pip/src/pip/_vendor
Donald Stufft 95bcf8c5f6 Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
..
cachecontrol Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
colorama Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
distlib Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
html5lib Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
lockfile Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
msgpack Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
packaging Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
pkg_resources Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
progress Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
pytoml Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
requests Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
webencodings Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
Makefile Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
README.rst Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
__init__.py Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
appdirs.py Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
distro.py Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
ipaddress.py Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
pyparsing.py Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
retrying.py Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
six.py Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00
vendor.txt Move all internal APIs to pip._internal 2017-08-31 14:53:00 -04:00

README.rst

Policy
======

* Vendored libraries **MUST** not be modified except as required to
  successfully vendor them.

* Vendored libraries **MUST** be released copies of libraries available on
  PyPI.

* The versions of libraries vendored in pip **MUST** be reflected in
  ``pip/_vendor/vendor.txt``.

* Vendored libraries **MUST** function without any build steps such as ``2to3`` or
  compilation of C code, pratically this limits to single source 2.x/3.x and
  pure Python.

* Any modifications made to libraries **MUST** be noted in
  ``pip/_vendor/README.rst`` and their corresponding patches **MUST** be
  included ``tasks/vendoring/patches``.


Rationale
---------

Historically pip has not had any dependencies except for setuptools itself,
choosing instead to implement any functionality it needed to prevent needing
a dependency. However, starting with pip 1.5 we began to replace code that was
implemented inside of pip with reusable libraries from PyPI. This brought the
typical benefits of reusing libraries instead of reinventing the wheel like
higher quality and more battle tested code, centralization of bug fixes
(particularly security sensitive ones), and better/more features for less work.

However, there is several issues with having dependencies in the traditional
way (via ``install_requires``) for pip. These issues are:

* **Fragility.** When pip depends on another library to function then if for
  whatever reason that library either isn't installed or an incompatible
  version is installed then pip ceases to function. This is of course true for
  all Python applications, however for every application *except* for pip the
  way you fix it is by re-running pip. Obviously, when pip can't run, you can't
  use pip to fix pip, so you're left having to manually resolve dependencies and
  installing them by hand.

* **Making other libraries uninstallable.** One of pip's current dependencies is
  the ``requests`` library, for which pip requires a fairly recent version to run.
  If pip dependended on ``requests`` in the traditional manner, then we'd either 
  have to maintain compatibility with every ``requests`` version that has ever 
  existed (and ever will), OR allow pip to render certain versions of ``requests``
  uninstallable. (The second issue, although technically true for any Python 
  application, is magnified by pip's ubiquity; pip is installed by default in 
  Python, in ``pyvenv``, and in ``virtualenv``.)

* **Security.** This might seem puzzling at first glance, since vendoring 
  has a tendency to complicate updating dependencies for security updates,
  and that holds true for pip. However, given the *other* reasons for avoiding 
  dependencies, the alternative is for pip to reinvent the wheel itself. 
  This is what pip did historically. It forced pip to re-implement its own 
  HTTPS verification routines as a workaround for the Python standard library's 
  lack of SSL validation, which resulted in similar bugs in the validation routine 
  in ``requests`` and ``urllib3``, except that they had to be discovered and
  fixed independently. Even though we're vendoring, reusing libraries keeps pip 
  more secure by relying on the great work of our dependencies, *and* allowing for
  faster, easier security fixes by simply pulling in newer versions of dependencies.

* **Bootstrapping.** Currently most popular methods of installing pip rely
  on pip's self-contained nature to install pip itself. These tools work by bundling 
  a copy of pip, adding it to ``sys.path``, and then executing that copy of pip. 
  This is done instead of implementing a "mini installer" (to reduce duplication); 
  pip already knows how to install a Python package, and is far more battle-tested 
  than any "mini installer" could ever possibly be.

Many downstream redistributors have policies against this kind of bundling, and
instead opt to patch the software they distribute to debundle it and make it
rely on the global versions of the software that they already have packaged
(which may have its own patches applied to it). We (the pip team) would prefer
it if pip was *not* debundled in this manner due to the above reasons and
instead we would prefer it if pip would be left intact as it is now. The one
exception to this, is it is acceptable to remove the
``pip/_vendor/requests/cacert.pem`` file provided you ensure that the
``ssl.get_default_verify_paths().cafile`` API returns the correct CA bundle for
your system. This will ensure that pip will use your system provided CA bundle
instead of the copy bundled with pip.

In the longer term, if someone has a *portable* solution to the above problems,
other than the bundling method we currently use, that doesn't add additional
problems that are unreasonable then we would be happy to consider, and possibly
switch to said method. This solution must function correctly across all of the
situation that we expect pip to be used and not mandate some external mechanism
such as OS packages.


Modifications
-------------

* ``html5lib`` has been modified to ``import six from pip._vendor``
* ``setuptools`` is completely stripped to only keep ``pkg_resources``
* ``pkg_resources`` has been modified to import its dependencies from ``pip._vendor``
* ``CacheControl`` has been modified to import its dependencies from ``pip._vendor``
* ``packaging`` has been modified to import its dependencies from ``pip._vendor``
* ``requests`` has been modified *not* to optionally load any C dependencies
* Modified distro to delay importing ``argparse`` to avoid errors on 2.6


Automatic Vendoring
-------------------

Vendoring is automated via the ``vendoring.update`` task (defined in
``tasks/vendoring/__init__.py``) from the content of
``pip/_vendor/vendor.txt`` and the different patches in
``tasks/vendoring/patches/``.
Launch it via ``invoke vendoring.update`` (requires ``invoke>=0.13.0``).


Debundling
----------

As mentioned in the rationale, we, the pip team, would prefer it if pip was not
debundled (other than optionally ``pip/_vendor/requests/cacert.pem``) and that
pip was left intact. However, if you insist on doing so, we have a
semi-supported method that we do test in our CI, but requires a bit of
extra work on your end in order to solve the problems described above.

1. Delete everything in ``pip/_vendor/`` **except** for
   ``pip/_vendor/__init__.py``.

2. Generate wheels for each of pip's dependencies (and any of their
   dependencies) using your patched copies of these libraries. These must be
   placed somewhere on the filesystem that pip can access (``pip/_vendor`` is
   the default assumption).

3. Modify ``pip/_vendor/__init__.py`` so that the ``DEBUNDLED`` variable is
   ``True``.

4. *(Optional)* If you've placed the wheels in a location other than
   ``pip/_vendor/``, then modify ``pip/_vendor/__init__.py`` so that the
   ``WHEEL_DIR`` variable points to the location you've placed them.