Fix false hash mismatches when installing a package that has a cached wheel.

This would occur when, for example, installing from a requirements file that references a certain hashed sdist, a common situation.

As of pip 7, pip always tries to build a wheel for each requirement (if one wasn't provided directly) and installs from that. The way this was implemented, InstallRequirement.link pointed to the cached wheel, which obviously had a different hash than the index-sourced archive, so spurious mismatch errors would result.

Now we no longer read from the wheel cache in hash-checking mode.

Make populate_link(), rather than the `link` setter, responsible for mapping InstallRequirement.link to a cached wheel. populate_link() isn't called until until prepare_files(). At that point, when we've examined all InstallRequirements and their potential --hash options, we know whether we should be requiring hashes and thus whether to use the wheel cache at all.

The only place that sets InstallRequirement.link other than InstallRequirement itself is pip.wheel, which does so long after hashes have been checked, when it's unpacking the wheel it just built, so it won't cause spurious hash mismatches.
This commit is contained in:
Erik Rose 2015-10-16 15:58:59 -04:00
parent e23f59673e
commit 925e4b4466
4 changed files with 69 additions and 29 deletions

View File

@ -428,11 +428,11 @@ Windows
Wheel Cache
~~~~~~~~~~~
Pip will read from the subdirectory ``wheels`` within the pip cache dir and use
any packages found there. This is disabled via the same ``no-cache-dir`` option
that disables the HTTP cache. The internal structure of that cache is not part
of the pip API. As of 7.0 pip uses a subdirectory per sdist that wheels were
built from, and wheels within that subdirectory.
Pip will read from the subdirectory ``wheels`` within the pip cache directory
and use any packages found there. This is disabled via the same
``--no-cache-dir`` option that disables the HTTP cache. The internal structure
of that is not part of the pip API. As of 7.0, pip makes a subdirectory for
each sdist that wheels are built from and places the resulting wheels inside.
Pip attempts to choose the best wheels from those built in preference to
building a new wheel. Note that this means when a package has both optional
@ -463,11 +463,11 @@ variety of platforms.)
The recommended hash algorithm at the moment is sha256, but stronger ones are
allowed, including all those supported by ``hashlib``. However, weaker ones
such as md5, sha1, and sha224 are excluded to avert false assurances of
such as md5, sha1, and sha224 are excluded to avoid giving a false sense of
security.
Hash verification is an all-or-nothing proposition. Specifying a ``--hash``
against any requirement not only checks that hash but also activates
against any requirement not only checks that hash but also activates a global
*hash-checking mode*, which imposes several other security restrictions:
* Hashes are required for all requirements. This is because a partially-hashed
@ -477,7 +477,7 @@ against any requirement not only checks that hash but also activates
``#md5=...`` syntax suffice to satisfy this rule (regardless of hash
strength, for legacy reasons), though you should use a stronger
hash like sha256 whenever possible.
* Hashes are required for all dependencies. An error is raised if there is a
* Hashes are required for all dependencies. An error results if there is a
dependency that is not spelled out and hashed in the requirements file.
* Requirements that take the form of project names (rather than URLs or local
filesystem paths) must be pinned to a specific version using ``==``. This
@ -506,9 +506,21 @@ fetches only the preferred archive for each package, so you may still need to
add hashes for alternatives archives using :ref:`pip hash`: for instance if
there is both a binary and a source distribution.
Hash-checking mode also functions with :ref:`pip download` and :ref:`pip
wheel`. A :ref:`comparison of hash-checking mode with other repeatability
strategies <Repeatability>` is available in the User Guide.
The :ref:`wheel cache <Wheel cache>` is disabled in hash-checking mode to
prevent spurious hash mismatch errors. These would otherwise occur while
installing sdists that had already been automatically built into cached wheels:
those wheels would be selected for installation, but their hashes would not
match the sdist ones from the requirements file. A further complication is that
locally built wheels are nondeterministic: contemporary modification times make
their way into the archive, making hashes unpredictable across machines and
cache flushes. However, wheels fetched from index servers land in pip's HTTP
cache, not its wheel cache, and are used normally in hash-checking mode. The
only potential penalty is thus extra build time for sdists, and this can be
solved by making sure pre-built wheels are available from the index server.
Hash-checking mode also works with :ref:`pip download` and :ref:`pip wheel`. A
:ref:`comparison of hash-checking mode with other repeatability strategies
<Repeatability>` is available in the User Guide.
.. warning::
Beware of the ``setup_requires`` keyword arg in :file:`setup.py`. The

View File

@ -239,28 +239,25 @@ class InstallRequirement(object):
return '<%s object: %s editable=%r>' % (
self.__class__.__name__, str(self), self.editable)
def populate_link(self, finder, upgrade):
def populate_link(self, finder, upgrade, require_hashes):
"""Ensure that if a link can be found for this, that it is found.
Note that self.link may still be None - if Upgrade is False and the
requirement is already installed.
If require_hashes is True, don't use the wheel cache, because cached
wheels, always built locally, have different hashes than the files
downloaded from the index server and thus throw false hash mismatches.
Furthermore, cached wheels at present have undeterministic contents due
to file modification times.
"""
if self.link is None:
self.link = finder.find_requirement(self, upgrade)
@property
def link(self):
return self._link
@link.setter
def link(self, link):
# Lookup a cached wheel, if possible.
if self._wheel_cache is None:
self._link = link
else:
self._link = self._wheel_cache.cached_wheel(link, self.name)
if self._link != link:
logger.debug('Using cached wheel link: %s', self._link)
if self._wheel_cache is not None and not require_hashes:
old_link = self.link
self.link = self._wheel_cache.cached_wheel(self.link, self.name)
if old_link != self.link:
logger.debug('Using cached wheel link: %s', self.link)
@property
def specifier(self):

View File

@ -341,7 +341,7 @@ class RequirementSet(object):
# Actually prepare the files, and collect any exceptions. Most hash
# exceptions cannot be checked ahead of time, because
# req.populate_links() needs to be called before we can make decisions
# req.populate_link() needs to be called before we can make decisions
# based on link type.
discovered_reqs = []
hash_errors = HashErrors()
@ -502,7 +502,8 @@ class RequirementSet(object):
"can delete this. Please delete it and try again."
% (req_to_install, req_to_install.source_dir)
)
req_to_install.populate_link(finder, self.upgrade)
req_to_install.populate_link(
finder, self.upgrade, require_hashes)
# We can't hit this spot and have populate_link return None.
# req_to_install.satisfied_by is None here (because we're
# guarded) and upgrade has no impact except when satisfied_by

View File

@ -3,7 +3,7 @@ import textwrap
import pytest
from tests.lib import (pyversion, path_to_url,
from tests.lib import (pyversion, path_to_url, requirements_file,
_create_test_package_with_subdirectory)
from tests.lib.local_repos import local_checkout
@ -313,3 +313,33 @@ def test_constrained_to_url_install_same_url(script, data):
'install', '--no-index', '-f', data.find_links, '-c',
script.scratch_path / 'constraints.txt', to_install)
assert 'Running setup.py install for singlemodule' in result.stdout
@pytest.mark.network
def test_double_install_spurious_hash_mismatch(script, tmpdir):
"""Make sure installing the same hashed sdist twice doesn't throw hash
mismatch errors.
Really, this is a test that we disable reads from the wheel cache in
hash-checking mode. Locally, implicitly built wheels of sdists obviously
have different hashes from the original archives. Comparing against those
causes spurious mismatch errors.
"""
script.pip('install', 'wheel') # Otherwise, it won't try to build wheels.
with requirements_file('simple==1.0 --hash=sha256:393043e672415891885c9a2a'
'0929b1af95fb866d6ca016b42d2e6ce53619b653',
tmpdir) as reqs_file:
# Install a package (and build its wheel):
result = script.pip_install_local(
'-r', reqs_file.abspath, expect_error=False)
assert 'Successfully installed simple-1.0' in str(result)
# Uninstall it:
script.pip('uninstall', '-y', 'simple', expect_error=False)
# Then install it again. We should not hit a hash mismatch, and the
# package should install happily.
result = script.pip_install_local(
'-r', reqs_file.abspath, expect_error=False)
assert 'Successfully installed simple-1.0' in str(result)