Add `pip hash` command.

This commit is contained in:
Erik Rose 2015-10-07 23:41:24 -04:00
parent c62cd71f0f
commit 09008bf190
7 changed files with 128 additions and 14 deletions

View File

@ -14,5 +14,4 @@ Reference Guide
pip_show
pip_search
pip_wheel
pip_hash

View File

@ -0,0 +1,42 @@
.. _`pip hash`:
pip hash
------------
.. contents::
Usage
*****
.. pip-command-usage:: hash
Description
***********
.. pip-command-description:: hash
Overview
++++++++
``pip hash`` is a convenient way to get a hash digest for use with
:ref:`hash-checking mode`, especially for packages with multiple archives. The
error message from ``pip install --require-hashes ...`` will give you one
hash, but, if there are multiple archives (like source and binary ones), you
will need to manually download and compute a hash for the other. Otherwise, a
spurious hash mismatch could occur when :ref:`pip install` is passed a different
set of options, like :ref:`--no-binary <install_--no-binary>`.
Example
********
Compute the hash of a downloaded archive::
$ pip download SomePackage
Collecting SomePackage
Downloading SomePackage-2.2.tar.gz
Saved ./pip_downloads/SomePackage-2.2.tar.gz
Successfully downloaded SomePackage
$ pip hash ./pip_downloads/SomePackage-2.2.tar.gz
--hash=sha256:93e62e05c7ad3da1a233def6731e8285156701e3419a5fe279017c429ec67ce0

View File

@ -460,7 +460,7 @@ binary and source distributions or when it offers binary distributions for a
variety of platforms.)
The recommended hash algorithm at the moment is sha256, but stronger ones are
allowed, including all those supported by ``hashlib``. However, weak hashes
allowed, including all those supported by ``hashlib``. However, weaker ones
such as md5, sha1, and sha224 are excluded to avert false assurances of
security.
@ -485,12 +485,28 @@ against any requirement not only checks that hash but also activates
to setuptools, giving up pip's ability to enforce any of the above.
Hash-checking mode can be forced on with the ``--require-hashes`` command-line
option. This can be useful in deploy scripts, to ensure that the author of the
option::
$ pip install --require-hashes -r requirements.txt
...
Hashes are required in --require-hashes mode (implicitly on when a hash is
specified for any package). These requirements were missing hashes,
leaving them open to tampering. These are the hashes the downloaded
archives actually had. You can add lines like these to your requirements
files to prevent tampering.
pyelasticsearch==1.0 --hash=sha256:44ddfb1225054d7d6b1d02e9338e7d4809be94edbe9929a2ec0807d38df993fa
more-itertools==2.2 --hash=sha256:93e62e05c7ad3da1a233def6731e8285156701e3419a5fe279017c429ec67ce0
This can be useful in deploy scripts, to ensure that the author of the
requirements file provided hashes. It is also a convenient way to bootstrap
your list of hashes, since it will show the hashes of the packages it
fetched. (It will fetch only a single archive for each package, so you may
still need to add additional hashes for alternatives: for instance if there is
both a binary and a source distribution available.)
your list of hashes, since it shows the hashes of the packages it fetched. It
fetches only the preferred archive for each package, so you may still need to
add hashes for alternatives archives using :ref:`pip hash`: for instance if
there is both a binary and a source distribution.
Hash-checking mode also functions with :ref:`pip download` and :ref:`pip
wheel`. A :ref:`comparison of hash-checking mode with other repeatability
strategies <Repeatability>` is available in the User Guide.
.. warning::
Beware of the ``setup_requires`` keyword arg in :file:`setup.py`. The

View File

@ -624,16 +624,15 @@ downloaded packages::
This protects against compromises of PyPI, its CDN, the HTTPS certificate
chain, and the network between you and the packages. It also guards
against a package changing without a change in its version number, on
indexes that allow this. This approach is a good fit for automated
deployments to servers.
against a package changing without its version number changing, on indexes
that allow this. This approach is a good fit for automated server deployments.
Hash-checking mode is a labor-saving alternative to running an internal index
Hash-checking mode is a labor-saving alternative to running a private index
server containing approved packages: it removes the need to upload packages,
maintain ACLs, and keep an audit trail (which a VCS give you for the
maintain ACLs, and keep an audit trail (which a VCS gives you on the
requirements file for free). It can also substitute for a vendor library,
providing easier upgrades and less VCS noise. It does not, of course,
provide the availability benefits of an internal index or a vendor library.
provide the availability benefits of a private index or a vendor library.
For more, see :ref:`pip install\'s discussion of hash-checking mode <hash-checking mode>`.

View File

@ -6,6 +6,7 @@ from __future__ import absolute_import
from pip.commands.completion import CompletionCommand
from pip.commands.download import DownloadCommand
from pip.commands.freeze import FreezeCommand
from pip.commands.hash import HashCommand
from pip.commands.help import HelpCommand
from pip.commands.list import ListCommand
from pip.commands.search import SearchCommand
@ -18,6 +19,7 @@ from pip.commands.wheel import WheelCommand
commands_dict = {
CompletionCommand.name: CompletionCommand,
FreezeCommand.name: FreezeCommand,
HashCommand.name: HashCommand,
HelpCommand.name: HelpCommand,
SearchCommand.name: SearchCommand,
ShowCommand.name: ShowCommand,
@ -38,6 +40,7 @@ commands_order = [
ShowCommand,
SearchCommand,
WheelCommand,
HashCommand,
HelpCommand,
]

47
pip/commands/hash.py Normal file
View File

@ -0,0 +1,47 @@
from __future__ import absolute_import
import hashlib
import logging
import sys
from pip.basecommand import Command
from pip.exceptions import FAVORITE_HASH
from pip.status_codes import ERROR
logger = logging.getLogger(__name__)
class HashCommand(Command):
"""
Compute a hash of a local package archive.
These can be used with --hash in a requirements file to do repeatable
installs.
"""
name = 'hash'
usage = """%prog [options] <file> ..."""
summary = 'Compute hashes of package archives.'
def run(self, options, args):
if not args:
self.parser.print_usage(sys.stderr)
return ERROR
for path in args:
logger.info('%s:\n--hash=%s:%s' % (path,
FAVORITE_HASH,
_hash_of_file(path)))
def _hash_of_file(path):
"""Return the hash digest of a file."""
with open(path, 'rb') as archive:
hash = hashlib.new(FAVORITE_HASH)
while True:
data = archive.read(2 ** 20)
if not data:
break
hash.update(data)
return hash.hexdigest()

View File

@ -0,0 +1,8 @@
def test_basic(script, tmpdir):
"""Run 'pip hash' through its paces."""
archive = tmpdir / 'hashable'
archive.write('hello')
result = script.pip('hash', archive)
expected = ('--hash=sha256:2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425'
'e73043362938b9824')
assert expected in str(result)