* Rename it to fit the fact that it no longer handle
fetching _not_ using lazy wheels
* Use self as the first parameter
* Unnest the checks with additional logs showing reason
when lazy wheel is not used
This keeps all knowledge about preparation and types of requirements in
`RequirementPreparer`, so there's one place to look when we're ready to
start breaking it apart later.
The fact that all of this functionality can be put in terms of the
`RequirementPreparer` indicates that, at least at this point, this is
the cleanest place to put this functionality.
For some time we have not needed to pre-emptively unpack wheels as part
of metadata processing, but kept the existing logic because the
behavior would start to diverge more for different package types. In
this case, though, removing the special cases for wheels makes this
logic a bit simpler, so it is worth doing.
These are the only cases where backtracking can happen. This approach
also accounts for VCS requirements relying on the same ensure function
to do cloning :/
During a build of extension module within `pip wheel` the source directory is
recursively copied in a temporary directory.
See https://github.com/pypa/pip/issues/7555
When the temporary directory is inside the source directory
(for example by setting `TMPDIR=$PWD/tmp`) this caused an infinite recursion
that ended in:
[Errno 36] File name too long
We prevent that buy never copying the target to the target in _copy_source_tree.
Fixes https://github.com/pypa/pip/issues/7872
Previously we were writing a delete marker file which is checked in
InstallRequirement.remove_temporary_source which is only invoked if the
user did not pass --no-clean (and a PreviousBuildDirError was not
raised). Since our TempDirectory machinery now respects these conditions
we can just wrap our source directory in that instead of using this
ad-hoc mechanism for tracking our delete preference.
This will let us clean up a lot of dead code that only existed for this
use case.
We want to use this value to determine whether a globally-managed
source_dir should delegate choosing deletion to the global tempdir
manager, so it needs to be above our call to
InstallRequirement.ensure_has_source_dir.
Since we need both the file path and content type to unpack, and we want
to move unpacking out of the lower-level functions, return all the
information needed so it's easier to move the unpacking out.
req.source_dir is only set by:
1. `InstallRequirement.__init__`
2. `InstallRequirement.ensure_has_source_dir`
`InstallRequirement.__init__` is only called with source_dir for
editable requirements, for which we would not call
`RequirementPreparer.prepare_linked_requirement` (only
`prepare_editable_requirement`).
We will use this assertion for justifying later refactoring.
Since
- download_dir is only set by the download command
- download_dir is
normalized at the beginning of the download command
- path normalization includes expanduser
Therefore expanduser in the preparer is redundant
* Do not cleanup download tempdir immediately
The previous logic forced us to handle populating the download directory
in this function right next to the download and hash checking. By
extending the lifetime of the directory we can more easily separate the
code.
This also allows for additional optimizations later: by using metadata
from wheels directly instead of unpacking them, we can avoid extracting
wheels unnecessarily. Unpacked files can be easily 3x larger than the
archives themselves, so this should reduce disk utilization and general
IO significantly.
This makes the behavior of this function easier to test, since we can
use a different file to distinguish the already-downloaded case from the
existing-file-hash-failed case.
_check_download_dir will only return a falsy value if either:
* the provided path does not exist
* the hash does not match - in which case the file is unlinked
so the file cannot exist at either of these points.
This will be home to Dowloader, Download, and associated helper
functions. Since this is an abstraction over PipSession, it makes
sense to keep these functions in a separate module.
Also move a helper function here from operations.prepare.
This simplifies the work done in the operations.prepare helper functions
and also opens up the door to remove session and progress_bar from
RequirementPreparer itself.
A plain loop is easier to follow than chained generators consumed by
a helper function, and reduces the number of objects being passed around
just to download a file.
Instead of computing hashes on-the-fly we do it after fully downloading
the file. This step will let us move hash checking to a higher-level
function without introducing a lot of complexity.
Now the place we construct the progress indicator doesn't need to know
about our strategy for consuming the response, freeing us to extract the
chunk iterator construction into the caller.