For some time we have not needed to pre-emptively unpack wheels as part
of metadata processing, but kept the existing logic because the
behavior would start to diverge more for different package types. In
this case, though, removing the special cases for wheels makes this
logic a bit simpler, so it is worth doing.
This check only applies to explicit requirements since we avoid
downloading the dist from finder altogether when there is a matching
installation (although the check wouldn’t change the behaviour in that
case anyway).
We can do this when we build the `ExplicitRequirement` instead, like how
we did for `SpecifierRequirement`, but that would require us to resolve
the direct requirement’s version eagerly, which I don’t want to.
The implemented approach checks the version only after resolution, at
which point the distribution is already built anyway and the operation
is cheap.
There are a few changes here:
1. The byte-compilation now occurs after we copy the root-scheme files
and files from any wheel data dirs
1. Instead of iterating over the files in the unpacked wheel directory,
we iterate over the installed files as they exist in the installation
path
2. In addition to asserting that pyc files were created, we also add
them to the list of installed files, so they will be included in RECORD
By compiling after installation, we no longer depend on a separate
temporary directory - this brings us closer to installing directly from
wheel files.
By compiling with source files as they exist in the installation output
directory, we no longer generate pyc files with an embedded randomized
temp directory - this means that wheel installs can be deterministic.
In order to add generated pyc files to the RECORD file for our package,
we need to know their path! To raise confidence that we're doing this
correctly, we assert the existence of the expected 'pyc' files while
still using the old installation logic.
Some valid reasons why pyc files may not be generated:
1. Syntax error in the installed Python files
2. There is already a pyc file in-place that isn't writable by the
current user
We don't fail installation in those cases today, and we wouldn't want to
change our behavior here, so we only assert that the pyc file was
created if `compileall.compile_file` indicates success.
`compileall.compile_file` returns a success parameter, but can return
"successful" without actually generating a pyc file if the input file
was filtered out and compilation was not attempted.
In our file processing we mirror that logic, to ensure that a truthy
success returned by `compileall.compile_file` actually indicates a file
was written.
We want to move towards having more control over the generation of pyc
files, which will allow us to provide deterministic installs and
generate pyc files without relying on an already-extracted wheel.
To that end, here we are stripping away one layer of abstraction,
`compileall.compile_dir`. `compileall.compile_dir` essentially recurses
through the provided directories and passes the files and args verbatim
to `compileall.compile_file`, so removing that layer means that we
directly call `compileall.compile_file`.
We make the assumption that we can successfully walk over the
source file tree, since we just wrote it, and omit the per-directory
traversal error handling done by `compileall.compile_dir`.
Since the Distribution pulls its data directly from the Wheel file,
without extracting intermediate files to disk, this brings us closer to
installing from Wheels without extracting everything.
This big chunk of code was independent of the rest of our wheel
installation process. Moving it out enforces that there are no
dependencies between it and the original function, and makes it easier
to read the original function.
This makes get_csv_rows_for_installed simpler, because it is not
modifying its arguments. We can also more easily refactor RECORD file
reading since it is now decoupled from getting the installed RECORD file
rows.
Reducing the scope of variables reduces possible dependencies between
parts of this function, and will make it easier to extract this section
into its own function.
This reduces our dependence on the files being extracted to the
filesystem.
Compare the name extraction to the similar code in
`utils.wheel.wheel_dist_info_dir`.
We don't need to give `.data` directories the same strict
treatment (yet) because it isn't inconvenient if there happen
to be multiple of them in a single Wheel file.
Currently we do processing in `get_entrypoints` so incoming text is more compatible
with `pkg_resources`. It turns out that `pkg_resources` is already doing the same normalization,
so we can omit it.
This simplifies `get_entrypoints`, opening the way for us to pass it a plain string instead
of a file path.