Commit bad03ef931 introduced the new
link_hash attribute that holds the link's hash info, but that attribute
does the same thing as _hashes, and some existing usages still populate
that old attribute. Since the plural variant covers more use cases (a
file can be hashed with multiple algorithms), we restore the old logic
that uses _hashes before the commit, and consolidate link_hash back into
that attribute.
The pip-specific Path implementation has been removed, and all its
usages replaced by pathlib.Path. The tmpdir and tmpdir_factory fixtures
are also removed, and all usages are replaced by tmp_path and
tmp_path_factory, which use pathlib.Path.
The pip() function now also accepts pathlib.Path so we don't need to put
str() everywhere. Path arguments are coerced with os.fspath() into str.
This reworks the HTML parsing logic, to gracefully use `html5lib` on
non-compliant HTML 5 documents. This warning softens the failure mode
for users who are using commercial package index solutions that do not
follow the requisite standards and serve malformed HTML documents.
The earlier variant _returned_ an iterable object from a generator. This
did not properly handle the fallback, resulting in the html5lib code
path not being executed.
The html5lib library isn't strictly required as the same functionality
can be achieved through the stdlib html.parser module.
The html5lib is one of the largest uses of the six library. By dropping
this unnecessary dependency, the pip project is closer to dropping the
six library.
Additionally, html5lib maintenance has slowed down and the project has
rejected pull requests to drop Python 2 support.
For now, the html5lib code remains, but is gated behind a command
line option: `--use-deprecated=html5lib`. After a sufficient amount of
time has passed without any reported bugs, the vendored library and this
flag can be removed completely.
The pretend library was used by very few tests. In all cases, it is
simple enough to switch to stdlib unitest.mock.
Using stdlib means there is one fewer library to install before running
tests. It can also simplify mypy usage via typeshed.
The html5lib package (as well as stdlib html.parser) already unescapes
attributes. There is no need to do so a second time.
Unnecessary since cba45215b9.
This introduces a collect_sources() method to do the same thing, but
instead of flattening links eagerly, return each repository entry
separately (and return a None for invalid repository options), so
subsequent code can better distinguish which link comes from which
repository.
Use pyupgrade to convert simple string formatting to use f-string
syntax. pyupgrade is intentionally timid and will not create an f-string
if it would make the expression longer or if the substitution parameters
are anything but simple names or dotted names.
make_link_collector() was in self_outdated_check, a module responsible
for checking whether the currently-running pip is outdated, but is
imported by things that has nothing to do with this outdated check. Move
the function to be a class method in LinkCollector so the module
hierarchy makes more sense.
add failing test
apply the fix
add template NEWS entry according to https://pip.pypa.io/en/latest/development/contributing/#news-entries (wrong PR #)
rename news entry to the current PR #
respond to review comments
fix test failures
fix tests by adding uuid salt in urls
cache html page fetching by link
make CI pass (?)
make the types much better
finally listen to the maintainer and cache parse_links() by url :)
avoid caching parse_links() when the url is an index url
cleanup
add testing for uncachable marking
only conditionally vendor _lru_cache for py2
bugfix => feature
python 2 does not cache!
Do away with type: ignore with getattr()
respond to review comments