Skip to content

Commit 107c1c2

Browse files
committed
Merge remote-tracking branch 'origin/main' into fix-direct-preference
2 parents 8284705 + c3f08f0 commit 107c1c2

File tree

14 files changed

+247
-112
lines changed

14 files changed

+247
-112
lines changed

.github/workflows/release.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ jobs:
3636

3737
steps:
3838
- name: Download all the dists
39-
uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4
39+
uses: actions/download-artifact@cc203385981b70ca67e1cc392babf9cc229d5806 # v4
4040
with:
4141
name: python-package-distributions
4242
path: dist/

docs/html/topics/more-dependency-resolution.md

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,8 @@ operations:
132132
* `get_preference` - this provides information to the resolver to help it choose
133133
which requirement to look at "next" when working through the resolution
134134
process.
135+
* `narrow_requirement_selection` - this provides a way to limit the number of
136+
identifiers passed to `get_preference`.
135137
* `find_matches` - given a set of constraints, determine what candidates exist
136138
that satisfy them. This is essentially where the finder interacts with the
137139
resolver.
@@ -140,19 +142,26 @@ operations:
140142
* `get_dependencies` - get the dependency metadata for a candidate. This is
141143
the implementation of the process of getting and reading package metadata.
142144

143-
Of these methods, the only non-trivial one is the `get_preference` method. This
144-
implements the heuristics used to guide the resolution, telling it which
145-
requirement to try to satisfy next. It's this method that is responsible for
146-
trying to guess which route through the dependency tree will be most productive.
147-
As noted above, it's doing this with limited information. See the following
148-
diagram
145+
Of these methods, the only non-trivial ones are the `get_preference` and
146+
`narrow_requirement_selection` methods. These implement heuristics used
147+
to guide the resolution, telling it which requirement to try to satisfy next.
148+
It's these methods that are responsible for trying to guess which route through
149+
the dependency tree will be most productive. As noted above, it's doing this
150+
with limited information. See the following diagram:
149151

150152
![](deps.png)
151153

152154
When the provider is asked to choose between the red requirements (A->B and
153155
A->C) it doesn't know anything about the dependencies of B or C (i.e., the
154156
grey parts of the graph).
155157

158+
Pip's current implementation of the provider implements
159+
`narrow_requirement_selection` as follows:
160+
161+
* If Requires-Python is present only consider that
162+
* If there are causes of resolution conflict (backtrack causes) then
163+
only consider them until there are no longer any resolution conflicts
164+
156165
Pip's current implementation of the provider implements `get_preference` as
157166
follows:
158167

news/13135.feature.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Use :pep:`753` "Well-known Project URLs in Metadata" normalization rules when
2+
identifying an equivalent project URL to replace a missing ``Home-Page`` field
3+
in ``pip show``.

news/13229.bugfix.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Parse wheel filenames according to `binary distribution format specification
2+
<https://packaging.python.org/en/latest/specifications/binary-distribution-format/#file-format>`_.
3+
When a filename doesn't match the spec a deprecation warning is emitted and the
4+
filename is parsed using the old method.

news/13253.feature.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Speed up resolution by first only considering the preference of
2+
candidates that must be required to complete the resolution.

src/pip/_internal/commands/show.py

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import logging
2+
import string
23
from optparse import Values
34
from typing import Generator, Iterable, Iterator, List, NamedTuple, Optional
45

@@ -13,6 +14,13 @@
1314
logger = logging.getLogger(__name__)
1415

1516

17+
def normalize_project_url_label(label: str) -> str:
18+
# This logic is from PEP 753 (Well-known Project URLs in Metadata).
19+
chars_to_remove = string.punctuation + string.whitespace
20+
removal_map = str.maketrans("", "", chars_to_remove)
21+
return label.translate(removal_map).lower()
22+
23+
1624
class ShowCommand(Command):
1725
"""
1826
Show information about one or more installed packages.
@@ -135,13 +143,9 @@ def _get_requiring_packages(current_dist: BaseDistribution) -> Iterator[str]:
135143
if not homepage:
136144
# It's common that there is a "homepage" Project-URL, but Home-page
137145
# remains unset (especially as PEP 621 doesn't surface the field).
138-
#
139-
# This logic was taken from PyPI's codebase.
140146
for url in project_urls:
141147
url_label, url = url.split(",", maxsplit=1)
142-
normalized_label = (
143-
url_label.casefold().replace("-", "").replace("_", "").strip()
144-
)
148+
normalized_label = normalize_project_url_label(url_label)
145149
if normalized_label == "homepage":
146150
homepage = url.strip()
147151
break

src/pip/_internal/index/package_finder.py

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -514,11 +514,7 @@ def _sort_key(self, candidate: InstallationCandidate) -> CandidateSortingKey:
514514
)
515515
if self._prefer_binary:
516516
binary_preference = 1
517-
if wheel.build_tag is not None:
518-
match = re.match(r"^(\d+)(.*)$", wheel.build_tag)
519-
assert match is not None, "guaranteed by filename validation"
520-
build_tag_groups = match.groups()
521-
build_tag = (int(build_tag_groups[0]), build_tag_groups[1])
517+
build_tag = wheel.build_tag
522518
else: # sdist
523519
pri = -(support_num)
524520
has_allowed_hash = int(link.is_hash_allowed(self._hashes))

src/pip/_internal/metadata/importlib/_envs.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,14 @@
88
import zipimport
99
from typing import Iterator, List, Optional, Sequence, Set, Tuple
1010

11-
from pip._vendor.packaging.utils import NormalizedName, canonicalize_name
11+
from pip._vendor.packaging.utils import (
12+
InvalidWheelFilename,
13+
NormalizedName,
14+
canonicalize_name,
15+
parse_wheel_filename,
16+
)
1217

1318
from pip._internal.metadata.base import BaseDistribution, BaseEnvironment
14-
from pip._internal.models.wheel import Wheel
1519
from pip._internal.utils.deprecation import deprecated
1620
from pip._internal.utils.filetypes import WHEEL_EXTENSION
1721

@@ -26,7 +30,9 @@ def _looks_like_wheel(location: str) -> bool:
2630
return False
2731
if not os.path.isfile(location):
2832
return False
29-
if not Wheel.wheel_file_re.match(os.path.basename(location)):
33+
try:
34+
parse_wheel_filename(os.path.basename(location))
35+
except InvalidWheelFilename:
3036
return False
3137
return zipfile.is_zipfile(location)
3238

src/pip/_internal/models/wheel.py

Lines changed: 64 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,13 @@
33
"""
44

55
import re
6-
from typing import Dict, Iterable, List
6+
from typing import Dict, Iterable, List, Optional
77

88
from pip._vendor.packaging.tags import Tag
9+
from pip._vendor.packaging.utils import BuildTag, parse_wheel_filename
910
from pip._vendor.packaging.utils import (
10-
InvalidWheelFilename as PackagingInvalidWheelName,
11+
InvalidWheelFilename as _PackagingInvalidWheelFilename,
1112
)
12-
from pip._vendor.packaging.utils import parse_wheel_filename
1313

1414
from pip._internal.exceptions import InvalidWheelFilename
1515
from pip._internal.utils.deprecation import deprecated
@@ -18,54 +18,75 @@
1818
class Wheel:
1919
"""A wheel file"""
2020

21-
wheel_file_re = re.compile(
21+
legacy_wheel_file_re = re.compile(
2222
r"""^(?P<namever>(?P<name>[^\s-]+?)-(?P<ver>[^\s-]*?))
2323
((-(?P<build>\d[^-]*?))?-(?P<pyver>[^\s-]+?)-(?P<abi>[^\s-]+?)-(?P<plat>[^\s-]+?)
2424
\.whl|\.dist-info)$""",
2525
re.VERBOSE,
2626
)
2727

2828
def __init__(self, filename: str) -> None:
29-
"""
30-
:raises InvalidWheelFilename: when the filename is invalid for a wheel
31-
"""
32-
wheel_info = self.wheel_file_re.match(filename)
33-
if not wheel_info:
34-
raise InvalidWheelFilename(f"{filename} is not a valid wheel filename.")
3529
self.filename = filename
36-
self.name = wheel_info.group("name").replace("_", "-")
37-
_version = wheel_info.group("ver")
38-
if "_" in _version:
39-
try:
40-
parse_wheel_filename(filename)
41-
except PackagingInvalidWheelName as e:
42-
deprecated(
43-
reason=(
44-
f"Wheel filename {filename!r} is not correctly normalised. "
45-
"Future versions of pip will raise the following error:\n"
46-
f"{e.args[0]}\n\n"
47-
),
48-
replacement=(
49-
"to rename the wheel to use a correctly normalised "
50-
"name (this may require updating the version in "
51-
"the project metadata)"
52-
),
53-
gone_in="25.1",
54-
issue=12938,
55-
)
56-
57-
_version = _version.replace("_", "-")
58-
59-
self.version = _version
60-
self.build_tag = wheel_info.group("build")
61-
self.pyversions = wheel_info.group("pyver").split(".")
62-
self.abis = wheel_info.group("abi").split(".")
63-
self.plats = wheel_info.group("plat").split(".")
64-
65-
# All the tag combinations from this file
66-
self.file_tags = {
67-
Tag(x, y, z) for x in self.pyversions for y in self.abis for z in self.plats
68-
}
30+
31+
# To make mypy happy specify type hints that can come from either
32+
# parse_wheel_filename or the legacy_wheel_file_re match.
33+
self.name: str
34+
self._build_tag: Optional[BuildTag] = None
35+
36+
try:
37+
wheel_info = parse_wheel_filename(filename)
38+
self.name, _version, self._build_tag, self.file_tags = wheel_info
39+
self.version = str(_version)
40+
except _PackagingInvalidWheelFilename as e:
41+
# Check if the wheel filename is in the legacy format
42+
legacy_wheel_info = self.legacy_wheel_file_re.match(filename)
43+
if not legacy_wheel_info:
44+
raise InvalidWheelFilename(e.args[0]) from None
45+
46+
deprecated(
47+
reason=(
48+
f"Wheel filename {filename!r} is not correctly normalised. "
49+
"Future versions of pip will raise the following error:\n"
50+
f"{e.args[0]}\n\n"
51+
),
52+
replacement=(
53+
"to rename the wheel to use a correctly normalised "
54+
"name (this may require updating the version in "
55+
"the project metadata)"
56+
),
57+
gone_in="25.3",
58+
issue=12938,
59+
)
60+
61+
self.name = legacy_wheel_info.group("name").replace("_", "-")
62+
self.version = legacy_wheel_info.group("ver").replace("_", "-")
63+
64+
# Generate the file tags from the legacy wheel filename
65+
pyversions = legacy_wheel_info.group("pyver").split(".")
66+
abis = legacy_wheel_info.group("abi").split(".")
67+
plats = legacy_wheel_info.group("plat").split(".")
68+
self.file_tags = frozenset(
69+
Tag(interpreter=py, abi=abi, platform=plat)
70+
for py in pyversions
71+
for abi in abis
72+
for plat in plats
73+
)
74+
75+
@property
76+
def build_tag(self) -> BuildTag:
77+
if self._build_tag is not None:
78+
return self._build_tag
79+
80+
# Parse the build tag from the legacy wheel filename
81+
legacy_wheel_info = self.legacy_wheel_file_re.match(self.filename)
82+
assert legacy_wheel_info is not None, "guaranteed by filename validation"
83+
build_tag = legacy_wheel_info.group("build")
84+
match = re.match(r"^(\d+)(.*)$", build_tag)
85+
assert match is not None, "guaranteed by filename validation"
86+
build_tag_groups = match.groups()
87+
self._build_tag = (int(build_tag_groups[0]), build_tag_groups[1])
88+
89+
return self._build_tag
6990

7091
def get_formatted_file_tags(self) -> List[str]:
7192
"""Return the wheel's tags as a sorted list of strings."""

src/pip/_internal/resolution/resolvelib/provider.py

Lines changed: 43 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,49 @@ def __init__(
104104
def identify(self, requirement_or_candidate: Union[Requirement, Candidate]) -> str:
105105
return requirement_or_candidate.name
106106

107+
def narrow_requirement_selection(
108+
self,
109+
identifiers: Iterable[str],
110+
resolutions: Mapping[str, Candidate],
111+
candidates: Mapping[str, Iterator[Candidate]],
112+
information: Mapping[str, Iterator["PreferenceInformation"]],
113+
backtrack_causes: Sequence["PreferenceInformation"],
114+
) -> Iterable[str]:
115+
"""Produce a subset of identifiers that should be considered before others.
116+
117+
Currently pip narrows the following selection:
118+
* Requires-Python, if present is always returned by itself
119+
* Backtrack causes are considered next because they can be identified
120+
in linear time here, whereas because get_preference() is called
121+
for each identifier, it would be quadratic to check for them there.
122+
Further, the current backtrack causes likely need to be resolved
123+
before other requirements as a resolution can't be found while
124+
there is a conflict.
125+
"""
126+
backtrack_identifiers = set()
127+
for info in backtrack_causes:
128+
backtrack_identifiers.add(info.requirement.name)
129+
if info.parent is not None:
130+
backtrack_identifiers.add(info.parent.name)
131+
132+
current_backtrack_causes = []
133+
for identifier in identifiers:
134+
# Requires-Python has only one candidate and the check is basically
135+
# free, so we always do it first to avoid needless work if it fails.
136+
# This skips calling get_preference() for all other identifiers.
137+
if identifier == REQUIRES_PYTHON_IDENTIFIER:
138+
return [identifier]
139+
140+
# Check if this identifier is a backtrack cause
141+
if identifier in backtrack_identifiers:
142+
current_backtrack_causes.append(identifier)
143+
continue
144+
145+
if current_backtrack_causes:
146+
return current_backtrack_causes
147+
148+
return identifiers
149+
107150
def get_preference(
108151
self,
109152
identifier: str,
@@ -156,20 +199,9 @@ def get_preference(
156199
unfree = bool(operators)
157200
requested_order = self._user_requested.get(identifier, math.inf)
158201

159-
# Requires-Python has only one candidate and the check is basically
160-
# free, so we always do it first to avoid needless work if it fails.
161-
requires_python = identifier == REQUIRES_PYTHON_IDENTIFIER
162-
163-
# Prefer the causes of backtracking on the assumption that the problem
164-
# resolving the dependency tree is related to the failures that caused
165-
# the backtracking
166-
backtrack_cause = self.is_backtrack_cause(identifier, backtrack_causes)
167-
168202
return (
169-
not requires_python,
170203
not direct,
171204
not pinned,
172-
not backtrack_cause,
173205
requested_order,
174206
not unfree,
175207
identifier,
@@ -224,14 +256,3 @@ def is_satisfied_by(self, requirement: Requirement, candidate: Candidate) -> boo
224256
def get_dependencies(self, candidate: Candidate) -> Sequence[Requirement]:
225257
with_requires = not self._ignore_dependencies
226258
return [r for r in candidate.iter_dependencies(with_requires) if r is not None]
227-
228-
@staticmethod
229-
def is_backtrack_cause(
230-
identifier: str, backtrack_causes: Sequence["PreferenceInformation"]
231-
) -> bool:
232-
for backtrack_cause in backtrack_causes:
233-
if identifier == backtrack_cause.requirement.name:
234-
return True
235-
if backtrack_cause.parent and identifier == backtrack_cause.parent.name:
236-
return True
237-
return False

0 commit comments

Comments
 (0)