Skip to content

Commit 0d4ed13

Browse files
authored
Merge pull request #13253 from notatallshaw/Use-new-narrow_requirement_selection-resolvelib-API-to-speed-up-resolution
Use new `narrow requirement selection` resolvelib api to reduce cost of resolution
2 parents 81bee61 + 6727e65 commit 0d4ed13

File tree

4 files changed

+134
-43
lines changed

4 files changed

+134
-43
lines changed

docs/html/topics/more-dependency-resolution.md

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,8 @@ operations:
132132
* `get_preference` - this provides information to the resolver to help it choose
133133
which requirement to look at "next" when working through the resolution
134134
process.
135+
* `narrow_requirement_selection` - this provides a way to limit the number of
136+
identifiers passed to `get_preference`.
135137
* `find_matches` - given a set of constraints, determine what candidates exist
136138
that satisfy them. This is essentially where the finder interacts with the
137139
resolver.
@@ -140,19 +142,26 @@ operations:
140142
* `get_dependencies` - get the dependency metadata for a candidate. This is
141143
the implementation of the process of getting and reading package metadata.
142144

143-
Of these methods, the only non-trivial one is the `get_preference` method. This
144-
implements the heuristics used to guide the resolution, telling it which
145-
requirement to try to satisfy next. It's this method that is responsible for
146-
trying to guess which route through the dependency tree will be most productive.
147-
As noted above, it's doing this with limited information. See the following
148-
diagram
145+
Of these methods, the only non-trivial ones are the `get_preference` and
146+
`narrow_requirement_selection` methods. These implement heuristics used
147+
to guide the resolution, telling it which requirement to try to satisfy next.
148+
It's these methods that are responsible for trying to guess which route through
149+
the dependency tree will be most productive. As noted above, it's doing this
150+
with limited information. See the following diagram:
149151

150152
![](deps.png)
151153

152154
When the provider is asked to choose between the red requirements (A->B and
153155
A->C) it doesn't know anything about the dependencies of B or C (i.e., the
154156
grey parts of the graph).
155157

158+
Pip's current implementation of the provider implements
159+
`narrow_requirement_selection` as follows:
160+
161+
* If Requires-Python is present only consider that
162+
* If there are causes of resolution conflict (backtrack causes) then
163+
only consider them until there are no longer any resolution conflicts
164+
156165
Pip's current implementation of the provider implements `get_preference` as
157166
follows:
158167

news/13253.feature.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Speed up resolution by first only considering the preference of
2+
candidates that must be required to complete the resolution.

src/pip/_internal/resolution/resolvelib/provider.py

Lines changed: 43 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,49 @@ def __init__(
103103
def identify(self, requirement_or_candidate: Union[Requirement, Candidate]) -> str:
104104
return requirement_or_candidate.name
105105

106+
def narrow_requirement_selection(
107+
self,
108+
identifiers: Iterable[str],
109+
resolutions: Mapping[str, Candidate],
110+
candidates: Mapping[str, Iterator[Candidate]],
111+
information: Mapping[str, Iterator["PreferenceInformation"]],
112+
backtrack_causes: Sequence["PreferenceInformation"],
113+
) -> Iterable[str]:
114+
"""Produce a subset of identifiers that should be considered before others.
115+
116+
Currently pip narrows the following selection:
117+
* Requires-Python, if present is always returned by itself
118+
* Backtrack causes are considered next because they can be identified
119+
in linear time here, whereas because get_preference() is called
120+
for each identifier, it would be quadratic to check for them there.
121+
Further, the current backtrack causes likely need to be resolved
122+
before other requirements as a resolution can't be found while
123+
there is a conflict.
124+
"""
125+
backtrack_identifiers = set()
126+
for info in backtrack_causes:
127+
backtrack_identifiers.add(info.requirement.name)
128+
if info.parent is not None:
129+
backtrack_identifiers.add(info.parent.name)
130+
131+
current_backtrack_causes = []
132+
for identifier in identifiers:
133+
# Requires-Python has only one candidate and the check is basically
134+
# free, so we always do it first to avoid needless work if it fails.
135+
# This skips calling get_preference() for all other identifiers.
136+
if identifier == REQUIRES_PYTHON_IDENTIFIER:
137+
return [identifier]
138+
139+
# Check if this identifier is a backtrack cause
140+
if identifier in backtrack_identifiers:
141+
current_backtrack_causes.append(identifier)
142+
continue
143+
144+
if current_backtrack_causes:
145+
return current_backtrack_causes
146+
147+
return identifiers
148+
106149
def get_preference(
107150
self,
108151
identifier: str,
@@ -153,20 +196,9 @@ def get_preference(
153196
unfree = bool(operators)
154197
requested_order = self._user_requested.get(identifier, math.inf)
155198

156-
# Requires-Python has only one candidate and the check is basically
157-
# free, so we always do it first to avoid needless work if it fails.
158-
requires_python = identifier == REQUIRES_PYTHON_IDENTIFIER
159-
160-
# Prefer the causes of backtracking on the assumption that the problem
161-
# resolving the dependency tree is related to the failures that caused
162-
# the backtracking
163-
backtrack_cause = self.is_backtrack_cause(identifier, backtrack_causes)
164-
165199
return (
166-
not requires_python,
167200
not direct,
168201
not pinned,
169-
not backtrack_cause,
170202
requested_order,
171203
not unfree,
172204
identifier,
@@ -221,14 +253,3 @@ def is_satisfied_by(self, requirement: Requirement, candidate: Candidate) -> boo
221253
def get_dependencies(self, candidate: Candidate) -> Sequence[Requirement]:
222254
with_requires = not self._ignore_dependencies
223255
return [r for r in candidate.iter_dependencies(with_requires) if r is not None]
224-
225-
@staticmethod
226-
def is_backtrack_cause(
227-
identifier: str, backtrack_causes: Sequence["PreferenceInformation"]
228-
) -> bool:
229-
for backtrack_cause in backtrack_causes:
230-
if identifier == backtrack_cause.requirement.name:
231-
return True
232-
if backtrack_cause.parent and identifier == backtrack_cause.parent.name:
233-
return True
234-
return False

tests/unit/resolution_resolvelib/test_provider.py

Lines changed: 74 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import math
2-
from typing import TYPE_CHECKING, Dict, Iterable, Optional, Sequence
2+
from typing import TYPE_CHECKING, Dict, Iterable, List, Optional, Sequence
33

44
import pytest
55

@@ -36,61 +36,53 @@ def build_req_info(
3636
@pytest.mark.parametrize(
3737
"identifier, information, backtrack_causes, user_requested, expected",
3838
[
39-
# Test case for REQUIRES_PYTHON_IDENTIFIER
40-
(
41-
REQUIRES_PYTHON_IDENTIFIER,
42-
{REQUIRES_PYTHON_IDENTIFIER: [build_req_info("python")]},
43-
[],
44-
{},
45-
(False, False, True, True, math.inf, True, REQUIRES_PYTHON_IDENTIFIER),
46-
),
4739
# Pinned package with "=="
4840
(
4941
"pinned-package",
5042
{"pinned-package": [build_req_info("pinned-package==1.0")]},
5143
[],
5244
{},
53-
(True, False, False, True, math.inf, False, "pinned-package"),
45+
(False, False, math.inf, False, "pinned-package"),
5446
),
5547
# Star-specified package, i.e. with "*"
5648
(
5749
"star-specified-package",
5850
{"star-specified-package": [build_req_info("star-specified-package==1.*")]},
5951
[],
6052
{},
61-
(True, False, True, True, math.inf, False, "star-specified-package"),
53+
(False, True, math.inf, False, "star-specified-package"),
6254
),
6355
# Package that caused backtracking
6456
(
6557
"backtrack-package",
6658
{"backtrack-package": [build_req_info("backtrack-package")]},
6759
[build_req_info("backtrack-package")],
6860
{},
69-
(True, False, True, False, math.inf, True, "backtrack-package"),
61+
(False, True, math.inf, True, "backtrack-package"),
7062
),
7163
# Root package requested by user
7264
(
7365
"root-package",
7466
{"root-package": [build_req_info("root-package")]},
7567
[],
7668
{"root-package": 1},
77-
(True, False, True, True, 1, True, "root-package"),
69+
(False, True, 1, True, "root-package"),
7870
),
7971
# Unfree package (with specifier operator)
8072
(
8173
"unfree-package",
8274
{"unfree-package": [build_req_info("unfree-package<1")]},
8375
[],
8476
{},
85-
(True, False, True, True, math.inf, False, "unfree-package"),
77+
(False, True, math.inf, False, "unfree-package"),
8678
),
8779
# Free package (no operator)
8880
(
8981
"free-package",
9082
{"free-package": [build_req_info("free-package")]},
9183
[],
9284
{},
93-
(True, False, True, True, math.inf, True, "free-package"),
85+
(False, True, math.inf, True, "free-package"),
9486
),
9587
],
9688
)
@@ -115,3 +107,70 @@ def test_get_preference(
115107
)
116108

117109
assert preference == expected, f"Expected {expected}, got {preference}"
110+
111+
112+
@pytest.mark.parametrize(
113+
"identifiers, backtrack_causes, expected",
114+
[
115+
# REQUIRES_PYTHON_IDENTIFIER is present
116+
(
117+
[REQUIRES_PYTHON_IDENTIFIER, "package1", "package2", "backtrack-package"],
118+
[build_req_info("backtrack-package")],
119+
[REQUIRES_PYTHON_IDENTIFIER],
120+
),
121+
# REQUIRES_PYTHON_IDENTIFIER is present after backtrack causes
122+
(
123+
["package1", "package2", "backtrack-package", REQUIRES_PYTHON_IDENTIFIER],
124+
[build_req_info("backtrack-package")],
125+
[REQUIRES_PYTHON_IDENTIFIER],
126+
),
127+
# Backtrack causes present (direct requirement)
128+
(
129+
["package1", "package2", "backtrack-package"],
130+
[build_req_info("backtrack-package")],
131+
["backtrack-package"],
132+
),
133+
# Multiple backtrack causes
134+
(
135+
["package1", "backtrack1", "backtrack2", "package2"],
136+
[build_req_info("backtrack1"), build_req_info("backtrack2")],
137+
["backtrack1", "backtrack2"],
138+
),
139+
# No special identifiers - return all
140+
(
141+
["package1", "package2"],
142+
[],
143+
["package1", "package2"],
144+
),
145+
# Empty list of identifiers
146+
(
147+
[],
148+
[],
149+
[],
150+
),
151+
],
152+
)
153+
def test_narrow_requirement_selection(
154+
identifiers: List[str],
155+
backtrack_causes: Sequence["PreferenceInformation"],
156+
expected: List[str],
157+
factory: Factory,
158+
) -> None:
159+
"""Test that narrow_requirement_selection correctly prioritizes identifiers:
160+
1. REQUIRES_PYTHON_IDENTIFIER (if present)
161+
2. Backtrack causes (if present)
162+
3. All other identifiers (as-is)
163+
"""
164+
provider = PipProvider(
165+
factory=factory,
166+
constraints={},
167+
ignore_dependencies=False,
168+
upgrade_strategy="to-satisfy-only",
169+
user_requested={},
170+
)
171+
172+
result = provider.narrow_requirement_selection(
173+
identifiers, {}, {}, {}, backtrack_causes
174+
)
175+
176+
assert list(result) == expected, f"Expected {expected}, got {list(result)}"

0 commit comments

Comments
 (0)