-
Couldn't load subscription status.
- Fork 31
Improve DependencyNode, get recursive build/install dependencies. #763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@dhellmann This MR implements the logic to get build and install requirements of a DependencyNode. |
Extend `DependencyNode` to get all install dependencies and build requirements. The new method return unique dependencies by recursively walking the dependency graph. The build requirements include all recursive installation requirements of build requirements. Signed-off-by: Christian Heimes <[email protected]>
| c.add_child(f, Requirement(f.canonicalized_name), RequirementType.BUILD_BACKEND) | ||
|
|
||
| assert sorted(a.iter_install_requirements()) == [b, e] | ||
| assert sorted(a.iter_build_requirements()) == [c, e, f] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect to see D in this list because D is a build requirement of C. We can't build A until we can install C, and we can't install C until we can build it.
Like I think I said on another review, we have the same issue here that we had with determining whether something was a build dependency during bootstrap. There, the package "inherits" the "build dependency" type from the why chain. If anything earlier in the why chain is a build dependency, then so is the current package.
In the bootstrap process we build packages using in-order traversal. We build the build dependencies, then the package, then its install dependencies. That separates the dependency types and allows us to cope with potential circular dependencies in installation dependencies, since those are not forbidden.
I think we need to apply the same logic when building in parallel. That is going to potentially limit the amount we do in parallel, but I think that's required to achieve a correct build, without deadlocking on cyclic dependencies or trying to use something that isn't available.
I think that means keeping track of 2 things in build-parallel for each package: Is the package ready to be built, and is the package ready to be used to build other packages. That will complicate the logic for deciding what work to do, and while a literal topological traversal will give us the right order for sequential build I don't think it gives us parallel builds on its own.
The current implementation in the main branch doesn't attempt to traverse the graph explicitly. During each pass it iterates over the set of all unbuilt nodes and asks if the build dependencies for the node have been built. If so, it adds the current node to the set to be built. I think we need to extend that logic so that for each build dependency we check not only that it has been built, but that its installation dependencies are built. Only at that point can we say that the build dependencies are ready to be used to build the current node.
We could build that using the topological sorter by having the sorter tell us when a package is ready to be built, but then keeping those built nodes in a separate list and waiting to mark them as done in the graph when the installation dependencies are also built.
Alternatively, we could fix the loop we have already to keep track of the same information.
I'm not sure which will be easier to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done some refactoring of the existing logic and added install dependency checking in #794
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect to see D in this list because D is a build requirement of C. We can't build A until we can install C, and we can't install C until we can build it.
No, A does not need D to build. A only needs C+E+F to build. The indirect dependencies A -> C -> D is handled by the topology graph. C does not become available until D is built. Because C has a dependency on D and A has a dependency on C, the topology converges on the solution D, then C, then A.
Convert the
DependencyNodeandDependencyEdgeinto data classes. The classes are frozen (immutable) and support comparison, sorting, and hashing.Extend
DependencyNodeto get all install dependencies and build requirements. The new method return unique dependencies by recursively walking the dependency graph. The build requirements include all recursive installation requirements of build requirements.