-
Couldn't load subscription status.
- Fork 31
Description
To solve #755 Fromager needs an API to get a list of packages that were present in the build environment of a package. The graph file only contains top-level build requirements from pyproject.toml [build-system].requires. Dependencies of build requirements are not listed as dependencies of a package.
build vs. install dependency
It is important to keep in mind that build dependencies and install dependencies of a package must be handled as separate graphs. In the general case, Python package dependencies are NOT DAGs. Installation dependencies can be cyclic. Circular dependencies are explicitly supported by Python packaging, pip, and uv. They occur in practice. I think that some of our downstream packages have cyclic dependency graphs.
So far we got lucky that build dependencies are direct acyclic graphs. Packages have far less build dependencies. Most packages depend on flit, hatchling, setuptools, and a few other packages. In general, developers are more careful about their build systems.
example
torchvision needs torch as a build system requirement. The dependency graph entry for torchvision lists torch as a build-system requirement. The entry does not contain any installation dependencies of torch. torchvision needs to wait for torch and all of Torch's installation requirements to become ready, before it can be built.
Amongst others torch has an installation requirement on astunparse, which itself has an installation requirement on six. torch does not have a build requirement on astunparse. In this simplified example, Fromager can build torch, and six in parallel. When six is done, it can build astunparse. Once astunparse, torch, and six are ready, Fromager can start to build torchvision.
problem transformation
There are multiple ways to think about the dependency problem.
One approach is to use separate topological sorted graphs for build dependencies and installation dependencies. The build graph tracks package -> build system requirements, the install graph package -> install requirements. An installation dependency becomes ready to build when all its build dependencies are done. A build dependency becomes ready when all of its installation dependencies are done.
Another approach is to have one graph where each node track package -> build system requirements + recursive installation requirements of build system requirements. That's what PR #763 and #796 are doing. With 796 the torchvision node has the predessors {torch, astunparse, six}.
We could make the dependencies explicit and include the indirect build dependencies in the graph as a new edge type build-dependency. A build-dependency is any dependency of a build requirement that is not already a build-system, build-backend, or build-sdist dependency. BuildEnvironment.get_distributions()returns all installed packages asMapping[str, Version]. The mapping should be equivalent to the set of packages from #763' iter_build_requirements()` method. It's redundant information, but maybe useful to have.
addendum
If we decide to extend the graph, then I would also like to add two additional flags to a graph:
is_toplevel_dependency: bool: is the package an installation requirement of a top-level package? Some packages likeflitorhatchlingshould not be needed at installation time.is_build_dependency: bool: is the package a direct or indirect build dependency of any package?
Both flags can be true at the same time, e.g. torch is both a build and install dependency of torchvision. The information is redundant as the flags can be reconstructed from the dependency edges, too.