Skip to content

Conversation

@tiran
Copy link
Collaborator

@tiran tiran commented Sep 29, 2025

Redesign the parallel build command on top of #763 and #795. The tracking dependency sorter handles the logic to return new buildable nodes as soon as all their build dependencies are done.

@tiran tiran requested a review from a team as a code owner September 29, 2025 09:07
@mergify mergify bot added the ci label Sep 29, 2025
@tiran tiran force-pushed the build-parallel-redesign branch from b48abd5 to 1ea4d8b Compare September 29, 2025 09:22
@tiran tiran marked this pull request as draft September 29, 2025 09:22
Copy link
Member

@dhellmann dhellmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to think that instead of reproducing the topology of the graph, we should include order data in the graph file based on the bootstrap and then use that order.

Iterate over the nodes in the order they were built by the bootstrap step. For each node, if its build dependencies are available, queue it to build and remove it from the to-be-built list. Repeat until the list is empty.

That algorithm would give us an accurate order that we could build, because it did build during the bootstrap step. So we would just be parallelizing based on what has already been built, and not trying to figure out how to get the same traversal order out of the graph a second time.

def __bool__(self) -> bool:
return self.is_active()

def get_available(self) -> set[DependencyNode]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I avoided having to wrap the sorter like this in #794 by having the user of the topological sorter split the ready nodes into exclusive and non-exclusive lists and then yielding them separately. The equivalent of making this method a generator for lists of nodes to build.

# it's a new ``[build-system].requires``.
yield edge.destination_node
# recursively get install dependencies of this build dep (depth first).
for install_edge in self._traverse_install_requirements(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The installation requirements of A are not part of its build requirements. We will be ready to build A before these things are built, and as you've pointed out elsewhere there will be cases where we must because the installation dependencies can have cycles. This is why in #794 I separate the installation dependencies from the topological sorter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are still misunderstanding the algorithm. Or I'm not understanding your point. This line is getting the installation requirement of a build requirement. The purpose of the code is to reconstruct which packages are necessary for the build environment of a package.

tiran added 5 commits October 6, 2025 16:42
Extend `DependencyNode` to get all install dependencies and build
requirements. The new method return unique dependencies by recursively
walking the dependency graph. The build requirements include all
recursive installation requirements of build requirements.

Signed-off-by: Christian Heimes <[email protected]>
`TopologicalSorter.get_ready()` returns a node only once. The
tracking topological sorter keeps track which nodes are marked as done.
The `get_available()` method returns nodes again and again, until
they are marked as done. The graph is active until all nodes are marked
as done.

Individual nodes can be marked as exclusive nodes. ``get_available``
treats exclusive nodes special and returns:

1. one or more non-exclusive nodes
2. exactly one exclusive node that is a predecessor of another node
3. exactly one exclusive node

The class uses a lock for ``is_activate`, ``get_available`, and ``done``,
so the methods can be used from threading pool and future callback.

Signed-off-by: Christian Heimes <[email protected]>
The `DependencyGraph.get_build_topology` method returns a
`TrackingTopologicalSorter` with all nodes in the graph. Each node
tracks its build dependency set. Nodes for a package with exclusive
builds are flagged as exclusive.

Signed-off-by: Christian Heimes <[email protected]>
Re-implement the `build-parallel` command on top of `get_build_topology`
and `TrackingTopologySorter`.

Signed-off-by: Christian Heimes <[email protected]>
Signed-off-by: Christian Heimes <[email protected]>
@tiran tiran force-pushed the build-parallel-redesign branch from 1ea4d8b to d597d25 Compare October 6, 2025 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants