Skip to content

Conversation

@antiguru
Copy link
Member

Gives the compiler the opportunity to inline consolidate calls for small
amounts of data by separating the complex consolidate logic for more than
one update to a different function. We don't mark the function as cold or
inline never because we want to leave the decision to the compiler.

In local testing, this shaved a few percentages of the spines example. With
the patch:

Running ["new"] arrangement
5.682871841s    loading complete
14.032609636s   queries complete
14.037398497s   shut down

With current master:

Running ["new"] arrangement
6.010566878s    loading complete
14.673966926s   queries complete
14.678923984s   shut down

Signed-off-by: Moritz Hoffmann [email protected]

Gives the compiler the opportunity to inline consolidate calls for small
amounts of data by separating the complex consolidate logic for more than
one update to a different function. We don't mark the function as cold or
inline never because we want to leave the decision to the compiler.

In local testing, this shaved a few percentages of the spines example. With
the patch:
```
Running ["new"] arrangement
5.682871841s    loading complete
14.032609636s   queries complete
14.037398497s   shut down
```

With current master:
```
Running ["new"] arrangement
6.010566878s    loading complete
14.673966926s   queries complete
14.678923984s   shut down
```

Signed-off-by: Moritz Hoffmann <[email protected]>
@antiguru antiguru requested review from Copilot and frankmcsherry and removed request for Copilot July 23, 2025 14:16
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes the consolidation functions by separating fast-path logic for small datasets from the complex consolidation logic for larger datasets. The optimization enables the compiler to inline the fast-path consolidation calls, resulting in improved performance for common cases with small amounts of data.

Key changes:

  • Added inline hints to public consolidation functions to encourage inlining
  • Extracted complex consolidation logic into separate "slow" functions for slices with more than one element
  • Restructured control flow to prioritize the simple case handling for single-element or empty slices

// In a world where there are not many results, we may never even need to call in to merge sort.
slice.sort_by(|x,y| x.0.cmp(&y.0));
/// Part of `consolidate_slice` that handles slices of length greater than 1.
fn consolidate_slice_slow<T: Ord, R: Semigroup>(slice: &mut [(T, R)]) -> usize {
Copy link

Copilot AI Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function consolidate_slice_slow should be marked as pub(crate) or remain private. Making it public exposes an implementation detail that users shouldn't call directly.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nit but, it's not that it's slow it's just general. There isn't a fast version, there's just a less general version.

// In a world where there are not many results, we may never even need to call in to merge sort.
slice.sort_unstable_by(|x,y| (&x.0, &x.1).cmp(&(&y.0, &y.1)));
/// Part of `consolidate_updates_slice` that handles slices of length greater than 1.
pub fn consolidate_updates_slice_slow<D: Ord, T: Ord, R: Semigroup>(slice: &mut [(D, T, R)]) -> usize {
Copy link

Copilot AI Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function consolidate_updates_slice_slow is marked as public but appears to be an implementation detail. Consider making it pub(crate) or private to avoid exposing internal implementation.

Suggested change
pub fn consolidate_updates_slice_slow<D: Ord, T: Ord, R: Semigroup>(slice: &mut [(D, T, R)]) -> usize {
fn consolidate_updates_slice_slow<D: Ord, T: Ord, R: Semigroup>(slice: &mut [(D, T, R)]) -> usize {

Copilot uses AI. Check for mistakes.
Signed-off-by: Moritz Hoffmann <[email protected]>
Copy link
Member

@frankmcsherry frankmcsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good! I had some nits that we can follow up with async.

consolidate_slice_slow(slice)
}
else {
slice.iter().filter(|x| !x.1.is_zero()).count()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not the most important thing, but I wonder if there is a way to write this that is more helpful to Rust. Like, rather than iter and hope Rust realizes the slices are either len 0 or len 1, we could get(0) and unwrap_or(0)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see these were inherited from the previous implementation! We can explore in follow-up work, or .. ignore if no improvements!

// In a world where there are not many results, we may never even need to call in to merge sort.
slice.sort_by(|x,y| x.0.cmp(&y.0));
/// Part of `consolidate_slice` that handles slices of length greater than 1.
fn consolidate_slice_slow<T: Ord, R: Semigroup>(slice: &mut [(T, R)]) -> usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nit but, it's not that it's slow it's just general. There isn't a fast version, there's just a less general version.

@frankmcsherry frankmcsherry merged commit 80ed856 into TimelyDataflow:master Jul 26, 2025
5 checks passed
@github-actions github-actions bot mentioned this pull request Jul 26, 2025
@antiguru antiguru deleted the outlined_consolidate branch July 26, 2025 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants