Inline consolidate fast-path #629

antiguru · 2025-07-23T14:15:23Z

Gives the compiler the opportunity to inline consolidate calls for small
amounts of data by separating the complex consolidate logic for more than
one update to a different function. We don't mark the function as cold or
inline never because we want to leave the decision to the compiler.

In local testing, this shaved a few percentages of the spines example. With
the patch:

Running ["new"] arrangement
5.682871841s    loading complete
14.032609636s   queries complete
14.037398497s   shut down

With current master:

Running ["new"] arrangement
6.010566878s    loading complete
14.673966926s   queries complete
14.678923984s   shut down

Signed-off-by: Moritz Hoffmann [email protected]

Gives the compiler the opportunity to inline consolidate calls for small amounts of data by separating the complex consolidate logic for more than one update to a different function. We don't mark the function as cold or inline never because we want to leave the decision to the compiler. In local testing, this shaved a few percentages of the spines example. With the patch: ``` Running ["new"] arrangement 5.682871841s loading complete 14.032609636s queries complete 14.037398497s shut down ``` With current master: ``` Running ["new"] arrangement 6.010566878s loading complete 14.673966926s queries complete 14.678923984s shut down ``` Signed-off-by: Moritz Hoffmann <[email protected]>

Copilot

Pull Request Overview

This PR optimizes the consolidation functions by separating fast-path logic for small datasets from the complex consolidation logic for larger datasets. The optimization enables the compiler to inline the fast-path consolidation calls, resulting in improved performance for common cases with small amounts of data.

Key changes:

Added inline hints to public consolidation functions to encourage inlining
Extracted complex consolidation logic into separate "slow" functions for slices with more than one element
Restructured control flow to prioritize the simple case handling for single-element or empty slices

Copilot · 2025-07-23T14:16:54Z

differential-dataflow/src/consolidation.rs

-        // In a world where there are not many results, we may never even need to call in to merge sort.
-        slice.sort_by(|x,y| x.0.cmp(&y.0));
+/// Part of `consolidate_slice` that handles slices of length greater than 1.
+fn consolidate_slice_slow<T: Ord, R: Semigroup>(slice: &mut [(T, R)]) -> usize {


The function consolidate_slice_slow should be marked as pub(crate) or remain private. Making it public exposes an implementation detail that users shouldn't call directly.

Naming nit but, it's not that it's slow it's just general. There isn't a fast version, there's just a less general version.

Copilot · 2025-07-23T14:16:54Z

differential-dataflow/src/consolidation.rs

-        // In a world where there are not many results, we may never even need to call in to merge sort.
-        slice.sort_unstable_by(|x,y| (&x.0, &x.1).cmp(&(&y.0, &y.1)));
+/// Part of `consolidate_updates_slice` that handles slices of length greater than 1.
+pub fn consolidate_updates_slice_slow<D: Ord, T: Ord, R: Semigroup>(slice: &mut [(D, T, R)]) -> usize {


The function consolidate_updates_slice_slow is marked as public but appears to be an implementation detail. Consider making it pub(crate) or private to avoid exposing internal implementation.

Suggested change

pub fn consolidate_updates_slice_slow<D: Ord, T: Ord, R: Semigroup>(slice: &mut [(D, T, R)]) -> usize {

fn consolidate_updates_slice_slow<D: Ord, T: Ord, R: Semigroup>(slice: &mut [(D, T, R)]) -> usize {

Signed-off-by: Moritz Hoffmann <[email protected]>

frankmcsherry

Seems good! I had some nits that we can follow up with async.

frankmcsherry · 2025-07-26T16:35:42Z

differential-dataflow/src/consolidation.rs

+        consolidate_slice_slow(slice)
+    }
+    else {
+        slice.iter().filter(|x| !x.1.is_zero()).count()


Probably not the most important thing, but I wonder if there is a way to write this that is more helpful to Rust. Like, rather than iter and hope Rust realizes the slices are either len 0 or len 1, we could get(0) and unwrap_or(0)?

Ah, I see these were inherited from the previous implementation! We can explore in follow-up work, or .. ignore if no improvements!

frankmcsherry · 2025-07-26T16:39:44Z

differential-dataflow/src/consolidation.rs

-        // In a world where there are not many results, we may never even need to call in to merge sort.
-        slice.sort_by(|x,y| x.0.cmp(&y.0));
+/// Part of `consolidate_slice` that handles slices of length greater than 1.
+fn consolidate_slice_slow<T: Ord, R: Semigroup>(slice: &mut [(T, R)]) -> usize {


Naming nit but, it's not that it's slow it's just general. There isn't a fast version, there's just a less general version.

antiguru requested review from Copilot and frankmcsherry and removed request for Copilot July 23, 2025 14:16

Copilot AI reviewed Jul 23, 2025

View reviewed changes

Make function private

efd8ce8

Signed-off-by: Moritz Hoffmann <[email protected]>

frankmcsherry approved these changes Jul 26, 2025

View reviewed changes

frankmcsherry merged commit 80ed856 into TimelyDataflow:master Jul 26, 2025
5 checks passed

github-actions bot mentioned this pull request Jul 26, 2025

chore: release v0.15.4 #612

Merged

antiguru deleted the outlined_consolidate branch July 26, 2025 16:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inline consolidate fast-path #629

Inline consolidate fast-path #629

Uh oh!

antiguru commented Jul 23, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 23, 2025

Uh oh!

frankmcsherry Jul 26, 2025

Uh oh!

Copilot AI Jul 23, 2025

Uh oh!

frankmcsherry left a comment

Uh oh!

frankmcsherry Jul 26, 2025

Uh oh!

frankmcsherry Jul 26, 2025

Uh oh!

frankmcsherry Jul 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	pub fn consolidate_updates_slice_slow<D: Ord, T: Ord, R: Semigroup>(slice: &mut [(D, T, R)]) -> usize {
	fn consolidate_updates_slice_slow<D: Ord, T: Ord, R: Semigroup>(slice: &mut [(D, T, R)]) -> usize {

Inline consolidate fast-path #629

Inline consolidate fast-path #629

Uh oh!

Conversation

antiguru commented Jul 23, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

frankmcsherry Jul 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

frankmcsherry left a comment

Choose a reason for hiding this comment

Uh oh!

frankmcsherry Jul 26, 2025

Choose a reason for hiding this comment

Uh oh!

frankmcsherry Jul 26, 2025

Choose a reason for hiding this comment

Uh oh!

frankmcsherry Jul 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants