refactor: support row/bucket shuffle for aggregation #19155

dqhl76 · 2025-12-25T05:05:54Z

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR included:

Support for Two Shuffle Methods: Added support for both Row and Bucket shuffle modes. Please refer to the diagram below for a visual comparison of the differences.
Recursive Spilling in Final Aggregate: Enabled the final aggregate processor to spill and restore data recursively. This enhancement not introduce performance regression compared to the previous implementation.

Reason for Dual Shuffle Methods:

Background: In our previous implementation, we used a "growth payload" strategy for the partial aggregation stage. While this dynamic approach was flexible and effective for varying data sizes, it necessitated an alignment processor (TransformPartitionBucket) and made implementing spilling support for the final aggregation difficult.

Solution: We moved to a fixed-bucket strategy determined at plan-time (based on node_num and thread_num). This symmetry between partial and final aggregation simplifies the architecture and enables simpler spilling support.

The Trade-off: With high concurrency (many nodes/threads), a purely fixed-bucket approach creates excessive buckets, degrading partial aggregation performance.

The Fix: We introduce a Row Shuffle method for high-concurrency scenarios. This forces the partial aggregation to use a single bucket, followed by a scatter operation to distribute data to the final aggregation processors, avoiding the overhead of managing too many buckets early in the pipeline.

Tests

Unit Test
Logic Test
Benchmark Test
No Test - Explain why

Type of change

Bug Fix (non-breaking change which fixes an issue)
New Feature (non-breaking change which adds functionality)
Breaking Change (fix or feature that could cause existing functionality not to work as expected)
Documentation Update
Refactoring
Performance Improvement
Other (please describe):

This change is

refactor: refactor spiller row shuffle spill tmp save clean pass bucket partial force spill pass tpch basic pass feat: add support for partitioned aggregate serialization and deserialization save remove unused activate_worker refactor: also add stats for final aggregate feat: implement configurable aggregate shuffle mode refactor: add basic bucket level shuffled agg refactor: remove legacy final agg feat: refactor aggregate function in build_aggregate to make it clear

github-actions · 2026-01-03T09:29:59Z

Docker Image for PR

tag: pr-19155-0907155-1767432417

note: this image tag is only available for internal use.

github-actions · 2026-01-03T10:29:08Z

ClickBench Report

hits: https://benchmark.databend.com/clickbench/pr/19155/20674481839/hits.html
internal: https://benchmark.databend.com/clickbench/pr/19155/20674481839/internal.html
load: https://benchmark.databend.com/clickbench/pr/19155/20674481839/load.html
tpch100: https://benchmark.databend.com/clickbench/pr/19155/20674481839/tpch100.html
tpch1000: https://benchmark.databend.com/clickbench/pr/19155/20674481839/tpch1000.html

fixup

github-actions · 2026-01-06T01:49:26Z

Docker Image for PR

tag: pr-19155-3d71ace-1767663886

note: this image tag is only available for internal use.

github-actions · 2026-01-06T02:52:12Z

ClickBench Report

hits: https://benchmark.databend.com/clickbench/pr/19155/20733784091/hits.html
internal: https://benchmark.databend.com/clickbench/pr/19155/20733784091/internal.html
load: https://benchmark.databend.com/clickbench/pr/19155/20733784091/load.html
tpch100: https://benchmark.databend.com/clickbench/pr/19155/20733784091/tpch100.html
tpch1000: https://benchmark.databend.com/clickbench/pr/19155/20733784091/tpch1000.html

github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Dec 25, 2025

This was referenced Dec 25, 2025

chore(query): add stats for new final aggregators #19156

Merged

refactor: refactor partition stream in aggregation spiller #19158

Closed

dqhl76 added 6 commits December 25, 2025 16:44

fmt

672b885

enable_experiment_aggregate = 1

fc733c1

license check

df270a0

make clippy happy

40737d1

make clippy happy

70710f5

dqhl76 force-pushed the aggregate-24 branch from 079cef9 to 70710f5 Compare December 25, 2025 08:44

try fix hang when merge join is after agg

4075b59

dqhl76 force-pushed the aggregate-24 branch from 9145a12 to 4075b59 Compare December 29, 2025 07:38

dqhl76 added 14 commits December 29, 2025 16:07

fmt

f427bb6

fmt

d499dc8

temp save

85d6c6e

simplify AggregateShuffleMode

d2f3352

simplify new partial aggregate

29cbbfb

not finish

017cfe4

save

6db6e57

save

75d85b3

fix

c6dd2c1

fix

e7fc7d9

row

156608f

bucket

ac3d821

finish

4bbc54d

fmt + add test

307c3b7

dqhl76 added the ci-benchmark Benchmark: run all test label Jan 3, 2026

clean useless branch

b342d56

dqhl76 added 6 commits January 4, 2026 15:45

try support final aggregate spill

7cc05d6

feat: support final aggregate spill

b7b9866

forget on_finish for output triggered finish

01ce590

shuffle mode determination for not cluster aggregation

c80335f

add more test

0be800a

fixup

fix

0b255dd

dqhl76 added ci-benchmark Benchmark: run all test and removed ci-benchmark Benchmark: run all test labels Jan 6, 2026

dqhl76 added 3 commits January 6, 2026 20:31

add more test

82c47d5

remove debug info

73888eb

disable for now, prefer enable in another PR

748c33c

dqhl76 marked this pull request as ready for review January 7, 2026 06:11

Merge branch 'main' into aggregate-24

5c6a707

dqhl76 mentioned this pull request Jan 8, 2026

refactor: optimize final aggregation for large datasets #19208

Draft

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: support row/bucket shuffle for aggregation #19155

refactor: support row/bucket shuffle for aggregation #19155

dqhl76 commented Dec 25, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jan 3, 2026

Uh oh!

github-actions bot commented Jan 3, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

refactor: support row/bucket shuffle for aggregation #19155

Are you sure you want to change the base?

refactor: support row/bucket shuffle for aggregation #19155

Conversation

dqhl76 commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Type of change

Uh oh!

github-actions bot commented Jan 3, 2026

Docker Image for PR

Uh oh!

github-actions bot commented Jan 3, 2026

ClickBench Report

Uh oh!

github-actions bot commented Jan 6, 2026

Docker Image for PR

Uh oh!

github-actions bot commented Jan 6, 2026

ClickBench Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dqhl76 commented Dec 25, 2025 •

edited

Loading