Incremental group emission in HashAggregate #19562

ahmed-mez · 2025-12-30T13:13:45Z

Which issue does this PR close?

Closes Long poll (async runtime stall) in HashAggregate when emitting large number of groups #18907

Rationale for this change

This is another attempt to solve the long polls observed and explained in #18907. Mostly inspired by @alamb's comment #18906 (comment) on the first attempt PR.

What changes are included in this PR?

This PR is an exploratory implementation to gauge the scope and complexity of supporting incremental group emission in GroupedHashAggregateStream. The goal is to understand how large this change would be and identify a path to ship it in smaller, reviewable increments.

Are these changes tested?

Test coverage from preexisting aggregate tests
Benchmark incremental_emit included to prove the benefits of these changes

Are there any user-facing changes?

alamb · 2026-01-05T20:53:32Z

run benchmarks

alamb-ghbot · 2026-01-05T20:53:40Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing ahmed/long-poll-take-2 (224ce3a) to a51e3a0 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

ahmed-mez · 2026-01-06T09:29:06Z

Thank you @alamb for triggering the benchmarks. The results haven't been posted by the bot yet, I wonder if it's expected.

alamb-ghbot · 2026-01-06T18:58:38Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing ahmed/long-poll-take-2 (224ce3a) to a51e3a0 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb · 2026-01-06T18:58:56Z

Thank you @alamb for triggering the benchmarks. The results haven't been posted by the bot yet, I wonder if it's expected.

No I am not sure what is going on -- my runner script died. I just restarted it and hopefully we'll get some info

alamb · 2026-01-06T19:39:18Z

The reason the benchmarks are failing I think is that that they have basically slowed down to a crawl.

SELECT "WatchID", "ClientIP", COUNT(*) AS c, SUM("IsRefresh"), AVG("ResolutionWidth") FROM hits GROUP BY "WatchID", "ClientIP" ORDER BY c DESC LIMIT 10;

Query 32 iteration 0 took 147686.4 ms and returned 10 rows

I killed it after this

Perhaps you can run the benchmarks locally

benchmarks/bench.sh data clickbench_partitioned
benchmarks/bench.sh run clickbench_partitioned

ahmed-mez · 2026-01-06T20:23:53Z

The reason the benchmarks are failing I think is that that they have basically slowed down to a crawl.
SELECT "WatchID", "ClientIP", COUNT(*) AS c, SUM("IsRefresh"), AVG("ResolutionWidth") FROM hits GROUP BY "WatchID", "ClientIP" ORDER BY c DESC LIMIT 10;

Query 32 iteration 0 took 147686.4 ms and returned 10 rows
I killed it after this

Perhaps you can run the benchmarks locally
benchmarks/bench.sh data clickbench_partitioned
benchmarks/bench.sh run clickbench_partitioned

Oh that's interesting! That benchmark query runs much more faster locally (same branch and commit)

SELECT "WatchID", "ClientIP", COUNT(*) AS c, SUM("IsRefresh"), AVG("ResolutionWidth") FROM hits GROUP BY "WatchID", "ClientIP" ORDER BY c DESC LIMIT 10;

Query 32 iteration 0 took 11717.4 ms and returned 10 rows
Query 32 iteration 1 took 11660.8 ms and returned 10 rows
Query 32 iteration 2 took 11855.8 ms and returned 10 rows
Query 32 iteration 3 took 12496.2 ms and returned 10 rows
Query 32 iteration 4 took 12555.9 ms and returned 10 rows
Query 32 avg time: 12057.21 ms

I'll keep digging and try to compare to base locally.
Apart from that, is the failed benchmark CI job accessible somehow for further debugging?

ahmed-mez · 2026-01-06T20:37:18Z

This is the comparison output and it's not great 😞 At first glance mainly topk queries are impacted by the perf regression

Comparing main and ahmed_long-poll-take-2
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       main ┃ ahmed_long-poll-take-2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │    0.87 ms │                0.82 ms │ +1.06x faster │
│ QQuery 1     │    8.58 ms │                8.22 ms │     no change │
│ QQuery 2     │   29.57 ms │               28.78 ms │     no change │
│ QQuery 3     │   27.93 ms │               24.25 ms │ +1.15x faster │
│ QQuery 4     │  185.18 ms │              289.26 ms │  1.56x slower │
│ QQuery 5     │  253.48 ms │             2514.65 ms │  9.92x slower │
│ QQuery 6     │    0.90 ms │                0.85 ms │ +1.06x faster │
│ QQuery 7     │    8.87 ms │                9.46 ms │  1.07x slower │
│ QQuery 8     │  215.91 ms │              535.34 ms │  2.48x slower │
│ QQuery 9     │  352.77 ms │              351.77 ms │     no change │
│ QQuery 10    │   59.07 ms │               66.03 ms │  1.12x slower │
│ QQuery 11    │   70.65 ms │               74.87 ms │  1.06x slower │
│ QQuery 12    │  246.81 ms │             2539.77 ms │ 10.29x slower │
│ QQuery 13    │  335.89 ms │              955.45 ms │  2.84x slower │
│ QQuery 14    │  221.80 ms │              332.14 ms │  1.50x slower │
│ QQuery 15    │  207.82 ms │              454.78 ms │  2.19x slower │
│ QQuery 16    │  455.08 ms │             2316.21 ms │  5.09x slower │
│ QQuery 17    │  441.56 ms │             1630.44 ms │  3.69x slower │
│ QQuery 18    │  898.82 ms │             9670.17 ms │ 10.76x slower │
│ QQuery 19    │   20.34 ms │               21.70 ms │  1.07x slower │
│ QQuery 20    │  470.21 ms │              480.47 ms │     no change │
│ QQuery 21    │  507.89 ms │              523.32 ms │     no change │
│ QQuery 22    │  827.85 ms │              813.45 ms │     no change │
│ QQuery 23    │ 2344.75 ms │             2317.04 ms │     no change │
│ QQuery 24    │   52.42 ms │               52.83 ms │     no change │
│ QQuery 25    │  113.44 ms │              109.39 ms │     no change │
│ QQuery 26    │   53.71 ms │               53.89 ms │     no change │
│ QQuery 27    │  581.05 ms │              572.82 ms │     no change │
│ QQuery 28    │ 5174.20 ms │             5402.70 ms │     no change │
│ QQuery 29    │  221.13 ms │              221.62 ms │     no change │
│ QQuery 30    │  216.52 ms │              293.76 ms │  1.36x slower │
│ QQuery 31    │  228.53 ms │              386.75 ms │  1.69x slower │
│ QQuery 32    │ 1214.74 ms │            11660.79 ms │  9.60x slower │
│ QQuery 33    │ 1108.52 ms │                   FAIL │  incomparable │
│ QQuery 34    │ 1164.78 ms │            29807.41 ms │ 25.59x slower │
│ QQuery 35    │  294.71 ms │              837.63 ms │  2.84x slower │
│ QQuery 36    │   31.74 ms │               41.91 ms │  1.32x slower │
│ QQuery 37    │   21.16 ms │               24.12 ms │  1.14x slower │
│ QQuery 38    │   32.40 ms │               33.90 ms │     no change │
│ QQuery 39    │   52.27 ms │               51.59 ms │     no change │
│ QQuery 40    │    9.27 ms │                8.25 ms │ +1.12x faster │
│ QQuery 41    │    8.02 ms │                7.70 ms │     no change │
│ QQuery 42    │    6.86 ms │                6.72 ms │     no change │
└──────────────┴────────────┴────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                     ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main)                     │ 17669.59ms │
│ Total Time (ahmed_long-poll-take-2)   │ 75533.04ms │
│ Average Time (main)                   │   420.70ms │
│ Average Time (ahmed_long-poll-take-2) │  1798.41ms │
│ Queries Faster                        │          4 │
│ Queries Slower                        │         21 │
│ Queries with No Change                │         17 │
│ Queries with Failure                  │          1 │
└───────────────────────────────────────┴────────────┘

github-actions bot added logical-expr Logical plan and expressions functions Changes to functions implementation ffi Changes to the ffi crate physical-plan Changes to the physical-plan crate labels Dec 30, 2025

ahmed-mez force-pushed the ahmed/long-poll-take-2 branch 2 times, most recently from 9ebbf95 to eacfedb Compare December 30, 2025 13:20

Incremental group emission in HashAggregate

224ce3a

ahmed-mez force-pushed the ahmed/long-poll-take-2 branch from eacfedb to 224ce3a Compare December 30, 2025 13:24

ahmed-mez mentioned this pull request Dec 30, 2025

Emit aggregation groups in chunks to avoid blocking async runtime #18906

Open

alamb mentioned this pull request Jan 5, 2026

Andrew Lamb Weekly-ish Open Source plan - 2026-01-05 #19652

Open

25 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incremental group emission in HashAggregate #19562

Incremental group emission in HashAggregate #19562

ahmed-mez commented Dec 30, 2025

Uh oh!

alamb commented Jan 5, 2026

Uh oh!

alamb-ghbot commented Jan 5, 2026

Uh oh!

ahmed-mez commented Jan 6, 2026

Uh oh!

alamb-ghbot commented Jan 6, 2026

Uh oh!

alamb commented Jan 6, 2026

Uh oh!

alamb commented Jan 6, 2026

Uh oh!

ahmed-mez commented Jan 6, 2026

Uh oh!

ahmed-mez commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Incremental group emission in HashAggregate #19562

Are you sure you want to change the base?

Incremental group emission in HashAggregate #19562

Conversation

ahmed-mez commented Dec 30, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

alamb commented Jan 5, 2026

Uh oh!

alamb-ghbot commented Jan 5, 2026

Uh oh!

ahmed-mez commented Jan 6, 2026

Uh oh!

alamb-ghbot commented Jan 6, 2026

Uh oh!

alamb commented Jan 6, 2026

Uh oh!

alamb commented Jan 6, 2026

Uh oh!

ahmed-mez commented Jan 6, 2026

Uh oh!

ahmed-mez commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants