Combine small pages in Limit #128531

dnhatn · 2025-05-27T19:55:10Z

Reverted in #129107

Currently, the Limit operator does not combine small pages into larger ones; it simply passes them along, except for chunking pages larger than the limit. This change integrates EstimatesRowSize into Limit and adjusts it to emit larger pages. As a result, pages up to twice the pageSize may be emitted, which is preferable to emitting undersized pages. This should reduce the number of transport requests and responses between clusters or coordinator-data nodes for queries without TopN or STATS when target shards produce small pages due to their size or highly selective filters.

elasticsearchmachine · 2025-05-27T21:32:37Z

Hi @dnhatn, I've created a changelog YAML for you.

elasticsearchmachine · 2025-05-27T21:35:22Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

idegtiarenko · 2025-05-28T08:03:44Z

...ugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/ManyShardsIT.java

+                OperatorStatus exchangeSink = driverProfile.operators().get(2);
+                assertThat(exchangeSink.status(), instanceOf(ExchangeSinkOperator.Status.class));
+                ExchangeSinkOperator.Status exchangeStatus = (ExchangeSinkOperator.Status) exchangeSink.status();
+                assertThat(exchangeStatus.pagesReceived(), lessThanOrEqualTo(1));


I was expecting this to be strictly equalTo(1). When this could be 0?

It could be 0 if early termination kicks in.

nik9000 · 2025-05-28T19:22:58Z

x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/LimitOperator.java


-    public LimitOperator(Limiter limiter) {
+    private final int pageSize;
+    private int pendingRows;


Could you move this one below the final ones? I just want to keep the mutable ones not mixed in.

Sure, I regrouped these in c94c72c

dnhatn · 2025-05-29T23:03:40Z

@idegtiarenko @nik9000 Thanks!

elasticsearchmachine · 2025-05-29T23:06:06Z

💔 Backport failed

Status	Branch	Result
❌	8.19	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 128531

Currently, the Limit operator does not combine small pages into larger ones; it simply passes them along, except for chunking pages larger than the limit. This change integrates EstimatesRowSize into Limit and adjusts it to emit larger pages. As a result, pages up to twice the pageSize may be emitted, which is preferable to emitting undersized pages. This should reduce the number of transport requests and responses between clusters or coordinator-data nodes for queries without TopN or STATS when target shards produce small pages due to their size or highly selective filters.

This PR reverts #128531. With #128531, the Limit operator was updated to combine smaller pages into a larger page to reduce overhead, such as the number of exchange requests. However, this has a significant implication: the combined larger page does not retain the attributes of the blocks from the smaller pages. For example, if the smaller pages have ordinal-based BytesRef blocks, the larger page will not. This can cause a significant slowdown if subsequent operators have optimizations for ordinal-based blocks. The Enrich operator has such optimizations, and our benchmarks have shown this performance regression. One possible solution to reduce the regression is to set a threshold (e.g., 1000 rows), above which the Limit operator would pass the page along without combining. However, even with a threshold of 1000, the performance regression does not go away completely. Alternatively, we could allow exchange requests to return multiple pages (up to the page size limit). To minimize risk, this PR reverts the previous change, and we will reintroduce a new change later

dnhatn · 2025-06-09T15:27:23Z

Reverted in #129107

This PR reverts elastic#128531. With elastic#128531, the Limit operator was updated to combine smaller pages into a larger page to reduce overhead, such as the number of exchange requests. However, this has a significant implication: the combined larger page does not retain the attributes of the blocks from the smaller pages. For example, if the smaller pages have ordinal-based BytesRef blocks, the larger page will not. This can cause a significant slowdown if subsequent operators have optimizations for ordinal-based blocks. The Enrich operator has such optimizations, and our benchmarks have shown this performance regression. One possible solution to reduce the regression is to set a threshold (e.g., 1000 rows), above which the Limit operator would pass the page along without combining. However, even with a threshold of 1000, the performance regression does not go away completely. Alternatively, we could allow exchange requests to return multiple pages (up to the page size limit). To minimize risk, this PR reverts the previous change, and we will reintroduce a new change later

elasticsearchmachine added the v9.1.0 label May 27, 2025

Combine small pages in Limit

8945341

dnhatn force-pushed the merge-pages-limit branch from 3f78b48 to 8945341 Compare May 27, 2025 21:30

dnhatn added :Analytics/ES|QL AKA ESQL >enhancement v8.19.0 labels May 27, 2025

Update docs/changelog/128531.yaml

817b468

dnhatn requested review from idegtiarenko and nik9000 May 27, 2025 21:34

dnhatn marked this pull request as ready for review May 27, 2025 21:35

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 27, 2025

dnhatn added 3 commits May 27, 2025 18:49

fix tests

0b49fda

Merge remote-tracking branch 'elastic/main' into merge-pages-limit

26d093b

Merge remote-tracking branch 'elastic/main' into merge-pages-limit

5cc6475

idegtiarenko reviewed May 28, 2025

View reviewed changes

idegtiarenko approved these changes May 28, 2025

View reviewed changes

dnhatn added the auto-backport Automatically create backport pull requests when merged label May 28, 2025

nik9000 approved these changes May 28, 2025

View reviewed changes

dnhatn added 5 commits May 28, 2025 18:57

fields

c94c72c

Merge remote-tracking branch 'elastic/main' into merge-pages-limit

4a2b8a4

Merge branch 'main' into merge-pages-limit

48fa160

Merge remote-tracking branch 'elastic/main' into merge-pages-limit

ed05f88

Merge remote-tracking branch 'elastic/main' into merge-pages-limit

db7a30c

dnhatn merged commit 1ab2e6c into elastic:main May 29, 2025
18 checks passed

dnhatn deleted the merge-pages-limit branch May 29, 2025 23:04

elasticsearchmachine added the backport pending label May 29, 2025

dnhatn mentioned this pull request Jun 8, 2025

Revert "Combine small pages in Limit" #129107

Merged

dnhatn added >non-issue and removed >enhancement backport pending v8.19.0 auto-backport Automatically create backport pull requests when merged labels Jun 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Combine small pages in Limit #128531

Combine small pages in Limit #128531

Uh oh!

dnhatn commented May 27, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented May 27, 2025

Uh oh!

elasticsearchmachine commented May 27, 2025

Uh oh!

idegtiarenko May 28, 2025

Uh oh!

dnhatn May 28, 2025

Uh oh!

nik9000 May 28, 2025

Uh oh!

dnhatn May 29, 2025

Uh oh!

nik9000 May 29, 2025

Uh oh!

dnhatn commented May 29, 2025

Uh oh!

Uh oh!

elasticsearchmachine commented May 29, 2025

Uh oh!

dnhatn commented Jun 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Combine small pages in Limit #128531

Combine small pages in Limit #128531

Uh oh!

Conversation

dnhatn commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented May 27, 2025

Uh oh!

elasticsearchmachine commented May 27, 2025

Uh oh!

idegtiarenko May 28, 2025

Choose a reason for hiding this comment

Uh oh!

dnhatn May 28, 2025

Choose a reason for hiding this comment

Uh oh!

nik9000 May 28, 2025

Choose a reason for hiding this comment

Uh oh!

dnhatn May 29, 2025

Choose a reason for hiding this comment

Uh oh!

nik9000 May 29, 2025

Choose a reason for hiding this comment

Uh oh!

dnhatn commented May 29, 2025

Uh oh!

Uh oh!

elasticsearchmachine commented May 29, 2025

💔 Backport failed

Uh oh!

dnhatn commented Jun 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dnhatn commented May 27, 2025 •

edited

Loading