-
Notifications
You must be signed in to change notification settings - Fork 25.7k
ESQL: Remove redundant TopN
#131833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kanoshiou
wants to merge
15
commits into
elastic:main
Choose a base branch
from
kanoshiou:remove-redundant-topn
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
ESQL: Remove redundant TopN
#131833
Changes from 12 commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
f95816f
Remove redundant `TopN`
kanoshiou 870ca7a
Update docs/changelog/131833.yaml
kanoshiou 6caf40a
Merge branch 'main' into remove-redundant-topn
kanoshiou 744217e
Merge branch 'main' into remove-redundant-topn
kanoshiou 544acbe
Merge branch 'refs/heads/main' into remove-redundant-topn
kanoshiou fc919f7
PruneRedundantTopN
kanoshiou 8eedd7b
Merge branch 'refs/heads/main' into remove-redundant-topn
kanoshiou 6b470d4
Update
kanoshiou df5862f
Add more tests
kanoshiou 62a43b7
Merge branch 'main' into remove-redundant-topn
kanoshiou f06f02e
Merge branch 'refs/heads/main' into remove-redundant-topn
kanoshiou ef65a37
Merge branch 'refs/heads/main' into remove-redundant-topn
kanoshiou c722a21
Merge branch 'main' into remove-redundant-topn
kanoshiou 7a7346d
refactor(esql): Simplify PruneRedundantTopN optimization logic
kanoshiou b101aec
Merge branch 'refs/heads/main' into remove-redundant-topn
kanoshiou File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| pr: 131833 | ||
| summary: "ESQL: Remove redundant `TopN`" | ||
| area: ES|QL | ||
| type: enhancement | ||
| issues: | ||
| - 131233 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
102 changes: 102 additions & 0 deletions
102
...rc/main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PruneRedundantTopN.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,102 @@ | ||
| /* | ||
| * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
| * or more contributor license agreements. Licensed under the Elastic License | ||
| * 2.0; you may not use this file except in compliance with the Elastic License | ||
| * 2.0. | ||
| */ | ||
|
|
||
| package org.elasticsearch.xpack.esql.optimizer.rules.logical; | ||
|
|
||
| import org.elasticsearch.xpack.esql.optimizer.LogicalOptimizerContext; | ||
| import org.elasticsearch.xpack.esql.plan.logical.Drop; | ||
| import org.elasticsearch.xpack.esql.plan.logical.Eval; | ||
| import org.elasticsearch.xpack.esql.plan.logical.Filter; | ||
| import org.elasticsearch.xpack.esql.plan.logical.Insist; | ||
| import org.elasticsearch.xpack.esql.plan.logical.Limit; | ||
| import org.elasticsearch.xpack.esql.plan.logical.LogicalPlan; | ||
| import org.elasticsearch.xpack.esql.plan.logical.OrderBy; | ||
| import org.elasticsearch.xpack.esql.plan.logical.Project; | ||
| import org.elasticsearch.xpack.esql.plan.logical.Rename; | ||
| import org.elasticsearch.xpack.esql.plan.logical.TopN; | ||
| import org.elasticsearch.xpack.esql.plan.logical.inference.Completion; | ||
|
|
||
| import java.util.ArrayDeque; | ||
| import java.util.Collections; | ||
| import java.util.Deque; | ||
| import java.util.IdentityHashMap; | ||
| import java.util.Set; | ||
|
|
||
| /** | ||
| * Removes redundant TopN operations from the logical plan to improve execution efficiency. | ||
| * <p> | ||
| * Multiple TopN nodes may appear in a query plan—particularly after optimization passes— | ||
| * and some of them can be safely removed if they share the same sort order and are not separated | ||
| * by operations that disrupt sorting semantics. | ||
| * <p> | ||
| * For instance: | ||
| * <pre> | ||
| * from test | sort x | limit 100 | sort x | limit 10 | ||
| * </pre> | ||
| * Both <code>sort x | limit 100</code> and <code>sort x | limit 10</code> will be transformed into TopN nodes. | ||
| * Since they sort by the same key and the latter applies a stricter (or equal) limit, | ||
| * the first TopN becomes redundant and can be pruned. | ||
| */ | ||
| public class PruneRedundantTopN extends OptimizerRules.ParameterizedOptimizerRule<TopN, LogicalOptimizerContext> { | ||
|
|
||
| public PruneRedundantTopN() { | ||
| super(OptimizerRules.TransformDirection.DOWN); | ||
| } | ||
|
|
||
| @Override | ||
| protected LogicalPlan rule(TopN plan, LogicalOptimizerContext ctx) { | ||
| Set<TopN> redundant = findRedundantTopN(plan, ctx); | ||
| if (redundant.isEmpty()) { | ||
| return plan; | ||
| } | ||
| return plan.transformDown(TopN.class, topN -> redundant.contains(topN) ? topN.child() : topN); | ||
| } | ||
|
|
||
| /** | ||
| * breadth-first recursion to find redundant TopNs in the children tree. | ||
| * Returns an identity set (we need to compare and prune the exact instances) | ||
| */ | ||
| private Set<TopN> findRedundantTopN(TopN parentTopN, LogicalOptimizerContext ctx) { | ||
| Set<TopN> result = Collections.newSetFromMap(new IdentityHashMap<>()); | ||
|
|
||
| Deque<LogicalPlan> toCheck = new ArrayDeque<>(); | ||
| toCheck.push(parentTopN.child()); | ||
|
|
||
| while (toCheck.isEmpty() == false) { | ||
| LogicalPlan p = toCheck.pop(); | ||
| if (p instanceof TopN childTopN) { | ||
| // Check if a child TopN is redundant compared to a parent TopN. | ||
| // A child TopN is redundant if it matches the parent's sort order and has a greater or equal limit. | ||
| if (childTopN.order().equals(parentTopN.order()) | ||
| // Although `PushDownAndCombineLimits` is expected to have propagated the stricter (lower) limit, | ||
| // we still compare limit values here to ensure correctness and avoid relying solely on prior optimizations. | ||
| // This limit check should always pass, but we validate it explicitly for robustness. | ||
| && (int) parentTopN.limit().fold(ctx.foldCtx()) <= (int) childTopN.limit().fold(ctx.foldCtx())) { | ||
| result.add(childTopN); | ||
| toCheck.push(childTopN.child()); | ||
| } | ||
| } else if (canRemoveRedundantChildTopN(p)) { | ||
| for (LogicalPlan child : p.children()) { | ||
| toCheck.push(child); | ||
| } | ||
| } | ||
| } | ||
| return result; | ||
| } | ||
|
|
||
| private boolean canRemoveRedundantChildTopN(LogicalPlan p) { | ||
| return p instanceof Completion | ||
| || p instanceof Drop | ||
| || p instanceof Eval | ||
| || p instanceof Rename | ||
| || p instanceof Filter | ||
| || p instanceof Insist | ||
| || p instanceof Limit | ||
| || p instanceof OrderBy | ||
| || p instanceof Project; | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing past limits and filters is incorrect. There is probably a class of nodes where
TOP N | COMMAND | TOP Mcan be simplified toCOMMAND | TOP min(N,M), but these cannot be included:The first takes the top 10
as, removes those withb <= 10and then gives the topcfrom this subset.The second takes random 10 rows, then takes those with
b > 10, and finally gives the topcfrom this subset. But the subsets are different.