-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ESQL: Pushdown Lookup Join past Project #129503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Pushdown Lookup Join past Project #129503
Conversation
…st-project-take-2
We only add an optimization, it must still work with older nodes.
…st-project-take-2
|
This still needs greater test coverage. But I think the productive code is ready to review, so I'll undraft this. |
|
Hi @alex-spies, I've created a changelog YAML for you. |
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
…st-project-take-2
...n/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownUtils.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I split out the setup part of the LogicalPlanOptimizerTests so that we can stuff related tests into a dedicated class rather than having one big pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, LGTM.
.../java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownJoinPastProjectTests.java
Outdated
Show resolved
Hide resolved
.../src/test/java/org/elasticsearch/xpack/esql/optimizer/AbstractLogicalPlanOptimizerTests.java
Outdated
Show resolved
Hide resolved
...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/local/EsqlProject.java
Outdated
Show resolved
Hide resolved
...n/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownUtils.java
Outdated
Show resolved
Hide resolved
...n/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownUtils.java
Show resolved
Hide resolved
.../main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownJoinPastProject.java
Outdated
Show resolved
Hide resolved
.../main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownJoinPastProject.java
Outdated
Show resolved
Hide resolved
|
|
||
| for (NamedExpression proj : newProjections) { | ||
| // TODO: add assert to Project that ensures Alias to attr or pure attr. | ||
| Attribute coreAttr = (Attribute) (proj instanceof Alias as ? as.child() : proj); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Attribute coreAttr = (Attribute) (proj instanceof Alias as ? as.child() : proj); | |
| Attribute coreAttr = (Attribute) Alias.unwrap(proj); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, even better, maybe skip iterating twice over join's output attributes -- do all in one single loop? You then won't need to test for unwrapped object (here and below).
Edit: and the cast won't be necessary anymore (now it feels like it would deserve a comment as to why it's safe).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The single loop would get rid of the unwrapping and casting, but that makes the handling of shadowing less isolated - which I find harder to explain nicely.
.../main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownJoinPastProject.java
Outdated
Show resolved
Hide resolved
.../main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownJoinPastProject.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
.../main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownJoinPastProject.java
Show resolved
Hide resolved
…st-project-take-2
| public Project(Source source, LogicalPlan child, List<? extends NamedExpression> projections) { | ||
| super(source, child); | ||
| this.projections = projections; | ||
| assert validateProjections(projections); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional safety measure: there is an unwritten invariant of Projects, let's write it out.
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
Add a new logical plan optimization: When there is a Project (KEEP/DROP/RENAME/renaming EVALs) in a LOOKUP JOIN's left child (the "main" side), perform the Project after the LOOKUP JOIN. This prevents premature field extractions when the lookup join happens on data nodes. (cherry picked from commit 809dab1) # Conflicts: # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/plan/logical/CommandLicenseTests.java
Add a new logical plan optimization: When there is a Project (KEEP/DROP/RENAME/renaming EVALs) in a LOOKUP JOIN's left child (the "main" side), perform the Project after the LOOKUP JOIN. This prevents premature field extractions when the lookup join happens on data nodes. (cherry picked from commit 809dab1) # Conflicts: # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/plan/logical/CommandLicenseTests.java
Add a new logical plan optimization: When there is a Project (KEEP/DROP/RENAME/renaming EVALs) in a LOOKUP JOIN's left child (the "main" side), perform the Project after the LOOKUP JOIN. This prevents premature field extractions when the lookup join happens on data nodes.
Add a new logical plan optimization: When there is a Project (KEEP/DROP/RENAME/renaming EVALs) in a LOOKUP JOIN's left child (the "main" side), perform the Project after the LOOKUP JOIN. This prevents premature field extractions when the lookup join happens on data nodes.
Add a new logical plan optimization: When there is a Project (KEEP/DROP/RENAME/renaming EVALs) in a LOOKUP JOIN's left child (the "main" side), perform the Project after the LOOKUP JOIN. This prevents premature field extractions when the lookup join happens on data nodes.
Closes #119082.
Alternative approach to #127776 that doesn't rely on renaming the lookup fields that
LOOKUP JOINadds.In general, we're doing this simple transformation:
The tricky part is dealing with the case where the lookup fields added by
LOOKUP JOINwould shadow some attributes if performed after theProject; in this case, we leaveEvals in place that assign temporary names to any would-be shadowed attributes. This is the same approach that other pushdown rules take when they push down past anOrder(SORT) - seePushDownUtils#pushGeneratingPlanPastProjectAndOrderByfor reference.Example with shadowing: