Skip to content

Conversation

@alex-spies
Copy link
Contributor

@alex-spies alex-spies commented Jun 16, 2025

Closes #119082.

Alternative approach to #127776 that doesn't rely on renaming the lookup fields that LOOKUP JOIN adds.

In general, we're doing this simple transformation:

Join[LEFT][leftKeyField][rightKeyField]
|_Project[leftKeyField, leftOtherField]
| \_...
\_EsRelation[LOOKUP][rightKeyField, rightOtherField]

->

Project[leftKeyField, leftOtherField, rightOtherField]
\_Join[LEFT][leftKeyField][rightKeyField]
  |_...
  \_EsRelation[LOOKUP][rightKeyField, rightOtherField]

The tricky part is dealing with the case where the lookup fields added by LOOKUP JOIN would shadow some attributes if performed after the Project; in this case, we leave Evals in place that assign temporary names to any would-be shadowed attributes. This is the same approach that other pushdown rules take when they push down past an Order (SORT) - see PushDownUtils#pushGeneratingPlanPastProjectAndOrderBy for reference.

Example with shadowing:

Join[LEFT][leftKeyField][rightKeyField]
|_Project[leftKeyField, nameConflictField AS renamedField]
| \_...
\_EsRelation[LOOKUP][rightKeyField, nameConflictField]

->

Project[leftKeyField, $$nameConflictField$temp_name$ AS renamedField, nameConflictField]
\_Join[LEFT][leftKeyField][rightKeyField]
  |_Eval[nameConflictField AS $$nameConflictField$temp_name$]
  |  \_...
  \_EsRelation[LOOKUP][rightKeyField, nameConflictField]

@alex-spies alex-spies added >enhancement auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v8.19.0 labels Jun 17, 2025
@alex-spies
Copy link
Contributor Author

This still needs greater test coverage. But I think the productive code is ready to review, so I'll undraft this.

@elasticsearchmachine
Copy link
Collaborator

Hi @alex-spies, I've created a changelog YAML for you.

@alex-spies alex-spies marked this pull request as ready for review June 17, 2025 15:58
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@alex-spies alex-spies changed the title ESQL: Pushdown Lookup Join past Project take 2 ESQL: Pushdown Lookup Join past Project Jun 17, 2025
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split out the setup part of the LogicalPlanOptimizerTests so that we can stuff related tests into a dedicated class rather than having one big pool.

@alex-spies alex-spies requested review from astefan and bpintea June 17, 2025 16:04
Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM.


for (NamedExpression proj : newProjections) {
// TODO: add assert to Project that ensures Alias to attr or pure attr.
Attribute coreAttr = (Attribute) (proj instanceof Alias as ? as.child() : proj);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Attribute coreAttr = (Attribute) (proj instanceof Alias as ? as.child() : proj);
Attribute coreAttr = (Attribute) Alias.unwrap(proj);

Copy link
Contributor

@bpintea bpintea Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, even better, maybe skip iterating twice over join's output attributes -- do all in one single loop? You then won't need to test for unwrapped object (here and below).
Edit: and the cast won't be necessary anymore (now it feels like it would deserve a comment as to why it's safe).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The single loop would get rid of the unwrapping and casting, but that makes the handling of shadowing less isolated - which I find harder to explain nicely.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

public Project(Source source, LogicalPlan child, List<? extends NamedExpression> projections) {
super(source, child);
this.projections = projections;
assert validateProjections(projections);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional safety measure: there is an unwritten invariant of Projects, let's write it out.

@alex-spies alex-spies merged commit 809dab1 into elastic:main Jun 23, 2025
32 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 129503

@alex-spies
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

alex-spies added a commit to alex-spies/elasticsearch that referenced this pull request Jun 23, 2025
Add a new logical plan optimization:

When there is a Project (KEEP/DROP/RENAME/renaming EVALs) in a LOOKUP JOIN's left child (the "main" side), perform the Project after the LOOKUP JOIN. This prevents premature field extractions when the lookup join happens on data nodes.

(cherry picked from commit 809dab1)

# Conflicts:
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/plan/logical/CommandLicenseTests.java
elasticsearchmachine pushed a commit that referenced this pull request Jun 23, 2025
Add a new logical plan optimization:

When there is a Project (KEEP/DROP/RENAME/renaming EVALs) in a LOOKUP JOIN's left child (the "main" side), perform the Project after the LOOKUP JOIN. This prevents premature field extractions when the lookup join happens on data nodes.

(cherry picked from commit 809dab1)

# Conflicts:
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/plan/logical/CommandLicenseTests.java
kderusso pushed a commit to kderusso/elasticsearch that referenced this pull request Jun 23, 2025
Add a new logical plan optimization:

When there is a Project (KEEP/DROP/RENAME/renaming EVALs) in a LOOKUP JOIN's left child (the "main" side), perform the Project after the LOOKUP JOIN. This prevents premature field extractions when the lookup join happens on data nodes.
julian-elastic pushed a commit to julian-elastic/elasticsearch that referenced this pull request Jun 24, 2025
Add a new logical plan optimization:

When there is a Project (KEEP/DROP/RENAME/renaming EVALs) in a LOOKUP JOIN's left child (the "main" side), perform the Project after the LOOKUP JOIN. This prevents premature field extractions when the lookup join happens on data nodes.
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jun 25, 2025
Add a new logical plan optimization:

When there is a Project (KEEP/DROP/RENAME/renaming EVALs) in a LOOKUP JOIN's left child (the "main" side), perform the Project after the LOOKUP JOIN. This prevents premature field extractions when the lookup join happens on data nodes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged backport pending >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ESQL: LOOKUP JOIN push down optimizations

4 participants