-
Notifications
You must be signed in to change notification settings - Fork 25.7k
ESQL: Fix ReplaceAliasingEvalWithProject in case of shadowing #137025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Fix ReplaceAliasingEvalWithProject in case of shadowing #137025
Conversation
|
Hi @alex-spies, I've created a changelog YAML for you. |
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
bpintea
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, interesting it didn't surface earlier.
I've only left style optional notes.
...ck/plugin/esql/qa/testFixtures/src/main/java/org/elasticsearch/xpack/esql/EsqlTestUtils.java
Show resolved
Hide resolved
...ava/org/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAliasingEvalWithProject.java
Outdated
Show resolved
Hide resolved
...ava/org/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAliasingEvalWithProject.java
Outdated
Show resolved
Hide resolved
...ava/org/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAliasingEvalWithProject.java
Outdated
Show resolved
Hide resolved
...ava/org/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAliasingEvalWithProject.java
Show resolved
Hide resolved
...rg/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAliasingEvalWithProjectTests.java
Show resolved
Hide resolved
...rg/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAliasingEvalWithProjectTests.java
Show resolved
Hide resolved
...rg/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAliasingEvalWithProjectTests.java
Show resolved
Hide resolved
...rg/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAliasingEvalWithProjectTests.java
Show resolved
Hide resolved
...rg/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAliasingEvalWithProjectTests.java
Show resolved
Hide resolved
...rg/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAliasingEvalWithProjectTests.java
Show resolved
Hide resolved
| import static org.elasticsearch.xpack.esql.optimizer.rules.logical.TemporaryNameUtils.locallyUniqueTemporaryName; | ||
|
|
||
| /** | ||
| * Replace aliasing evals (eval x=a) with a projection which can be further combined / simplified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we require project on top? It seems we are missing a lot of optimization opportunities.
e.g. maybe we will duplicate a billion doc field many times for this ES|QL? I tested with logical planning, not sure if we have rules later to handle this.
from test
| EVAL salary = salary+1, salary = salary +1, salary = salary +1
Eval[[salary{f}#17 + 1[INTEGER] AS salary#5, salary{r}#5 + 1[INTEGER] AS salary#8, salary{r}#8 + 1[INTEGER] AS sala
ry#11]]
\_Limit[1000[INTEGER],false,false]
\_EsRelation[test][_meta_field{f}#18, emp_no{f}#12, first_name{f}#13, ..]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example above should be optimized to just this, no project needed
from test
| EVAL salary = salary+3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule doesn't propagate shadowed, internal columns from an eval. We could do that! But I don't think we do.
I expect this is what you'd want this to become?
from test
| EVAL salary = ((salary+1)+1)+1
(Which should be simplified by some other rule to salary+3, I think.)
Why do we require project on top?
Great question! That rule is super old, and I don't recall why we don't trigger it always. My hunch is that we wanted it mostly to combine the aliases from the eval with downstream projections. But we could profit from propagating the aliases more generally.
That said, on its own, there is no performance difference between a simple alias in an eval vs. in a projection. Both are cheap! They just incRef the underlying block. (Unless that block is sent over the wire. But that could also be tackled on the serialization level.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would expect us to get to salary = ((salary+1)+1)+1 and then fold the constants in evals eventually. You don't have to address it in this PR, it seems like a bigger change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a separate approach to this optimizer rule, which would inline all eval expressions in a first step, and then can very simply extract any simple renames into a project.
This would side step all the shadowing shenanigans and would address your comment below.
It would have interesting behavior because, in a way, it'd do the opposite of extracting common expressions in EVALs, which we may want to implement in the future. E.g. EVAL x = to_lower(y), z1 = length(x), z2 = starts_with("foo", x) - this eval re-uses x and inlining it into z1 and z2 would make us re-compute it unnecessarily. OTOH, this would allow for simplifications like simplifying salary = salary+1, salary = salary +1, salary = salary +1 into salary+3.
I'm not sure we want to jump on this right now, but maybe it'll become useful in the future, or if we find that our eval expressions bottleneck queries and need to be optimized better.
julian-elastic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule is pretty complicated, I feel like it tries to do too much at the same time and the same function. I wonder if we can refactor it somehow, e.g. have a method that does just the aliasing, have a method that does just the renaming, have a method that builds the final output. The changes themselves seem correct. Also added some testing recommendations in the comments.
I think a multi-pass approach could be easier to read, but would likely take another day to refactor :/ Conceptually, we could ignore shadowing in a first pass and just turn the Eval into a It's still complex though :/ The root problem is that our optimizer rules need to deal with shadowing in the first place, because our name conflict resolution still is always taken into account in LogicalPlans, not only during the initial resolution. If you see a nicer way that'd work, feel free to go for it! For now, I just need this to be correct so that I can unmute the generative tests again. |
|
Thanks for the reviews @bpintea and @julian-elastic ! |
…c#137025) Fix elastic#137019: a bug that happened when the Eval has (non-aliasing) fields that happen to overwrite the attributes that we try to alias in a subsequent Project.
💔 Backport failed
You can use sqren/backport to manually backport by running |
…c#137025) Fix elastic#137019: a bug that happened when the Eval has (non-aliasing) fields that happen to overwrite the attributes that we try to alias in a subsequent Project. (cherry picked from commit 386b156) # Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/resources/eval.csv-spec # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…c#137025) Fix elastic#137019: a bug that happened when the Eval has (non-aliasing) fields that happen to overwrite the attributes that we try to alias in a subsequent Project. (cherry picked from commit 386b156) # Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/resources/eval.csv-spec # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java
…137025) (#137318) * ESQL: Fix ReplaceAliasingEvalWithProject in case of shadowing (#137025) Fix #137019: a bug that happened when the Eval has (non-aliasing) fields that happen to overwrite the attributes that we try to alias in a subsequent Project. (cherry picked from commit 386b156) # Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/resources/eval.csv-spec # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java * Fix tests
…137025) (#137316) * ESQL: Fix ReplaceAliasingEvalWithProject in case of shadowing (#137025) Fix #137019: a bug that happened when the Eval has (non-aliasing) fields that happen to overwrite the attributes that we try to alias in a subsequent Project. (cherry picked from commit 386b156) # Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/resources/eval.csv-spec # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java * Remove accidentally committed test from other PR * Fix tests
Fix #137019