-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[ES|QL] Non-Correlated Subquery in FROM command #135744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ES|QL] Non-Correlated Subquery in FROM command #135744
Conversation
|
Hi @fang-xing-esql, I've created a changelog YAML for you. |
8a72832 to
0c5b79d
Compare
|
Hi @fang-xing-esql, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces support for non-correlated subqueries within the FROM command in ES|QL, allowing queries to reference multiple data sources including both index patterns and subqueries. The implementation enables subqueries to be processed similarly to Fork operations, with key distinctions in index resolution and predicate pushdown capabilities.
- Adds grammar and parser support for subquery syntax in FROM commands
- Implements UnionAll logical plan to handle mixed index patterns and subqueries
- Enables predicate pushdown optimization specifically for UnionAll operations
Reviewed Changes
Copilot reviewed 36 out of 39 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| EsqlBaseParser.g4 | Updates grammar to support subquery syntax in FROM_MODE |
| LogicalPlanBuilder.java | Creates UnionAll plans and handles subquery/index pattern combinations |
| UnionAll.java | New logical plan extending Fork with union-typed field support |
| Subquery.java | New logical plan node representing subquery placeholders |
| Analyzer.java | Resolves subquery indices and handles union-typed fields |
| PushDownAndCombineFilters.java | Adds predicate pushdown optimization for UnionAll |
| EsqlSession.java | Implements subquery index resolution during pre-analysis |
| Various test files | Adds comprehensive test coverage for subquery functionality |
Comments suppressed due to low confidence (1)
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/SubqueryTests.java:1
- There's a typo in "nested fork/subquery is not supported, it passes Analyzer" - should be "nested fork/subquery is not supported; it passes Analyzer" (semicolon instead of comma for better grammar).
/*
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| } | ||
| return parent; | ||
| } else { // We should not reach here as the grammar does not allow it | ||
| throw new ParsingException("FROM is required in a subquery"); |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message "FROM is required in a subquery" is misleading since the grammar already enforces this requirement. Consider a more descriptive message like "Invalid subquery structure" or remove the comment and exception if this code path is truly unreachable.
| throw new ParsingException("FROM is required in a subquery"); | |
| throw new ParsingException("Invalid subquery structure"); |
| LogicalPlan newChild = switch (child) { | ||
| case Project project -> maybePushDownFilterPastProjectForUnionAllChild(pushable, project); | ||
| case Limit limit -> maybePushDownFilterPastLimitForUnionAllChild(pushable, limit); | ||
| default -> null; // TODO add a general push down for unexpected pattern |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TODO comment indicates incomplete functionality. Consider implementing the general push down logic or at least provide a more specific plan for when this will be addressed, as returning null could lead to silent failures in optimization.
| default -> null; // TODO add a general push down for unexpected pattern | |
| default -> { | |
| // Fallback: unknown child type, do not push down filter for this child. | |
| // Consider implementing general push down logic here in the future. | |
| yield child; | |
| } |
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java
Outdated
Show resolved
Hide resolved
| boolean supportsAggregateMetricDouble, | ||
| boolean supportsDenseVector | ||
| boolean supportsDenseVector, | ||
| Set<IndexPattern> subqueryIndices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging this subqueryIndices into the mainIndices is another option, it will require changes to EsqlCCSUtils.initCrossClusterState and EsqlCCSUtils.createIndexExpressionFromAvailableClusters, as they associate the ExecutionInfo with only one index pattern today.
| hasCapabilities(adminClient(), List.of(ENABLE_FORK_FOR_REMOTE_INDICES.capabilityName())) | ||
| ); | ||
| } | ||
| // Subqueries in FROM are not fully tested in CCS yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there is subquery exists in the query convertToRemoteIndices doesn't generate a correct remote index pattern yet, the query becomes invalid. Subqueries are not fully tested in CCS yet, working on it as a follow up.
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/SubqueryTests.java
Show resolved
Hide resolved
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Pinging @elastic/kibana-esql (ES|QL-ui) |
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
...ain/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownAndCombineFilters.java
Outdated
Show resolved
Hide resolved
luigidellaquila
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fang-xing-esql, that looks pretty good.
I left another round of comments, most of them are very minor changes, but one needs to be fixed before moving forward
...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java
Show resolved
Hide resolved
...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/PreAnalyzer.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/telemetry/FeatureMetric.java
Show resolved
Hide resolved
|
Thank you so much for reviewing this PR in such depth, I really appreciate the time and thought you put into it, @astefan ! I'll create a follow up issue for improving I made a couples of changes and added additional tests related to subqueries combined with fork and full-text functions, both in the main query and within subqueries. Initially, I planned to address these in a separate PR since this one is already quite large, but I think these scenarios are important enough to include here. First, Second, the validation of commands before a full-text function, as well as the verification of the field referenced by that function, is now deferred to |
Thank you so much for your thoughtful reviews @luigidellaquila! They really help this PR to become a better PR. I think I addressed all of them, please just let me know if there is anything that I missed. |
luigidellaquila
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fang-xing-esql
There are still a couple of things I'd like to check in detail (the pushdown logic in particular), but for what I could see it looks good, so I'm approving to unblock the merge.
| return DATE_NANOS; | ||
| } | ||
|
|
||
| if (t1.isCounter()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd expect this logic to be in EsqlDataTypeConverter.commonType() (that you are using below btw).
The only logical difference is noCounter().
Maybe commonTypes() could just do
for (List<Attribute> out : outputs) {
type = EsqlDataTypeConverter.commonType(type, out.get(i).dataType()).noCounter();
}
bpintea
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started my journey on this PR. Minor/optional notes only.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/LogicalPlanBuilder.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/LogicalPlanBuilder.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/LogicalPlanBuilder.java
Outdated
Show resolved
Hide resolved
...org/elasticsearch/xpack/esql/optimizer/rules/logical/PushDownFilterAndLimitIntoUnionAll.java
Outdated
Show resolved
Hide resolved
...lasticsearch/xpack/esql/optimizer/rules/logical/PushDownFilterAndLimitIntoUnionAllTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be great to have some javadoc for this class.
(I would seem it's mostly useful in PushDownFilterAndLimitIntoUnionAll, where we first push "below" UnionAll and then once there, below Subquery?
Seems like the LogicalPlanBuilder#visitRelation always places the index patern first, followed by the subqueries.
I'm wondering thus if this marker node is really needed. Wouldn't these push downs be doable on each branch by existing rules, if Subquery wasn't there?
But still to look into it and didn't grasp it all yet. Javadoc would be great in any case, though. :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of reasons that Subquery node exists in the plan.
First, originally I added this node because in the future we may want to support qualifiers with subqueries(like FROM idx1, (FROM idx2, idx3) as idx23), and this is a place that we can store the name(qualifier) of the subqueries, that can be referenced by the parent query.
Second, when I started working on pushing down filters into subqueries, I realize PushDownAndCombineFilters does not push down filters below limit intentionally, refer to here, however UnionAll/Fork adds a implicit limit for each branch, so this rule doesn't help the predicate pushdown for subqueries. The Subquery node is also used as a pattern for predicate pushdown. I'll add more comments in the code.
bpintea
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue I see that we could align is the treatment of the "union"/conflicting types: in case these emerge out of an index pattern, we return the type as unsupported, along with the original_types and a suggested_cast. User is in the know, all good.
With UnionAll, we return it of type keyword, even if no index in the union has the conflicting field of that type and null values, but with no other indication whatsoever that there's a conflict. The user won't know of the conflict, which I think is problematic, as this could occur frequently.
Don't know if we have a decision here or track this somewhere?
(Some more comments to follow, sorry :) )
| ThreadPool.Names.SEARCH_COORDINATION, | ||
| ThreadPool.Names.SYSTEM_READ | ||
| ); | ||
| if (subqueryIndexPattern != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this now ever happen? Shouldn't it be guaranteed by the grammar that it's never null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The piece of code that processes subquery index patterns in EsqlSession will be in a separate PR from @craigtaverner, and it will be shared by subqueries and views, stay tuned :).
| ); | ||
|
|
||
| } else { | ||
| // occurs when dealing with local relations (row a = 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't currently supported by the grammar, no?
| && indexResolution.matches(indexPattern) == false | ||
| && context.subqueryResolution().isEmpty() == false) { | ||
| // index pattern does not match main index | ||
| indexResolution = context.subqueryResolution().getOrDefault(indexPattern, indexResolution); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| indexResolution = context.subqueryResolution().getOrDefault(indexPattern, indexResolution); | |
| indexResolution = context.subqueryResolution().get(indexPattern); | |
| if (indexResolution == null) {... |
Wouldn't it be a bug if the simple get() would return a null? And wouldn't we introduce a new one, if an "unkown" pattern resolves then to the "main" index?
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/FullTextFunction.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are some fixes needed on conflicting/union types from different branches. I've left a few notes, out of which one is, I think, a bug.
Potentially related to that: when running the query in that comment - with index1 only having "x": 1 and index2 having "x": "2" (i.e. both text and keyword fields) - I get a CCE:
[2025-10-17T12:49:31,742][ERROR][o.e.r.ChunkedRestResponseBodyPart] [runTask-0] failure encoding chunk java.lang.ClassCastException: class org.elasticsearch.compute.data.IntVectorBlock cannot be cast to class org.elasti
csearch.compute.data.LongBlock (org.elasticsearch.compute.data.IntVectorBlock and org.elasticsearch.compute.data.LongBlock are in unnamed module of loader java.net.FactoryURLClassLoader @11d2dd2d)
at org.elasticsearch.xpack.esql.action.PositionToXContent$1.valueToXContent(PositionToXContent.java:72)
at org.elasticsearch.xpack.esql.action.PositionToXContent.positionToXContent(PositionToXContent.java:53)
at org.elasticsearch.xpack.esql.action.ResponseXContentUtils.lambda$rowValues$7(ResponseXContentUtils.java:114)
at [email protected]/org.elasticsearch.rest.ChunkedRestResponseBodyPart$1.encodeChunk(ChunkedRestResponseBodyPart.java:161)
at [email protected]/org.elasticsearch.rest.RestController$EncodedLengthTrackingChunkedRestResponseBodyPart.encodeChunk(RestController.java:1012)
at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.writeChunk(Netty4HttpPipeliningHandler.java:436)
at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.doWriteChunkedResponse(Netty4HttpPipeliningHandler.java:263)
at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.doWrite(Netty4HttpPipeliningHandler.java:231)
at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.write(Netty4HttpPipeliningHandler.java:184)
at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:891)
at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:956)
at [email protected]/io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1263)
at [email protected]/io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
at [email protected]/io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
at [email protected]/io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
at [email protected]/io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
at [email protected]/io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998)
at [email protected]/io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:1575)
Also, something like:
FROM index1, (FROM index2) | EVAL l = x::LONG | EVAL i = x::INTEGER | KEEP x, l, i
fails with a verification exception:
"org.elasticsearch.xpack.esql.VerificationException: Found 1 problem\nline 130:47: Output has changed from [[x{r}#204, l{r}#191, i{r}#194]] to [[x{r}#204, l{r}#191, i{r}#194]].
at org.elasticsearch.xpack.esql.optimizer.LogicalPlanOptimizer.optimize(LogicalPlanOptimizer.java:121)
at org.elasticsearch.xpack.esql.session.EsqlSession.optimizedPlan(EsqlSession.java:947)
at org.elasticsearch.xpack.esql.session.EsqlSession$1.lambda$onResponse$1(EsqlSession.java:228)
at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
at [email protected]/org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:406)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:326)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.addListener(SubscribableListener.java:222)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.lambda$andThen$1(SubscribableListener.java:534)
at [email protected]/org.elasticsearch.action.ActionListener.run(ActionListener.java:465)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.newForked(SubscribableListener.java:138)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.andThen(SubscribableListener.java:534)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.andThen(SubscribableListener.java:489)
at org.elasticsearch.xpack.esql.session.EsqlSession$1.onResponse(EsqlSession.java:228)
at org.elasticsearch.xpack.esql.session.EsqlSession$1.onResponse(EsqlSession.java:219)
at [email protected]/org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:406)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:326)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:355)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:262)
at org.elasticsearch.xpack.esql.session.EsqlSession.analyzeWithRetry(EsqlSession.java:885)
at org.elasticsearch.xpack.esql.session.EsqlSession.lambda$resolveIndices$14(EsqlSession.java:538)
at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
at [email protected]/org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:406)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:326)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:355)
at [email protected]/org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:262)
at org.elasticsearch.xpack.esql.session.EsqlSession.preAnalyzeSubqueryIndices(EsqlSession.java:554)
at org.elasticsearch.xpack.esql.session.EsqlSession.lambda$preAnalyzeSubqueryIndices$15(EsqlSession.java:551)
at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
at org.elasticsearch.xpack.esql.session.EsqlSession.lambda$preAnalyzeSubqueryIndex$16(EsqlSession.java:589)
at [email protected]/org.elasticsearch.action.ActionListenerImplementations$DelegatingFailureActionListener.onResponse(ActionListenerImplementations.java:233)
at org.elasticsearch.xpack.esql.session.IndexResolver.lambda$resolveAsMergedMapping$0(IndexResolver.java:100)
at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
at [email protected]/org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:413)
at [email protected]/org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:228)
at [email protected]/org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:222)
at [email protected]/org.elasticsearch.action.ActionListenerImplementations$RunBeforeActionListener.onResponse(ActionListenerImplementations.java:350)
at [email protected]/org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:413)
at [email protected]/org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:33)
at [email protected]/org.elasticsearch.action.ActionListenerImplementations$MappedActionListener.onResponse(ActionListenerImplementations.java:111)
at org.elasticsearch.xpack.esql.action.EsqlResolveFieldsAction.finishHim(EsqlResolveFieldsAction.java:338)
at org.elasticsearch.xpack.esql.action.EsqlResolveFieldsAction.lambda$doExecuteForked$7(EsqlResolveFieldsAction.java:232)
at [email protected]/org.elasticsearch.core.AbstractRefCounted$1.closeInternal(AbstractRefCounted.java:125)
at [email protected]/org.elasticsearch.core.AbstractRefCounted.decRef(AbstractRefCounted.java:77)
at [email protected]/org.elasticsearch.action.support.RefCountingRunnable.close(RefCountingRunnable.java:113)
at [email protected]/org.elasticsearch.core.Releasables$4.close(Releasables.java:178)
at [email protected]/org.elasticsearch.common.util.concurrent.RunOnce.run(RunOnce.java:41)
at [email protected]/org.elasticsearch.action.fieldcaps.RequestDispatcher.innerExecute(RequestDispatcher.java:177)
at [email protected]/org.elasticsearch.action.fieldcaps.RequestDispatcher$1.doRun(RequestDispatcher.java:146)
at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
at [email protected]/org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$1.onResponse(TransportFieldCapabilitiesAction.java:328)
at [email protected]/org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$1.onResponse(TransportFieldCapabilitiesAction.java:324)
at [email protected]/org.elasticsearch.common.util.concurrent.AbstractThrottledTaskRunner$1.doRun(AbstractThrottledTaskRunner.java:136)
at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
at [email protected]/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:35)
at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1067)
at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1575)\n"
| Map<String, AbstractConvertFunction> convertFunctions = new HashMap<>(); | ||
| plan.forEachDown(p -> p.forEachExpression(AbstractConvertFunction.class, f -> { | ||
| if (f.field() instanceof Attribute attr && unionAll.output().contains(attr)) { | ||
| convertFunctions.putIfAbsent(attr.name(), f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is problematic: we convert an emerging union output attribute to the first[*] type showing up in a conversion function, irrespective of this conversion function applying to the union output attribute or not.
That is: if x is a field in two indices (of different types), part of the output of something like:
FROM index1, (FROM index2) | EVAL some_field = x::LONG | EVAL some_other_field = x::INTEGER
x's type will be either LONG or INTEGER, depending on the order of those EVALs. Which I think is wrong.
[*] first in order of walking the tree.
I guess we probably want to restrict the map to only those attributes that apply a conversion function to another one with the same name? In which case, I guess we'll want to simply push down the conversion function to the UnionAll branches -- so not just copy it, but push it away downwards: right now the conversions stay, even if done on the same name attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a good idea, I'll see how to deal with it. I'd like to keep the explicit casting push down in this PR. And I'm going to remove the implicit casting among subqueries and main index patterns from this PR, and do it as a follow up, as suggested by @alex-spies in the design review, if you haven't review the implicit casting part, you can skip it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for catching this! I think we can handle multiple conversion functions on the same UnionAll output better. The collectConvertFunctions method has been updated to collect all explicit conversion functions from the main query that reference the outputs of UnionAll. These explicit conversion functions are now pushed down into each UnionAll branch and returned as new output attributes. And the subsequent commands after UnionAll can reference the outputs without confusion.
|
|
||
| private LogicalPlan maybePushDownConvertFunctionsToChild(LogicalPlan child, List<Alias> aliases, List<Attribute> output) { | ||
| // Fork/UnionAll adds an EsqlProject on top of each child plan during resolveFork, check this pattern before pushing down | ||
| if (aliases.isEmpty() == false && child instanceof EsqlProject esqlProject) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to check for the EsqlProject presence / pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to here, the branches of UnionAll follow a common pattern like below:
UnionAll/Fork
Project
Eval (optional)
Subquery (main index patterns do not have this node)
...
In both ResolveUnionTypesInUnionAll and PushDownFilterAndLimitIntoUnionAll, the pattern checks are intentionally strict — these rules only apply their transformations when a branch of UnionAll exactly matches the expected patterns. The main goal is to have a tight/better control over their behavior and ensure they perform only the intended transformations. If the structure of subqueries changes in the future and no longer fits the expected pattern, these rules will simply stop applying, allowing us to detect such changes early in the process.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
| DataType targetType = convertFunctions.containsKey(oldAttr.name()) | ||
| ? convertFunctions.get(oldAttr.name()).dataType() | ||
| : oldAttr.dataType(); | ||
| if (oldAttr.dataType() != targetType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this condition can only be true if convertFunctions.containsKey(oldAttr.name()). Maybe we can simplify a bit the code here for better legibility.
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/tree/EsqlNodeSubclassTests.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/tree/EsqlNodeSubclassTests.java
Outdated
Show resolved
Hide resolved
…across subqueries
…on only on coordinator node
This PR enables support for
non-correlated subquerieswithin theFROMcommand. Related to https://github.com/elastic/esql-planning/issues/89A
non-correlated subqueryin this context is one that is fully self-contained and does not reference attributes from the outer query. Enabling support for these subqueries in theFROMcommand provides an additional way to define a data source, beyond directly specifying index patterns in anES|QLquery.Example
This feature is built on top of
Fork. Subqueries are processed in a manner similar to howForkoperates today, with modifications made to the following components to support this functionality:FROM_MODEis updated to support subquery syntax.LogicalPlanBuildercreates aUnionAlllogical plan on top of multiple data sources. Each data source can be either index patterns or subqueries.UnionAllextendsFork, but unlikeFork, eachUnionAllleg may fetch data from different indices—this is one of the key differences betweenUnionAllandFork.fieldcapscalls to build anIndexResolutionfor each subquery.UnionAllleg,InvalidMappedFieldare not created across them. If conversion functions are required for common fields between the main index and subquery indices, those conversion functions must be pushed down into eachUnionAllleg.UnionAllandFork, as predicate pushdown applies only toUnionAll, whileForkremains unchanged.Restrictions and follow ups to be addressed in the next PRs:
LogicalPlanOptimizerwill error out, if the subquery has commands besidesFROMcommand. This is tracked in [ES|QL] Allow nested non-correlated subqueries in from command #136034.FieldNameUtils.resolveFieldNamesto identify subquery field names for field caps call, instead of using all fields*. [ES|QL] Improve FieldNameUtils.resolveFieldNames to identify subquery field names for field caps call, instead of using all fields*#137283LocalRelationcreated byPruneFiltersincludes all of the output of the subquery, and the output is the superset of the outputs from each subquery, which looks confusing, the outputs that are not directly from the current subquery can be excluded from theLocalRelation.UnionAlloutput with explicit casting. [ES|QL] Push down Filters/Predicates on mixed data typeUnionAlloutput with explicit casting. #137284UnionAlloutputLIMITappended to each subquery(UnionAllbranch). [ES|QL] Remove the implicitLIMITcommand added to each subquery (UnionAllbranch) #138106