Skip to content

Conversation

@fang-xing-esql
Copy link
Member

@fang-xing-esql fang-xing-esql commented Sep 30, 2025

This PR enables support for non-correlated subqueries within the FROM command. Related to https://github.com/elastic/esql-planning/issues/89

A non-correlated subquery in this context is one that is fully self-contained and does not reference attributes from the outer query. Enabling support for these subqueries in the FROM command provides an additional way to define a data source, beyond directly specifying index patterns in an ES|QL query.

Example

FROM index1, (FROM index2
              | WHERE a > 10
              | EVAL b = a * 2
              | STATS cnt = COUNT(*) BY c
              | SORT cnt desc
              | LIMIT 10)
, index3, (FROM index4 
           | stats count(*))
| WHERE d > 10
| STATS max = max(*) BY e
| SORT max desc

This feature is built on top of Fork. Subqueries are processed in a manner similar to how Fork operates today, with modifications made to the following components to support this functionality:

  • Grammar: FROM_MODE is updated to support subquery syntax.
  • Parser: LogicalPlanBuilder creates a UnionAll logical plan on top of multiple data sources. Each data source can be either index patterns or subqueries. UnionAll extends Fork, but unlike Fork, each UnionAll leg may fetch data from different indices—this is one of the key differences between UnionAll and Fork.
  • PreAnalyzer: Extracts index patterns from subqueries and issues fieldcaps calls to build an IndexResolution for each subquery.
  • Analyzer: Resolves indices referenced by subqueries and handles union-typed fields referenced within them. Since subquery index patterns and main query index patterns are accessed separately behind each UnionAll leg, InvalidMappedField are not created across them. If conversion functions are required for common fields between the main index and subquery indices, those conversion functions must be pushed down into each UnionAll leg.
  • LogicalPlanOptimizer: Pushes down eligible filters/predicates from the main query into subqueries. This is another key distinction between UnionAll and Fork, as predicate pushdown applies only to UnionAll, while Fork remains unchanged.

Restrictions and follow ups to be addressed in the next PRs:

@elasticsearchmachine
Copy link
Collaborator

Hi @fang-xing-esql, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Hi @fang-xing-esql, I've created a changelog YAML for you.

@fang-xing-esql fang-xing-esql added the test-release Trigger CI checks against release build label Sep 30, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for non-correlated subqueries within the FROM command in ES|QL, allowing queries to reference multiple data sources including both index patterns and subqueries. The implementation enables subqueries to be processed similarly to Fork operations, with key distinctions in index resolution and predicate pushdown capabilities.

  • Adds grammar and parser support for subquery syntax in FROM commands
  • Implements UnionAll logical plan to handle mixed index patterns and subqueries
  • Enables predicate pushdown optimization specifically for UnionAll operations

Reviewed Changes

Copilot reviewed 36 out of 39 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
EsqlBaseParser.g4 Updates grammar to support subquery syntax in FROM_MODE
LogicalPlanBuilder.java Creates UnionAll plans and handles subquery/index pattern combinations
UnionAll.java New logical plan extending Fork with union-typed field support
Subquery.java New logical plan node representing subquery placeholders
Analyzer.java Resolves subquery indices and handles union-typed fields
PushDownAndCombineFilters.java Adds predicate pushdown optimization for UnionAll
EsqlSession.java Implements subquery index resolution during pre-analysis
Various test files Adds comprehensive test coverage for subquery functionality
Comments suppressed due to low confidence (1)

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/SubqueryTests.java:1

  • There's a typo in "nested fork/subquery is not supported, it passes Analyzer" - should be "nested fork/subquery is not supported; it passes Analyzer" (semicolon instead of comma for better grammar).
/*

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

}
return parent;
} else { // We should not reach here as the grammar does not allow it
throw new ParsingException("FROM is required in a subquery");
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message "FROM is required in a subquery" is misleading since the grammar already enforces this requirement. Consider a more descriptive message like "Invalid subquery structure" or remove the comment and exception if this code path is truly unreachable.

Suggested change
throw new ParsingException("FROM is required in a subquery");
throw new ParsingException("Invalid subquery structure");

Copilot uses AI. Check for mistakes.
LogicalPlan newChild = switch (child) {
case Project project -> maybePushDownFilterPastProjectForUnionAllChild(pushable, project);
case Limit limit -> maybePushDownFilterPastLimitForUnionAllChild(pushable, limit);
default -> null; // TODO add a general push down for unexpected pattern
Copy link

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment indicates incomplete functionality. Consider implementing the general push down logic or at least provide a more specific plan for when this will be addressed, as returning null could lead to silent failures in optimization.

Suggested change
default -> null; // TODO add a general push down for unexpected pattern
default -> {
// Fallback: unknown child type, do not push down filter for this child.
// Consider implementing general push down logic here in the future.
yield child;
}

Copilot uses AI. Check for mistakes.
boolean supportsAggregateMetricDouble,
boolean supportsDenseVector
boolean supportsDenseVector,
Set<IndexPattern> subqueryIndices
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging this subqueryIndices into the mainIndices is another option, it will require changes to EsqlCCSUtils.initCrossClusterState and EsqlCCSUtils.createIndexExpressionFromAvailableClusters, as they associate the ExecutionInfo with only one index pattern today.

hasCapabilities(adminClient(), List.of(ENABLE_FORK_FOR_REMOTE_INDICES.capabilityName()))
);
}
// Subqueries in FROM are not fully tested in CCS yet
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is subquery exists in the query convertToRemoteIndices doesn't generate a correct remote index pattern yet, the query becomes invalid. Subqueries are not fully tested in CCS yet, working on it as a follow up.

@fang-xing-esql fang-xing-esql marked this pull request as ready for review October 2, 2025 15:16
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 2, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/kibana-esql (ES|QL-ui)

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fang-xing-esql, that looks pretty good.
I left another round of comments, most of them are very minor changes, but one needs to be fixed before moving forward

@fang-xing-esql
Copy link
Member Author

Thank you so much for reviewing this PR in such depth, I really appreciate the time and thought you put into it, @astefan !

I'll create a follow up issue for improving FieldNameUtils.resolveFieldNames, we can do a better job here to recognize field names referenced in subqueries, instead of sending * to the field caps call., thank you for reminding me about it!

I made a couples of changes and added additional tests related to subqueries combined with fork and full-text functions, both in the main query and within subqueries. Initially, I planned to address these in a separate PR since this one is already quite large, but I think these scenarios are important enough to include here.

First, FieldNameUtils.resolveFieldNames currently asserts when encountering nested forks. For cases involving a subquery and a fork, this check has been deferred to LogicalVerifier, which now throws a VerificationException instead of an assertion error. as allowing nested subqueries and subquery + fork is planned as the next step.

Second, the validation of commands before a full-text function, as well as the verification of the field referenced by that function, is now deferred to LogicalVerifier when a UnionAll plan exists in its child plans. Since the output of a UnionAll plan is always a ReferenceAttribute, and certain full-text functions apply only to fields, deferring this validation allows the PushDownFilterAndLimitIntoUnionAll rule to push the full-text functions down into individual subqueries. This enables those validations to occur within the subquery context, increasing the chances that full-text functions can be properly evaluated.

@fang-xing-esql
Copy link
Member Author

fang-xing-esql commented Oct 14, 2025

Thanks @fang-xing-esql, that looks pretty good. I left another round of comments, most of them are very minor changes, but one needs to be fixed before moving forward

Thank you so much for your thoughtful reviews @luigidellaquila! They really help this PR to become a better PR. I think I addressed all of them, please just let me know if there is anything that I missed.

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fang-xing-esql
There are still a couple of things I'd like to check in detail (the pushdown logic in particular), but for what I could see it looks good, so I'm approving to unblock the merge.

return DATE_NANOS;
}

if (t1.isCounter()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd expect this logic to be in EsqlDataTypeConverter.commonType() (that you are using below btw).
The only logical difference is noCounter().

Maybe commonTypes() could just do

                for (List<Attribute> out : outputs) {
                    type = EsqlDataTypeConverter.commonType(type, out.get(i).dataType()).noCounter();
                }

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started my journey on this PR. Minor/optional notes only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be great to have some javadoc for this class.
(I would seem it's mostly useful in PushDownFilterAndLimitIntoUnionAll, where we first push "below" UnionAll and then once there, below Subquery?
Seems like the LogicalPlanBuilder#visitRelation always places the index patern first, followed by the subqueries.
I'm wondering thus if this marker node is really needed. Wouldn't these push downs be doable on each branch by existing rules, if Subquery wasn't there?
But still to look into it and didn't grasp it all yet. Javadoc would be great in any case, though. :) )

Copy link
Member Author

@fang-xing-esql fang-xing-esql Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of reasons that Subquery node exists in the plan.

First, originally I added this node because in the future we may want to support qualifiers with subqueries(like FROM idx1, (FROM idx2, idx3) as idx23), and this is a place that we can store the name(qualifier) of the subqueries, that can be referenced by the parent query.

Second, when I started working on pushing down filters into subqueries, I realize PushDownAndCombineFilters does not push down filters below limit intentionally, refer to here, however UnionAll/Fork adds a implicit limit for each branch, so this rule doesn't help the predicate pushdown for subqueries. The Subquery node is also used as a pattern for predicate pushdown. I'll add more comments in the code.

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue I see that we could align is the treatment of the "union"/conflicting types: in case these emerge out of an index pattern, we return the type as unsupported, along with the original_types and a suggested_cast. User is in the know, all good.

With UnionAll, we return it of type keyword, even if no index in the union has the conflicting field of that type and null values, but with no other indication whatsoever that there's a conflict. The user won't know of the conflict, which I think is problematic, as this could occur frequently.

Don't know if we have a decision here or track this somewhere?

(Some more comments to follow, sorry :) )

ThreadPool.Names.SEARCH_COORDINATION,
ThreadPool.Names.SYSTEM_READ
);
if (subqueryIndexPattern != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this now ever happen? Shouldn't it be guaranteed by the grammar that it's never null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The piece of code that processes subquery index patterns in EsqlSession will be in a separate PR from @craigtaverner, and it will be shared by subqueries and views, stay tuned :).

);

} else {
// occurs when dealing with local relations (row a = 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't currently supported by the grammar, no?

&& indexResolution.matches(indexPattern) == false
&& context.subqueryResolution().isEmpty() == false) {
// index pattern does not match main index
indexResolution = context.subqueryResolution().getOrDefault(indexPattern, indexResolution);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
indexResolution = context.subqueryResolution().getOrDefault(indexPattern, indexResolution);
indexResolution = context.subqueryResolution().get(indexPattern);
if (indexResolution == null) {...

Wouldn't it be a bug if the simple get() would return a null? And wouldn't we introduce a new one, if an "unkown" pattern resolves then to the "main" index?

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some fixes needed on conflicting/union types from different branches. I've left a few notes, out of which one is, I think, a bug.

Potentially related to that: when running the query in that comment - with index1 only having "x": 1 and index2 having "x": "2" (i.e. both text and keyword fields) - I get a CCE:

[2025-10-17T12:49:31,742][ERROR][o.e.r.ChunkedRestResponseBodyPart] [runTask-0] failure encoding chunk java.lang.ClassCastException: class org.elasticsearch.compute.data.IntVectorBlock cannot be cast to class org.elasti
csearch.compute.data.LongBlock (org.elasticsearch.compute.data.IntVectorBlock and org.elasticsearch.compute.data.LongBlock are in unnamed module of loader java.net.FactoryURLClassLoader @11d2dd2d)
        at org.elasticsearch.xpack.esql.action.PositionToXContent$1.valueToXContent(PositionToXContent.java:72)
        at org.elasticsearch.xpack.esql.action.PositionToXContent.positionToXContent(PositionToXContent.java:53)
        at org.elasticsearch.xpack.esql.action.ResponseXContentUtils.lambda$rowValues$7(ResponseXContentUtils.java:114)
        at [email protected]/org.elasticsearch.rest.ChunkedRestResponseBodyPart$1.encodeChunk(ChunkedRestResponseBodyPart.java:161)
        at [email protected]/org.elasticsearch.rest.RestController$EncodedLengthTrackingChunkedRestResponseBodyPart.encodeChunk(RestController.java:1012)
        at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.writeChunk(Netty4HttpPipeliningHandler.java:436)
        at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.doWriteChunkedResponse(Netty4HttpPipeliningHandler.java:263)
        at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.doWrite(Netty4HttpPipeliningHandler.java:231)
        at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.write(Netty4HttpPipeliningHandler.java:184)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:891)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:956)
        at [email protected]/io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1263)
        at [email protected]/io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
        at [email protected]/io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
        at [email protected]/io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
        at [email protected]/io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
        at [email protected]/io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998)
        at [email protected]/io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at java.base/java.lang.Thread.run(Thread.java:1575)

Also, something like:

FROM index1, (FROM index2) | EVAL l = x::LONG | EVAL i = x::INTEGER | KEEP x, l, i

fails with a verification exception:

"org.elasticsearch.xpack.esql.VerificationException: Found 1 problem\nline 130:47: Output has changed from [[x{r}#204, l{r}#191, i{r}#194]] to [[x{r}#204, l{r}#191, i{r}#194]].
	at org.elasticsearch.xpack.esql.optimizer.LogicalPlanOptimizer.optimize(LogicalPlanOptimizer.java:121)
	at org.elasticsearch.xpack.esql.session.EsqlSession.optimizedPlan(EsqlSession.java:947)
	at org.elasticsearch.xpack.esql.session.EsqlSession$1.lambda$onResponse$1(EsqlSession.java:228)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:406)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:326)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.addListener(SubscribableListener.java:222)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.lambda$andThen$1(SubscribableListener.java:534)
	at [email protected]/org.elasticsearch.action.ActionListener.run(ActionListener.java:465)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.newForked(SubscribableListener.java:138)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.andThen(SubscribableListener.java:534)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.andThen(SubscribableListener.java:489)
	at org.elasticsearch.xpack.esql.session.EsqlSession$1.onResponse(EsqlSession.java:228)
	at org.elasticsearch.xpack.esql.session.EsqlSession$1.onResponse(EsqlSession.java:219)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:406)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:326)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:355)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:262)
	at org.elasticsearch.xpack.esql.session.EsqlSession.analyzeWithRetry(EsqlSession.java:885)
	at org.elasticsearch.xpack.esql.session.EsqlSession.lambda$resolveIndices$14(EsqlSession.java:538)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener$SuccessResult.complete(SubscribableListener.java:406)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:326)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:355)
	at [email protected]/org.elasticsearch.action.support.SubscribableListener.onResponse(SubscribableListener.java:262)
	at org.elasticsearch.xpack.esql.session.EsqlSession.preAnalyzeSubqueryIndices(EsqlSession.java:554)
	at org.elasticsearch.xpack.esql.session.EsqlSession.lambda$preAnalyzeSubqueryIndices$15(EsqlSession.java:551)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
	at org.elasticsearch.xpack.esql.session.EsqlSession.lambda$preAnalyzeSubqueryIndex$16(EsqlSession.java:589)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$DelegatingFailureActionListener.onResponse(ActionListenerImplementations.java:233)
	at org.elasticsearch.xpack.esql.session.IndexResolver.lambda$resolveAsMergedMapping$0(IndexResolver.java:100)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
	at [email protected]/org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:413)
	at [email protected]/org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:228)
	at [email protected]/org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:222)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$RunBeforeActionListener.onResponse(ActionListenerImplementations.java:350)
	at [email protected]/org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:413)
	at [email protected]/org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:33)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$MappedActionListener.onResponse(ActionListenerImplementations.java:111)
	at org.elasticsearch.xpack.esql.action.EsqlResolveFieldsAction.finishHim(EsqlResolveFieldsAction.java:338)
	at org.elasticsearch.xpack.esql.action.EsqlResolveFieldsAction.lambda$doExecuteForked$7(EsqlResolveFieldsAction.java:232)
	at [email protected]/org.elasticsearch.core.AbstractRefCounted$1.closeInternal(AbstractRefCounted.java:125)
	at [email protected]/org.elasticsearch.core.AbstractRefCounted.decRef(AbstractRefCounted.java:77)
	at [email protected]/org.elasticsearch.action.support.RefCountingRunnable.close(RefCountingRunnable.java:113)
	at [email protected]/org.elasticsearch.core.Releasables$4.close(Releasables.java:178)
	at [email protected]/org.elasticsearch.common.util.concurrent.RunOnce.run(RunOnce.java:41)
	at [email protected]/org.elasticsearch.action.fieldcaps.RequestDispatcher.innerExecute(RequestDispatcher.java:177)
	at [email protected]/org.elasticsearch.action.fieldcaps.RequestDispatcher$1.doRun(RequestDispatcher.java:146)
	at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
	at [email protected]/org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$1.onResponse(TransportFieldCapabilitiesAction.java:328)
	at [email protected]/org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$1.onResponse(TransportFieldCapabilitiesAction.java:324)
	at [email protected]/org.elasticsearch.common.util.concurrent.AbstractThrottledTaskRunner$1.doRun(AbstractThrottledTaskRunner.java:136)
	at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
	at [email protected]/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:35)
	at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1067)
	at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1575)\n"

Map<String, AbstractConvertFunction> convertFunctions = new HashMap<>();
plan.forEachDown(p -> p.forEachExpression(AbstractConvertFunction.class, f -> {
if (f.field() instanceof Attribute attr && unionAll.output().contains(attr)) {
convertFunctions.putIfAbsent(attr.name(), f);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is problematic: we convert an emerging union output attribute to the first[*] type showing up in a conversion function, irrespective of this conversion function applying to the union output attribute or not.
That is: if x is a field in two indices (of different types), part of the output of something like:

FROM index1, (FROM index2) | EVAL some_field = x::LONG | EVAL some_other_field = x::INTEGER

x's type will be either LONG or INTEGER, depending on the order of those EVALs. Which I think is wrong.

[*] first in order of walking the tree.

I guess we probably want to restrict the map to only those attributes that apply a conversion function to another one with the same name? In which case, I guess we'll want to simply push down the conversion function to the UnionAll branches -- so not just copy it, but push it away downwards: right now the conversions stay, even if done on the same name attribute.

Copy link
Member Author

@fang-xing-esql fang-xing-esql Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good idea, I'll see how to deal with it. I'd like to keep the explicit casting push down in this PR. And I'm going to remove the implicit casting among subqueries and main index patterns from this PR, and do it as a follow up, as suggested by @alex-spies in the design review, if you haven't review the implicit casting part, you can skip it for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for catching this! I think we can handle multiple conversion functions on the same UnionAll output better. The collectConvertFunctions method has been updated to collect all explicit conversion functions from the main query that reference the outputs of UnionAll. These explicit conversion functions are now pushed down into each UnionAll branch and returned as new output attributes. And the subsequent commands after UnionAll can reference the outputs without confusion.


private LogicalPlan maybePushDownConvertFunctionsToChild(LogicalPlan child, List<Alias> aliases, List<Attribute> output) {
// Fork/UnionAll adds an EsqlProject on top of each child plan during resolveFork, check this pattern before pushing down
if (aliases.isEmpty() == false && child instanceof EsqlProject esqlProject) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to check for the EsqlProject presence / pattern?

Copy link
Member Author

@fang-xing-esql fang-xing-esql Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to here, the branches of UnionAll follow a common pattern like below:

UnionAll/Fork
  Project
    Eval (optional)
      Subquery (main index patterns do not have this node)
        ...

In both ResolveUnionTypesInUnionAll and PushDownFilterAndLimitIntoUnionAll, the pattern checks are intentionally strict — these rules only apply their transformations when a branch of UnionAll exactly matches the expected patterns. The main goal is to have a tight/better control over their behavior and ensure they perform only the intended transformations. If the structure of subqueries changes in the future and no longer fits the expected pattern, these rules will simply stop applying, allowing us to detect such changes early in the process.

DataType targetType = convertFunctions.containsKey(oldAttr.name())
? convertFunctions.get(oldAttr.name()).dataType()
: oldAttr.dataType();
if (oldAttr.dataType() != targetType) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this condition can only be true if convertFunctions.containsKey(oldAttr.name()). Maybe we can simplify a bit the code here for better legibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement ES|QL-ui Impacts ES|QL UI Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) test-release Trigger CI checks against release build v9.3.0