-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch Version
main/9.1
Installed Plugins
No response
Java Version
bundled
OS Version
Problem Description
The LOOKUP JOIN query fails when the lookup operator is combined with ENRICH.
Steps to Reproduce
The query is:
FROM test191 | EVAL language_code = "1" | LOOKUP JOIN test-lookup191 ON color.keyword | ENRICH _remote:lang_r
Where test191 is the test index from 191_lookup_join_text.yml and test-lookup191 is the lookup index from the same test and lang_r is enrich policy with language_code and the key.
This produces the following exception on main:
2025-06-12T16:03:09,790][WARN ][o.e.x.e.a.EsqlResponseListener] [node-1] ESQL request failed with status [INTERNAL_SERVER_ERROR]: java.lang.ClassCastException: class org.elasticsearch.xpack.esql.plan.physical.ProjectExec cannot be cast to class org.elasticsearch.xpack.esql.plan.physical.EsQueryExec (org.elasticsearch.xpack.esql.plan.physical.ProjectExec and org.elasticsearch.xpack.esql.plan.physical.EsQueryExec are in unnamed module of loader java.net.URLClassLoader @357c9bd9)
at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planLookupJoin(LocalExecutionPlanner.java:719)
at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:294)
at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planLimit(LocalExecutionPlanner.java:838)
at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:259)
at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planOutput(LocalExecutionPlanner.java:387)
at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:298)
at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:217)
at org.elasticsearch.xpack.esql.plugin.ComputeService.runCompute(ComputeService.java:582)
at org.elasticsearch.xpack.esql.plugin.ComputeService.executePlan(ComputeService.java:406)
at org.elasticsearch.xpack.esql.plugin.ComputeService.execute(ComputeService.java:198)
at org.elasticsearch.xpack.esql.plugin.TransportEsqlQueryAction.lambda$innerExecute$3(TransportEsqlQueryAction.java:236)
at org.elasticsearch.xpack.esql.session.EsqlSession.executeSubPlans(EsqlSession.java:248)
at org.elasticsearch.xpack.esql.session.EsqlSession.executeOptimizedPlan(EsqlSession.java:213)
at org.elasticsearch.xpack.esql.session.EsqlSession$1.lambda$onResponse$0(EsqlSession.java:190)
at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
at org.elasticsearch.xpack.esql.planner.premapper.PreMapper.lambda$preMapper$0(PreMapper.java:33)
at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:261)
at org.elasticsearch.xpack.esql.expression.function.fulltext.QueryBuilderResolver.resolveQueryBuilders(QueryBuilderResolver.java:50)
at org.elasticsearch.xpack.esql.planner.premapper.PreMapper.queryRewrite(PreMapper.java:38)
at org.elasticsearch.xpack.esql.planner.premapper.PreMapper.preMapper(PreMapper.java:31)
at org.elasticsearch.xpack.esql.session.EsqlSession$1.onResponse(EsqlSession.java:187)
at org.elasticsearch.xpack.esql.session.EsqlSession$1.onResponse(EsqlSession.java:184)
at org.elasticsearch.xpack.esql.session.EsqlSession.analyzeAndMaybeRetry(EsqlSession.java:549)
The log with the plans can be seen here: https://gist.github.com/smalyshev/22dea5574e516ef499d37556d6a61d6f
The basic problem IMHO is that there is a problem in processing LookupJoin when it is on the remote side (inside FragmentExec). I have had problems with "Duplicate name ids are not allowed" in Layout.java but when trying to construct the reduced example of this I've also discovered this issue. By default a lot of mapping operations leave LookupJoin to execute on coordinator side of the computation - due to this check:
if (left instanceof FragmentExec fragment) {
return new FragmentExec(bp);
}which often fails when FragmentExec is not the immediate child of LookupJoin (same issue we dealt with when working on remote ENRICH) but if I manage to force it on the FragmentExec side, other problems start.
The test above is constructed to test duplicate fields (both test191 and test-lookup191 have color and description fields) but it looks like current code fails even before getting to the remote side.
Logs (if relevant)
https://gist.github.com/smalyshev/22dea5574e516ef499d37556d6a61d6f