-
Notifications
You must be signed in to change notification settings - Fork 25.6k
resolve remotes with field caps call #133947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 9 commits
938aa13
068eaa3
73980fa
168f8a5
a2ab223
8b7f8ef
3f25a1e
ff93b63
dd4b4ff
95f8dde
101f8be
42d933e
36cfd23
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
|
|
@@ -23,6 +23,7 @@ | |||
| import org.elasticsearch.common.util.concurrent.ThreadContext; | ||||
| import org.elasticsearch.common.util.iterable.Iterables; | ||||
| import org.elasticsearch.core.Tuple; | ||||
| import org.elasticsearch.index.query.QueryBuilder; | ||||
| import org.elasticsearch.tasks.Task; | ||||
| import org.elasticsearch.threadpool.ThreadPool; | ||||
| import org.elasticsearch.transport.AbstractTransportRequest; | ||||
|
|
@@ -39,21 +40,24 @@ | |||
| import org.elasticsearch.xpack.core.enrich.EnrichPolicy; | ||||
| import org.elasticsearch.xpack.esql.action.EsqlExecutionInfo; | ||||
| import org.elasticsearch.xpack.esql.analysis.EnrichResolution; | ||||
| import org.elasticsearch.xpack.esql.analysis.PreAnalyzer; | ||||
| import org.elasticsearch.xpack.esql.core.type.DataType; | ||||
| import org.elasticsearch.xpack.esql.core.type.EsField; | ||||
| import org.elasticsearch.xpack.esql.core.util.StringUtils; | ||||
| import org.elasticsearch.xpack.esql.index.EsIndex; | ||||
| import org.elasticsearch.xpack.esql.index.MappingException; | ||||
| import org.elasticsearch.xpack.esql.io.stream.PlanStreamInput; | ||||
| import org.elasticsearch.xpack.esql.io.stream.PlanStreamOutput; | ||||
| import org.elasticsearch.xpack.esql.plan.IndexPattern; | ||||
| import org.elasticsearch.xpack.esql.plan.logical.Enrich; | ||||
| import org.elasticsearch.xpack.esql.session.EsqlCCSUtils; | ||||
| import org.elasticsearch.xpack.esql.session.IndexResolver; | ||||
|
|
||||
| import java.io.IOException; | ||||
| import java.util.ArrayList; | ||||
| import java.util.Collection; | ||||
| import java.util.Collections; | ||||
| import java.util.HashMap; | ||||
| import java.util.HashSet; | ||||
| import java.util.List; | ||||
| import java.util.Map; | ||||
| import java.util.Set; | ||||
|
|
@@ -110,22 +114,37 @@ public static UnresolvedPolicy from(Enrich e) { | |||
| /** | ||||
| * Resolves a set of enrich policies | ||||
| * | ||||
| * @param enriches the unresolved policies | ||||
| * @param preAnalysis to retrieve indices and enriches to resolve | ||||
| * @param requestFilter to resolve target clusters | ||||
| * @param executionInfo the execution info | ||||
| * @param listener notified with the enrich resolution | ||||
| */ | ||||
| public void resolvePolicies(List<Enrich> enriches, EsqlExecutionInfo executionInfo, ActionListener<EnrichResolution> listener) { | ||||
| if (enriches.isEmpty()) { | ||||
| public void resolvePolicies( | ||||
| PreAnalyzer.PreAnalysis preAnalysis, | ||||
| QueryBuilder requestFilter, | ||||
| EsqlExecutionInfo executionInfo, | ||||
| ActionListener<EnrichResolution> listener | ||||
| ) { | ||||
| if (preAnalysis.enriches.isEmpty()) { | ||||
| listener.onResponse(new EnrichResolution()); | ||||
| return; | ||||
| } | ||||
|
|
||||
| doResolvePolicies( | ||||
| new HashSet<>(executionInfo.getClusters().keySet()), | ||||
| enriches.stream().map(EnrichPolicyResolver.UnresolvedPolicy::from).toList(), | ||||
| executionInfo, | ||||
| listener | ||||
| ); | ||||
| doResolveRemotes(preAnalysis.indices, requestFilter, listener.delegateFailureAndWrap((l, remotes) -> { | ||||
| doResolvePolicies(remotes, preAnalysis.enriches.stream().map(UnresolvedPolicy::from).toList(), executionInfo, l); | ||||
| })); | ||||
| } | ||||
|
|
||||
| private void doResolveRemotes(List<IndexPattern> indexPatterns, QueryBuilder requestFilter, ActionListener<Set<String>> listener) { | ||||
| switch (indexPatterns.size()) { | ||||
| case 0 -> listener.onResponse(Set.of()); | ||||
| case 1 -> indexResolver.resolveConcreteIndices( | ||||
| indexPatterns.getFirst().indexPattern(), | ||||
| requestFilter, | ||||
| listener.map(EsqlCCSUtils::getRemotesOf) | ||||
| ); | ||||
| default -> listener.onFailure(new MappingException("Queries with multiple indices are not supported")); | ||||
|
||||
| listener.onFailure(new MappingException("Queries with multiple indices are not supported")); |
but that is a good call.
This is OK for now, but it will change when we start supporting subqueries and proper JOINs.
I suspect we would still need a distinction between main/top-level from and all the nested ones.
I will think about a way to improve it (across all usages)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand it correctly, latest CI run (that is green) includes my tests.
All good then 🎉
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -42,6 +42,7 @@ | |
| import java.util.TreeMap; | ||
| import java.util.TreeSet; | ||
|
|
||
| import static java.util.stream.Collectors.toSet; | ||
| import static org.elasticsearch.xpack.esql.core.type.DataType.DATETIME; | ||
| import static org.elasticsearch.xpack.esql.core.type.DataType.KEYWORD; | ||
| import static org.elasticsearch.xpack.esql.core.type.DataType.OBJECT; | ||
|
|
@@ -74,6 +75,16 @@ public IndexResolver(Client client) { | |
| this.client = client; | ||
| } | ||
|
|
||
| public void resolveConcreteIndices(String indexPattern, QueryBuilder requestFilter, ActionListener<Set<String>> listener) { | ||
| client.execute( | ||
| EsqlResolveFieldsAction.TYPE, | ||
| createFieldCapsRequest(indexPattern, Set.of("_id"), requestFilter, false), | ||
| listener.delegateFailureAndWrap((l, response) -> { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder though what happens here if one of the clusters fails. Will it fail this listener and thus the whole request? In other places, we have to do some handling to figure out what failed and skip failed clusters and so on, but we don't do it here. I'm not sure what happens here if one of the clusters fails? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What I am afraid of is if |
||
| l.onResponse(response.getIndexResponses().stream().map(FieldCapabilitiesIndexResponse::getIndexName).collect(toSet())); | ||
| }) | ||
| ); | ||
| } | ||
|
|
||
| /** | ||
| * Resolves a pattern to one (potentially compound meaning that spawns multiple indices) mapping. | ||
| */ | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the index pattern also contains a cluster/project prefix, will this still be correct?
eg.
The enrich can only be calculated on the coordinator (after the STATS reduction), but I guess the enrich index on the coordinator won't be resolved. I'm not sure it works at all today tbh, maybe worth checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a valid situation, where all supplied indices are correct specific indices (not aliases nor patterns) and exist.
In such cases we would resolve to exact same 2 identifiers and would derive 2 remotes from them:
remote1andremote2There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before merging, I'd just prefer to have a test for this query and make sure we are not introducing a regression.
Apart from that, I think we can proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For LOOKUP JOINs we have a specific check: we just don't allow it.
I'm checking ENRICH now, but I couldn't find similar validation for it in the codebase, so I guess it's being treated differently...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it is behaving the same way. It looks like we have a similar test for enrich:
elasticsearch/x-pack/plugin/esql/src/internalClusterTest/java/org/elasticsearch/xpack/esql/action/CrossClusterEnrichIT.java
Lines 398 to 421 in 37b28e4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@idegtiarenko I added a test in
mainfor this case, see #134199, and it seems to work fine.I think it makes sense to run it on this PR and see if the results are the same.