Skip to content

Conversation

@idegtiarenko
Copy link
Contributor

This change updates enriches to resolve remotes using dedicated field caps rather than executionInfo.getClusters().keySet().
This is required in order to be able to resolve flat index expressions.

Related to: ES-12612

@idegtiarenko idegtiarenko added >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.2.0 labels Sep 1, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@idegtiarenko idegtiarenko requested a review from n1v0lg September 1, 2025 14:48
Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @idegtiarenko, it looks good in general.

I left just a couple of comments regarding some corner cases (maybe a bit paranoid)

private void doResolveRemotes(List<IndexPattern> indexPatterns, QueryBuilder requestFilter, ActionListener<Set<String>> listener) {
switch (indexPatterns.size()) {
case 0 -> listener.onResponse(Set.of());
case 1 -> indexResolver.resolveConcreteIndices(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the index pattern also contains a cluster/project prefix, will this still be correct?

eg.

FROM remote1:idx1,remote2:idx2
| STATS max(a) by b
| ENRICH policy on a, b

The enrich can only be calculated on the coordinator (after the STATS reduction), but I guess the enrich index on the coordinator won't be resolved. I'm not sure it works at all today tbh, maybe worth checking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the index pattern also contains a cluster/project prefix, will this still be correct?

Yes, this is a valid situation, where all supplied indices are correct specific indices (not aliases nor patterns) and exist.
In such cases we would resolve to exact same 2 identifiers and would derive 2 remotes from them: remote1 and remote2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging, I'd just prefer to have a test for this query and make sure we are not introducing a regression.
Apart from that, I think we can proceed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For LOOKUP JOINs we have a specific check: we just don't allow it.

I'm checking ENRICH now, but I couldn't find similar validation for it in the codebase, so I guess it's being treated differently...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is behaving the same way. It looks like we have a similar test for enrich:

public void testAggThenEnrichRemote() {
String query = String.format(Locale.ROOT, """
FROM *:events,events
| eval ip= TO_STR(host)
| %s
| stats c = COUNT(*) by os
| %s
| sort vendor
""", enrichHosts(Enrich.Mode.ANY), enrichVendors(Enrich.Mode.REMOTE));
var error = expectThrows(VerificationException.class, () -> runQuery(query, randomBoolean()).close());
assertThat(error.getMessage(), containsString("ENRICH with remote policy can't be executed after [stats c = COUNT(*) by os]@4:3"));
}
public void testEnrichCoordinatorThenEnrichRemote() {
String query = String.format(Locale.ROOT, """
FROM *:events,events
| eval ip= TO_STR(host)
| %s
| %s
| sort vendor
""", enrichHosts(Enrich.Mode.COORDINATOR), enrichVendors(Enrich.Mode.REMOTE));
var error = expectThrows(VerificationException.class, () -> runQuery(query, randomBoolean()).close());
assertThat(error.getMessage(), containsString("ENRICH with remote policy can't be executed after [ENRICH _COORDINATOR"));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@idegtiarenko I added a test in main for this case, see #134199, and it seems to work fine.
I think it makes sense to run it on this PR and see if the results are the same.

requestFilter,
listener.map(EsqlCCSUtils::getRemotesOf)
);
default -> listener.onFailure(new MappingException("Queries with multiple indices are not supported"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK for now, but it will change when we start supporting subqueries and proper JOINs.

I'm not sure how it will actually work, maybe the enriches in a subquery will have to consider the remotes from the main indices in the subquery itself.

Anyway, I'd add a // TODO here, so that we don't forget.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the error message is a bit misleading, it took me a couple of seconds to realize that we were referring to multiple FROM, not to multiple indices. I know it comes from a similar check in EsqlSession, probably we should change both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied it from

listener.onFailure(new MappingException("Queries with multiple indices are not supported"));

but that is a good call.

This is OK for now, but it will change when we start supporting subqueries and proper JOINs.

I suspect we would still need a distinction between main/top-level from and all the nested ones.
I will think about a way to improve it (across all usages)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, latest CI run (that is green) includes my tests.
All good then 🎉

# Conflicts:
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java
idegtiarenko and others added 4 commits September 8, 2025 09:41
# Conflicts:
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java
Copy link
Contributor

@smalyshev smalyshev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, the only request is maybe add some comments on what's going on there especially in enrich resolver. I worked on that code and it still took me some time to realize what's going on, it may be tough on somebody who has not as familiar with the code.

client.execute(
EsqlResolveFieldsAction.TYPE,
createFieldCapsRequest(indexPattern, Set.of("_id"), requestFilter, false),
listener.delegateFailureAndWrap((l, response) -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder though what happens here if one of the clusters fails. Will it fail this listener and thus the whole request? In other places, we have to do some handling to figure out what failed and skip failed clusters and so on, but we don't do it here. I'm not sure what happens here if one of the clusters fails?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I am afraid of is if EsqlResolveFieldsAction returns partial response with say cluster A failing and thus having no indices but other clusters having indices, then A would be excluded from the list but not marked as skipped, and then there could be problems with it downstream.

@idegtiarenko
Copy link
Contributor Author

This is going to be replaced with #134290

@idegtiarenko idegtiarenko deleted the es-12612_separate_field_caps branch September 23, 2025 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants