-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ES|QL: Allow FORK with remote indices #128310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
got another failure after merging main - cannot replicate locally 😞 seems to be the same type of issue as before with results not coming back:
putting this back in draft until I figure it out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Looking forward for this stabilized and merged
| FORK ( WHERE language_name == "English" | EVAL x = 1 ) | ||
( WHERE language_name != "English" ) | ||
| SORT _fork, language_name | ||
FROM employees |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the tests that were querying the languages
index failed - with rows being duplicated:
Actual: |
-- | --
language_name:keyword \| x:integer \| _fork:keyword |
English \| 1 \| fork1 |
English \| 1 \| fork1 |
French \| null \| fork2 |
French \| null \| fork2 |
German \| null \| fork2 |
German \| null \| fork2 |
Spanish \| null \| fork2 |
Spanish \| null \| fork2 |
|
Expected: |
language_name:keyword \| x:integer \| _fork:keyword |
English \| 1 \| fork1 |
French \| null \| fork2 |
German \| null \| fork2 |
Spanish \| null \| fork2
debugging these test showed that we were querying both the local and remote index, which is why we had duplicate results.
languages
was an index that was primarily used for testing lookup join for which we had some special handling such that we disallow duplicates:
Lines 298 to 302 in b1081c0
if (Arrays.stream(localIndices).anyMatch(i -> LOOKUP_INDICES.contains(i.trim().toLowerCase(Locale.ROOT)))) { | |
// If the query contains lookup indices, use only remotes to avoid duplication | |
onlyRemotes = true; | |
} | |
final boolean onlyRemotesFinal = onlyRemotes; |
languages
is not part of LOOKUP_INDICES
(but it should be?)
anyway, I thought that it might be better to rewrite these tests with an index that has no special handling - so I used employees
here instead of languages
.
} | ||
|
||
return input -> { | ||
checkForRemoteClusters(input, source(ctx), "FORK"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could have made ENABLE_FORK_FOR_REMOTE_INDICES
capability available only in snapshots and use it here - such that CCS support is only available in snapshots - but I don't think we need to 🤔
Is it still true that cross-cluster |
Checked with @smalyshev - this is about having FORK marked as elasticsearch/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Fork.java Line 36 in ebb94bd
We agreed that for the beginning this is fine, even though it puts some restrictions in place such as we cannot execute a remote lookup join after FORK. |
@ioanatia I guess can update the docs to mention that the the limitation disappears in 9.2.0 elasticsearch/docs/reference/query-languages/esql/_snippets/commands/layout/fork.md Line 42 in 208cd45
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks @ioanatia
I will get that addressed too. @leemthompo |
related #121652
We initially skipped CCS tests (PR) after we merged streaming support for FORK because the tests were flaky. In theory, given the FORK's execution model, queries using FORK with remote indices should just work.
This was back in May and since then CCS has stabilized for the GA release and we no longer see these failures.
I am not sure whether there was a single fix that enabled us to enable querying remote indices with FORK.
There are probably more separate changes (#127328, #127328 etc) that compounded to this moment where we can support FORK with CCS.