Skip to content

Conversation

ioanatia
Copy link
Contributor

@ioanatia ioanatia commented May 22, 2025

related #121652

We initially skipped CCS tests (PR) after we merged streaming support for FORK because the tests were flaky. In theory, given the FORK's execution model, queries using FORK with remote indices should just work.

This was back in May and since then CCS has stabilized for the GA release and we no longer see these failures.
I am not sure whether there was a single fix that enabled us to enable querying remote indices with FORK.
There are probably more separate changes (#127328, #127328 etc) that compounded to this moment where we can support FORK with CCS.

@ioanatia ioanatia added :Analytics/ES|QL AKA ESQL :Search Relevance/Search Catch all for Search Relevance >non-issue labels May 22, 2025
@ioanatia ioanatia changed the title ESTest FORK with multi cluster ES|QL: Test FORK with multi cluster May 22, 2025
@ioanatia ioanatia marked this pull request as ready for review May 22, 2025 14:58
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels May 22, 2025
@ioanatia ioanatia requested a review from tteofili May 22, 2025 14:58
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@ioanatia ioanatia mentioned this pull request May 21, 2025
23 tasks
@ioanatia
Copy link
Contributor Author

got another failure after merging main - cannot replicate locally 😞

seems to be the same type of issue as before with results not coming back:

java.lang.AssertionError: Expected more data but no more entries found after [3] |  
-- | --
  | Data mismatch: |  
  |   |  
  | Actual: |  
  | _fork:keyword \| emp_no:integer \| x:keyword \| y:keyword \| z:keyword \| a:keyword |  
  | fork1         \| 10048          \| Florian   \| 10048     \| Syrotiuk  \| Florian 10048 Syrotiuk |  
  | fork1         \| 10081          \| Zhongwei  \| 10081     \| Rosen     \| Zhongwei 10081 Rosen |  
  | fork2         \| null           \| 2         \| 10081     \| 10048     \| null |  
  |   |  
  | Expected: |  
  | _fork:keyword \| emp_no:integer \| x:keyword \| y:keyword \| z:keyword \| a:keyword |  
  | fork1         \| 10048          \| Florian   \| 10048     \| Syrotiuk  \| Florian 10048 Syrotiuk |  
  | fork1         \| 10081          \| Zhongwei  \| 10081     \| Rosen     \| Zhongwei 10081 Rosen |  
  | fork2         \| null           \| 2         \| 10081     \| 10048     \| null |  
  | fork3         \| 10048          \| Syrotiuk  \| null      \| null      \| null |  
  | fork3         \| 10081          \| Rosen     \| null      \| null      \| null |  
  | fork4         \| 10048          \| abc       \| aaa       \| null      \| null |  
  | fork4         \| 10081          \| abc       \| aaa       \| null      \| null |  
  |  

putting this back in draft until I figure it out

@ioanatia ioanatia marked this pull request as draft May 22, 2025 19:46
@ioanatia ioanatia mentioned this pull request Jun 30, 2025
14 tasks
Copy link
Contributor

@idegtiarenko idegtiarenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
Looking forward for this stabilized and merged

@ioanatia ioanatia changed the title ES|QL: Test FORK with multi cluster ES|QL: Allow FORK with remote indices Sep 2, 2025
| FORK ( WHERE language_name == "English" | EVAL x = 1 )
( WHERE language_name != "English" )
| SORT _fork, language_name
FROM employees
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the tests that were querying the languages index failed - with rows being duplicated:

Actual: |  
-- | --
language_name:keyword \| x:integer \| _fork:keyword |  
English               \| 1         \| fork1 |  
English               \| 1         \| fork1 |  
French                \| null      \| fork2 |  
French                \| null      \| fork2 |  
German                \| null      \| fork2 |  
German                \| null      \| fork2 |  
Spanish               \| null      \| fork2 |  
Spanish               \| null      \| fork2 |  
  |  
Expected: |  
language_name:keyword \| x:integer \| _fork:keyword |  
English               \| 1         \| fork1 |  
French                \| null      \| fork2 |  
German                \| null      \| fork2 |  
Spanish               \| null      \| fork2

debugging these test showed that we were querying both the local and remote index, which is why we had duplicate results.

languages was an index that was primarily used for testing lookup join for which we had some special handling such that we disallow duplicates:

if (Arrays.stream(localIndices).anyMatch(i -> LOOKUP_INDICES.contains(i.trim().toLowerCase(Locale.ROOT)))) {
// If the query contains lookup indices, use only remotes to avoid duplication
onlyRemotes = true;
}
final boolean onlyRemotesFinal = onlyRemotes;

languages is not part of LOOKUP_INDICES (but it should be?)
anyway, I thought that it might be better to rewrite these tests with an index that has no special handling - so I used employees here instead of languages.

}

return input -> {
checkForRemoteClusters(input, source(ctx), "FORK");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could have made ENABLE_FORK_FOR_REMOTE_INDICES capability available only in snapshots and use it here - such that CCS support is only available in snapshots - but I don't think we need to 🤔

@ioanatia ioanatia marked this pull request as ready for review September 2, 2025 14:50
@smalyshev
Copy link
Contributor

Is it still true that cross-cluster FORK requires coordinator-side aggregation? I'm not sure I understand the mechanism why.

@ioanatia
Copy link
Contributor Author

ioanatia commented Sep 2, 2025

Is it still true that cross-cluster FORK requires coordinator-side aggregation? I'm not sure I understand the mechanism why.

I am not sure what you mean by coordinator-side aggregation.
A query using FORK gets split into multiple plans - a subplan for each branch and a main coordinator plan.
Each subplan that pertains to a FORK branch gets split into a data node plan and coordinator plan.

For example:

FROM test
| FORK
    ( WHERE content:"fox" ) // sub plan 1
    ( WHERE content:"dog" ) // sub plan 2
| SORT _fork
| KEEP _fork, id, content

gets translated to something like this:

fork_planning

It was a conscious decision to execute FORK this way and one of the benefits was that we did not need to make any special concessions for CCS support.
Anyway, this PR does not make any fundamental changes in how FORK gets planned - we simply enable CCS support here.

@ioanatia
Copy link
Contributor Author

ioanatia commented Sep 2, 2025

Is it still true that cross-cluster FORK requires coordinator-side aggregation? I'm not sure I understand the mechanism why.

Checked with @smalyshev - this is about having FORK marked as ExecutesOn.Coordinator, similar to STATS:

public class Fork extends LogicalPlan implements PostAnalysisPlanVerificationAware, TelemetryAware, ExecutesOn.Coordinator {

We agreed that for the beginning this is fine, even though it puts some restrictions in place such as we cannot execute a remote lookup join after FORK.
We will look into removing this restriction in the future - e.g. if we can push more plans that come after FORK to the subplans etc.

@leemthompo
Copy link
Contributor

leemthompo commented Sep 3, 2025

@ioanatia I guess can update the docs to mention that the the limitation disappears in 9.2.0

- Using remote cluster references and `FORK` is not supported.

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks @ioanatia

@ioanatia ioanatia merged commit 580f734 into elastic:main Sep 3, 2025
33 checks passed
@ioanatia ioanatia deleted the fork_ccs_tests branch September 3, 2025 17:58
@ioanatia
Copy link
Contributor Author

ioanatia commented Sep 3, 2025

@ioanatia I guess can update the docs to mention that the the limitation disappears in 9.2.0

I will get that addressed too. @leemthompo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue :Search Relevance/Search Catch all for Search Relevance Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants