Skip to content

Conversation

@smalyshev
Copy link
Contributor

Only serialize clusterInfo once we have a consistent initialized state of it.

}
} finally {
executionInfo.clusterInfoInitializing(false);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it could protect from concurrent addition while serializing, however I am not following what could trigger that.

This completes synchronously before any of the listeners in analyzedPlan could fail. Am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem can happen when we're running an async query, and the query runner has just started to construct the executionInfo here, and we get an async GET request at the same time. That request will also use executionInfo, but when serializing it writeCollection has a race condition - it reads size first and then serializes elements. So we get broken serialization data as the result.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Since it is just starting, it does not contain any meaningful information worth waiting for?

Copy link
Contributor Author

@smalyshev smalyshev Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precisely. We don't want to actually show anybody half-initialized cluster info, even if it didn't cause the race. And since we didn't initialize it yet, it can't contain anything important.

@smalyshev smalyshev marked this pull request as ready for review August 15, 2025 16:23
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 15, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@smalyshev smalyshev enabled auto-merge (squash) August 18, 2025 17:31
@smalyshev smalyshev merged commit bc7db6c into elastic:main Aug 18, 2025
34 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Aug 19, 2025
…improv

* upstream/main: (92 commits)
  ESQL: mark LOOKUP JOIN as ExecutesOn.Any by default (elastic#133064)
  Fix 404s in REST API landing page (elastic#133086)
  Fix release tests for OptimizerVerificationTests (elastic#133100)
  Make Glob non-recursive (elastic#132798)
  Update ES|QL function list for release versions (elastic#133096)
  Split transport version func test into abstract base (elastic#133035)
  Omit project ID from snapshot metrics (elastic#133098)
  Mute org.elasticsearch.xpack.esql.analysis.AnalyzerTests testNoDenseVectorFailsForMagnitude elastic#133013
  Mute org.elasticsearch.xpack.esql.optimizer.OptimizerVerificationTests testRemoteEnrichAfterCoordinatorOnlyPlans elastic#133015
  Mute org.elasticsearch.test.rest.yaml.CcsCommonYamlTestSuiteIT test {p0=search/160_exists_query/Test exists query on _id field} elastic#133097
  Rename initial to unreferenced in transport versions (elastic#133082)
  Rename exception type header (elastic#133045)
  ESQL: Pluggable tests for Operator status (elastic#132876)
  ESQL: Mark new signatures in MIN and MAX (elastic#132980)
  Don't try to serialize half-baked cluster info (elastic#132756)
  migrate ml_rollover_legacy_indices transport version (elastic#133008)
  Enable `exclude_source_vectors` by default for new indices (elastic#131907)
  Expose APIs needed by flush during translog replay (elastic#132960)
  Change reporting_user role to leverage reserved kibana privileges (elastic#132766)
  Update TasksIT for batched execution (elastic#132762)
  ...
szybia added a commit to szybia/elasticsearch that referenced this pull request Aug 19, 2025
* upstream/main: (58 commits)
  ESQL: mark LOOKUP JOIN as ExecutesOn.Any by default (elastic#133064)
  Fix 404s in REST API landing page (elastic#133086)
  Fix release tests for OptimizerVerificationTests (elastic#133100)
  Make Glob non-recursive (elastic#132798)
  Update ES|QL function list for release versions (elastic#133096)
  Split transport version func test into abstract base (elastic#133035)
  Omit project ID from snapshot metrics (elastic#133098)
  Mute org.elasticsearch.xpack.esql.analysis.AnalyzerTests testNoDenseVectorFailsForMagnitude elastic#133013
  Mute org.elasticsearch.xpack.esql.optimizer.OptimizerVerificationTests testRemoteEnrichAfterCoordinatorOnlyPlans elastic#133015
  Mute org.elasticsearch.test.rest.yaml.CcsCommonYamlTestSuiteIT test {p0=search/160_exists_query/Test exists query on _id field} elastic#133097
  Rename initial to unreferenced in transport versions (elastic#133082)
  Rename exception type header (elastic#133045)
  ESQL: Pluggable tests for Operator status (elastic#132876)
  ESQL: Mark new signatures in MIN and MAX (elastic#132980)
  Don't try to serialize half-baked cluster info (elastic#132756)
  migrate ml_rollover_legacy_indices transport version (elastic#133008)
  Enable `exclude_source_vectors` by default for new indices (elastic#131907)
  Expose APIs needed by flush during translog replay (elastic#132960)
  Change reporting_user role to leverage reserved kibana privileges (elastic#132766)
  Update TasksIT for batched execution (elastic#132762)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants