Skip to content

Conversation

@afoucret
Copy link
Contributor

Description

This PR adds a new asynchronous pre-optimization step to the ES|QL logical plan execution pipeline.
The pre-optimization step is positioned between the Analyzer and the Optimizer, allowing for asynchronous operations to be performed before once the logical plan is analyzed and before it is optimized.

Context / Use case

This infrastructure is required for the TEXT_EMBEDDING function implementation (issue #131022).
By evaluating text embeddings before query optimization, we ensure they benefit from all subsequent constant. optimizations.

The PreMapper was originally place before theOptimizer but is has been moved after, so it can be used for this purpose.

Key Changes

  • Added a new PRE_OPTIMIZED stage in the LogicalPlan
  • Created LogicalPlanPreOptimizer class for handling asynchronous pre-optimization
  • Created LogicalPreOptimizerContext to support the pre-optimization process
  • Updated EsqlSession to include the pre-optimization step in the execution flow

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@nik9000 nik9000 requested a review from astefan July 17, 2025 13:25
@afoucret
Copy link
Contributor Author

afoucret commented Jul 18, 2025

@astefan Got a couple of CI failures caused by the branch being outdated but all is green right now.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few notes and comments:

  1. The text_embedding function is currently the main recipient of this PR at this time. It is supposed to call an external service and create what is, essentially, an array of floats that can be later used in the query
  2. due to its nature, the performance of the text_embedding function mainly depends on the external service availability, which can be a bottleneck
  3. because this step resembles the one of the enrich policies discovery or lookup indices resolution, it was decided that resolving the text_embedding output value is best placed in the overall "discovery" (index resolution, analyzer, logical optimizer, physical optimizer) set of steps. This set of steps is performed on the coordinator node
  4. another argument in favor of using the pre-optimizer step is the one of using the Literal that comes out of the text_embedding function in further optimization rules from the LogicalPlanOptimizer. I personally do not believe this as a strong argument. There could be a similar optimization step on each data node.
  5. also, calling the external service has a cost associated with it. One of the arguments in favor of calling this service on the coordinator node only is that the cost is greatly reduced this way. It is essentially one call. There are some downside to this decision:
    • the coordinator becomes a bottleneck
    • some steps that still validate the correctness of the query (everything that comes after the pre-optimization step - logical optimizer, physical optimizer, local logical optimizer, local physical optimizer) can wait unnecessarily a long time before replying back to the user with a potentially invalid query response
    • any shard/index specific optimizations cannot be applied. For example, if the field that is used in the text_embedding function has a null value or no value on some of the shards, I am assuming that calling the external service for that specific value/field is unnnecessary.
  6. I think it is OK to start with this pre-optimizer step only for constant values (for example TEXT_EMBEDDING("Who is Victor Hugo?", "test_dense_inference") only from the point of view of calling the external service only once
    • there is also a LocalLogicalPlanOptimizer. If there is any shard/index specific behavior my preference would be to to do this "external service" call from each "relevant" data node or to have a heuristic logic that decides which approach is better: one call from coordinator or multiple calls from each data node.
  7. I am not convinced that the EsqlSession should change the way it's handling the Listeners while calling the entire flow of analyzer, logical optimizer, physical optimizer, but I understand the need to have this call async. If it doesn't break any existent tests/behavior it's ok with me.

@astefan astefan self-requested a review July 21, 2025 13:12
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@afoucret afoucret merged commit c666679 into elastic:main Jul 21, 2025
33 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Jul 22, 2025
…king

* upstream/main: (100 commits)
  Term vector API on stateless search nodes (elastic#129902)
  TEST Fix ThreadPoolMergeSchedulerStressTestIT testMergingFallsBehindAndThenCatchesUp (elastic#131636)
  Add inference.put_custom rest-api-spec (elastic#131660)
  ESQL: Fewer serverless docs in tests (elastic#131651)
  Skip search on indices with INDEX_REFRESH_BLOCK (elastic#129132)
  Mute org.elasticsearch.indices.cluster.RemoteSearchForceConnectTimeoutIT testTimeoutSetting elastic#131656
  [jdk] Resolve EA OpenJDK builds to our JDK archive (elastic#131237)
  Add optimized path for intermediate values aggregator (elastic#131390)
  Correctly handling download_database_on_pipeline_creation within a pipeline processor within a default or final pipeline (elastic#131236)
  Refresh potential lost connections at query start for `_search` (elastic#130463)
  Add template_id to patterned-text type (elastic#131401)
  Integrate LIKE/RLIKE LIST with ReplaceStringCasingWithInsensitiveRegexMatch rule (elastic#131531)
  [ES|QL] Add doc for the COMPLETION command (elastic#131010)
  ESQL: Add times to topn status (elastic#131555)
  ESQL: Add asynchronous pre-optimization step for logical plan (elastic#131440)
  ES|QL: Improve generative tests for FORK [130015] (elastic#131206)
  Update index mapping update privileges (elastic#130894)
  ESQL: Added Sample operator NamedWritable to plugin (elastic#131541)
  update `kibana_system` to grant it access to `.chat-*` system index (elastic#131419)
  Clarify heap size configuration (elastic#131607)
  ...
szybia added a commit to szybia/elasticsearch that referenced this pull request Jul 22, 2025
…-tracking

* upstream/main: (44 commits)
  Term vector API on stateless search nodes (elastic#129902)
  TEST Fix ThreadPoolMergeSchedulerStressTestIT testMergingFallsBehindAndThenCatchesUp (elastic#131636)
  Add inference.put_custom rest-api-spec (elastic#131660)
  ESQL: Fewer serverless docs in tests (elastic#131651)
  Skip search on indices with INDEX_REFRESH_BLOCK (elastic#129132)
  Mute org.elasticsearch.indices.cluster.RemoteSearchForceConnectTimeoutIT testTimeoutSetting elastic#131656
  [jdk] Resolve EA OpenJDK builds to our JDK archive (elastic#131237)
  Add optimized path for intermediate values aggregator (elastic#131390)
  Correctly handling download_database_on_pipeline_creation within a pipeline processor within a default or final pipeline (elastic#131236)
  Refresh potential lost connections at query start for `_search` (elastic#130463)
  Add template_id to patterned-text type (elastic#131401)
  Integrate LIKE/RLIKE LIST with ReplaceStringCasingWithInsensitiveRegexMatch rule (elastic#131531)
  [ES|QL] Add doc for the COMPLETION command (elastic#131010)
  ESQL: Add times to topn status (elastic#131555)
  ESQL: Add asynchronous pre-optimization step for logical plan (elastic#131440)
  ES|QL: Improve generative tests for FORK [130015] (elastic#131206)
  Update index mapping update privileges (elastic#130894)
  ESQL: Added Sample operator NamedWritable to plugin (elastic#131541)
  update `kibana_system` to grant it access to `.chat-*` system index (elastic#131419)
  Clarify heap size configuration (elastic#131607)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants