Cleanup: Remove obsolete TODO from SparseVectorQueryBuilder #134702

mridula-s109 · 2025-09-14T21:17:54Z

Summary

Small cleanup to bring SparseVectorQueryBuilder in line with current inference APIs and the pattern used by SemanticQueryBuilder.

Changes

Replace CoordinatedInferenceAction.Request with InferenceAction.Request (parity with SemanticQueryBuilder).
Remove outdated TODO about moving this to xpack core (it’s already in core).
Refactor response handling to a validateAndExtractTextExpansionResults(...) helper for clearer errors and consistency.

mridula-s109 · 2025-09-14T21:18:44Z

Hi @kderusso 👋

This draft PR addresses the SparseVectorQueryBuilder refactoring.

Highlights:

Standardized on InferenceAction.Request (no need for CoordinatedInferenceAction.Request)
Clean separation from ML coordination logic
Minimal, mostly mechanical refactoring — tests all pass

Would love your thoughts, especially on whether the InferenceAction.Request standardization and overall approach match the intended architectural direction. 🙏

kderusso

Hey @mridula-s109 so sorry about the confusion here, the Jira ticket you were assigned said that Sparse Vector was still in ml and we didn't verify that it had already been moved to core during planning 😅

I think the cleanup refactoring that you've done here makes sense for the most part, with some caveats:

I would like to understand the difference between a CoordinatedInferenceAction and an InferenceAction. By changing this, what is actually changing under the hood and what side effects are there? Knowing that someone has done this research is important for peace of mind
The bulk of the work for this type of refactor lies in test failures - and there are a lot of these in your PR.

It would be worth thinking about whether we need this PR, or since it's already in core if we just try to do a refactoring to server.

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/search/SparseVectorQueryBuilder.java

...core/src/main/java/org/elasticsearch/xpack/core/search/vectors/SparseVectorQueryBuilder.java

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/search/SparseVectorQueryBuilder.java

...core/src/main/java/org/elasticsearch/xpack/core/search/vectors/SparseVectorQueryBuilder.java

...src/test/java/org/elasticsearch/xpack/core/search/vectors/SparseVectorQueryBuilderTests.java

mridula-s109 · 2025-09-17T13:42:28Z

Hey @mridula-s109 so sorry about the confusion here, the Jira ticket you were assigned said that Sparse Vector was still in ml and we didn't verify that it had already been moved to core during planning 😅

I think the cleanup refactoring that you've done here makes sense for the most part, with some caveats:

I would like to understand the difference between a CoordinatedInferenceAction and an InferenceAction. By changing this, what is actually changing under the hood and what side effects are there? Knowing that someone has done this research is important for peace of mind

The bulk of the work for this type of refactor lies in test failures - and there are a lot of these in your PR.

It would be worth thinking about whether we need this PR, or since it's already in core if we just try to do a refactoring to server.

Hey @kderusso, thanks for clarifying!Since sparse vector is already in core, would you be okay with me closing this PR and also resolving the Jira issue?

kderusso · 2025-09-17T13:56:47Z

Hey @kderusso, thanks for clarifying!Since sparse vector is already in core, would you be okay with me closing this PR and also resolving the Jira issue?

Can you please open a smaller PR to clear out the TODOs to avoid confusion down the line?

kderusso · 2025-09-17T13:57:48Z

Also, did you time box looking into moving this to server? How bad would it be?

kderusso · 2025-10-01T17:51:36Z

I noticed that sparse vector yaml tests are still in ML, if we're keeping this PR we should probably move them (unless there's a compelling "this will break everything" reason to keep them)

mridula-s109 · 2025-10-02T15:48:54Z

Hey @mridula-s109 so sorry about the confusion here, the Jira ticket you were assigned said that Sparse Vector was still in ml and we didn't verify that it had already been moved to core during planning 😅

I think the cleanup refactoring that you've done here makes sense for the most part, with some caveats:

I would like to understand the difference between a CoordinatedInferenceAction and an InferenceAction. By changing this, what is actually changing under the hood and what side effects are there? Knowing that someone has done this research is important for peace of mind

The bulk of the work for this type of refactor lies in test failures - and there are a lot of these in your PR.

It would be worth thinking about whether we need this PR, or since it's already in core if we just try to do a refactoring to server.

Thanks for the thoughtful feedback @kderusso and @ioanatia 🙏

To clarify the CoordinatedInferenceAction vs InferenceAction part:

CoordinatedInferenceAction is used when we need to coordinate inference requests across clusters (CCS mode, reduce roundtrips=false, etc.). It wraps results into a “coordination format.”
InferenceAction is the simpler path for local inference requests within a single cluster. It doesn’t do the extra cross-cluster handling.

Since SparseVector queries don’t currently need cross-cluster coordination, switching to InferenceAction.Request aligns with what we already do in SemanticQueryBuilder. Functionally, this shouldn’t change behaviour for local queries, but it does reduce unnecessary complexity.

On the scope of the refactor:

You’re right that SparseVectorQueryBuilder is already in core, so the big move work is unnecessary.
I’ve now aligned it with the SemanticQueryBuilder pattern (as suggested by Ioana) and cleaned up the TODO to avoid future confusion.
The remaining open question is whether we should take the next step of moving this fully into server (like knn). I haven’t deeply time-boxed that yet my sense is it’s a larger lift and may be best tracked separately.

mridula-s109 · 2025-10-02T16:34:27Z

Hey @mridula-s109 so sorry about the confusion here, the Jira ticket you were assigned said that Sparse Vector was still in ml and we didn't verify that it had already been moved to core during planning 😅
I think the cleanup refactoring that you've done here makes sense for the most part, with some caveats:

I would like to understand the difference between a CoordinatedInferenceAction and an InferenceAction. By changing this, what is actually changing under the hood and what side effects are there? Knowing that someone has done this research is important for peace of mind

The bulk of the work for this type of refactor lies in test failures - and there are a lot of these in your PR.

It would be worth thinking about whether we need this PR, or since it's already in core if we just try to do a refactoring to server.

Thanks for the thoughtful feedback @kderusso and @ioanatia 🙏

To clarify the CoordinatedInferenceAction vs InferenceAction part:

CoordinatedInferenceAction is used when we need to coordinate inference requests across clusters (CCS mode, reduce roundtrips=false, etc.). It wraps results into a “coordination format.”

InferenceAction is the simpler path for local inference requests within a single cluster. It doesn’t do the extra cross-cluster handling.

Since SparseVector queries don’t currently need cross-cluster coordination, switching to InferenceAction.Request aligns with what we already do in SemanticQueryBuilder. Functionally, this shouldn’t change behaviour for local queries, but it does reduce unnecessary complexity.

On the scope of the refactor:

You’re right that SparseVectorQueryBuilder is already in core, so the big move work is unnecessary.

I’ve now aligned it with the SemanticQueryBuilder pattern (as suggested by Ioana) and cleaned up the TODO to avoid future confusion.

The remaining open question is whether we should take the next step of moving this fully into server (like knn). I haven’t deeply time-boxed that yet my sense is it’s a larger lift and may be best tracked separately.

I tested everything locally with ./gradlew :x-pack:plugin:inference:check and all the tests passed. I’ll also double-check any failures once CI finishes running.

kderusso · 2025-10-02T18:50:48Z

Since SparseVector queries don’t currently need cross-cluster coordination,

Is that true, or would that break CCS for sparse vector search?

ioanatia · 2025-10-06T10:18:26Z

Since SparseVector queries don’t currently need cross-cluster coordination,

Is that true, or would that break CCS for sparse vector search?

This does not break CCS support. The CCS support that we recently added for SparseVectorQueryBuilder kicks in only when we query semantic_text fields with sparse_vector. In that case, the query interceptor will be responsible for getting the inference results and rewriting the sparse_vector query. So the code path that Mridula touches here will not even be reached.

Taking a closer look at CoordinatedInferenceAction its purpose is to be a unified, internal API that can handle multiple types of tasks and that is able to call the inference API or internal ML models (using the infer model internal API).
If used with inference endpoints that refer to Elasticsearch models hosted on the ML node, it will call the infer model API directly.
The same InferModelAction is called by the inference service, when used with an inference endpoint that points to an ML model that is being run by ML Elasticsearch nodes.
I don't really see how any of these would break CCS.

This change actually breaks something else, in a more subtle way.
Right now on main, for the sparse_vector query, the inference_id can point to both an Inference Endpoint and a model ID.

On this branch, if inference_id points to a model ID, the query will break.
However, I don't think we ever intended to support model_id, the inference_id option was always documented as an inference API endpoint.

elasticsearchmachine · 2025-10-08T15:50:38Z

Pinging @elastic/search-relevance (Team:Search - Relevance)

elasticsearchmachine · 2025-10-22T12:37:55Z

Hi @mridula-s109, I've created a changelog YAML for you.

mridula-s109 · 2025-10-29T11:48:51Z

As its already removed to core, just removed the redundant todo @kderusso . We can evaluate moving to server later when necessary.

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/search/SparseVectorQueryBuilder.java

…ml/search/SparseVectorQueryBuilder.java Co-authored-by: Kathleen DeRusso <[email protected]>

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/search/SparseVectorQueryBuilder.java

kderusso

…134702) * Removed the TODO * Updated the sparevector similar to Semantic query builder * Update docs/changelog/134702.yaml * Remove obsolete TODO comment from SparseVectorQueryBuilder * [CI] Auto commit changes from spotless * Modified the TODO to make more sense * Delete docs/changelog/134702.yaml * Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/search/SparseVectorQueryBuilder.java Co-authored-by: Kathleen DeRusso <[email protected]> * Update SparseVectorQueryBuilder.java --------- Co-authored-by: elasticsearchmachine <[email protected]> Co-authored-by: Kathleen DeRusso <[email protected]>

mridula-s109 requested a review from kderusso September 14, 2025 21:17

mridula-s109 self-assigned this Sep 14, 2025

elasticsearchmachine added the v9.2.0 label Sep 14, 2025

kderusso reviewed Sep 15, 2025

View reviewed changes

elasticsearchmachine added v9.3.0 and removed v9.2.0 labels Oct 2, 2025

mridula-s109 added 2 commits October 2, 2025 14:30

Removed the TODO

a7f26f3

Updated the sparevector similar to Semantic query builder

2b3a991

mridula-s109 force-pushed the refactor_sparsevectorquerybuilder branch from 3ff23bf to 2b3a991 Compare October 2, 2025 15:30

Merge branch 'main' into refactor_sparsevectorquerybuilder

e82fd03

Merge branch 'main' into refactor_sparsevectorquerybuilder

2cef34d

mridula-s109 changed the title ~~Refactor SparseVectorQueryBuilder out of ML plugin~~ Cleanup: Align SparseVectorQueryBuilder with InferenceAction and remove legacy TODO Oct 2, 2025

mridula-s109 changed the title ~~Cleanup: Align SparseVectorQueryBuilder with InferenceAction and remove legacy TODO~~ Cleanup: Align SparseVectorQueryBuilder with InferenceAction Oct 2, 2025

mridula-s109 marked this pull request as ready for review October 2, 2025 15:59

mridula-s109 requested review from ioanatia and kderusso October 2, 2025 15:59

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Oct 2, 2025

mridula-s109 added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch Team:Search - Relevance The Search organization Search Relevance team labels Oct 2, 2025

elasticsearchmachine removed Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch Team:Search - Relevance The Search organization Search Relevance team labels Oct 2, 2025

elasticsearchmachine removed the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 8, 2025

Merge branch 'main' into refactor_sparsevectorquerybuilder

62d3545

mridula-s109 added the >tech debt label Oct 22, 2025

Update docs/changelog/134702.yaml

5429d08

mridula-s109 added 2 commits October 29, 2025 11:42

Remove obsolete TODO comment from SparseVectorQueryBuilder

1da4a66

Merge branch 'main' into refactor_sparsevectorquerybuilder

6d1ecd6

mridula-s109 changed the title ~~Cleanup: Align SparseVectorQueryBuilder with InferenceAction~~ Cleanup:SparseVectorQueryBuilder with InferenceAction Oct 29, 2025

[CI] Auto commit changes from spotless

9c1eb88

kderusso reviewed Oct 29, 2025

View reviewed changes

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/search/SparseVectorQueryBuilder.java Show resolved Hide resolved

Modified the TODO to make more sense

8973b94

mridula-s109 changed the title ~~Cleanup:SparseVectorQueryBuilder with InferenceAction~~ Cleanup: Remove obsolete TODO from SparseVectorQueryBuilder Oct 29, 2025

mridula-s109 added >non-issue and removed >tech debt labels Oct 29, 2025

mridula-s109 added 3 commits October 29, 2025 13:45

Delete docs/changelog/134702.yaml

aad211d

Merge branch 'main' into refactor_sparsevectorquerybuilder

3527e4a

Merge branch 'main' into refactor_sparsevectorquerybuilder

55496f5

kderusso reviewed Oct 29, 2025

View reviewed changes

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/search/SparseVectorQueryBuilder.java Outdated Show resolved Hide resolved

Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/…

b7569b0

…ml/search/SparseVectorQueryBuilder.java Co-authored-by: Kathleen DeRusso <[email protected]>

mridula-s109 requested a review from kderusso October 29, 2025 15:01

Merge branch 'main' into refactor_sparsevectorquerybuilder

5b9eb22

kderusso reviewed Oct 29, 2025

View reviewed changes

...ugin/core/src/main/java/org/elasticsearch/xpack/core/ml/search/SparseVectorQueryBuilder.java Outdated Show resolved Hide resolved

mridula-s109 added 3 commits October 29, 2025 19:55

Update SparseVectorQueryBuilder.java

0ac0d94

Merge branch 'main' into refactor_sparsevectorquerybuilder

d9d4397

Merge branch 'main' into refactor_sparsevectorquerybuilder

fb96c4f

kderusso approved these changes Oct 30, 2025

View reviewed changes

mridula-s109 merged commit 42049d1 into elastic:main Oct 30, 2025
34 checks passed

Cleanup: Remove obsolete TODO from SparseVectorQueryBuilder #134702

Cleanup: Remove obsolete TODO from SparseVectorQueryBuilder #134702

Uh oh!

Conversation

mridula-s109 commented Sep 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

mridula-s109 commented Sep 14, 2025

Uh oh!

kderusso left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mridula-s109 commented Sep 17, 2025

Uh oh!

kderusso commented Sep 17, 2025

Uh oh!

kderusso commented Sep 17, 2025

Uh oh!

kderusso commented Oct 1, 2025

Uh oh!

mridula-s109 commented Oct 2, 2025

Uh oh!

mridula-s109 commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kderusso commented Oct 2, 2025

Uh oh!

ioanatia commented Oct 6, 2025

Uh oh!

elasticsearchmachine commented Oct 8, 2025

Uh oh!

elasticsearchmachine commented Oct 22, 2025

Uh oh!

mridula-s109 commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kderusso left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mridula-s109 commented Sep 14, 2025 •

edited

Loading

mridula-s109 commented Oct 2, 2025 •

edited

Loading

mridula-s109 commented Oct 29, 2025 •

edited

Loading