[ML] Do not create inference endpoint if ID is used in existing mappings #137055

DonalEvans · 2025-10-23T17:59:19Z

When creating an inference endpoint, if the inference ID is referenced by ingest pipeline processors or used in semantic_text mappings in non-empty indices, prevent the endpoint from being created.

Closes #124272

When creating an inference endpoint, if the inference ID is referenced by ingest pipeline processors or used in semantic_text mappings in non-empty indices, prevent the endpoint from being created. Closes elastic#124272

elasticsearchmachine · 2025-10-23T17:59:46Z

Hi @DonalEvans, I've created a changelog YAML for you.

elasticsearchmachine · 2025-10-23T17:59:46Z

Pinging @elastic/ml-core (Team:ML)

…ating-inference-endpoint

benwtrent · 2025-10-27T20:50:17Z

...src/main/java/org/elasticsearch/xpack/inference/action/TransportPutInferenceModelAction.java

+            // Search for documents in the indices
+            for (String indexName : indicesUsingInferenceId) {
+                SearchRequest countRequest = new SearchRequest(indexName);
+                countRequest.indicesOptions(IndicesOptions.LENIENT_EXPAND_OPEN);
+                countRequest.allowPartialSearchResults(true);
+                // We just need to know whether any documents exist at all
+                SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder().size(0).trackTotalHits(true).trackTotalHitsUpTo(1);
+                countRequest.source(searchSourceBuilder);
+                SearchResponse searchResponse = client.search(countRequest).actionGet();
+                if (searchResponse.getHits().getTotalHits().value() > 0) {
+                    nonEmptyIndices.add(indexName);
+                }
+                searchResponse.decRef();
+            }


I am not sure we should check for a document existing in the index or not. We have no control on how "live" this look actually is. Maybe somebody is indexing something between your check here and your logic to delete?

If at all possible, we should simply verify indices & pipelines using the IDs, and thats it.

If we prevent inference endpoint creation if any indices exist that reference that endpoint's ID, regardless of whether those indices have actually used the endpoint, then we prevent users from being able to create an index with a semantic text field mapping, and then create the endpoint used by that mapping. This would be a potentially breaking change, since we don't currently enforce any kind of ordering when creating mappings/inference endpoints, just log a warning if the index is created before the endpoint.

I think there might be a misunderstanding about what this change is doing, as you mention "your logic to delete," but there is no deletion happening in this code path. The inference endpoint is either created successfully, or not created at all because there are indices using the inference ID and which have documents in them. If a user tries to index documents before creating the endpoint, they will get a ResourceNotFoundException because the endpoint referenced by the semantic text field doesn't exist yet, so I'm not sure that it's possible for us to get into any kind of race here where indexing a document while the endpoint is being created results in a bad state.

The one exception to this would be the scenario where a user creates the mapping, indexes some documents that don't contain a semantic text field (which should not result in a ResourceNotFoundException), then creates the endpoint. In this scenario, with the changes in this PR, the endpoint creation would fail because documents exist in the index, whereas without the changes, the endpoint would be created successfully. I'm not sure if this is a use case we want to support though, as it seems reasonable to require that users create their inference endpoints before they start indexing documents (even if those documents don't use the endpoint).

My point stays the same, on creation or deletion. If we are doing ANY logic based on "do docs exist", its incredibly likely that there is a race condition and the guarantees just aren't there. You cannot make a strong guarantee on documents existing or not in an index.

Reading: [ML] When creating an inference endpoint check existing semantic text fields for usage #124272 (comment) I agree that we should prevent a mixing of tasks/vital qualities. But this applies no matter what part of the life-cycle we are in. To me, it seems like the validation check should only be on the "vital qualities" of the model. If users force-deleted an inference ID, they have made a drastic, purposeful change. They have shown they do not want "our protection".

The original plan was to check if the model configuration for any existing semantic text fields was compatible with the new endpoint being created, and while we do store information about task type, embedding type and number of dimensions for a semantic text field, we crucially don't store information about the model name. Since different models may produce incompatible embeddings even if they're using the same embedding type and dimensions, we have no way to guarantee that the new endpoint being created will be compatible with any existing semantic text fields. I'll add a comment to the issue about this for visibility.

I'm still not sure I understand the issue with potential race conditions between endpoint creation and document indexing. As far as I understand it, if the endpoint hasn't been created when we try to index a document with a semantic text field, the indexing will simply fail, and the endpoint creation will continue and succeed, regardless of when it checks whether there are docs in the index (because at no point are there any docs in it). I don't see a possibility of a race where we successfully create the endpoint AND end up with a document in the index which has the wrong inference configuration, but I am new to Elasticsearch so I may just have misunderstood how things work. I hope it doesn't seem like I'm being difficult, I just want to understand why this approach won't result in the behaviour we want.

I realize we don't have the model name.

Here are valid scenarios for docs existing in the index that should be allowed:

The index is being restored via snapshot to a new cluster and has inference results

The index was created without a semantic_text field, documents indexed, then one added later

The index was created with a semantic_text field, but all documents indexed never actually used the field

The user has an index with semantic_text referencing the model ID, but force deleted the previous model_id (indicating they are in control)

The index exists, but the user didn't fully complete a "delete by query" (it could be in flight), that is removing all the previous docs that reference the inference field.

In all 5 of those, we should not prevent the creation of the endpoint based on documents existing.

I realize only one of those is a "race condition". Sorry for the confusion and conflation of terms. I am simply trying to say that we should only validate the critical pieces of the model that are non-updatable in the mapping if they exist (task type, dimensions, etc.). If those critical pieces of information do not exist, we should defer to the user's judgement and allow it to be created anyways.

Thanks for the explanation! So it sounds like we're okay with the possibility of a user creating an endpoint that isn't compatible with existing mappings (if the model is different, for example), but that we should at least check those model settings that we do have access to.

I'll implement that logic for indices with semantic text fields, but I'm not sure if it's possible to do the same kind of fine-grained check for pipelines, since as far as I can tell, they only store the inference endpoint ID and not any of the model settings. Would preventing endpoint creation in the event that any pipelines exist which reference the endpoint ID be acceptable, or is that too heavy-handed, and we should dispense with checking the pipelines altogether?

Would preventing endpoint creation in the event that any pipelines exist which reference the endpoint ID be acceptable, or is that too heavy-handed, and we should dispense with checking the pipelines altogether?

I think we dispense with checking pipelines altogether. Pipelines and indices are completely separate from each other. Consequently, one shouldn't impact the other I don't think. At least, not in the case.

- Do not check for pipelines using the inference ID - Check if existing semantic text fields have compatible model settings - Update and expand test coverage for the new behaviour - Improve existing test InferenceServiceExtension implementations - Move SemanticTextInfoExtractor from xpack.core.ml.utils to xpack.inference.common

…ating-inference-endpoint

benwtrent · 2025-11-03T13:14:48Z

...e-service-tests/src/javaRestTest/java/org/elasticsearch/xpack/inference/InferenceCrudIT.java

+        // Try to create an inference endpoint with the same ID but different dimensions
+        // from when the document with the semantic text field was indexed
+        ResponseException responseException = assertThrows(
+            ResponseException.class,
+            () -> putModel(endpointId, mockDenseServiceModelConfig(128), TaskType.TEXT_EMBEDDING)
+        );


Seems like this should fail even if there aren't any docs in the index.

Do you mean that it should fail if no docs with semantic text fields were ever indexed, or it should fail if docs were indexed but then removed from the index? In the former case, we wouldn't have stored the model details, because that information is only populated when we get the inference results back when ingesting a document with a semantic text field. In the latter case, the endpoint creation does fail with these changes, because the semantic text field mapping isn't cleared when documents are deleted.

Since we don't persist information about the model until a doc with a semantic text field is indexed, I don't think there's any risk to the user if they create an endpoint with certain settings, delete it, then create a new one with different settings, but never index any documents with semantic text fields before creating the endpoint for the second time.

Since we don't persist information about the model until a doc with a semantic text field is indexed, I don't think there's any risk to the user if they create an endpoint with certain settings,

Yep, you are correct. I have been out of the loop here for a bit. LGTM!

benwtrent · 2025-11-03T13:16:14Z

server/src/main/java/org/elasticsearch/inference/MinimalServiceSettings.java

-    static final String DIMENSIONS_FIELD = "dimensions";
-    static final String SIMILARITY_FIELD = "similarity";
-    static final String ELEMENT_TYPE_FIELD = "element_type";
+    public static final String DIMENSIONS_FIELD = "dimensions";
+    public static final String SIMILARITY_FIELD = "similarity";
+    public static final String ELEMENT_TYPE_FIELD = "element_type";


Do these need to be public?

I realize its consistency, but having it in this PR implies a behavior access change, and there isn't.

My mistake, I was originally using these in tests, but then realized that ServiceFields has the same Strings defined and already public. I'll put them back how they were. It would be nice if we could have a single source of truth for these sorts of constants, because they're defined independently in a lot of different places, which makes it a real pain if we ever want to update them.

…ating-inference-endpoint

elasticsearchmachine · 2025-11-03T21:36:22Z

💔 Backport failed

Status	Branch	Result
❌	9.2	Commit could not be cherrypicked due to conflicts
❌	8.19	Commit could not be cherrypicked due to conflicts
❌	9.1	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 137055

…ngs (elastic#137055) When creating an inference endpoint, if the inference ID is used in incompatible semantic_text mappings, prevent the endpoint from being created. Closes elastic#124272 - Check if existing semantic text fields have compatible model settings - Update and expand test coverage for the new behaviour - Improve existing test InferenceServiceExtension implementations - Move SemanticTextInfoExtractor from xpack.core.ml.utils to xpack.inference.common (cherry picked from commit 635fda1) # Conflicts: # x-pack/plugin/inference/qa/test-service-plugin/src/main/java/org/elasticsearch/xpack/inference/mock/TestDenseInferenceServiceExtension.java

…ngs (elastic#137055) When creating an inference endpoint, if the inference ID is used in incompatible semantic_text mappings, prevent the endpoint from being created. Closes elastic#124272 - Check if existing semantic text fields have compatible model settings - Update and expand test coverage for the new behaviour - Improve existing test InferenceServiceExtension implementations - Move SemanticTextInfoExtractor from xpack.core.ml.utils to xpack.inference.common (cherry picked from commit 635fda1) # Conflicts: # x-pack/plugin/inference/qa/test-service-plugin/src/main/java/org/elasticsearch/xpack/inference/mock/TestDenseInferenceServiceExtension.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/LocalStateInferencePlugin.java

DonalEvans · 2025-11-04T16:48:40Z

💚 All backports created successfully

Status	Branch	Result
✅	9.2
✅	9.1
✅	8.19

Questions ?

Please refer to the Backport tool documentation

…ngs (elastic#137055) When creating an inference endpoint, if the inference ID is used in incompatible semantic_text mappings, prevent the endpoint from being created. Closes elastic#124272 - Check if existing semantic text fields have compatible model settings - Update and expand test coverage for the new behaviour - Improve existing test InferenceServiceExtension implementations - Move SemanticTextInfoExtractor from xpack.core.ml.utils to xpack.inference.common (cherry picked from commit 635fda1) # Conflicts: # x-pack/plugin/inference/qa/test-service-plugin/src/main/java/org/elasticsearch/xpack/inference/mock/TestDenseInferenceServiceExtension.java # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/LocalStateInferencePlugin.java

BASE=20ce72d7b9a4a1c6848c11ca4a372cf4b95689fe HEAD=d33ebfba924e43642ba5fd3ccacd5aec5b2ba0db Branch=main

DonalEvans added >bug :ml Machine learning Team:ML Meta label for the ML team v9.3.0 labels Oct 23, 2025

Update docs/changelog/137055.yaml

98a0091

DonalEvans added 3 commits October 23, 2025 11:21

Reword error message

5fb900e

Merge branch 'main' into check-existing-semantic-text-fields-when-cre…

614a61e

…ating-inference-endpoint

Fix failing test

e95e06e

benwtrent reviewed Oct 27, 2025

View reviewed changes

DonalEvans requested a review from benwtrent October 31, 2025 00:52

Merge branch 'main' into check-existing-semantic-text-fields-when-cre…

5ec69c7

…ating-inference-endpoint

benwtrent approved these changes Nov 3, 2025

View reviewed changes

DonalEvans added 2 commits November 3, 2025 08:49

Restore default visibility to MinimalServiceSettings constants

d566301

Merge branch 'main' into check-existing-semantic-text-fields-when-cre…

d33ebfb

…ating-inference-endpoint

DonalEvans added auto-backport Automatically create backport pull requests when merged branch:9.2 branch:9.1 branch:8.19 labels Nov 3, 2025

elasticsearchmachine added v9.2.1 v8.19.7 v9.1.7 and removed branch:9.2 branch:9.1 branch:8.19 labels Nov 3, 2025

DonalEvans merged commit 635fda1 into elastic:main Nov 3, 2025
34 checks passed

DonalEvans deleted the check-existing-semantic-text-fields-when-creating-inference-endpoint branch November 3, 2025 21:35

elasticsearchmachine added the backport pending label Nov 3, 2025

DonalEvans mentioned this pull request Nov 4, 2025

[9.2] [ML] Do not create inference endpoint if ID is used in existing mappings (#137055) #137583

Merged

DonalEvans mentioned this pull request Nov 4, 2025

[9.1] [ML] Do not create inference endpoint if ID is used in existing mappings (#137055) #137584

Merged

DonalEvans mentioned this pull request Nov 4, 2025

[8.19] [ML] Do not create inference endpoint if ID is used in existing mappings (#137055) #137587

Merged

phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Nov 6, 2025

Mirror upstream elastic#137055 as single snapshot commit for AI review

70cf0dd

BASE=20ce72d7b9a4a1c6848c11ca4a372cf4b95689fe HEAD=d33ebfba924e43642ba5fd3ccacd5aec5b2ba0db Branch=main

phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Nov 7, 2025

Mirror upstream elastic#137055 as single snapshot commit for AI review

4436b4d

BASE=20ce72d7b9a4a1c6848c11ca4a372cf4b95689fe HEAD=d33ebfba924e43642ba5fd3ccacd5aec5b2ba0db Branch=main

[ML] Do not create inference endpoint if ID is used in existing mappings #137055

[ML] Do not create inference endpoint if ID is used in existing mappings #137055

Uh oh!

Conversation

DonalEvans commented Oct 23, 2025

Uh oh!

elasticsearchmachine commented Oct 23, 2025

Uh oh!

elasticsearchmachine commented Oct 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DonalEvans Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benwtrent Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Nov 3, 2025

💔 Backport failed

Uh oh!

DonalEvans commented Nov 4, 2025

💚 All backports created successfully

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DonalEvans Oct 28, 2025 •

edited

Loading

benwtrent Oct 29, 2025 •

edited

Loading