Skip to content

Conversation

@jimczi
Copy link
Contributor

@jimczi jimczi commented Jul 3, 2025

In #129825, we modified the dense_vector field type to delay setting index options until the field's dimensions are known. However, this introduced a discrepancy for indices created before that change, which would previously default to int8_hnsw even when dimensions were not set.

This discrepancy leads to an assertion failure in mixed-version clusters, where the serialized mappings differ between nodes:

[2025-07-02T20:37:29,852][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [v9.0.4-2] fatal error in thread [elasticsearch[v9.0.4-2][clusterApplierService#updateTask][T#1]], exiting java.lang.AssertionError: provided source [{"_doc":{"properties":{"vector":{"type":"dense_vector","index":true,"similarity":"cosine"}}}}] differs from mapping [{"_doc":{"properties":{"vector":{"type":"dense_vector","index":true,"similarity":"cosine","index_options":{"type":"int8_hnsw","m":16,"ef_construction":100}}}}}]

This commit resolves the issue by ensuring that indices created before the change continue to default to int8_hnsw index options, even if dimensions remain unset.

Closes #130085

In elastic#129825, we modified the dense_vector field type to delay setting index options until the field's dimensions are known. However, this introduced a discrepancy for indices created before that change, which would previously default to int8_hnsw even when dimensions were not set.

This discrepancy leads to an assertion failure in mixed-version clusters, where the serialized mappings differ between nodes:
```
[2025-07-02T20:37:29,852][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [v9.0.4-2] fatal error in thread [elasticsearch[v9.0.4-2][clusterApplierService#updateTask][T#1]], exiting java.lang.AssertionError: provided source [{"_doc":{"properties":{"vector":{"type":"dense_vector","index":true,"similarity":"cosine"}}}}] differs from mapping [{"_doc":{"properties":{"vector":{"type":"dense_vector","index":true,"similarity":"cosine","index_options":{"type":"int8_hnsw","m":16,"ef_construction":100}}}}}]
```

This commit resolves the issue by ensuring that indices created before the change continue to default to int8_hnsw index options, even if dimensions remain unset.
@jimczi jimczi requested a review from pmpailis July 3, 2025 09:57
@jimczi jimczi added >test Issues or PRs that are addressing/adding tests :Search Relevance/Vectors Vector search v9.1.0 v9.2.0 labels Jul 3, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jul 3, 2025
Copy link
Contributor

@pmpailis pmpailis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks @jimczi !

@jimczi jimczi added the auto-backport Automatically create backport pull requests when merged label Jul 3, 2025
Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@jimczi jimczi merged commit f91124a into elastic:main Jul 3, 2025
32 checks passed
@jimczi jimczi deleted the default_dense_vector_dims branch July 3, 2025 13:13
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.1 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 130540

@jimczi jimczi removed the auto-backport Automatically create backport pull requests when merged label Jul 3, 2025
jimczi added a commit to jimczi/elasticsearch that referenced this pull request Jul 3, 2025
elastic#130540)

In elastic#129825, we modified the dense_vector field type to delay setting index options until the field's dimensions are known. However, this introduced a discrepancy for indices created before that change, which would previously default to int8_hnsw even when dimensions were not set.

This discrepancy leads to an assertion failure in mixed-version clusters, where the serialized mappings differ between nodes:
```
[2025-07-02T20:37:29,852][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [v9.0.4-2] fatal error in thread [elasticsearch[v9.0.4-2][clusterApplierService#updateTask][T#1]], exiting java.lang.AssertionError: provided source [{"_doc":{"properties":{"vector":{"type":"dense_vector","index":true,"similarity":"cosine"}}}}] differs from mapping [{"_doc":{"properties":{"vector":{"type":"dense_vector","index":true,"similarity":"cosine","index_options":{"type":"int8_hnsw","m":16,"ef_construction":100}}}}}]
```

This commit resolves the issue by ensuring that indices created before the change continue to default to int8_hnsw index options, even if dimensions remain unset.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport pending :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch >test Issues or PRs that are addressing/adding tests v9.1.0 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] MixedClusterEsqlSpecIT failling on main->9.0 bwc tests

4 participants