Skip to content

Commit 2058c4c

Browse files
Fix flaky MMR diversification YAML tests (elastic#143706)
The default int8_hnsw index type quantizes float32 vectors to int8, introducing enough scoring error to non-deterministically reorder documents with close cosine similarities. With only 4 dimensions the quantization is particularly coarse. Use explicit hnsw index type on test dense_vector mappings to get exact float scoring and deterministic KNN result ordering. Update expected results to reflect exact cosine ordering. Closes elastic#143430 Closes elastic#143609
1 parent 2ad21ee commit 2058c4c

File tree

2 files changed

+11
-11
lines changed

2 files changed

+11
-11
lines changed

muted-tests.yml

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -366,15 +366,9 @@ tests:
366366
- class: org.elasticsearch.xpack.esql.qa.mixed.EsqlClientYamlIT
367367
method: test {p0=esql/40_tsdb/TS Command grouping on text field}
368368
issue: https://github.com/elastic/elasticsearch/issues/142544
369-
- class: org.elasticsearch.multiproject.test.CoreWithMultipleProjectsClientYamlTestSuiteIT
370-
method: test {yaml=search.retrievers/result-diversification/10_mmr_result_diversification_retriever/Test MMR result diversification single index float type}
371-
issue: https://github.com/elastic/elasticsearch/issues/143430
372369
- class: org.elasticsearch.repositories.azure.AzureBlobContainerRetriesTests
373370
method: testWriteLargeBlob
374371
issue: https://github.com/elastic/elasticsearch/issues/143551
375-
- class: org.elasticsearch.multiproject.test.CoreWithMultipleProjectsClientYamlTestSuiteIT
376-
method: test {yaml=search.retrievers/result-diversification/10_mmr_result_diversification_retriever/Test MMR result diversification multiple indexes}
377-
issue: https://github.com/elastic/elasticsearch/issues/143609
378372
- class: org.elasticsearch.index.query.PrefixQueryBuilderTests
379373
method: testPrefixCircuitBreakerTripsWithLowLimit
380374
issue: https://github.com/elastic/elasticsearch/issues/143548

rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.retrievers/result-diversification/10_mmr_result_diversification_retriever.yml

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ setup:
2323
textvector:
2424
type: dense_vector
2525
dims: 4
26+
index_options:
27+
type: hnsw
2628
id:
2729
type: integer
2830

@@ -41,6 +43,8 @@ setup:
4143
textvector:
4244
type: dense_vector
4345
dims: 4
46+
index_options:
47+
type: hnsw
4448
id:
4549
type: integer
4650

@@ -76,6 +80,8 @@ setup:
7680
textvector:
7781
type: dense_vector
7882
dims: 4
83+
index_options:
84+
type: hnsw
7985
rankfield:
8086
type: integer
8187
id:
@@ -215,7 +221,7 @@ teardown:
215221
- match: { hits.hits.6._source.textbody: "ninth text" }
216222
- match: { hits.hits.7._source.textbody: "seventh text" }
217223
- match: { hits.hits.8._source.textbody: "eighth text" }
218-
- match: { hits.hits.9._source.textbody: "fourth text" }
224+
- match: { hits.hits.9._source.textbody: "fifth text" }
219225

220226
- do:
221227
search:
@@ -308,7 +314,7 @@ teardown:
308314
- match: { hits.hits.6._source.textbody: "ninth text" }
309315
- match: { hits.hits.7._source.textbody: "seventh text" }
310316
- match: { hits.hits.8._source.textbody: "eighth text" }
311-
- match: { hits.hits.9._source.textbody: "fourth text" }
317+
- match: { hits.hits.9._source.textbody: "fifth text" }
312318

313319
- do:
314320
search:
@@ -356,7 +362,7 @@ teardown:
356362
- length: { hits.hits: 3 }
357363
- match: { hits.hits.0._source.textbody: "seventh text" }
358364
- match: { hits.hits.1._source.textbody: "eighth text" }
359-
- match: { hits.hits.2._source.textbody: "fourth text" }
365+
- match: { hits.hits.2._source.textbody: "fifth text" }
360366

361367
---
362368
"Test MMR result diversification byte vector type":
@@ -459,7 +465,7 @@ teardown:
459465
- match: { hits.hits.6._source.textbody: "ninth text" }
460466
- match: { hits.hits.7._source.textbody: "third text other" }
461467
- match: { hits.hits.8._source.textbody: "third text duplicate other" }
462-
- match: { hits.hits.9._source.textbody: "fifth text other" }
468+
- match: { hits.hits.9._source.textbody: "sixth text other" }
463469

464470
- do:
465471
search:
@@ -514,7 +520,7 @@ teardown:
514520
- match: { hits.hits.6._source.textbody: "ninth text" }
515521
- match: { hits.hits.7._source.textbody: "seventh text" }
516522
- match: { hits.hits.8._source.textbody: "eighth text" }
517-
- match: { hits.hits.9._source.textbody: "fourth text" }
523+
- match: { hits.hits.9._source.textbody: "fifth text" }
518524

519525
---
520526
"Test MMR result diversification rank_window_size restricts top docs":

0 commit comments

Comments
 (0)