Skip to content

Commit 7727dff

Browse files
authored
Knn vector rescoring to sort score docs (#122653) (#122678)
RescoreKnnVectorQuery rewrites to KnnScoreDocQuery, which takes a sorted array of doc ids and corresponding array including scores fo such docs. A binary search is performed on top of the docs array, and such global ids are converted back to segment level ids (subtracting the context docbase) when scoring docs. RescoreKnnVectoryQuery did not sort the array of docs which caused binary search to return non deterministic results, which in turn made us look up wrong docs, something using out of bound ids. One symptom of this was observed in a DFSProfilerIT test failure which triggered a Lucene assertion around doc id being outside of the range of the bitset of live docs. The fix is to simply sort the score docs array before extracting docs ids and scores and providing them to KnnScoreDocQuery upon rewrite. Relates to #116663 Closes #119711
1 parent 9760c38 commit 7727dff

File tree

3 files changed

+8
-3
lines changed

3 files changed

+8
-3
lines changed

docs/changelog/122653.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 122653
2+
summary: Knn vector rescoring to sort score docs
3+
area: Vector Search
4+
type: bug
5+
issues:
6+
- 119711

muted-tests.yml

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -338,9 +338,6 @@ tests:
338338
- class: org.elasticsearch.xpack.restart.FullClusterRestartIT
339339
method: testWatcherWithApiKey {cluster=UPGRADED}
340340
issue: https://github.com/elastic/elasticsearch/issues/119396
341-
- class: org.elasticsearch.search.profile.dfs.DfsProfilerIT
342-
method: testProfileDfs
343-
issue: https://github.com/elastic/elasticsearch/issues/119711
344341
- class: org.elasticsearch.xpack.security.authc.ldap.ADLdapUserSearchSessionFactoryTests
345342
issue: https://github.com/elastic/elasticsearch/issues/119882
346343
- class: org.elasticsearch.index.mapper.AbstractShapeGeometryFieldMapperTests

server/src/main/java/org/elasticsearch/search/vectors/RescoreKnnVectorQuery.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323

2424
import java.io.IOException;
2525
import java.util.Arrays;
26+
import java.util.Comparator;
2627
import java.util.Objects;
2728

2829
/**
@@ -60,6 +61,7 @@ public Query rewrite(IndexSearcher searcher) throws IOException {
6061
TopDocs topDocs = searcher.search(query, k);
6162
vectorOperations = topDocs.totalHits.value;
6263
ScoreDoc[] scoreDocs = topDocs.scoreDocs;
64+
Arrays.sort(scoreDocs, Comparator.comparingInt(scoreDoc -> scoreDoc.doc));
6365
int[] docIds = new int[scoreDocs.length];
6466
float[] scores = new float[scoreDocs.length];
6567
for (int i = 0; i < scoreDocs.length; i++) {

0 commit comments

Comments
 (0)