Skip to content

Commit 9340fb7

Browse files
committed
Simplify HnswGraph#search. (#627)
Currently the contract on `bound` is that it holds the score of the top of the `results` priority queue. It means that a candidate is only considered if its score is better than the bound *or* if less than `topK` results have been accumulated so far. I think it would be simpler if `bound` would always hold the minimum score that is required for a candidate to be considered? This would also be more consistent with how our WAND support works, by trusting `setMinCompetitiveScore` alone, instead of having to check whether the priority queue is full as well.
1 parent 68beb1a commit 9340fb7

File tree

1 file changed

+11
-10
lines changed

1 file changed

+11
-10
lines changed

lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -167,17 +167,17 @@ static NeighborQueue searchLevel(
167167
}
168168
}
169169

170-
// Set the bound to the worst current result and below reject any newly-generated candidates
171-
// failing to exceed this bound
170+
// A bound that holds the minimum similarity to the query vector that a candidate vector must
171+
// have to be considered.
172172
BoundsChecker bound = BoundsChecker.create(similarityFunction.reversed);
173-
bound.set(results.topScore());
173+
if (results.size() >= topK) {
174+
bound.set(results.topScore());
175+
}
174176
while (candidates.size() > 0) {
175177
// get the best candidate (closest or best scoring)
176178
float topCandidateScore = candidates.topScore();
177-
if (results.size() >= topK) {
178-
if (bound.check(topCandidateScore)) {
179-
break;
180-
}
179+
if (bound.check(topCandidateScore)) {
180+
break;
181181
}
182182
int topCandidateNode = candidates.pop();
183183
graphValues.seek(level, topCandidateNode);
@@ -189,11 +189,12 @@ static NeighborQueue searchLevel(
189189
}
190190

191191
float score = similarityFunction.compare(query, vectors.vectorValue(friendOrd));
192-
if (results.size() < topK || bound.check(score) == false) {
192+
if (bound.check(score) == false) {
193193
candidates.add(friendOrd, score);
194194
if (acceptOrds == null || acceptOrds.get(friendOrd)) {
195-
results.insertWithOverflow(friendOrd, score);
196-
bound.set(results.topScore());
195+
if (results.insertWithOverflow(friendOrd, score) && results.size() >= topK) {
196+
bound.set(results.topScore());
197+
}
197198
}
198199
}
199200
}

0 commit comments

Comments
 (0)