Skip to content
3 changes: 3 additions & 0 deletions lucene/CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -328,6 +328,9 @@ Optimizations

* GITHUB#15618: SegmentTermsEnumFrame: avoid unnecessary array allocations that waste memory (Misha Dmitriev)

* GITHUB#15607: Utilize bulk scoring for diversity checking when building HNSW vector indices. This results
in some performance improvements during indexing and segment merges. (Ben Trent)

Bug Fixes
---------------------
* GITHUB#14161: PointInSetQuery's constructor now throws IllegalArgumentException
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ public class HnswGraphBuilder implements HnswBuilder {
protected final int M; // max number of connections on upper layers
private final double ml;

private final int[] bulkScoreNodes; // for bulk scoring
private final float[] bulkScores; // for bulk scoring
Comment on lines +72 to +73
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried about thread-safety of these arrays (given we have concurrent merging), but from this comment it looks like instances of this class are not shared across threads, but rather multiple instances of this class (across different threads) can operate on a single HnswGraph?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaivalnp the scorer itself isn't threadsafe. I assumed that since we were using a scorer, we were OK.

I had the same threading concerns and looked up and it seems that each thread as a unique builder object instance (The worker of the thread) and they all work on the same graph.

private final SplittableRandom random;
protected final UpdateableRandomVectorScorer scorer;
protected final HnswGraphSearcher graphSearcher;
Expand Down Expand Up @@ -156,6 +158,10 @@ protected HnswGraphBuilder(
this.hnsw = hnsw;
this.hnswLock = hnswLock;
this.graphSearcher = graphSearcher;
// pick a number that keeps us from scoring TOO much for diversity checking
// but enough to take advantage of bulk scoring
this.bulkScoreNodes = new int[8];
this.bulkScores = new float[8];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe host 8 up as a private static final variable?

entryCandidates = new GraphBuilderKnnCollector(1);
beamCandidates = new GraphBuilderKnnCollector(beamWidth);
beamCandidates0 = new GraphBuilderKnnCollector(Math.min(beamWidth / 2, M * 3));
Expand Down Expand Up @@ -470,12 +476,18 @@ static void popToScratch(GraphBuilderKnnCollector candidates, NeighborArray scra
*/
private boolean diversityCheck(float score, NeighborArray neighbors, RandomVectorScorer scorer)
throws IOException {
int bulkCount = 0;
final int bulkScoreChunk = Math.min((neighbors.nodes().length + 1) / 2, bulkScoreNodes.length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC neighbors.nodes() returns the internal array used to store neighbor nodes, which can be larger than actual number of neighbors and is exponentially growing -- can we have edge cases where the length is about twice the actual number of neighbors, causing us to bulk score everything?

I wonder if we should use neighbors.size() instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call! And good call on the array copy!, Just needed to handle the tail. New benchmarks show even nicer perf ;)

for (int i = 0; i < neighbors.size(); i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we don't need to perform any operation per-node of this loop, can we reduce some iterations using something like:

private boolean diversityCheck(float score, NeighborArray neighbors, RandomVectorScorer scorer)
    throws IOException {
  final int chunk = Math.min(Math.ceilDiv(neighbors.size(), 2), bulkScoreNodes.length);
  for (int start = 0; start < neighbors.size(); start += chunk) {
    int length = Math.min(neighbors.size() - start, chunk);
    System.arraycopy(neighbors.nodes(), start, bulkScoreNodes, 0, length);
    if (scorer.bulkScore(bulkScoreNodes, bulkScores, length) >= score) {
      return false;
    }
  }
  return true;
}

float neighborSimilarity = scorer.score(neighbors.nodes()[i]);
if (neighborSimilarity >= score) {
return false;
bulkScoreNodes[bulkCount++] = neighbors.nodes()[i];
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, doing 8 always, even if there are only 8 connections seems foolish. I will benchmark with Math.min((neighbors.nodes().length + 1)/2, bulkScoreNodes.length)).

if (bulkCount == bulkScoreChunk || i == neighbors.size() - 1) {
if (scorer.bulkScore(bulkScoreNodes, bulkScores, bulkCount) >= score) {
return false;
}
bulkCount = 0;
}
}
assert bulkCount == 0;
return true;
}

Expand Down