Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
05029df
Avoid reconstructing HNSW graph during singleton merging
Jul 29, 2025
47c4d1c
Made changes in HNSW util classes to accomodate deletes for segment m…
Aug 9, 2025
42e8c0a
Restored codec files to original state
Aug 9, 2025
a6c5272
Fixed getNewOrdMapping function in ConcurrentHnswMerger for handling …
Aug 9, 2025
215d038
Merge branch 'main' into singleton
Oct 30, 2025
bd6f1ab
Fixed graph diconnected issue by adding support for reconnecting and …
Oct 31, 2025
df0f219
Removed and ignored some test cases as we are breaking graph building…
Oct 31, 2025
01666ff
Made some functions and members in HnswGraphBuilder protected and pac…
Oct 31, 2025
a792f4d
Disabled TestHnswBitVectorsFormat.testMergeStability test as the fail…
Oct 31, 2025
3f23ac9
Use HnswGraphBuilder.ml instead of inverseMaxConn
Oct 31, 2025
cb934af
Revert "Use HnswGraphBuilder.ml instead of inverseMaxConn"
Nov 4, 2025
a0f5169
No need of calculating score here
Nov 5, 2025
c23f662
Use a different factor for checking disconnected nodes
Nov 7, 2025
ff647c5
Set 85% as threshold for disconnected nodes, 40% for graph deletes an…
Nov 7, 2025
59ebbdc
Fix fixDisconnectedNodes function where node with no connection was b…
Nov 7, 2025
5a0a73d
tidy
Nov 7, 2025
f2d4f4b
Added check to repair graph only when it has deletes
Nov 11, 2025
6ff2eed
Merge branch 'main' into singleton
Nov 11, 2025
fd7aa18
Fixed vectorCount for calculating test, added test case
Nov 14, 2025
6b80092
Merge branch 'main' into singleton
Nov 14, 2025
f29f9e8
Reverted test override in TestHnswBitVectorsFormat
Nov 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -481,7 +481,7 @@ static FieldEntry create(
}

/** Read the nearest-neighbors graph from the index input */
private static final class OffHeapHnswGraph extends HnswGraph {
static final class OffHeapHnswGraph extends HnswGraph {

final IndexInput dataIn;
final int[][] nodesByLevel;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
import org.apache.lucene.codecs.hnsw.FlatFieldVectorsWriter;
import org.apache.lucene.codecs.hnsw.FlatVectorsScorer;
import org.apache.lucene.codecs.hnsw.FlatVectorsWriter;
import org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat;
import org.apache.lucene.index.ByteVectorValues;
import org.apache.lucene.index.DocsWithFieldSet;
import org.apache.lucene.index.FieldInfo;
Expand Down Expand Up @@ -69,6 +70,8 @@ public final class Lucene99HnswVectorsWriter extends KnnVectorsWriter {

private static final long SHALLOW_RAM_BYTES_USED =
RamUsageEstimator.shallowSizeOfInstance(Lucene99HnswVectorsWriter.class);
static final int DELETE_THRESHOLD_PERCENT = 30;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if we have done any testing to motivate this choice? I guess as the number of gaps in the neighborhoods left behind by removing the deleted nodes in the graph increases we would expect to see a drop-off in recall, or maybe performance? but I don't have a good intuition about whether there is a knee in the curve, or how strong the effect is


private final SegmentWriteState segmentWriteState;
private final IndexOutput meta, vectorIndex;
private final int M;
Expand Down Expand Up @@ -347,12 +350,195 @@ private void reconstructAndWriteNeighbours(
}
}

/**
* Processes an off-heap HNSW graph, removing deleted nodes and writing the graph structure to the
* vector index.
*
* @param graph The off-heap graph to process
* @param docMap Mapping from old document IDs to new document IDs
* @param graphSize The size of the graph (number of vectors)
* @param offsets Array to store the byte offsets for each node at each level
* @return A mock HnswGraph implementation that provides access to the graph structure
* @throws IOException If an error occurs while writing to the vector index
*/
private HnswGraph deleteNodesWriteGraph(
Lucene99HnswVectorsReader.OffHeapHnswGraph graph,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change the signature to accept an HnswGraph?

MergeState.DocMap docMap,
int graphSize,
int[][] offsets)
throws IOException {
if (graph == null) return null;

int[] scratch = new int[graph.maxConn() * 2];
final int numLevels = graph.numLevels();
final int[][] validNodesPerLevel = new int[numLevels][];

// Process all levels
for (int level = 0; level < numLevels; level++) {
int[] sortedNodes = NodesIterator.getSortedNodes(graph.getNodesOnLevel(level));

// Count and collect valid nodes
int validNodeCount = 0;
for (int node : sortedNodes) {
if (docMap.get(node) != -1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might be able to pass in the size of the new graph? At least in the main case of merging we should know (I think?)

validNodeCount++;
}
}

// Special case for top level with no valid nodes
if (level == numLevels - 1 && validNodeCount == 0 && level > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if level ==0 and validNodeCount == 0 the new graph should be empty. I'm not sure how that case will get handled here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case though (the top level would be empty) -- isn't it also possible that a lower level is empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if level ==0 and validNodeCount == 0 the new graph should be empty. I'm not sure how that case will get handled here?

This means 100% nodes are deleted, right? I think we will never reach this case as entry condition to this function is checking if deletes are less than 30%.

validNodeCount = 1; // We'll create one connection to lower level
}

validNodesPerLevel[level] = new int[validNodeCount];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could avoid the up-front counting, allocate a full-sized array and then use only the part of it that we fill up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

offsets[level] = new int[validNodeCount];

int validNodeIndex = 0;
int nodeOffsetId = 0;

// Process nodes at this level
for (int node : sortedNodes) {
if (docMap.get(node) == -1) {
Copy link
Contributor Author

@Pulkitg64 Pulkitg64 Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect. Graph does not store docIDs but instead they store ordinal. Whereas docMap maps oldDocIds to new DocIDs.
The correct implementation is to create a map which maps old ords to new ords.

Will fix this in next revision.

continue;
}

// Store mapped node ID
int mappedNode = docMap.get(node);
validNodesPerLevel[level][validNodeIndex++] = mappedNode;

// Process neighbors
graph.seek(level, node);
long offsetStart = vectorIndex.getFilePointer();

// Process and write neighbors with delta encoding
writeNeighbors(graph, docMap, scratch, level == 0 ? graphSize : 0);

offsets[level][nodeOffsetId++] =
Math.toIntExact(vectorIndex.getFilePointer() - offsetStart);
}

// Special case for empty top level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should special case the first empty level we findand make that the top level, unless it is the bottom level in which case the whole graph is empty

if (level == numLevels - 1 && validNodeIndex == 0 && level > 0) {
int entryNode = validNodesPerLevel[level - 1][0];
validNodesPerLevel[level][0] = entryNode;

long offsetStart = vectorIndex.getFilePointer();
vectorIndex.writeVInt(1);
vectorIndex.writeVInt(entryNode);
offsets[level][0] = Math.toIntExact(vectorIndex.getFilePointer() - offsetStart);
}
}

return new HnswGraph() {
@Override
public int nextNeighbor() {
throw new UnsupportedOperationException();
}

@Override
public void seek(int level, int target) {
throw new UnsupportedOperationException();
}

@Override
public int size() {
return graphSize;
}

@Override
public int numLevels() throws IOException {
return numLevels;
}

@Override
public int maxConn() {
return graph.maxConn();
}

@Override
public int entryNode() {
return validNodesPerLevel[0][0];
}

@Override
public int neighborCount() {
throw new UnsupportedOperationException();
}

@Override
public NodesIterator getNodesOnLevel(int level) {
return new ArrayNodesIterator(validNodesPerLevel[level], validNodesPerLevel[level].length);
}
};
}

/** Writes neighbors with delta encoding to the vector index. */
private void writeNeighbors(
Lucene99HnswVectorsReader.OffHeapHnswGraph graph,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we delegate to an existing method (maybe with a refactor) to ensure we write in the same format? EG what if we switch to GroupVarInt encoding - we want to make sure this method tracks that change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

MergeState.DocMap docMap,
int[] scratch,
int maxNode)
throws IOException {
int lastNode = -1;
int actualSize = 0;

int neighborLength = graph.neighborCount();
for (int i = 0; i < neighborLength; i++) {
int curNode = docMap.get(graph.nextNeighbor());
if (curNode == -1 || curNode == lastNode || (maxNode > 0 && curNode >= maxNode)) {
continue;
}

scratch[actualSize++] = lastNode == -1 ? curNode : curNode - lastNode;
lastNode = curNode;
}

vectorIndex.writeVInt(actualSize);
for (int i = 0; i < actualSize; i++) {
vectorIndex.writeVInt(scratch[i]);
}
}

@Override
public void mergeOneField(FieldInfo fieldInfo, MergeState mergeState) throws IOException {
CloseableRandomVectorScorerSupplier scorerSupplier =
flatVectorWriter.mergeOneFieldToIndex(fieldInfo, mergeState);
try {
long vectorIndexOffset = vectorIndex.getFilePointer();

if (mergeState.liveDocs.length == 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you seen IncrementalHnswGraphMerge and MergingHnswGraphBuilder? They select the biggest graph with no deletions and merge the other segments' graphs into it. Could we expose a utility method here for rewriting a graph (in memory) to drop deletions, and then use it there?

Here we are somewhat mixing the on-disk graph format with the logic of dropping deleted nodes, which I think we could abstract out intoi the util.hnsw realm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw that class. I think this is a good idea. Will do it in next revision.

&& (mergeState.knnVectorsReaders[0] instanceof PerFieldKnnVectorsFormat.FieldsReader)) {
PerFieldKnnVectorsFormat.FieldsReader fieldsReader =
(PerFieldKnnVectorsFormat.FieldsReader) mergeState.knnVectorsReaders[0];
HnswGraph oldGraph = fieldsReader.getGraph(fieldInfo.name);
int oldGraphSize = oldGraph.size();
int newLength = scorerSupplier.totalVectorCount();
int maxDocLength = mergeState.maxDocs[0];
if (oldGraphSize == maxDocLength) {

if (((oldGraphSize - newLength) * 100) / oldGraphSize < DELETE_THRESHOLD_PERCENT) {
int[][] offsets = new int[oldGraph.numLevels()][];
HnswGraph reconstructedGraph =
deleteNodesWriteGraph(
(Lucene99HnswVectorsReader.OffHeapHnswGraph) oldGraph,
mergeState.docMaps[0],
newLength,
offsets);

long vectorIndexLength = vectorIndex.getFilePointer() - vectorIndexOffset;
writeMeta(
fieldInfo,
vectorIndexOffset,
vectorIndexLength,
scorerSupplier.totalVectorCount(),
reconstructedGraph,
offsets);

IOUtils.close(scorerSupplier);
return;
}
}
}
// build the graph using the temporary vector data
// we use Lucene99HnswVectorsReader.DenseOffHeapVectorValues for the graph construction
// doesn't need to know docIds
Expand Down
Loading