GraphIndexBuilderMparameter now represents the maximum degree of the graph, instead of half the maximum degree. (The former behavior was motivated by making it easy to make apples-to-apples comparisons with Lucene HNSW graphs.) So, if you were building a graph of M=16 with JVector2, you should build it with M=32 with JVector3.NodeSimilarity.ReRankerrenamed toRerankerNodeSimilarity.Rerankerapi has changed. The interface is no longer parameterized, and thesimilarityTomethod no longer takes a Map parameter (provided bysearchwith the full vectors associated with the nodes returned). This is because we discovered that (in contrast with the original DiskANN design) it is more performant to read vectors lazily from disk at reranking time, since this will only have to fetch vectors for the topK nodes instead of all nodes visited.
OnHeapGraphIndex::ramBytesUsedOneNodeno longer takes anint nodeLevelparameter
- In-graph deletes are supported through
GraphIndexBuilder.markNodeDeleted. Deleted nodes are removed whenGraphIndexBuilder.cleanupis called (which is not threadsafe wrt other concurrent changes). To write a graph with deleted nodes to disk, aMapmust be supplied indicating what ordinals to change the remaining node ids to -- on-disk graphs may not contain "holes" in the ordinal sequence. GraphSearcher.searchnow has an experimental overload that takes afloat thresholdparameter that may be used instead of topK; (approximately) all the nodes with simlarities greater than the given threshold will be returned.- Binary Quantization is available as an alternative to Product Quantization. Our tests show that it's primarily suitable for ada002 embedding vectors and loses too much accuracy with smaller embeddings.
GraphIndexBuilder.completeis nowcleanup.- The
Bitsparameter toGraphSearcher.searchis no longer nullable; passBits.ALLinstead ofnullto indicate that all ordinals are acceptable.
NeighborQueue,NeighborArray, andNeighborSimilarityhave been renamed toNodeQueue,NodeArray, andNodeSimilarity, respectively.