-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Avoid reconstructing HNSW graphs during segment merging. #15003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
|
I don't feel qualified to do the review, but I agree with the motivation. I wonder if this optimization could be applied when there are more than 1 segment to merge by first applying deletions on the bigger segment to merge and then adding vectors from other segments? |
msokolov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like a promising direction! I left a bunch of comments. My main one is about whether we should do this on-heap to make it more flexible (eg so we could use it when merging multiple graphs, too).
|
|
||
| private static final long SHALLOW_RAM_BYTES_USED = | ||
| RamUsageEstimator.shallowSizeOfInstance(Lucene99HnswVectorsWriter.class); | ||
| static final int DELETE_THRESHOLD_PERCENT = 30; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious if we have done any testing to motivate this choice? I guess as the number of gaps in the neighborhoods left behind by removing the deleted nodes in the graph increases we would expect to see a drop-off in recall, or maybe performance? but I don't have a good intuition about whether there is a knee in the curve, or how strong the effect is
| * @throws IOException If an error occurs while writing to the vector index | ||
| */ | ||
| private HnswGraph deleteNodesWriteGraph( | ||
| Lucene99HnswVectorsReader.OffHeapHnswGraph graph, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we change the signature to accept an HnswGraph?
| // Count and collect valid nodes | ||
| int validNodeCount = 0; | ||
| for (int node : sortedNodes) { | ||
| if (docMap.get(node) != -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might be able to pass in the size of the new graph? At least in the main case of merging we should know (I think?)
| } | ||
|
|
||
| // Special case for top level with no valid nodes | ||
| if (level == numLevels - 1 && validNodeCount == 0 && level > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if level ==0 and validNodeCount == 0 the new graph should be empty. I'm not sure how that case will get handled here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case though (the top level would be empty) -- isn't it also possible that a lower level is empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if level ==0 and validNodeCount == 0 the new graph should be empty. I'm not sure how that case will get handled here?
This means 100% nodes are deleted, right? I think we will never reach this case as entry condition to this function is checking if deletes are less than 30%.
| validNodeCount = 1; // We'll create one connection to lower level | ||
| } | ||
|
|
||
| validNodesPerLevel[level] = new int[validNodeCount]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could avoid the up-front counting, allocate a full-sized array and then use only the part of it that we fill up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
| Math.toIntExact(vectorIndex.getFilePointer() - offsetStart); | ||
| } | ||
|
|
||
| // Special case for empty top level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should special case the first empty level we findand make that the top level, unless it is the bottom level in which case the whole graph is empty
|
|
||
| /** Writes neighbors with delta encoding to the vector index. */ | ||
| private void writeNeighbors( | ||
| Lucene99HnswVectorsReader.OffHeapHnswGraph graph, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we delegate to an existing method (maybe with a refactor) to ensure we write in the same format? EG what if we switch to GroupVarInt encoding - we want to make sure this method tracks that change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
| try { | ||
| long vectorIndexOffset = vectorIndex.getFilePointer(); | ||
|
|
||
| if (mergeState.liveDocs.length == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you seen IncrementalHnswGraphMerge and MergingHnswGraphBuilder? They select the biggest graph with no deletions and merge the other segments' graphs into it. Could we expose a utility method here for rewriting a graph (in memory) to drop deletions, and then use it there?
Here we are somewhat mixing the on-disk graph format with the logic of dropping deleted nodes, which I think we could abstract out intoi the util.hnsw realm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just saw that class. I think this is a good idea. Will do it in next revision.
@jpountz Yes good idea, let me try doing that only in this PR
Thanks @msokolov . Yes I think that would be best way forward for this optimization. Working on it. |
|
|
||
| // Process nodes at this level | ||
| for (int node : sortedNodes) { | ||
| if (docMap.get(node) == -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect. Graph does not store docIDs but instead they store ordinal. Whereas docMap maps oldDocIds to new DocIDs.
The correct implementation is to create a map which maps old ords to new ords.
Will fix this in next revision.
|
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
|
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
|
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
|
The failing test is running fine on my macOS desktop and I have not changed anything in the related classes. Even with the same failing seed I am unable to reproduce the issue. Not sure why this test in failing in check. |
|
I created a pull request to my own repository, there tests are working fine: Pulkitg64#1 This issue looks to be transient and next commit should fix this. |
|
Adding some KnnPerfTestResults where I tried to simulate deletes while indexing docs. We are seeing consistent improvement in Indexing Time and Indexing Rate (except one weird case when we deleted 40% docs) without impacting recall. Num Docs: 1MM
|
|
I am confused! This PR suddenly got so much simpler, which is great, but I feel like it dropped a few things that seemed important. EG we are no longer checking the largest graph to see if its delete % is below a threshold? Also I think we are now ignoring the various edge cases around upper-level graph layers possibly becoming empty? |
|
With MaxConn = 16, I am seeing much better results. But on a weird case with 25% delete I am seeing regression in indexing rate. Trying maxConn=8 in next benchmark run
|
|
@Pulkitg64 what exactly are you benchmarking? It seems like the latest version of this PR does nothing to actually correct the graph nodes? We should handle:
|
|
Ah, maybe I don't fully grok the current impl. It seems like its doing the "largest graph" thing, but now its more clever and doing the initialized graph thing and that is where the deletes are being removed? |
Yeah, the
Yes, in the first revision I added the arbitrary percentage without doing any testing. But this time, I wanted to see the impact of mergePolicy that kicks in when delete % is higher than certain threshold. I thought we may not need to add explicit check of checking delete % of largest graph because merge policy will automatically take care of this.
|
That's right @benwtrent, we are skipping deleted nodes from the largest graph in the |
|
@Pulkitg64 pretty damn clever ;). I gotta think through this. Intuitively, it SHOULD work, even for singleton merges |
|
It's fascinating that we actually see recall improving in many cases! Intuitively, I think when we merge more segments in we have an opportunity to patch up the holes left by the deleted docs, and maybe we somehow end up doing that in an even better way the second time around? I do wonder what recall will look like for graphs with high deletion rates that are singleton-merged only? I wonder if we could test that with |
|
Based on @msokolov suggestion, I ran the benchmarks by simulating singleton merging. For this I indexed 1M docs and then force merge the segments then delete documents and then again force merge the segment. I am seeing consistent improvement (about 50x speedup) in force merge time after deletes but also degradation in recall numbers (about 10%). It's probably because of disconnectedness issue (Let me try to find connectedness number of these graphs as well.)
|
I would think so. My gut is that we don't actually go through and "fixup" anything when there is just one graph. We just pick the biggest one, and since there are no more vectors to add, we just drop connections on the ground. I would expect us to have to iterate through the graph and for every vector that is significantly disconnected, attempt to reconnect it with NNDescent starting at is original place in the graph (initializing with neighbor's neighbors if all its connections were removed). |
|
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
|
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
|
Thank you for all the work @Pulkitg64 OK, I think your plan from this comment (50% check) is the way to go for now. My intuition is that the relative threshold is likely better, but we would need to make the percentage lower (like 25%, or 20%, like your other benchmarks show). However, the 50% shows really nice improvements already, so lets move forward with that. As for "Reconstruct completely", yes, lets focus on 40% or fewer deleted total docs? Does that align with your benchmarking results? If so, I think we have our two thresholds and we can move forward with this PR. Thank you for this practical, and powerful improvement for vector indexing! |
Since in my approach, even if graph has no deletes, it will still try to reconnect nodes which will make merging slower, hence I agree with you that "relative threshold" approach is likely better. Sharing benchmark results in some time
Keeping this threshold at 40% now. |
|
Taking a sweet spot to adjust between recall and indexing performance by taking 15% as threshold for relative reduction in connection for considering a node as disconnected, I am getting below results:
|
|
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
|
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
benwtrent
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, reading through it all, I really like where it ended up.
My main concern is testing. Maybe HnswGraphTestCase.testRandomReadWriteAndMerge is enough? Could you add some logic there for "delete few, delete many"? Just to make sure we adequately exercise this logic.
This really is great work! Thank you for all the iteration. The numbers are great!
lucene/core/src/java/org/apache/lucene/util/hnsw/IncrementalHnswGraphMerger.java
Show resolved
Hide resolved
| OnHeapHnswGraph initializedGraph, | ||
| BitSet initializedNodes) | ||
| /** | ||
| * Fixes disconnected nodes at a specific level by performing graph searches from their existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://www.vldb.org/pvldb/vol18/p5166-upreti.pdf has an in-place delete algorithm (search for algorithm 6). It may be worth investigating for this use case as it could be less expensive than repairing disconnected nodes, and the paper indicates they've successfully used this algorithm with no observed loss in accuracy.
Worth noting that this algorithm does not guarantee the removal of all in-edges to the deleted node in a directed graph, so even after performing this kind of deletion we'd have to sweep the graph and remove any dead edges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mccullocht for idea and sharing this paper. The idea of using the neighbors and neighbor of neighbors of the deleted nodes for reconnection seems like a good idea (definitely faster) and I think worth pursuing.
But I wonder if we should do it as part of this PR or different one. I can work on this idea immediately as part of different issue if that makes sense to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this PR has been thoroughly tested and I didn't want to block it but trying this algorithm would be a good follow up.
|
Hi @benwtrent |
|
@Pulkitg64 nope! I am back from eating a bunch of 🦃 . I will merge and backport this. Super excited for this change. Its a practical and very useful optimization for HNSW! |
optimizes HNSW graph merging. Now instead of not considering a large graph to merge into, deleted nodes are removed and the related connections are repaired. Additionally, nodes might be promoted to higher levels to account for loss of connectivity at higher layers. Especially in singleton merges, this improves throughput significantly.
|
Note that this caused some CI failures: #15467 -- let's hold off on backporting until we resolve that. |
|
@mikemccand I already backported, however a fix is in flight. |
|
Wow, fast! Thanks @benwtrent. |
Description
This is a draft PR to optimize HNSW graph merging during singleton merges. When merging a single segment with deletions, the current implementation reconstructs the entire graph with only live nodes, which is a time-consuming process. This PR avoids full graph reconstruction by dropping deleted nodes and renumbering the remaining live nodes.
TODOs:
Add specific unit tests
Benchmarks (luceneutil)