-
Notifications
You must be signed in to change notification settings - Fork 1.2k
A specialized Trie for Block Tree Index #14333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
mikemccand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Thank you for persisting @gf2121!
|
|
||
| @SuppressWarnings("NonFinalStaticField") | ||
| static Codec defaultCodec = LOADER.lookup("Lucene101"); | ||
| static Codec defaultCodec = LOADER.lookup("Lucene103"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay! This means default Codec is now using trie!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is causing the OOMs. See #14487
|
Thank you @mikemccand and @jpountz for the patient review and all these great suggestions! I raised mikemccand/luceneutil#369 to switch codec for luceneutil. |
|
Nightly confirmed the speedup. https://benchmarks.mikemccandless.com/2025.04.14.18.05.20.html Annotation PR: mikemccand/luceneutil#370 |
That's ~1.2 M primary key lookups per second, yay! PK lookups are used all the time, not just the "key/value store" use case... e.g. updating or deleting a document by its id. |
|
I saw a message raising concern about some OOM test failures possibly related to this? Am I confused? Is excessive memory usage something we should be concerned about? Hmm maybe it was a test issue only #14487 |
|
If we think the issue last week was with the tests, should we go ahead and back-port this change? |
|
We plan to allow CI chew on this change for a couple of weeks and backport it if everything goes well. See #14333 (comment). When backporting, i'll try to catch all changes made to |
|
You anticipated my concern with #14447 😄 |
* A specialized Trie for Block Tree Index (#14333) * Fix OOM of TestTrie (#14488) * Compute the doc range more efficiently when flushing doc block (#14447) * Fix TestForTooMuchCloning (#14547) * Fix tests: too many calls to IndexInput.clone during merging (#14595) * Simplify rootCodeFp (#14596) --------- Co-authored-by: panguixin <[email protected]>
|
Thanks for the fantastic change! Want to share that we adopted and backported this codec to Lucene 912, and ran it against an Amazon Search internal benchmark, from which we observed an increase of 2.7% in Searcher throughput : D Looking forward to seeing the actual improvements this brings in production! |
|
Thank you @Coqueue.
Small correction: Amazon Product Search internal benchmarks. Also, we did this backport just to evaluate performance impact to our service from this change since we so heavily use Lucene's terms dictionary, but for actually running this in production we will upgrade to 10.x. Hmm this change doesn't have the milestone set -- I'll set it. |
|
This is great info, thanks for sharing! |
Context
Write MSB VLong for better outputs sharing in block tree index #12631 introduced a MSBVLong format to encode the first fp of FST output. It is the first time we benefit from the output sharing in blocktree. The change reduces ~13% tip size, in turn caused a performance regression when accumulating output bytes Nightly benchmark regression for term dict queries #12659. Then Disable suffix sharing for block tree index #12722 introduce a complex and tricky OutputAccumulator to get the performance back a bit, while still slower than no output prefix sharing.
Disable suffix sharing for block tree index #12722 we disabled suffix sharing as we find that very few suffix get shared in block tree.
Proposal
Before the PRs mentioned above, the fst in block tree is almost like a trie - no output prefix sharing and few suffix sharing. When output prefix get shared, we saved space, but also lost performance. IMO for
tip, performance is more important than storage size, which is usually a very small part of the whole index, and loaded off-heap.This PR tries to introduce a simple trie that specialized for blocktree index:
IndexInputwithout copy....
PR is still a draft, but the number looks promising.
Storage
Search