-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Optimize FST on-heap BytesReader #12879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Actually 300 -> 245 seconds is quite a performance gain. It seems FST's own store is quite a bit faster than the |
| return new ReverseBytesReader(byteBuffers.get(0).array()); | ||
| } | ||
| return new FST.BytesReader() { | ||
| private byte[] current = byteBuffers.get(0).array(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array() returns the internal array and there's no byte-copy here
| // use a faster implementation for single-block case | ||
| return new ReverseBytesReader(byteBuffers.get(0).array()); | ||
| } | ||
| return new FST.BytesReader() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do tests exercise this path sometimes? We would need tests to either make large FSTs, or, randomize the block size so sometimes it's tiny?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe testRealFST did this sometimes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm still worried about test coverage of this. Conditionals like this are dangerous for Lucene, since our tests only test tiny FSTs, this code path would be rarely/never executed, likely even by our nightly benchmarks. Ideally we would find a way to randomize the page size in many tests, or at least, one test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I meant TestFSTs.testRealTerms. The default block size is 32KB and the FST is 55-80KB, thus 2-3 blocks.
| return new FST.BytesReader() { | ||
| private byte[] current = byteBuffers.get(0).array(); | ||
| private int nextBuffer = -1; | ||
| private int nextRead = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the = 0 -- it's javac's default already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I've removed it
mikemccand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some minor comments -- I think this is nearly ready? (though it it still marked as draft?) -- given the gains of this optimization I think it makes sense to specialize the single buffer vs multiple buffer cases.
| assert byteBuffers != null; // freeze() must be called first | ||
| if (byteBuffers.size() == 1) { | ||
| // use a faster implementation for single-block case | ||
| return new ReverseBytesReader(byteBuffers.get(0).array()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assert hasArray() before this? .array() is optional (only valid for DirectByteBuffer I think).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we call toWriteableBufferList the array should be accessible. I think if the array is somehow not accessible HeapByteBuffer (the one being used) would throw an exception and thus would be caught by tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I realize the code is safe but it takes some thinking to confirm it -- maybe add the assert with your awesome above explanation? So future code readers know why this ByteBuffer is for sure backed by an array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. I've added the assertion. I decided to put it right after the ByteBuffer list is created (in freeze()) and assert all ByteBuffer in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks, looks great @dungba88!
mikemccand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dungba88 -- I plan to merge some time today.
* Move size() to FSTStore * Remove size() completely * Allow FST builder to use different DataOutput * access BytesStore byte[] directly for copying * Rename BytesStore * Change class to final * Reorder methods * Remove unused methods * Rename truncate to setPosition() and remove skipBytes() * Simplify the writing operations * Update comment * remove unused parameter * Simplify BytesStore operation * tidy code * Rename copyBytes to writeTo * Simplify BytesStore operations * Embed writeBytes() to FSTCompiler * Fix the write bytes method * Remove the default block bits constant * add assertion * Rename method parameter names * Move reverse to FSTCompiler * Revert setPosition call * Address comments * Return immediately when writing 0 bytes * Add comment & * Rename variables * Fix the compile error * Remove isReadable() * Remove isReadable() * Optimize ReadWriteDataOutput * tidy code * Freeze the DataOutput once finished() * Refactor * freeze the DataOutput before use * Improvement of ReadWriteDataOutput * tidy code * Address comments and add off-heap FST tests * Remove the hardcoded random * Ignore the Test2BFSTOffHeap test * Simplify ReadWriteDataOutput * Do not expose blockBits * tidy code * Remove 0 initialization * Add assertion and comment
|
Thanks @mikemccand for merging |
* Move size() to FSTStore * Remove size() completely * Allow FST builder to use different DataOutput * access BytesStore byte[] directly for copying * Rename BytesStore * Change class to final * Reorder methods * Remove unused methods * Rename truncate to setPosition() and remove skipBytes() * Simplify the writing operations * Update comment * remove unused parameter * Simplify BytesStore operation * tidy code * Rename copyBytes to writeTo * Simplify BytesStore operations * Embed writeBytes() to FSTCompiler * Fix the write bytes method * Remove the default block bits constant * add assertion * Rename method parameter names * Move reverse to FSTCompiler * Revert setPosition call * Address comments * Return immediately when writing 0 bytes * Add comment & * Rename variables * Fix the compile error * Remove isReadable() * Remove isReadable() * Optimize ReadWriteDataOutput * tidy code * Freeze the DataOutput once finished() * Refactor * freeze the DataOutput before use * Improvement of ReadWriteDataOutput * tidy code * Address comments and add off-heap FST tests * Remove the hardcoded random * Ignore the Test2BFSTOffHeap test * Simplify ReadWriteDataOutput * Do not expose blockBits * tidy code * Remove 0 initialization * Add assertion and comment
Description
This is the same with #12624, except for a slight change in implementation of
ReadWriteDataOutput. This PR keeps the original implementation that BytesStore did (just thegetReverseBytesReader()part), to make sure there is no regression. My theory is that ReverseRandomAccessReader need to seek the correct buffer on everyreadByte, and reading from the ByteBuffer has some (small) overhead comparing to accessing from the byte array directly. But these should have insignificant impact on the performance. More extensive benchmark is needed.See the diff between 2 approaches at dungba88#13
Using Test2BFST, there is some improvement over the
ByteBuffersDataInput:ReverseRandomAccessReader with ByteBuffersDataInput
This PR
I ran the test several times and the results are consistent.