Optimize FST on-heap BytesReader #12879

dungba88 · 2023-12-06T01:52:38Z

Description

This is the same with #12624, except for a slight change in implementation of ReadWriteDataOutput. This PR keeps the original implementation that BytesStore did (just the getReverseBytesReader() part), to make sure there is no regression. My theory is that ReverseRandomAccessReader need to seek the correct buffer on every readByte, and reading from the ByteBuffer has some (small) overhead comparing to accessing from the byte array directly. But these should have insignificant impact on the performance. More extensive benchmark is needed.

See the diff between 2 approaches at dungba88#13

Using Test2BFST, there is some improvement over the ByteBuffersDataInput:

ReverseRandomAccessReader with ByteBuffersDataInput

  1> TEST: now verify [fst size=4621076364; nodeCount=2252341486; arcCount=2264078585]
  1> 0...: took 0 seconds
  1> 1000000...: took 27 seconds
  1> 2000000...: took 54 seconds
  1> 3000000...: took 82 seconds
  1> 4000000...: took 109 seconds
  1> 5000000...: took 137 seconds
  1> 6000000...: took 165 seconds
  1> 7000000...: took 192 seconds
  1> 8000000...: took 219 seconds
  1> 9000000...: took 247 seconds
  1> 10000000...: took 275 seconds
  1> 11000000...: took 300 seconds

This PR

  1> TEST: now verify [fst size=4621076364; nodeCount=2252341486; arcCount=2264078585]
  1> 0...: took 0 seconds
  1> 1000000...: took 22 seconds
  1> 2000000...: took 44 seconds
  1> 3000000...: took 66 seconds
  1> 4000000...: took 89 seconds
  1> 5000000...: took 111 seconds
  1> 6000000...: took 133 seconds
  1> 7000000...: took 155 seconds
  1> 8000000...: took 178 seconds
  1> 9000000...: took 200 seconds
  1> 10000000...: took 222 seconds
  1> 11000000...: took 245 seconds

I ran the test several times and the results are consistent.

dungba88 · 2023-12-06T02:07:41Z

I think we'll likely go with the approach in #12624 (simpler despite some potential regression). But this approach is to have some comparison, and in case someone has strong argument for favoring the latency.

To reviewer: Discussion outside of ReadWriteDataOutput should also go to #12624 instead.

Merge main

dungba88 · 2023-12-10T14:57:58Z

Note: We already merged #12624 , but this PR can be used as benchmark candidate for #12884

mikemccand · 2023-12-11T19:41:03Z

Actually 300 -> 245 seconds is quite a performance gain. It seems FST's own store is quite a bit faster than the ByteBuffer backed store. I think we should make this change? I'll try to review soon.

dungba88 · 2023-12-12T04:57:44Z

lucene/core/src/java/org/apache/lucene/util/fst/ReadWriteDataOutput.java

+      return new ReverseBytesReader(byteBuffers.get(0).array());
+    }
+    return new FST.BytesReader() {
+      private byte[] current = byteBuffers.get(0).array();


array() returns the internal array and there's no byte-copy here

mikemccand · 2023-12-21T12:10:56Z

lucene/core/src/java/org/apache/lucene/util/fst/ReadWriteDataOutput.java

+      // use a faster implementation for single-block case
+      return new ReverseBytesReader(byteBuffers.get(0).array());
+    }
+    return new FST.BytesReader() {


Do tests exercise this path sometimes? We would need tests to either make large FSTs, or, randomize the block size so sometimes it's tiny?

I believe testRealFST did this sometimes

Hmm I'm still worried about test coverage of this. Conditionals like this are dangerous for Lucene, since our tests only test tiny FSTs, this code path would be rarely/never executed, likely even by our nightly benchmarks. Ideally we would find a way to randomize the page size in many tests, or at least, one test?

Sorry I meant TestFSTs.testRealTerms. The default block size is 32KB and the FST is 55-80KB, thus 2-3 blocks.

mikemccand · 2023-12-21T12:13:18Z

lucene/core/src/java/org/apache/lucene/util/fst/ReadWriteDataOutput.java

+    return new FST.BytesReader() {
+      private byte[] current = byteBuffers.get(0).array();
+      private int nextBuffer = -1;
+      private int nextRead = 0;


You don't need the = 0 -- it's javac's default already.

Good catch! I've removed it

mikemccand

I left some minor comments -- I think this is nearly ready? (though it it still marked as draft?) -- given the gains of this optimization I think it makes sense to specialize the single buffer vs multiple buffer cases.

mikemccand · 2024-01-04T12:51:09Z

lucene/core/src/java/org/apache/lucene/util/fst/ReadWriteDataOutput.java

+    assert byteBuffers != null; // freeze() must be called first
+    if (byteBuffers.size() == 1) {
+      // use a faster implementation for single-block case
+      return new ReverseBytesReader(byteBuffers.get(0).array());


Can we assert hasArray() before this? .array() is optional (only valid for DirectByteBuffer I think).

Since we call toWriteableBufferList the array should be accessible. I think if the array is somehow not accessible HeapByteBuffer (the one being used) would throw an exception and thus would be caught by tests

Yeah I realize the code is safe but it takes some thinking to confirm it -- maybe add the assert with your awesome above explanation? So future code readers know why this ByteBuffer is for sure backed by an array.

That's a good idea. I've added the assertion. I decided to put it right after the ByteBuffer list is created (in freeze()) and assert all ByteBuffer in the list.

Awesome, thanks, looks great @dungba88!

mikemccand

Thanks @dungba88 -- I plan to merge some time today.

* Move size() to FSTStore * Remove size() completely * Allow FST builder to use different DataOutput * access BytesStore byte[] directly for copying * Rename BytesStore * Change class to final * Reorder methods * Remove unused methods * Rename truncate to setPosition() and remove skipBytes() * Simplify the writing operations * Update comment * remove unused parameter * Simplify BytesStore operation * tidy code * Rename copyBytes to writeTo * Simplify BytesStore operations * Embed writeBytes() to FSTCompiler * Fix the write bytes method * Remove the default block bits constant * add assertion * Rename method parameter names * Move reverse to FSTCompiler * Revert setPosition call * Address comments * Return immediately when writing 0 bytes * Add comment & * Rename variables * Fix the compile error * Remove isReadable() * Remove isReadable() * Optimize ReadWriteDataOutput * tidy code * Freeze the DataOutput once finished() * Refactor * freeze the DataOutput before use * Improvement of ReadWriteDataOutput * tidy code * Address comments and add off-heap FST tests * Remove the hardcoded random * Ignore the Test2BFSTOffHeap test * Simplify ReadWriteDataOutput * Do not expose blockBits * tidy code * Remove 0 initialization * Add assertion and comment

dungba88 · 2024-01-08T14:09:26Z

Thanks @mikemccand for merging

* Move size() to FSTStore * Remove size() completely * Allow FST builder to use different DataOutput * access BytesStore byte[] directly for copying * Rename BytesStore * Change class to final * Reorder methods * Remove unused methods * Rename truncate to setPosition() and remove skipBytes() * Simplify the writing operations * Update comment * remove unused parameter * Simplify BytesStore operation * tidy code * Rename copyBytes to writeTo * Simplify BytesStore operations * Embed writeBytes() to FSTCompiler * Fix the write bytes method * Remove the default block bits constant * add assertion * Rename method parameter names * Move reverse to FSTCompiler * Revert setPosition call * Address comments * Return immediately when writing 0 bytes * Add comment & * Rename variables * Fix the compile error * Remove isReadable() * Remove isReadable() * Optimize ReadWriteDataOutput * tidy code * Freeze the DataOutput once finished() * Refactor * freeze the DataOutput before use * Improvement of ReadWriteDataOutput * tidy code * Address comments and add off-heap FST tests * Remove the hardcoded random * Ignore the Test2BFSTOffHeap test * Simplify ReadWriteDataOutput * Do not expose blockBits * tidy code * Remove 0 initialization * Add assertion and comment

dungba88 added 30 commits November 14, 2023 07:55

Move size() to FSTStore

a6259fc

Remove size() completely

e0e1517

Merge branch 'apache:main' into remove-size

9b986a8

Allow FST builder to use different DataOutput

f2d8234

access BytesStore byte[] directly for copying

13c9359

Rename BytesStore

fa08d51

Change class to final

c6fb4b5

Reorder methods

f1e8f89

Remove unused methods

2f9c730

Rename truncate to setPosition() and remove skipBytes()

847828d

Simplify the writing operations

0913062

Update comment

0de0d26

remove unused parameter

ef3fdc6

Simplify BytesStore operation

f00d24f

tidy code

12963df

Rename copyBytes to writeTo

f1e81b8

Merge branch 'remove-size' into pr-12543-1

b7ceec9

Simplify BytesStore operations

c8165ad

Merge branch 'simplify-bytesstore' into pr-12543-1

d8cb927

Merge branch 'pr-12543-1' into pr-12543

096fa97

Embed writeBytes() to FSTCompiler

9a002c0

Fix the write bytes method

1c201d4

Merge branch 'apache:main' into pr-12543

cfdeeff

Merge branch 'apache:main' into pr-12543-1

1420cfc

Remove the default block bits constant

7efcde0

Merge branch 'pr-12543-1' into pr-12543

8f98e7b

add assertion

b140b91

Rename method parameter names

dbc1918

Move reverse to FSTCompiler

a5c7e14

Revert setPosition call

50de8f7

dungba88 added 2 commits December 6, 2023 01:06

Ignore the Test2BFSTOffHeap test

5552271

Simplify ReadWriteDataOutput

6cc31c9

dungba88 mentioned this pull request Dec 6, 2023

Allow FST builder to use different writer (#12543) #12624

Merged

Merge branch 'pr-12543' into pr-12543-1

ea3d588

dungba88 marked this pull request as draft December 6, 2023 09:22

dungba88 added 4 commits December 10, 2023 23:52

Merge branch 'pr-12543-1' into tmp-rwdo

e14a909

Merge pull request #21 from dungba88/tmp-rwdo

013b42b

Merge main

Merge branch 'apache:main' into pr-12543-1

c9f230a

Do not expose blockBits

34efa96

dungba88 mentioned this pull request Dec 10, 2023

Create a simple JMH benchmark to measure FST compilation / traversal times #12884

Open

tidy code

cbceb85

dungba88 changed the title ~~Allow FST builder to use different writer (alternative reverse BytesReader)~~ Optimize FST on-heap BytesReader Dec 12, 2023

dungba88 commented Dec 12, 2023

View reviewed changes

mikemccand reviewed Dec 21, 2023

View reviewed changes

dungba88 added 2 commits December 23, 2023 14:52

Merge branch 'apache:main' into pr-12543-1

ef9eae6

Remove 0 initialization

9fccfbc

dungba88 marked this pull request as ready for review December 23, 2023 06:58

mikemccand reviewed Jan 4, 2024

View reviewed changes

Add assertion and comment

3c388fe

mikemccand approved these changes Jan 5, 2024

View reviewed changes

mikemccand merged commit 4c883a4 into apache:main Jan 6, 2024

mikemccand added this to the 9.10.0 milestone Jan 6, 2024

Optimize FST on-heap BytesReader #12879

Optimize FST on-heap BytesReader #12879

Uh oh!

Conversation

dungba88 commented Dec 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

ReverseRandomAccessReader with ByteBuffersDataInput

This PR

Uh oh!

dungba88 commented Dec 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dungba88 commented Dec 10, 2023

Uh oh!

mikemccand commented Dec 11, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikemccand left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dungba88 Jan 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikemccand left a comment

Choose a reason for hiding this comment

Uh oh!

dungba88 commented Jan 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dungba88 commented Dec 6, 2023 •

edited

Loading

dungba88 commented Dec 6, 2023 •

edited

Loading

dungba88 Jan 4, 2024 •

edited

Loading