-
-
Notifications
You must be signed in to change notification settings - Fork 237
pool wire write buffer #2799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pool wire write buffer #2799
Conversation
zachmu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idea is sound, see comments
| }, | ||
| } | ||
|
|
||
| type ByteBuffer struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This idea has merit and is surely better than allocating a buffer for each request but the way you're managing the memory is suboptimal. Also good to use the same backing array for multiple values in a request.
In the main use of this object in the handler, you're getting a zero-length slice (which has some larger backing array) and then calling append on it byte by byte. This will grow the backing array in some cases, but it's not being done under your deliberate control. Rather, you're then calling Double if the backing array is low on space after the appends have already happened.
Basically: in these methods, you are referring to the len of the byte slice, when your concern is usually the cap. It's fine to let append happen byte by byte as long as they aren't doubling the backing array too often, that's the expensive bit.
I think this would probably work slightly better if you just scrapped the explicit capacity management altogether and just let the Go runtime manage it automatically for you. Either that, or always manage it explicitly yourself, i.e. before you serialize a value with all those append calls. But it's not clear to me that manual management is actually any better if you use the same strategy as the go runtime does (double once we're full).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with all of this, but there are two caveats that limit our ability to let the runtime handle this for us. (1) The runtime chooses doubling based on the cap of the slice, not the full backing array. So for the current setup, the doubled array is usually actually smaller than the original backing array. (2) Doubled arrays are not reference swapped, we need a handle to the new buffer to know when to grow.
I'm not aware of how to override the default runtime growth behavior to ignore the slice cap and instead double based on the backing array cap. SoBytesBuffer still does the doubling, and a Grow(n int) interface to track when this should happen. We pay for 2 mallocs on doubling, because the first one is never big enough. Not calling Grow after allocing, or growing by too small of length compared to the allocations used will stomp previously written memory.
zachmu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, although a few tests of using the Get() / Grow() sequence across some boundaries would be great.
sql/byte_buffer.go
Outdated
| // are responsible for accurately reporting which bytes | ||
| // they expect to be protected. | ||
| func (b *ByteBuffer) Grow(n int) { | ||
| if b.i+n > len(b.buf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually is there an off-by-one error here? Test at the boundary would be goo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, not an error maybe but definitely better to preemptively double when we know the next Grow will exceed the buffer
BytesBufferis a class that lets us avoid most allocations for spooling values to wire. Notably, the object is responsible for doubling the backing array size when appropriate, and aGrow(n int)interface is necessary to track when this should happen. Letting the runtime do all of this would be preferable, but the runtime doubles based on slice size, and the refactors required to make that workable are more invasive. We pay for 2 mallocs on doubling, because the first one is never big enough. Not callingGrowafter allocing, or growing by too small of length compared to the allocations used will stomp previously written memory.As long as we track bytes used with the
Growinterface this works smoothly and shaves ~30% off of tablescans.perf here: dolthub/dolt#8693