-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Optimize exponential histogram builder for in-order construction #135836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
...togram/src/main/java/org/elasticsearch/exponentialhistogram/ExponentialHistogramBuilder.java
Show resolved
Hide resolved
...togram/src/main/java/org/elasticsearch/exponentialhistogram/ExponentialHistogramBuilder.java
Show resolved
Hide resolved
.../src/main/java/org/elasticsearch/exponentialhistogram/FixedCapacityExponentialHistogram.java
Show resolved
Hide resolved
| * @return true, if the last bucket added successfully via {@link #tryAddBucket(long, long, boolean)} was a positive one. | ||
| */ | ||
| boolean wasLastAddedBucketPositive() { | ||
| return positiveBuckets.numBuckets > 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so, can't we mix adding positive and negative buckets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use a single array for storage, where we have all buckets for negative values followed by the buckets for positive values. Therefore no, you can't add a negative bucket after a positive one in FixedCapacityExponentialHistogram, this invariant is already enforced by tryAddBucket.
| // result was already returned on a previous call, return a new instance | ||
| adjustResultCapacity(result.getCapacity(), true); | ||
| } | ||
| assert resultAlreadyReturned == false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
Prior to this PR, the exponential histogram builder was only intended for usage in tests.
It was very allocaty and inefficient under high load, due to internally using
TreeMaps for storing the histogram buckets prior to building.This PR makes the builder efficient by introducing a happy path: If the buckets are provided in order, we directly construct the result
FixedCapacityExponentialHistogram, avoiding unnecessary copies, allocations and sorting.In addition, it is possible to provide the bucket count in advance if available, to avoid resizing.
It is still possible to provide the buckets out of order. We detect this case and then fallback to using the
TreeMaps for bucket storage.This is required, because for #135625 we'll have to introduce wire (de)serialization code for exponential histograms, which internally will use the builder and therefore needs to be efficient. In addition it looks like for ES|QL histogram copying is required, which is also made more efficient with this PR.