Skip to content

Conversation

@JonasKunz
Copy link
Contributor

Prior to this PR, the exponential histogram builder was only intended for usage in tests.
It was very allocaty and inefficient under high load, due to internally using TreeMaps for storing the histogram buckets prior to building.

This PR makes the builder efficient by introducing a happy path: If the buckets are provided in order, we directly construct the result FixedCapacityExponentialHistogram, avoiding unnecessary copies, allocations and sorting.
In addition, it is possible to provide the bucket count in advance if available, to avoid resizing.

It is still possible to provide the buckets out of order. We detect this case and then fallback to using the TreeMaps for bucket storage.

This is required, because for #135625 we'll have to introduce wire (de)serialization code for exponential histograms, which internally will use the builder and therefore needs to be efficient. In addition it looks like for ES|QL histogram copying is required, which is also made more efficient with this PR.

@elasticsearchmachine elasticsearchmachine added external-contributor Pull request authored by a developer outside the Elasticsearch team v9.3.0 needs:triage Requires assignment of a team area label labels Oct 2, 2025
@JonasKunz JonasKunz changed the title Optimize exponential histogram builder for construction in order Optimize exponential histogram builder for in-order construction Oct 2, 2025
@JonasKunz JonasKunz added :StorageEngine/Mapping The storage related side of mappings >non-issue labels Oct 2, 2025
@elasticsearchmachine elasticsearchmachine added Team:StorageEngine and removed needs:triage Requires assignment of a team area label labels Oct 2, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

* @return true, if the last bucket added successfully via {@link #tryAddBucket(long, long, boolean)} was a positive one.
*/
boolean wasLastAddedBucketPositive() {
return positiveBuckets.numBuckets > 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so, can't we mix adding positive and negative buckets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use a single array for storage, where we have all buckets for negative values followed by the buckets for positive values. Therefore no, you can't add a negative bucket after a positive one in FixedCapacityExponentialHistogram, this invariant is already enforced by tryAddBucket.

// result was already returned on a previous call, return a new instance
adjustResultCapacity(result.getCapacity(), true);
}
assert resultAlreadyReturned == false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@JonasKunz JonasKunz merged commit 4b9175e into elastic:main Oct 7, 2025
34 checks passed
@JonasKunz JonasKunz deleted the exp-histo-optimize-builder branch October 7, 2025 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contributor Pull request authored by a developer outside the Elasticsearch team >non-issue :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants