Skip to content

Conversation

@JonasKunz
Copy link
Contributor

There was a ToDo in the exponential histogram merging algorithm for a potential optimization for merging small histograms.

The current algorithm merges all histograms into an accumulator with the desired bucket count limit (e.g. 320 would be the OpenTelemetry default). During this merge process, the merger has to iterate through all buckets of the accumulator and the accumulated histogram. If the latter is much smaller, a lot of time might be wasted iterating the accumulator.

A potential optimization for this is to

  • Buffer all incoming buckets first, until we have roughly as many buckets as the size of the accumulator
  • Merge them with each other ( takes O(n logn), but is amortized across the accumulated histograms) when the buffer is full
  • And only afterwards merge them into the accumulator

I did a PoC implementation for this and ran some benchmarks.

Benchmark results without optimization
Benchmark                           (bucketCount)  (histoImplementation)  (mergedHistoSizeFactor)  Mode  Cnt   Score   Error  Units
ExponentialHistogramMergeBench.add            250           array-backed                     0.01  avgt    5   1,938 ± 0,031  us/op
ExponentialHistogramMergeBench.add            250           array-backed                      0.1  avgt    5   2,106 ± 0,013  us/op
ExponentialHistogramMergeBench.add            250           array-backed                     0.25  avgt    5   1,867 ± 0,013  us/op
ExponentialHistogramMergeBench.add            250           array-backed                      0.5  avgt    5   2,136 ± 0,009  us/op
ExponentialHistogramMergeBench.add            250           array-backed                      1.0  avgt    5   2,231 ± 0,002  us/op
ExponentialHistogramMergeBench.add            250           array-backed                      2.0  avgt    5   2,796 ± 0,048  us/op
ExponentialHistogramMergeBench.add            250             compressed                     0.01  avgt    5   5,210 ± 0,039  us/op
ExponentialHistogramMergeBench.add            250             compressed                      0.1  avgt    5   2,115 ± 0,018  us/op
ExponentialHistogramMergeBench.add            250             compressed                     0.25  avgt    5   2,453 ± 0,020  us/op
ExponentialHistogramMergeBench.add            250             compressed                      0.5  avgt    5   4,687 ± 0,032  us/op
ExponentialHistogramMergeBench.add            250             compressed                      1.0  avgt    5   3,331 ± 0,055  us/op
ExponentialHistogramMergeBench.add            250             compressed                      2.0  avgt    5   5,171 ± 0,016  us/op
ExponentialHistogramMergeBench.add            500           array-backed                     0.01  avgt    5   3,541 ± 0,026  us/op
ExponentialHistogramMergeBench.add            500           array-backed                      0.1  avgt    5   4,017 ± 0,032  us/op
ExponentialHistogramMergeBench.add            500           array-backed                     0.25  avgt    5   3,638 ± 0,008  us/op
ExponentialHistogramMergeBench.add            500           array-backed                      0.5  avgt    5   4,164 ± 0,018  us/op
ExponentialHistogramMergeBench.add            500           array-backed                      1.0  avgt    5   4,455 ± 0,021  us/op
ExponentialHistogramMergeBench.add            500           array-backed                      2.0  avgt    5   5,244 ± 0,070  us/op
ExponentialHistogramMergeBench.add            500             compressed                     0.01  avgt    5   6,552 ± 0,127  us/op
ExponentialHistogramMergeBench.add            500             compressed                      0.1  avgt    5   4,701 ± 0,025  us/op
ExponentialHistogramMergeBench.add            500             compressed                     0.25  avgt    5   4,987 ± 0,033  us/op
ExponentialHistogramMergeBench.add            500             compressed                      0.5  avgt    5   6,237 ± 0,040  us/op
ExponentialHistogramMergeBench.add            500             compressed                      1.0  avgt    5   9,526 ± 0,074  us/op
ExponentialHistogramMergeBench.add            500             compressed                      2.0  avgt    5   8,496 ± 0,091  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                     0.01  avgt    5   6,921 ± 0,026  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                      0.1  avgt    5   7,882 ± 0,059  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                     0.25  avgt    5   7,245 ± 0,077  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                      0.5  avgt    5   8,322 ± 0,039  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                      1.0  avgt    5   5,205 ± 0,015  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                      2.0  avgt    5   6,001 ± 0,030  us/op
ExponentialHistogramMergeBench.add           1000             compressed                     0.01  avgt    5   9,271 ± 0,133  us/op
ExponentialHistogramMergeBench.add           1000             compressed                      0.1  avgt    5   9,483 ± 0,087  us/op
ExponentialHistogramMergeBench.add           1000             compressed                     0.25  avgt    5   9,652 ± 0,036  us/op
ExponentialHistogramMergeBench.add           1000             compressed                      0.5  avgt    5   9,630 ± 0,082  us/op
ExponentialHistogramMergeBench.add           1000             compressed                      1.0  avgt    5   9,501 ± 0,058  us/op
ExponentialHistogramMergeBench.add           1000             compressed                      2.0  avgt    5   9,497 ± 0,073  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                     0.01  avgt    5  13,714 ± 0,059  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                      0.1  avgt    5  15,715 ± 0,278  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                     0.25  avgt    5  12,880 ± 0,383  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                      0.5  avgt    5  14,455 ± 0,076  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                      1.0  avgt    5   9,649 ± 0,052  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                      2.0  avgt    5  11,839 ± 0,093  us/op
ExponentialHistogramMergeBench.add           2000             compressed                     0.01  avgt    5  39,321 ± 0,263  us/op
ExponentialHistogramMergeBench.add           2000             compressed                      0.1  avgt    5  40,071 ± 0,202  us/op
ExponentialHistogramMergeBench.add           2000             compressed                     0.25  avgt    5  35,409 ± 0,240  us/op
ExponentialHistogramMergeBench.add           2000             compressed                      0.5  avgt    5  29,272 ± 0,345  us/op
ExponentialHistogramMergeBench.add           2000             compressed                      1.0  avgt    5  19,886 ± 0,055  us/op
ExponentialHistogramMergeBench.add           2000             compressed                      2.0  avgt    5  21,253 ± 0,112  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                     0.01  avgt    5  34,710 ± 0,168  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                      0.1  avgt    5  26,595 ± 0,098  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                     0.25  avgt    5  31,429 ± 0,304  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                      0.5  avgt    5  35,262 ± 0,286  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                      1.0  avgt    5  24,084 ± 0,142  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                      2.0  avgt    5  31,712 ± 0,348  us/op
ExponentialHistogramMergeBench.add           5000             compressed                     0.01  avgt    5  96,002 ± 1,436  us/op
ExponentialHistogramMergeBench.add           5000             compressed                      0.1  avgt    5  45,676 ± 0,394  us/op
ExponentialHistogramMergeBench.add           5000             compressed                     0.25  avgt    5  45,431 ± 1,326  us/op
ExponentialHistogramMergeBench.add           5000             compressed                      0.5  avgt    5  45,547 ± 0,312  us/op
ExponentialHistogramMergeBench.add           5000             compressed                      1.0  avgt    5  45,401 ± 0,254  us/op
ExponentialHistogramMergeBench.add           5000             compressed                      2.0  avgt    5  46,643 ± 0,628  us/op
Benchmark results with optimization
Benchmark                           (bucketCount)  (histoImplementation)  (mergedHistoSizeFactor)  Mode  Cnt   Score   Error  Units
Benchmark                           (bucketCount)  (histoImplementation)  (mergedHistoSizeFactor)  Mode  Cnt    Score   Error  Units
ExponentialHistogramMergeBench.add            250           array-backed                     0.01  avgt    5    0,358 ± 0,022  us/op
ExponentialHistogramMergeBench.add            250           array-backed                      0.1  avgt    5    1,517 ± 0,039  us/op
ExponentialHistogramMergeBench.add            250           array-backed                     0.25  avgt    5    2,541 ± 0,021  us/op
ExponentialHistogramMergeBench.add            250           array-backed                      0.5  avgt    5    4,094 ± 0,844  us/op
ExponentialHistogramMergeBench.add            250           array-backed                      1.0  avgt    5    2,710 ± 0,013  us/op
ExponentialHistogramMergeBench.add            250           array-backed                      2.0  avgt    5    4,375 ± 0,093  us/op
ExponentialHistogramMergeBench.add            250             compressed                     0.01  avgt    5    0,413 ± 0,013  us/op
ExponentialHistogramMergeBench.add            250             compressed                      0.1  avgt    5    1,947 ± 0,012  us/op
ExponentialHistogramMergeBench.add            250             compressed                     0.25  avgt    5    3,014 ± 0,018  us/op
ExponentialHistogramMergeBench.add            250             compressed                      0.5  avgt    5    4,766 ± 0,199  us/op
ExponentialHistogramMergeBench.add            250             compressed                      1.0  avgt    5    5,705 ± 0,111  us/op
ExponentialHistogramMergeBench.add            250             compressed                      2.0  avgt    5    9,454 ± 0,312  us/op
ExponentialHistogramMergeBench.add            500           array-backed                     0.01  avgt    5    0,541 ± 0,003  us/op
ExponentialHistogramMergeBench.add            500           array-backed                      0.1  avgt    5    2,913 ± 0,090  us/op
ExponentialHistogramMergeBench.add            500           array-backed                     0.25  avgt    5    5,334 ± 0,020  us/op
ExponentialHistogramMergeBench.add            500           array-backed                      0.5  avgt    5    8,040 ± 0,148  us/op
ExponentialHistogramMergeBench.add            500           array-backed                      1.0  avgt    5    6,993 ± 0,073  us/op
ExponentialHistogramMergeBench.add            500           array-backed                      2.0  avgt    5    9,247 ± 0,682  us/op
ExponentialHistogramMergeBench.add            500             compressed                     0.01  avgt    5    0,646 ± 0,006  us/op
ExponentialHistogramMergeBench.add            500             compressed                      0.1  avgt    5    3,238 ± 0,069  us/op
ExponentialHistogramMergeBench.add            500             compressed                     0.25  avgt    5    6,454 ± 0,021  us/op
ExponentialHistogramMergeBench.add            500             compressed                      0.5  avgt    5    9,250 ± 0,113  us/op
ExponentialHistogramMergeBench.add            500             compressed                      1.0  avgt    5   13,230 ± 0,042  us/op
ExponentialHistogramMergeBench.add            500             compressed                      2.0  avgt    5   15,835 ± 0,171  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                     0.01  avgt    5    1,150 ± 0,065  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                      0.1  avgt    5    5,615 ± 0,031  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                     0.25  avgt    5   10,645 ± 0,112  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                      0.5  avgt    5   15,657 ± 0,154  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                      1.0  avgt    5    9,503 ± 0,106  us/op
ExponentialHistogramMergeBench.add           1000           array-backed                      2.0  avgt    5   13,908 ± 0,109  us/op
ExponentialHistogramMergeBench.add           1000             compressed                     0.01  avgt    5    1,267 ± 0,054  us/op
ExponentialHistogramMergeBench.add           1000             compressed                      0.1  avgt    5    6,254 ± 0,267  us/op
ExponentialHistogramMergeBench.add           1000             compressed                     0.25  avgt    5   11,773 ± 0,028  us/op
ExponentialHistogramMergeBench.add           1000             compressed                      0.5  avgt    5   18,682 ± 3,312  us/op
ExponentialHistogramMergeBench.add           1000             compressed                      1.0  avgt    5   16,758 ± 0,218  us/op
ExponentialHistogramMergeBench.add           1000             compressed                      2.0  avgt    5   27,900 ± 1,093  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                     0.01  avgt    5    2,171 ± 0,010  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                      0.1  avgt    5   11,449 ± 0,324  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                     0.25  avgt    5   20,769 ± 0,194  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                      0.5  avgt    5   31,568 ± 0,700  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                      1.0  avgt    5   14,743 ± 0,083  us/op
ExponentialHistogramMergeBench.add           2000           array-backed                      2.0  avgt    5   26,280 ± 0,296  us/op
ExponentialHistogramMergeBench.add           2000             compressed                     0.01  avgt    5    2,370 ± 0,072  us/op
ExponentialHistogramMergeBench.add           2000             compressed                      0.1  avgt    5   12,201 ± 0,066  us/op
ExponentialHistogramMergeBench.add           2000             compressed                     0.25  avgt    5   22,911 ± 0,060  us/op
ExponentialHistogramMergeBench.add           2000             compressed                      0.5  avgt    5   37,543 ± 0,120  us/op
ExponentialHistogramMergeBench.add           2000             compressed                      1.0  avgt    5   27,279 ± 0,168  us/op
ExponentialHistogramMergeBench.add           2000             compressed                      2.0  avgt    5   53,994 ± 0,192  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                     0.01  avgt    5    5,345 ± 0,033  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                      0.1  avgt    5   23,568 ± 2,101  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                     0.25  avgt    5   40,661 ± 1,310  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                      0.5  avgt    5   54,654 ± 0,399  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                      1.0  avgt    5   35,928 ± 0,237  us/op
ExponentialHistogramMergeBench.add           5000           array-backed                      2.0  avgt    5   63,468 ± 0,900  us/op
ExponentialHistogramMergeBench.add           5000             compressed                     0.01  avgt    5    5,741 ± 0,038  us/op
ExponentialHistogramMergeBench.add           5000             compressed                      0.1  avgt    5   25,882 ± 0,085  us/op
ExponentialHistogramMergeBench.add           5000             compressed                     0.25  avgt    5   48,965 ± 0,145  us/op
ExponentialHistogramMergeBench.add           5000             compressed                      0.5  avgt    5   71,124 ± 1,751  us/op
ExponentialHistogramMergeBench.add           5000             compressed                      1.0  avgt    5   67,148 ± 0,239  us/op
ExponentialHistogramMergeBench.add           5000             compressed                      2.0  avgt    5  134,633 ± 0,754  us/op

The mergedHistoSizeFactor is the factor how much smaller the merged histograms are than the accumulator and the duration is the time to merge a single histogram into the accumulator.

As it turns out, the optimization is only a net positive for bucket counts >= 500 and if the merged histograms are at least 90% smaller than the accumulator. Otherwise it even slows the merging down due to additional buffering / copying.

We could optimize for this (e.g. only buffer histograms with a size below a threshold), however it does not seem worth it at this point. If we use something around the opentelemetry default as bucket limit (e.g. 320), we don't see significant gains for reasonably sized inputs. However, at the same time the optimization certainly increases the complexity of the code.

Therefore with this propose I'd propose to document this in the code, remove the ToDo, but not add the optimization into the code as it does not seem worth it right now.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine elasticsearchmachine added the external-contributor Pull request authored by a developer outside the Elasticsearch team label Sep 24, 2025
@JonasKunz JonasKunz merged commit aa04178 into elastic:main Sep 29, 2025
34 checks passed
@JonasKunz JonasKunz deleted the update-exp-histo-optimization-todo branch September 29, 2025 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contributor Pull request authored by a developer outside the Elasticsearch team >non-issue :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants