Convert OTel Histograms to CloudWatch Values/Counts #376
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
New implementation for converting OTel histograms to CloudWatch Values/Counts for emission to CloudWatch by the CloudWatch Agent. The OTel histogram format is incompatible with the CloudWatch APIs. A mapping algorithm is needed to transform OTel histograms to Values/Counts.
OTel histograms are in the format:
See the following for more details on OTel histogram format: https://opentelemetry.io/docs/specs/otel/metrics/data-model/#histogram
For the purposes of this algorithm, the input histograms are assumed to always be in Delta temporarility as the CloudWatch Agent will use the cumuluativetodelta processor to convert before emission.
CloudWatch accepts histograms using the Values/Counts model in the PutMetricData API.
See the following for more details: https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_MetricDatum.html. This API accepts:
This algorithm converts each of the buckets of the input histogram into (at most) 10 value/count data pairs aka "inner buckets". The values of the inner buckets are spread evenly across the bucket span. The counts of the inner buckets are determined using an exponential mapping algorithm. Counts are weighted more heavily to one side according to an exponential function depending on how the density of the nearby buckets are changing.
The following image demonstrates how an example input histogram is converted to the values/count model. The red dots indicate the values/counts that are pushed to CloudWatch.

Testing
Unit testing
Used the new tools introduced previously to send histogram test cases to CloudWatch and then retrieve the percentile metrics.
Most percentiles fall within the expected range. A few are off by a percent or two. I believe this is due to the back-end applying another SEH1 mapping slightly modifying the values that the agent sends to CW for efficient storing.
For our accuracy tests, we see several improvments: