Skip to content

Conversation

martincostello
Copy link
Member

@martincostello martincostello commented Oct 3, 2025

Fixes #6538

Changes

Updates the code that writes the protobuf fields for metrics histograms to use packed format instead of unpacked.

TODO

Merge requirement checklist

  • CONTRIBUTING guidelines followed (license requirements, nullable enabled, static analysis, etc.)
  • Unit tests added/updated
  • Appropriate CHANGELOG.md files updated for non-trivial changes
  • Changes in public API reviewed (if applicable)

Benchmarks

Manually edited this code:

to:

-     [Params(1, 10, 100)]
+     [Params(1, 10, 20)]

Otherwise, it takes 75 minutes just to get to the warm-up phase for the 100 iteration.

This PR (0bdac99)

gRPC

Command:

dotnet run --configuration Release --framework net9.0 -- --filter "*OtlpGrpc*" --memory

BenchmarkDotNet v0.15.4, Windows 11 (10.0.26100.6584/24H2/2024Update/HudsonValley)
13th Gen Intel Core i7-13700H 2.90GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 9.0.305
  [Host]     : .NET 9.0.9 (9.0.9, 9.0.925.41916), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 9.0.9 (9.0.9, 9.0.925.41916), X64 RyuJIT x86-64-v3


Method NumberOfBatches NumberOfSpans Mean Error StdDev Allocated
OtlpExporter_Batching 1 10000 4.075 s 0.0196 s 0.0183 s 15.51 KB
OtlpExporter_Batching 10 10000 40.868 s 0.0999 s 0.0935 s 149.53 KB
OtlpExporter_Batching 20 10000 81.431 s 0.2181 s 0.2040 s 298.94 KB

HTTP

Command:

dotnet run --configuration Release --framework net9.0 -- --filter "*OtlpHttp*" --memory

BenchmarkDotNet v0.15.4, Windows 11 (10.0.26100.6584/24H2/2024Update/HudsonValley)
13th Gen Intel Core i7-13700H 2.90GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 9.0.305
  [Host]     : .NET 9.0.9 (9.0.9, 9.0.925.41916), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 9.0.9 (9.0.9, 9.0.925.41916), X64 RyuJIT x86-64-v3


Method NumberOfBatches NumberOfSpans Mean Error StdDev Median Allocated
OtlpExporter_Batching 1 10000 12.11 ms 0.892 ms 2.616 ms 11.52 ms 7.25 KB
OtlpExporter_Batching 10 10000 83.81 ms 6.606 ms 19.164 ms 75.51 ms 75.89 KB
OtlpExporter_Batching 100 10000 944.47 ms 77.790 ms 226.916 ms 882.53 ms 726.85 KB

main (6a70665)

gRPC

Command:

dotnet run --configuration Release --framework net9.0 -- --filter "*OtlpGrpc*" --memory

BenchmarkDotNet v0.15.4, Windows 11 (10.0.26100.6584/24H2/2024Update/HudsonValley)
13th Gen Intel Core i7-13700H 2.90GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 9.0.305
  [Host]     : .NET 9.0.9 (9.0.9, 9.0.925.41916), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 9.0.9 (9.0.9, 9.0.925.41916), X64 RyuJIT x86-64-v3


Method NumberOfBatches NumberOfSpans Mean Error StdDev Allocated
OtlpExporter_Batching 1 10000 4.083 s 0.0189 s 0.0177 s 15.08 KB
OtlpExporter_Batching 10 10000 40.795 s 0.0784 s 0.0733 s 149.53 KB
OtlpExporter_Batching 20 10000 81.629 s 0.1562 s 0.1461 s 299.03 KB

HTTP

Command:

dotnet run --configuration Release --framework net9.0 -- --filter "*OtlpHttp*" --memory

BenchmarkDotNet v0.15.4, Windows 11 (10.0.26100.6584/24H2/2024Update/HudsonValley)
13th Gen Intel Core i7-13700H 2.90GHz, 1 CPU, 20 logical and 14 physical cores
.NET SDK 9.0.305
  [Host]     : .NET 9.0.9 (9.0.9, 9.0.925.41916), X64 RyuJIT x86-64-v3
  DefaultJob : .NET 9.0.9 (9.0.9, 9.0.925.41916), X64 RyuJIT x86-64-v3


Method NumberOfBatches NumberOfSpans Mean Error StdDev Median Allocated
OtlpExporter_Batching 1 10000 7.872 ms 0.4105 ms 1.210 ms 7.698 ms 7.25 KB
OtlpExporter_Batching 10 10000 62.448 ms 3.8543 ms 10.997 ms 56.356 ms 75.89 KB
OtlpExporter_Batching 100 10000 765.617 ms 42.4808 ms 123.919 ms 754.422 ms 726.85 KB

End-to-end testing

I ran two Grafana k6 performance tests against an application of my own with the following two commits:

Both runs have similar performance profiles, and metrics histograms render similarly.

Comparison

image

v1.13.0

image image image

This PR (0bdac99)

image image image

@github-actions github-actions bot added the pkg:OpenTelemetry.Exporter.OpenTelemetryProtocol Issues related to OpenTelemetry.Exporter.OpenTelemetryProtocol NuGet package label Oct 3, 2025
Copy link

codecov bot commented Oct 3, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.77%. Comparing base (ac8d45e) to head (ad3ef55).
⚠️ Report is 6 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6567      +/-   ##
==========================================
+ Coverage   86.64%   86.77%   +0.13%     
==========================================
  Files         258      258              
  Lines       11910    11946      +36     
==========================================
+ Hits        10319    10366      +47     
+ Misses       1591     1580      -11     
Flag Coverage Δ
unittests-Project-Experimental 86.59% <100.00%> (+0.23%) ⬆️
unittests-Project-Stable 86.70% <100.00%> (+0.12%) ⬆️
unittests-Solution 86.72% <100.00%> (+0.38%) ⬆️
unittests-UnstableCoreLibraries-Experimental 85.87% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ntation/Serializer/ProtobufOtlpMetricSerializer.cs 98.91% <100.00%> (+0.14%) ⬆️
...ol/Implementation/Serializer/ProtobufSerializer.cs 91.58% <100.00%> (+0.07%) ⬆️
src/OpenTelemetry/Metrics/AggregatorStore.cs 87.59% <100.00%> (+1.96%) ⬆️

... and 3 files with indirect coverage changes

@martincostello martincostello changed the title [WIP] [OTLP] Use packed format for metric histograms [OTLP] Use packed format for metric histograms Oct 3, 2025
martincostello added a commit to martincostello/costellobot that referenced this pull request Oct 6, 2025
martincostello added a commit to martincostello/costellobot that referenced this pull request Oct 6, 2025
martincostello added a commit to martincostello/opentelemetry-dotnet that referenced this pull request Oct 6, 2025
@martincostello martincostello marked this pull request as ready for review October 8, 2025 08:39
@martincostello martincostello requested a review from a team as a code owner October 8, 2025 08:39
@Copilot Copilot AI review requested due to automatic review settings October 8, 2025 08:39
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the OTLP metric histogram serialization to use packed format instead of unpacked format, fixing compatibility issues where some OTLP receivers were rejecting payloads with HTTP 400 responses due to gRPC protocol errors.

Key Changes:

  • Refactored histogram bucket serialization from individual field writes to packed format
  • Added helper method WriteDouble for packed double serialization
  • Replaced inline histogram bucket writing with dedicated WriteHistogramBuckets method

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
ProtobufSerializer.cs Added WriteDouble helper method for packed double serialization
ProtobufOtlpMetricSerializer.cs Refactored histogram bucket serialization to use packed format with dedicated helper methods
CHANGELOG.md Added entry documenting the fix for OTLP receiver rejection issues

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

linux-foundation-easycla bot commented Oct 8, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

Use packed format for metric histograms.

Resolves open-telemetry#6538.
@martincostello
Copy link
Member Author

Given that the OTLP integration tests pass without this change, I'll try and create a unit test that validates the layout of the serialized protobuf message is in the expected packed format.

Add snapshot test for the protobuf serialized metrics.
@github-actions github-actions bot added infra Infra work - CI/CD, code coverage, linters dependencies Pull requests that update a dependency file pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package labels Oct 8, 2025
Resolve IDE0005 warnings caused by Verify adding an implicit using for Xunit.
public static class ProtobufOtlpMetricSerializerTests
{
[Fact]
public static async Task WriteMetricsData_Serializes_Metrics_Correctly()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've verified manually that if this test is added to main it fails.

@martincostello
Copy link
Member Author

martincostello commented Oct 8, 2025

I also tried a test where the assertion was the below, but the Google.Protobuf library appears to be lenient enough to support packed and unpacked fields, so it also passes in main. Copilot suggests that both it and protobuf-net are both permissive and cannot be made to be strict, so if that's the case the Verify-based test is probably the best option.

var deserialized = Proto.Metrics.V1.MetricsData.Parser.ParseFrom(buffer, 0, actual);

Assert.NotNull(deserialized);
Assert.NotNull(deserialized.ResourceMetrics);
Assert.NotEmpty(deserialized.ResourceMetrics);

foreach (var resourceMetrics in deserialized.ResourceMetrics)
{
    foreach (var scope in resourceMetrics.ScopeMetrics)
    {
        foreach (var metric in scope.Metrics)
        {
            if (metric.Histogram is { } histogram)
            {
                foreach (var point in histogram.DataPoints)
                {
                    Assert.NotNull(point.BucketCounts);
                    Assert.NotEmpty(point.BucketCounts);

                    Assert.NotNull(point.ExplicitBounds);
                    Assert.NotEmpty(point.ExplicitBounds);
                }
            }
        }
    }
}

Move the code to generate the `Batch<Metrics>` to another method.
Comment on lines +304 to +305
this.StartTimeExclusive = startTimeExclusive;
this.EndTimeInclusive = endTimeInclusive;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any option to make this calls by the reflection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that if desired, but IMHO that's a bit yuck if the tests already have [InternalsVisibleTo]. Or alternatively I could just make the setters internal instead of prviate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file infra Infra work - CI/CD, code coverage, linters pkg:OpenTelemetry.Exporter.OpenTelemetryProtocol Issues related to OpenTelemetry.Exporter.OpenTelemetryProtocol NuGet package pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug][metrics OtelExporter] Invalid protobuf structure: proto: wrong wireType = 1 for field BucketCounts
2 participants