-
Notifications
You must be signed in to change notification settings - Fork 151
Various optimizations for Activity handling #8040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| else if ((string.IsNullOrEmpty(activityTraceId) && w3cActivity is { ParentSpanId: not null }) | ||
| || w3cActivity is { Parent: not null }) | ||
| { | ||
| // We know we have a parent context, but also that the traceID is (weirdly) null, so we fallback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see how this is possible really looking at the Activity code
Pretty sure I wrote it all though 🙈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I'm not sure how this can happen...can we log an error for this? It might seem weird to the end-user, but I would love for us to more proactively get a better sense of when this is happening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string.IsNullOrEmpty(activityTraceId) && w3cActivity is { ParentSpanId: not null }
This condition makes sense in that it's the testing the case where the first if (!StringUtil.IsNullOrEmpty(activityTraceId) && w3cActivity is { ParentSpanId: { } parentSpanId }) fails the activityTraceId portion of the conditional but passes the parentSpanId part of the conditional. However, I don't see how this is possible. This is the part where I think error logging could identify if we ever hit this.
w3cActivity is { Parent: not null }
It seems like this would happen for .NET 5 but the Activity was set to a Hierarchical format. That one I can easily see happening and shouldn't error log.
What do you think of separating out the two conditionals and for the ParentSpanId condition we only do error logging? In fact the current ParentId lookup you proposed doesn't look right for that branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see how this is possible really looking at the Activity code
Pretty sure I wrote it all though 🙈
Yeah, I purposefully avoided changing the checks here because of the complexity of hierarchical IDs vs W3C and how Activity has evolved over the years 😅 Definitely opted for an "I trust you" possibility 😂
From my own reading of Activity, activityTraceId will only ever be null if we are using hierarchical IDs, so I think your assertion makes sense - we can update this to log a warning when string.IsNullOrEmpty(activityTraceId) && w3cActivity is { ParentSpanId: not null }, though what should we do in that case, in terms of finding a parent? 🤔 Just, don't have a parent? If we do that, then we'll end up falling into the subsequent block, where we try to modify the activity to match any ambient span, which also feels a bit... off...
In fact the current
ParentIdlookup you proposed doesn't look right for that branch
Hmmm, you're right... the previous condition:
if (!StringUtil.IsNullOrEmpty(activityTraceId) && w3cActivity is { ParentSpanId: { } parentSpanId })could only ever be hit for w3c IDs, the hierarchical ones would never be hit 🤔 With my change, we now do hit this for hierarchical IDs... which means that this is a behaviour change, but then again, it feels like the previous logic was off - I don't think we would ever have correctly grabbed the parent for hierarchical IDs? 😕 I'll debug it and see what was happening in those cases, because I'm super confused 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just stepped through this, for hierarchical IDs, we would previously never hit any of the branches for setting the parent 🤦♂️
My fix now does (correctly I think) hit the second branch when you're using hierarchical IDs, and looks up the parent correctly in ActivityMappingById, so I believe we need to keep that branch in some form, but it's all a bit confusing, so I've reworked it a lot, as below (pushing it shortly, but CI is getting tired, so will do it in a one-er shortly)
Also noticed we were completely ignore hierarchical IDs in < .NET Core 3 (i.e. IActivity) which didn't seem right?
if (activity is IW3CActivity activity3)
{
var activityTraceId = activity3.TraceId;
var activitySpanId = activity3.SpanId;
if (!StringUtil.IsNullOrEmpty(activityTraceId))
{
// W3C ID
if (activity3 is { ParentSpanId: { } parentSpanId })
{
if (ActivityMappingById.TryGetValue(new ActivityKey(activityTraceId, parentSpanId), out ActivityMapping mapping))
{
parent = mapping.Scope.Span.Context;
}
else
{
// create a new parent span context for the ActivityContext
// ... code elided, same as before
}
}
}
else
{
// No traceID, so much be Hierarchical ID
if (activity3.ParentSpanId is { } parentSpanId)
{
Log.Error("Activity with ID {ActivityId} had parent span ID {ParentSpanId} but TraceID was missing", activity.Id, parentSpanId);
}
else
{
var parentId = activity3.ParentId;
if (!StringUtil.IsNullOrEmpty(parentId) && ActivityMappingById.TryGetValue(new ActivityKey(parentId), out ActivityMapping mapping))
{
parent = mapping.Scope.Span.Context;
}
}
}
// ... Same "fixup" code as before, only for W3C ID spans
}
else
{
// NEW BRANCH
// non-IW3CActivity, i.e. we're in .NET Core 2.x territory. Only have hierarchical IDs to worry about here
var parentId = activity.ParentId;
if (!StringUtil.IsNullOrEmpty(parentId) && ActivityMappingById.TryGetValue(new ActivityKey(parentId), out ActivityMapping mapping))
{
parent = mapping.Scope.Span.Context;
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow thanks and nice catch 👍
## Summary of changes Update `Activity`-based benchmarks to reduce variability and make comparisons easier ## Reason for change The various `Activity`-related benchmarks call the global `Tracer` instance, so we should make sure to configure it with our default benchmarking settings (basically disabling background jobs like telemetry/discovery/remote config) to reduce variation. Also added a "baseline" job for comparison and a version that uses hiearachical IDs instead of W3C IDs. Was a prerequisite for a bunch of other work. ## Implementation details - Configure `Benchmarks.OpenTelemetry.InstrumentedApi` project and `Benchmarks.Trace/ActivityBenchmark` to setup the global tracer with Telemetry etc disabled - Add two extra benchmarks to `Benchmarks.Trace/ActivityBenchmark` - `StartStopWithChild_Baseline`, which is the same as `StartStopWithChild` but without the DD integration - `StartStopWithChild_Hierarchical`, which is the same as `StartStopWithChild` but uses hierarchical ID format - Don't run either of these in CI for now (to avoid extra load), just for local comparisons - Simplify the benchmark to just do explicit duck typing (which is closer to what we do normally anyway, and removes a bunch of code) ## Test coverage Running these benchmarks locally shows the improvements we need to make, and highlights that we clearly have a bug with hierarchical IDs 😬 | Method | Runtime | Mean | Error | StdDev | Gen0 | Gen1 | Allocated | | ------------------------------- | -------------------- | ----------: | ----------: | ----------: | -----: | -----: | --------: | | StartStopWithChild_Baseline | .NET 6.0 | 672.9 ns | 23.23 ns | 66.66 ns | 0.0038 | - | 1.09 KB | | StartStopWithChild_Hierarchical | .NET 6.0 | 35,478.1 ns | 1,035.96 ns | 3,021.94 ns | - | - | 10.47 KB | | StartStopWithChild | .NET 6.0 | 3,681.7 ns | 50.23 ns | 44.53 ns | 0.0153 | - | 4.87 KB | | StartStopWithChild_Baseline | .NET 8.0 | 570.8 ns | 16.96 ns | 49.22 ns | 0.0029 | - | 1.09 KB | | StartStopWithChild_Hierarchical | .NET 8.0 | 32,478.5 ns | 1,215.04 ns | 3,466.57 ns | - | - | 10.48 KB | | StartStopWithChild | .NET 8.0 | 3,021.4 ns | 59.24 ns | 163.16 ns | 0.0153 | - | 4.77 KB | | StartStopWithChild_Baseline | .NET Core 3.1 | 752.6 ns | 15.06 ns | 30.09 ns | 0.0038 | - | 1.19 KB | | StartStopWithChild_Hierarchical | .NET Core 3.1 | 34,170.1 ns | 572.80 ns | 478.31 ns | - | - | 10.75 KB | | StartStopWithChild | .NET Core 3.1 | 4,905.0 ns | 31.63 ns | 29.58 ns | 0.0153 | - | 5.05 KB | | StartStopWithChild_Baseline | .NET Framework 4.7.2 | 770.7 ns | 6.03 ns | 5.64 ns | 0.2012 | - | 1.24 KB | | StartStopWithChild_Hierarchical | .NET Framework 4.7.2 | 37,386.2 ns | 542.73 ns | 453.21 ns | 1.8921 | 0.1221 | 11.77 KB | | StartStopWithChild | .NET Framework 4.7.2 | 5,884.5 ns | 64.54 ns | 60.37 ns | 0.8621 | - | 5.3 KB | ## Other details https://datadoghq.atlassian.net/browse/LANGPLAT-915 Part of a stack working to improve OTel performance - #8036 👈 - #8037 - #8038 - #8039 - #8040 - #8041 - #8042
c54293a to
052d4a9
Compare
3645391 to
793b5c3
Compare
BenchmarksBenchmark execution time: 2026-01-12 16:03:28 Comparing candidate commit 982d0dc in PR branch Found 8 performance improvements and 6 performance regressions! Performance is the same for 156 metrics, 16 unstable metrics. scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net472
scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild netcoreapp3.1
scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net6.0
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody net6.0
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net6.0
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net472
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net6.0
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1
scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net6.0
scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark net6.0
scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog netcoreapp3.1
scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net472
|
## Summary of changes Reduce allocation of `ActivityHandlerCommon` by removing `string` concatenation ## Reason for change The `ActivityHandlerCommon.ActivityStarted` and `ActivityStopped` methods need to store and retrieve `Activity` instances from a `ConcurrentDictionary<>`. Today they're doing that be concatenating the `Activity`'s `TraceId` and `SpanID`, or by using it's `ID`. All that concatenation causes a bunch of allocation, so instead introduce a simple `struct` to use as the key instead ## Implementation details Introduce `ActivityKey`, which is essentially `internal readonly record struct ActivityKey(string TraceId, string SpanId`, and use that for all of the dictionary lookups. Which avoids all the string concatenation allocations. ## Test coverage Added some unit tests for `ActivityKey`, by mostly covered by existing integration tests for correctness. Benchmarks show a significant improvement over [the previous results](#8036), particularly for Hierachical IDs which clearly were buggy | Method | Runtime | Mean | StdDev | Gen0 | Allocated | Compared to #8036 | | ------------------------------- | -------------------- | -------: | --------: | -----: | --------: | ----------------: | | StartStopWithChild_Hierarchical | .NET 6.0 | 4.217 us | 0.3227 us | 0.0153 | 4.09 KB | -6.38 KB | | StartStopWithChild_Hierarchical | .NET 8.0 | 3.413 us | 0.2505 us | 0.0076 | 4.09 KB | -6.39 KB | | StartStopWithChild_Hierarchical | .NET Core 3.1 | 5.676 us | 0.4636 us | 0.0153 | 4.32 KB | -6.43 KB | | StartStopWithChild_Hierarchical | .NET Framework 4.7.2 | 6.813 us | 0.4969 us | 0.7324 | 4.53 KB | -7.24 KB | | | | | | | | | | StartStopWithChild | .NET 6.0 | 4.105 us | 0.2677 us | 0.0153 | 4.3 KB | -0.57 KB | | StartStopWithChild | .NET 8.0 | 3.475 us | 0.1570 us | 0.0114 | 4.2 KB | -0.57 KB | | StartStopWithChild | .NET Core 3.1 | 5.647 us | 0.3129 us | 0.0153 | 4.48 KB | -0.57 KB | | StartStopWithChild | .NET Framework 4.7.2 | 6.842 us | 0.2992 us | 0.7629 | 4.69 KB | -0.61 KB | ## Other details https://datadoghq.atlassian.net/browse/LANGPLAT-915 Part of a stack working to improve OTel performance - #8036 - #8037 👈 - #8038 - #8039 - #8040 - #8041 - #8042
052d4a9 to
2e9e4cf
Compare
793b5c3 to
b6debcc
Compare
## Summary of changes Fix incorrect nullable annotations on `Activity` duck types ## Reason for change While working on other performance things, noticed that the nullable annotations often declared non-nullability when they actually could be null. A particularly confusing part are the `TraceId` and `SpanId` values in `IW3CActivity`. These were marked non-nullable because when you call `Activity.Start()` then these will always be non-null, but _only_ if you're using W3C IDs. If you're using hierarchical IDs ([the default in <.NET 5](https://learn.microsoft.com/en-us/dotnet/core/compatibility/core-libraries/5.0/default-activityidformat-changed)) then these values _will_ be null. As a side note, I suspect this explains the "we saw errors about these being null in error tracking but don't understand why" scenarios 😄 Also, I think we should rename `IW3CActivity` to `IActivity3` instead. W3C _implies_ that it's a W3C activity, but that's not necessarily the case, and is essentially the source of the above confusion I think. `IActivity3` would then be consistently named with `IActivity5` and `IActivity6` we also currently have. ## Implementation details Add nullable annotations to values that _can_ be null, and fix the fallout (mostly in `ActivityKey`) ## Test coverage Covered by existing tests sufficiently I think ## Other details https://datadoghq.atlassian.net/browse/LANGPLAT-915 Part of a stack working to improve OTel performance - #8036 - #8037 👈 - #8038 - #8039 - #8040 - #8041 - #8042
b6debcc to
a19980f
Compare
2e9e4cf to
ff96a9a
Compare
This comment has been minimized.
This comment has been minimized.
| ActivityKey? activityKey = null; | ||
|
|
||
| if (activity is IW3CActivity w3cActivity) | ||
| if (activity is IW3CActivity activity3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Level of care low, but I actually prefer w3cActivity here because activity3 could be taken to mean it's an IActivity3 reference (even though IActivity3 doesn't exist, but IActivity5 and IActivity6 do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Level of care low, but I actually prefer w3cActivity here because activity3 could be taken to mean it's an IActivity3 reference
I have a pretty strong opinion on not using w3cActivity, because it comes with implicit assumptions (I know, because the code made them, as did I 😅)
even though IActivity3 doesn't exist, but IActivity5 and IActivity6 do)
Yeah I'm going to rename IW3CActitvity to IActivity3 later, when it's less conflicting 😉
| // No traceID, so much be Hierarchical ID | ||
| if (activity3.ParentSpanId is { } parentSpanId) | ||
| { | ||
| // This is a weird scenario - we're in a hierarchical ID, we don't have a trace ID, but we _do_ have a _parentSpanID?! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care about activities with hierarchical IDs? AFAIK those are not compatible with OpenTelemetry. Could we just ignore them? (general question, not specific to this PR)
except special cases like the aspnetcore observer, if we still use that without DD_TRACE_OTEL_ENABLED.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, I see the comment below
non-IW3CActivity, i.e. we're in .NET Core 2.x territory. Only have hierarchical IDs to worry about here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just ignore them? (general question, not specific to this PR)
Great question, maybe we could 🤔 Big breaking change though...
tracer/src/Datadog.Trace/Activity/Handlers/ActivityHandlerCommon.cs
Outdated
Show resolved
Hide resolved
tracer/src/Datadog.Trace/Activity/Handlers/IgnoreActivityHandler.cs
Outdated
Show resolved
Hide resolved
## Summary of changes Avoids allocating a closure in .NET Core if we can avoid it ## Reason for change .NET Core's `ConcurrentDictionary.GetOrAdd()` method allows providing a "state" object which we can pass to the `GetOrAdd` method. Using this method avoids allocating a closure every time the method is hit, and we can pass the state using a value tuple to avoid additional allocation there ## Implementation details `#if`/`#else` to glory ## Test coverage Functionality is covered by existing tests, benchmarks show an incremental improvement over #8037 for .NET Core, as expected: | Method | Runtime | Mean | StdDev | Gen0 | Allocated | Compared to #8037 | | ------------------------------- | -------------------- | -------: | --------: | -----: | --------: | ----------------: | | StartStopWithChild_Hierarchical | .NET 6.0 | 3.746 us | 0.1128 us | 0.0114 | 3.81 KB | -0.28 KB | | StartStopWithChild_Hierarchical | .NET 8.0 | 2.759 us | 0.0374 us | 0.0114 | 3.81 KB | -0.28 KB | | StartStopWithChild_Hierarchical | .NET Core 3.1 | 4.762 us | 0.0584 us | 0.0153 | 4.04 KB | -0.28 KB | | StartStopWithChild_Hierarchical | .NET Framework 4.7.2 | 5.651 us | 0.0717 us | 0.7324 | 4.54 KB | 0.01 KB (noise) | | | | | | | | | | StartStopWithChild | .NET 6.0 | 3.607 us | 0.0508 us | 0.0153 | 4.02 KB | -0.28 KB | | StartStopWithChild | .NET 8.0 | 2.921 us | 0.0617 us | 0.0114 | 3.91 KB | -0.29 KB | | StartStopWithChild | .NET Core 3.1 | 4.922 us | 0.0407 us | 0.0153 | 4.2 KB | -0.28 KB | | StartStopWithChild | .NET Framework 4.7.2 | 6.008 us | 0.0979 us | 0.7629 | 4.69 KB | 0 | ## Other details https://datadoghq.atlassian.net/browse/LANGPLAT-915 Part of a stack working to improve OTel performance - #8036 - #8037 - #8038 - #8039 👈 - #8040 - #8041 - #8042
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8040) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8040) - mean (69ms) : 67, 70
master - mean (69ms) : 67, 70
section Bailout
This PR (8040) - mean (72ms) : 71, 74
master - mean (73ms) : 71, 74
section CallTarget+Inlining+NGEN
This PR (8040) - mean (1,009ms) : 969, 1049
master - mean (1,008ms) : 971, 1045
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8040) - mean (106ms) : 104, 108
master - mean (106ms) : 103, 109
section Bailout
This PR (8040) - mean (107ms) : 106, 109
master - mean (107ms) : 106, 108
section CallTarget+Inlining+NGEN
This PR (8040) - mean (733ms) : 669, 797
master - mean (732ms) : 678, 785
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8040) - mean (94ms) : 92, 96
master - mean (94ms) : 91, 96
section Bailout
This PR (8040) - mean (94ms) : 93, 95
master - mean (94ms) : 93, 95
section CallTarget+Inlining+NGEN
This PR (8040) - mean (712ms) : 675, 750
master - mean (713ms) : 679, 746
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8040) - mean (92ms) : 90, 95
master - mean (92ms) : 90, 94
section Bailout
This PR (8040) - mean (93ms) : 92, 95
master - mean (94ms) : 92, 95
section CallTarget+Inlining+NGEN
This PR (8040) - mean (635ms) : 616, 655
master - mean (634ms) : 618, 650
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8040) - mean (193ms) : 190, 196
master - mean (193ms) : 189, 197
section Bailout
This PR (8040) - mean (196ms) : 193, 199
master - mean (196ms) : 194, 199
section CallTarget+Inlining+NGEN
This PR (8040) - mean (1,112ms) : 1068, 1157
master - mean (1,124ms) : 1054, 1195
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8040) - mean (277ms) : 271, 283
master - mean (276ms) : 271, 281
section Bailout
This PR (8040) - mean (277ms) : 274, 281
master - mean (276ms) : 272, 281
section CallTarget+Inlining+NGEN
This PR (8040) - mean (925ms) : 871, 979
master - mean (923ms) : 863, 983
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8040) - mean (271ms) : 265, 276
master - mean (272ms) : 265, 278
section Bailout
This PR (8040) - mean (270ms) : 267, 273
master - mean (270ms) : 266, 273
section CallTarget+Inlining+NGEN
This PR (8040) - mean (923ms) : 879, 968
master - mean (923ms) : 869, 976
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8040) - mean (270ms) : 264, 276
master - mean (269ms) : 262, 276
section Bailout
This PR (8040) - mean (271ms) : 265, 277
master - mean (269ms) : 265, 272
section CallTarget+Inlining+NGEN
This PR (8040) - mean (823ms) : 803, 843
master - mean (822ms) : 804, 841
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ff96a9a to
81a3391
Compare
## Summary of changes Reduce allocation of `ActivityHandlerCommon` by removing `string` concatenation ## Reason for change The `ActivityHandlerCommon.ActivityStarted` and `ActivityStopped` methods need to store and retrieve `Activity` instances from a `ConcurrentDictionary<>`. Today they're doing that be concatenating the `Activity`'s `TraceId` and `SpanID`, or by using it's `ID`. All that concatenation causes a bunch of allocation, so instead introduce a simple `struct` to use as the key instead ## Implementation details Introduce `ActivityKey`, which is essentially `internal readonly record struct ActivityKey(string TraceId, string SpanId`, and use that for all of the dictionary lookups. Which avoids all the string concatenation allocations. ## Test coverage Added some unit tests for `ActivityKey`, by mostly covered by existing integration tests for correctness. Benchmarks show a significant improvement over [the previous results](#8036), particularly for Hierachical IDs which clearly were buggy | Method | Runtime | Mean | StdDev | Gen0 | Allocated | Compared to #8036 | | ------------------------------- | -------------------- | -------: | --------: | -----: | --------: | ----------------: | | StartStopWithChild_Hierarchical | .NET 6.0 | 4.217 us | 0.3227 us | 0.0153 | 4.09 KB | -6.38 KB | | StartStopWithChild_Hierarchical | .NET 8.0 | 3.413 us | 0.2505 us | 0.0076 | 4.09 KB | -6.39 KB | | StartStopWithChild_Hierarchical | .NET Core 3.1 | 5.676 us | 0.4636 us | 0.0153 | 4.32 KB | -6.43 KB | | StartStopWithChild_Hierarchical | .NET Framework 4.7.2 | 6.813 us | 0.4969 us | 0.7324 | 4.53 KB | -7.24 KB | | | | | | | | | | StartStopWithChild | .NET 6.0 | 4.105 us | 0.2677 us | 0.0153 | 4.3 KB | -0.57 KB | | StartStopWithChild | .NET 8.0 | 3.475 us | 0.1570 us | 0.0114 | 4.2 KB | -0.57 KB | | StartStopWithChild | .NET Core 3.1 | 5.647 us | 0.3129 us | 0.0153 | 4.48 KB | -0.57 KB | | StartStopWithChild | .NET Framework 4.7.2 | 6.842 us | 0.2992 us | 0.7629 | 4.69 KB | -0.61 KB | ## Other details https://datadoghq.atlassian.net/browse/LANGPLAT-915 Part of a stack working to improve OTel performance - #8036 - #8037 👈 - #8038 - #8039 - #8040 - #8041 - #8042
81a3391 to
0996918
Compare
| Method | Mean | Error | StdDev | Allocated | |----------------- |----------:|----------:|----------:|----------:| | ArrayBasedUpdate | 30.029 us | 0.1795 us | 0.1402 us | - | | PairBasedUpdate | 2.815 us | 0.0232 us | 0.0206 us | - |
- Split ParentSpanId access - Refactor AcitivityHandlerCommon to explicitly split W3C vs hierarchical IDs - Add handling for < .NET Core 3 activities
…to be grappend from in-process parent
0996918 to
982d0dc
Compare
bouwkast
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 lot easier to follow the various logic with the refactorings/renaming/comments thanks!
zacharycmontoya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the refactoring and fixes!!
## Summary of changes Update code that enumerates properties of `IActivity` duck types to use allocation-free enumeration ## Reason for change Our duck types all use the public APIs of `Activity` to grab tags, events, and links. These types are generally implemented internally as custom linked list objects, but are exposed as `IEnumerable<>` types. The internal types implement a `struct`-based enumerator, but as the types aren't exposed directly, the compiler can't use those and ends up allocating. That's... annoying. To work around that, this adds a "hack" that OTel were [actually using themselves](https://github.com/open-telemetry/opentelemetry-dotnet/blob/73bff75ef653f81fe6877299435b21131be36dc0/src/OpenTelemetry/Internal/EnumerationHelper.cs#L58) (and [which has been discussed elsewhere](https://www.macrosssoftware.com/2020/07/13/enumerator-performance-surprises/)), of building a dynamic method that finds the `GetEnumerator` type and uses that. We can't use it everywhere because we don't have the types available (e.g. links and events), but we _can_ use it for tags. There's essentially a trade-off of startup time vs runtime allocation, which is worth it in this case IMO. Additionally, the `Activity` implementation uses an empty array as the return value when there _are_ no tags/links/events, so we can avoid the enumerator allocation in those cases by specifically looking for an empty array and bailing if so. ## Implementation details I chose to add the `EnumerationHelper` as an explicitly `Activity`-related thing for now, but we can consider exposing this more widely if we get confidence. I also avoided the concurrent dictionary approach that is the "default" for this, and instead added an additional helper to read a static field that we populate once. We have to wait until we have a "real" activity to do the population, so that we get the right type, which is a bit annoying, but it works... ## Test coverage Added a couple of unit tests for the result and confirmed it works as expected. Benchmarks vs #8040 shw a nice allocation improvement: | Method | Runtime | Mean | StdDev | Gen0 | Gen1 | Allocated | Compared to #8040 | | ------------------------------- | -------------------- | ---------: | --------: | -----: | -----: | --------: | ----------------: | | StartStopWithChild_Hierarchical | .NET 6.0 | 3,248.6 ns | 28.71 ns | 0.0114 | 0.0038 | 3.6 KB | -0.19 KB | | StartStopWithChild_Hierarchical | .NET 8.0 | 2,651.1 ns | 80.80 ns | 0.0114 | - | 3.6 KB | -0.19 KB | | StartStopWithChild_Hierarchical | .NET Core 3.1 | 4,482.2 ns | 263.51 ns | 0.0153 | - | 3.83 KB | -0.19 KB | | StartStopWithChild_Hierarchical | .NET Framework 4.7.2 | 4,915.8 ns | 63.81 ns | 0.7019 | - | 4.33 KB | -0.18 KB | | StartStopWithChild | .NET 6.0 | 3,176.8 ns | 37.73 ns | 0.0114 | - | 3.59 KB | -0.19 KB | | StartStopWithChild | .NET 8.0 | 2,598.0 ns | 40.05 ns | 0.0114 | - | 3.59 KB | -0.19 KB | | StartStopWithChild | .NET Core 3.1 | 4,260.7 ns | 59.98 ns | 0.0076 | - | 3.78 KB | -0.19 KB | | StartStopWithChild | .NET Framework 4.7.2 | 5,097.7 ns | 46.98 ns | 0.6790 | - | 4.2 KB | -0.19 KB | ## Other details https://datadoghq.atlassian.net/browse/LANGPLAT-915 Part of a stack working to improve OTel performance - #8036 - #8037 - #8038 - #8039 - #8040 - #8041 👈 - #8042 --------- Co-authored-by: Lucas Pimentel <lucas.pimentel@datadoghq.com>
…8042) ## Summary of changes Improve performance around the "population" of tags from an `Activity` into a `Span` ## Reason for change Currently we do a _lot_ of allocation in the `OtlpHelpers.AgentConvertSpan` method. The `OpenTelemetryTags` object also has a lot of properties on it that we never set, and are only used for _reading_ tags in the `OperationMapper`, even though many may never even be read. This increases the size of the tags object. This PR aims to do various optimizations around the "close activity" paths: - Reduce the size of the `OpenTelemetryTags` object by removing properties only used by `OperationMapper` - Add properties to `OpenTelemetryTags` for values that we explictly set in `OtlpHelpers.AgentConvertSpan` ## Implementation details - Currently we're always creating the `OpenTelemetryTags`, even though we _may_ throw it away if the activity is ignored, so that's an easy win. - I think we can assume that the tags object passed to `AgentConvertSpan` is only ever `OpenTelemetryTags` based on the call sites, so we can interact with the tags object directly where posisble. - "Simplify" some of the code paths by avoiding `span.SetTag` and favouring `tags.SetTag` when we know it's not a "special" tag. - Inline a few method calls in places where they will often not be called, at the expense of some clarity - Refactor `OperationMapper` to use the `GetTag()` API, and cache the values at various points. The resulting code is uglier, but overall means we reduce ~100 bytes off every span, so I think it's worth it ## Test coverage Functionality is covered by existing, this is just a refactoring, and there's good tests for `OperationMapper` currently. Benchmarks vs #8041 show a nice allocation improvement, at the expense of slower execution (this was local though, so not 100% on that, will see what the CI results show). If there is a significant slow down, I'll try to isolate it, but I think the allocation improvments are probably worth it either way | Method | Runtime | Mean | StdDev | Gen0 | Allocated | Compared to #8041 | | ------------------------------- | -------------------- | -------: | --------: | -----: | --------: | ----------------: | | StartStopWithChild_Hierarchical | .NET 6.0 | 4.438 us | 0.7347 us | 0.0076 | 3.2 KB | -0.40 KB | | StartStopWithChild_Hierarchical | .NET 8.0 | 3.216 us | 0.4260 us | 0.0076 | 3.2 KB | -0.40 KB | | StartStopWithChild_Hierarchical | .NET Core 3.1 | 5.293 us | 0.8316 us | 0.0076 | 3.42 KB | -0.41 KB | | StartStopWithChild_Hierarchical | .NET Framework 4.7.2 | 5.735 us | 0.5592 us | 0.6332 | 3.92 KB | -0.41 KB | | | | | | | | | | StartStopWithChild | .NET 6.0 | 3.848 us | 0.5325 us | 0.0076 | 3.19 KB | -0.40 KB | | StartStopWithChild | .NET 8.0 | 3.080 us | 0.2698 us | 0.0076 | 3.19 KB | -0.40 KB | | StartStopWithChild | .NET Core 3.1 | 5.094 us | 0.4982 us | 0.0076 | 3.38 KB | -0.40 KB | | StartStopWithChild | .NET Framework 4.7.2 | 6.172 us | 0.5948 us | 0.6104 | 3.79 KB | -0.41 KB | ## Other details Added a couple of extra tests that were missing from the previous PR in the stack too https://datadoghq.atlassian.net/browse/LANGPLAT-915 Part of a stack working to improve OTel performance - #8036 - #8037 - #8038 - #8039 - #8040 - #8041 - #8042 👈
Summary of changes
Various optimizations in the
Activityhandling code to reduce allocations and execution timeReason for change
Some properties are triksy, as they do a bunch of allocation, so we should avoid them if we can. The changes in here look bigger than they are diff-wise, I've added comments to aid review.
Implementation details
ParentIduntil we definitely need it.ParentIduntil we're sure we don't have something better alreadyTracer.Instance.ActiveScope?.Spanuntil we know we need it (very minor optimisations but why not 🤷♂️)StopActivitySlowto a separate method, as we shouldn't hit this now, so should help the JIT out with things like code size etc of the calling method (conjecture, not tested, but I think it's better from code understanding PoV too)ShouldIgnoreByOperationNamewhich also improves execution time 10x from ~30us to ~3us.Test coverage
Functionally covered by existing tests. Benchmarks compared to #8039 show improvments. Allocations listed below, but execution time is also improved:
Other details
https://datadoghq.atlassian.net/browse/LANGPLAT-915
Part of a stack working to improve OTel performance
Activitybenchmarks for stability #8036ActivityKeyto remove string concatenation #8037Activityduck types #8038ActivityMappingById.GetOrAdd#8039Activityobjects #8041OpenTelemetryTagspopulation to reduce allocation and work #8042