Improve performance for UriHelpers.CleanUri#8199
Conversation
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8199) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8199) - mean (69ms) : 67, 71
master - mean (69ms) : 67, 71
section Bailout
This PR (8199) - mean (73ms) : 72, 74
master - mean (73ms) : 72, 75
section CallTarget+Inlining+NGEN
This PR (8199) - mean (1,052ms) : 973, 1131
master - mean (1,043ms) : 1000, 1086
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8199) - mean (116ms) : 113, 119
master - mean (116ms) : 113, 120
section Bailout
This PR (8199) - mean (117ms) : 114, 119
master - mean (117ms) : 115, 119
section CallTarget+Inlining+NGEN
This PR (8199) - mean (773ms) : 712, 834
master - mean (777ms) : 725, 828
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8199) - mean (102ms) : 98, 105
master - mean (102ms) : 99, 105
section Bailout
This PR (8199) - mean (103ms) : 101, 104
master - mean (103ms) : 101, 104
section CallTarget+Inlining+NGEN
This PR (8199) - mean (761ms) : 739, 783
master - mean (763ms) : 737, 790
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8199) - mean (94ms) : 91, 96
master - mean (94ms) : 91, 97
section Bailout
This PR (8199) - mean (95ms) : 93, 97
master - mean (95ms) : 93, 98
section CallTarget+Inlining+NGEN
This PR (8199) - mean (640ms) : 626, 654
master - mean (635ms) : 622, 648
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8199) - mean (213ms) : 205, 222
master - mean (213ms) : 205, 221
section Bailout
This PR (8199) - mean (218ms) : 211, 224
master - mean (219ms) : 214, 225
section CallTarget+Inlining+NGEN
This PR (8199) - mean (1,225ms) : 1170, 1281
master - mean (1,223ms) : 1172, 1274
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8199) - mean (321ms) : 307, 335
master - mean (326ms) : 315, 337
section Bailout
This PR (8199) - mean (323ms) : 311, 335
master - mean (326ms) : 316, 336
section CallTarget+Inlining+NGEN
This PR (8199) - mean (1,051ms) : 1000, 1103
master - mean (1,069ms) : 1016, 1121
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8199) - mean (299ms) : 287, 310
master - mean (301ms) : 292, 310
section Bailout
This PR (8199) - mean (303ms) : 288, 318
master - mean (304ms) : 291, 316
section CallTarget+Inlining+NGEN
This PR (8199) - mean (992ms) : 936, 1047
master - mean (1,001ms) : 930, 1071
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8199) - mean (311ms) : 299, 322
master - mean (316ms) : 303, 328
section Bailout
This PR (8199) - mean (310ms) : 301, 320
master - mean (317ms) : 305, 329
section CallTarget+Inlining+NGEN
This PR (8199) - mean (1,007ms) : 902, 1113
master - mean (1,030ms) : 939, 1120
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||
| private static string CleanUriAndRemoveIds(Uri uri, bool removeScheme) | ||
| { | ||
| var sb = StringBuilderCache.Acquire(); |
There was a problem hiding this comment.
NIT: should we create this with a size estimation line in line 66? Something like var sb = StringBuilderCache.Acquire(uri.AbsolutePath.Length + 64);
There was a problem hiding this comment.
tbh, I don't know. We had started to not bother in various place, and just to always allocate the Max size, so we basically always end up creating one of the max size anyway, and creating a smaller one isn't actually a benefit if we're just going to allocate a bigger one later 😄
NachoEchevarria
left a comment
There was a problem hiding this comment.
Very nice! Thanks!
…r of any edge cases
…uired | Method | Job | Mean | Allocated | Alloc Ratio | | --------------------------- | ------------------ | --------: | --------: | ----------: | | CleanUri_Original | .NET 10.0 | 98.36 ns | 168 B | 0.21 | | CleanUri_Updated | .NET 10.0 | 75.24 ns | 120 B | 0.15 | | CleanUri_Original | .NET 6.0 | 126.13 ns | 168 B | 0.21 | | CleanUri_Updated | .NET 6.0 | 120.56 ns | 120 B | 0.15 | | CleanUri_Original | .NET Core 2.1 | 217.51 ns | 800 B | 1.00 | | CleanUri_Updated | .NET Core 2.1 | 195.59 ns | 752 B | 0.94 | | CleanUri_Original | .NET Core 3.1 | 247.55 ns | 792 B | 0.99 | | CleanUri_Updated | .NET Core 3.1 | 229.05 ns | 744 B | 0.93 | | CleanUri_Original | .NET Framework 4.8 | 186.70 ns | 802 B | 1.00 | | CleanUri_Updated | .NET Framework 4.8 | 205.19 ns | 754 B | 0.94 | | | | | | | | CleanUri_IgnoreIds_Original | .NET 10.0 | 55.58 ns | 128 B | 0.15 | | CleanUri_IgnoreIds_Updated | .NET 10.0 | 53.36 ns | 128 B | 0.15 | | CleanUri_IgnoreIds_Original | .NET 6.0 | 68.13 ns | 128 B | 0.15 | | CleanUri_IgnoreIds_Updated | .NET 6.0 | 73.44 ns | 128 B | 0.15 | | CleanUri_IgnoreIds_Original | .NET Core 2.1 | 150.71 ns | 856 B | 1.00 | | CleanUri_IgnoreIds_Updated | .NET Core 2.1 | 150.08 ns | 856 B | 1.00 | | CleanUri_IgnoreIds_Original | .NET Core 3.1 | 163.51 ns | 848 B | 0.99 | | CleanUri_IgnoreIds_Updated | .NET Core 3.1 | 172.51 ns | 848 B | 0.99 | | CleanUri_IgnoreIds_Original | .NET Framework 4.8 | 106.02 ns | 859 B | 1.00 | | CleanUri_IgnoreIds_Updated | .NET Framework 4.8 | 106.11 ns | 859 B | 1.00 | | | | | | | | CleanUri_WithIds_Original | .NET 10.0 | 104.45 ns | 168 B | 0.18 | | CleanUri_WithIds_Updated | .NET 10.0 | 80.02 ns | 115 B | 0.13 | | CleanUri_WithIds_Original | .NET 6.0 | 142.08 ns | 168 B | 0.18 | | CleanUri_WithIds_Updated | .NET 6.0 | 132.24 ns | 120 B | 0.13 | | CleanUri_WithIds_Original | .NET Core 2.1 | 234.61 ns | 912 B | 1.00 | | CleanUri_WithIds_Updated | .NET Core 2.1 | 219.09 ns | 856 B | 0.94 | | CleanUri_WithIds_Original | .NET Core 3.1 | 266.88 ns | 888 B | 0.97 | | CleanUri_WithIds_Updated | .NET Core 3.1 | 246.59 ns | 840 B | 0.92 | | CleanUri_WithIds_Original | .NET Framework 4.8 | 194.05 ns | 915 B | 1.00 | | CleanUri_WithIds_Updated | .NET Framework 4.8 | 202.65 ns | 859 B | 0.94 |
26766b1 to
0dc0106
Compare
BenchmarksBenchmark execution time: 2026-02-16 18:18:17 Comparing candidate commit 0dc0106 in PR branch Found 11 performance improvements and 6 performance regressions! Performance is the same for 154 metrics, 21 unstable metrics. scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net6.0
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net6.0
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net6.0
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net472
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice netcoreapp3.1
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net6.0
scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch netcoreapp3.1
scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync netcoreapp3.1
scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog net6.0
scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog netcoreapp3.1
scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net6.0
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net6.0
scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin net6.0
|
| // the flag changed in .NET 10 | ||
| private static readonly ulong DisablePathAndQueryCanonicalizationFlag | ||
| = FrameworkDescription.Instance.RuntimeVersion.Major >= 10 | ||
| ? 1UL << 55 | ||
| : 0x200000000000; |
There was a problem hiding this comment.
Is there anything we can do to preemptively workaround this if it changes again in future version of .NET?
Like assuming it was created with DangerousDisablePathAndQueryCanonicalizationFlag and take the safe route if RuntimeVersion.Major > 10 (until we add support for whatever the new value is)
There was a problem hiding this comment.
Is there anything we can do to preemptively workaround this if it changes again in future version of .NET?
Well, the good news is that our tests would crash if they change it in future versions (both the unit tests here and the integration tests) 😄
Like assuming it was created with DangerousDisablePathAndQueryCanonicalizationFlag and take the safe route if RuntimeVersion.Major > 10 (until we add support for whatever the new value is)
I'd generally rather we didn't do that, as I would be concerned about the performance regression going unnoticed...
|
|
||
| #if NET6_0_OR_GREATER | ||
| [DuckCopy] | ||
| internal struct UriStruct |
There was a problem hiding this comment.
You're adding the same duck type in #8203 at tracer/src/Datadog.Trace/Util/Http/HttpRequestUtils.cs.
Is that intentional?
There was a problem hiding this comment.
Yeah, I just didn't want to stack the PRs, I wasn't sure anyway would notice, and was going to clean it up after 😉
There was a problem hiding this comment.
Pull request overview
This PR improves the performance of UriHelpers.CleanUri() by reducing memory allocations. The primary optimization is to use a single GetComponents() call to retrieve URI components at once instead of accessing individual properties, which trigger separate internal parsing operations. Additionally, when IDs need to be removed, the code now uses a single StringBuilder instance throughout the entire operation instead of partially building strings and then allocating another StringBuilder.
The PR handles a complex edge case in .NET 6+ where URIs created with DangerousDisablePathAndQueryCanonicalization flag cannot use GetComponents() with path components without throwing exceptions. This is detected using duck typing to inspect internal URI flags.
Changes:
- Refactored
CleanUri()to delegate to specialized methods for performance - Added extensive characterization tests for existing behavior
- Implemented duck typing detection for .NET 6+ dangerous URI flag
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| tracer/src/Datadog.Trace/Util/UriHelpers.cs | Refactored main implementation to use GetComponents(), added specialized methods, and added .NET 6+ dangerous flag detection |
| tracer/test/Datadog.Trace.Tests/Util/UriHelpersTests.cs | Added comprehensive test coverage for IsIdentifierSegment and CleanUri methods with 180+ test cases |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
## Summary of changes A few minor improvements to the standard `AspNetCoreDiagnosticObserver` ## Reason for change Looking into obvious perf improvements for ASP.NET Core, but only a few minor things stood out (apart from related PRs like #8199 and #8203). ## Implementation details - Reduce size of MVC tags object by not deriving from `WebTags` (we never set those tags anyway) - Delay creating spanlinks collection if we don't need it - HttpRoute always matches AspNetCoreRoute, so can make it readonly ## Test coverage Covered by existing tests, benchmarks show (tiny) allocation gains ## Other details Relates to https://datadoghq.atlassian.net/browse/LANGPLAT-842 Related PRs: - #8167 - #8170 - #8180 - #8196 - #8199 - #8203
Summary of changes
Reduce allocations for
UriHelpers.CleanUri()that's called as part of HTTP requests (among others)Reason for change
The allocations showed up in some profiling so decided to dig in. And Oh Boy, this stuff is terrible on < .NET 6 😅 And it's still terrible after this PR, just less terrible 😅
Implementation details
Two main improvements:
StringBuilderto format theUrianyway (because we're stripping identifiers), then use one for the wholestring, instead of partially building the string, and then doing another allocation.GetComponents(), specifying all the components we need, instead of accessing each of theUriproperties ad-hoc. It turns out these properties are actually very expensive, because they trigger a bunch of analysis. Doing it one-shot doesn't avoid that, but it still gives some improvements.Test coverage
I wrote a "characterisation" unit test, seeing as we didn't have any, to describe the current behaviour, to ensure I didn't change anything. Benchmarking was done (as shown below), comparing the original implementation with the new implementation, for three different scenarios:
CleanUri:CleanUri_WithIds:CleanUri_IgnoreIds:The benchmark results heavily depend on the
Uriprovided, so it's only the relative changes that are interesting here, comparing original to updated.Benchmarking code
Original benchmarks
EDIT: I realised that the allocations here aren't quite right, because this is calling the method on the same
Uriinstance in each iteration. AsUriuses heavy caching, that distorts the allocations a lot. A more accurate test is to use a newUrifor each iteration, as the following benchmarks do. The allocation is more because we're both allocating theUriand doing the stripping, and the benchmark doesn't make use of caching, but thankfully we're still better in all cases.EDIT 2: Shortly after submitting this PR, I discovered DangerousDisablePathAndQueryCanonicalization and tl;dr; that's a nightmare 😅 It's only in .NET 6+, but they don't expose whether it was set publicly, so had to do some gnarly stuff with duck types. I'm not proud, but the benchmarks still seem worth pursuing it to me. This is the same issue that plagues #8203
Other details
Spotted while working on https://datadoghq.atlassian.net/browse/LANGPLAT-842