Skip to content

[Code Origin] DEBUG-5199 Reduce per-span allocations and locks#8197

Open
dudikeleti wants to merge 13 commits intomasterfrom
dudik/co/optimizations
Open

[Code Origin] DEBUG-5199 Reduce per-span allocations and locks#8197
dudikeleti wants to merge 13 commits intomasterfrom
dudik/co/optimizations

Conversation

@dudikeleti
Copy link
Contributor

@dudikeleti dudikeleti commented Feb 12, 2026

Summary of changes

  • Add code origin tags for entry spans (frame 0 only) to WebTags to avoid allocating the underlying tag list for common server entry spans.
  • Reduce per-span allocations when adding code-origin tags for entry spans (_dd.code_origin.*) by batching tag appends in TagsList.
  • Add an internal TagsList.BeginTagBatch() / TagBatch.SetTag() helper to reserve capacity, take the tag-list lock once, and set multiple tags with the same override semantics as SetTag
  • Cache PDB-derived code origin values (file URL + stringified line/column) during per-assembly cache population (stored per endpoint token) to avoid per-span ToString() allocations.

Reason for change

  • Code-origin tags are added on a very hot path and were introducing avoidable allocations and overhead (line/column ToString() and repeated tag container growth/locking).

Implementation details

  • Entry span optimization: code origin for entry spans is represented as WebTags properties for the base 4 tags (type/index/method/type). This is explicitly not a valid representation for exit spans, which can include multiple frames and dynamic keys.
  • Batch tag add fast-path: TagBatch preserves TagsList.SetTag() semantics. Callers must keep the critical section small and avoid slow/allocating work while holding the lock.
  • PDB caching: line/column strings are computed once during per-assembly cache population (per endpoint token).

Test coverage

SpanCodeOriginTests
EndpointDetectorTests

@dudikeleti dudikeleti requested review from a team as code owners February 12, 2026 13:00
@dudikeleti dudikeleti requested a review from Copilot February 12, 2026 13:00
@github-actions github-actions bot added the area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) label Feb 12, 2026
@dudikeleti dudikeleti added type:enhancement Improvement to an existing feature type:performance Performance, speed, latency, resource usage (CPU, memory) area:debugger and removed area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) labels Feb 12, 2026
@dudikeleti dudikeleti force-pushed the dudik/co/optimizations branch from 8411be8 to 9b59bc2 Compare February 12, 2026 13:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimize code-origin tag emission on hot paths by batching tag writes and caching PDB-derived strings to reduce allocations and lock contention.

Changes:

  • Add TagsList.BeginTagBatch() / TagBatch to reserve capacity, take the tag lock once, and append multiple tags.
  • Update code-origin tagging to use the batch append path when span.Tags is a TagsList.
  • Cache PDB sequence point URL/line/column strings during per-assembly cache construction to avoid per-span ToString() allocations.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
tracer/src/Datadog.Trace/Tagging/TagsList.cs Adds a lock-holding batch helper to append multiple tags with fewer allocations/locks.
tracer/src/Datadog.Trace/Debugger/SpanCodeOrigin/SpanCodeOrigin.cs Uses batched tag appends for code-origin tags and caches formatted PDB values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions bot added the area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) label Feb 12, 2026
@dudikeleti dudikeleti removed type:enhancement Improvement to an existing feature area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) labels Feb 12, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8411be87b3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@github-actions github-actions bot added the area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) label Feb 12, 2026
@pr-commenter
Copy link

pr-commenter bot commented Feb 12, 2026

Benchmarks

Benchmark execution time: 2026-02-16 11:49:16

Comparing candidate commit 137fb83 in PR branch dudik/co/optimizations with baseline commit 05bb018 in branch master.

Found 10 performance improvements and 10 performance regressions! Performance is the same for 160 metrics, 12 unstable metrics.

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net472

  • 🟥 throughput [-6045.390op/s; -5651.851op/s] or [-7.036%; -6.578%]

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net6.0

  • 🟥 throughput [-9754.832op/s; -7622.854op/s] or [-7.814%; -6.106%]

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild netcoreapp3.1

  • 🟥 throughput [-6733.593op/s; -5440.853op/s] or [-7.097%; -5.734%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

  • 🟩 execution_time [-79.862ms; -79.438ms] or [-39.370%; -39.161%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net6.0

  • 🟥 execution_time [+103.276ms; +104.375ms] or [+107.288%; +108.429%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest netcoreapp3.1

  • 🟩 throughput [+532.203op/s; +1455.590op/s] or [+5.673%; +15.515%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net472

  • 🟩 execution_time [-35.857ms; -31.767ms] or [-15.441%; -13.679%]
  • 🟩 throughput [+85.593op/s; +105.207op/s] or [+8.280%; +10.178%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

  • 🟥 execution_time [+22.486ms; +29.060ms] or [+11.533%; +14.904%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1

  • 🟥 throughput [-176.766op/s; -118.536op/s] or [-11.339%; -7.604%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net472

  • 🟩 execution_time [-152.372µs; -147.108µs] or [-7.358%; -7.103%]
  • 🟩 throughput [+36.956op/s; +38.315op/s] or [+7.653%; +7.935%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net6.0

  • 🟥 execution_time [+11.681ms; +15.928ms] or [+6.097%; +8.314%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark netcoreapp3.1

  • 🟥 throughput [-404.811op/s; -225.195op/s] or [-18.187%; -10.117%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net6.0

  • 🟩 execution_time [-7.348µs; -2.737µs] or [-15.536%; -5.787%]
  • 🟩 throughput [+1150.622op/s; +3169.043op/s] or [+5.320%; +14.653%]

scenario:Benchmarks.Trace.NLogBenchmark.EnrichedLog net472

  • 🟩 throughput [+7232.561op/s; +8031.381op/s] or [+5.616%; +6.237%]

scenario:Benchmarks.Trace.RedisBenchmark.SendReceive netcoreapp3.1

  • 🟥 execution_time [+10.579ms; +15.083ms] or [+5.360%; +7.642%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net6.0

  • 🟩 execution_time [-82.053ms; -65.411ms] or [-46.812%; -37.317%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan netcoreapp3.1

  • 🟥 execution_time [+16.980ms; +17.696ms] or [+8.529%; +8.889%]

@dd-trace-dotnet-ci-bot
Copy link

dd-trace-dotnet-ci-bot bot commented Feb 12, 2026

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing This PR (8197) and master.

✅ No regressions detected - check the details below

Full Metrics Comparison

FakeDbCommand

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration69.50 ± (69.43 - 69.70) ms69.19 ± (69.20 - 69.41) ms-0.4%
.NET Framework 4.8 - Bailout
duration73.77 ± (73.57 - 73.85) ms73.18 ± (73.18 - 73.48) ms-0.8%
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1033.87 ± (1037.50 - 1045.02) ms1044.87 ± (1053.60 - 1065.39) ms+1.1%✅⬆️
.NET Core 3.1 - Baseline
process.internal_duration_ms22.40 ± (22.36 - 22.43) ms22.27 ± (22.23 - 22.30) ms-0.6%
process.time_to_main_ms87.14 ± (86.98 - 87.30) ms86.78 ± (86.64 - 86.93) ms-0.4%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed15.47 ± (15.46 - 15.47) MB15.48 ± (15.47 - 15.48) MB+0.1%✅⬆️
runtime.dotnet.threads.count12 ± (12 - 12)12 ± (12 - 12)+0.0%
.NET Core 3.1 - Bailout
process.internal_duration_ms22.27 ± (22.24 - 22.30) ms22.34 ± (22.32 - 22.36) ms+0.3%✅⬆️
process.time_to_main_ms88.59 ± (88.45 - 88.73) ms88.30 ± (88.16 - 88.44) ms-0.3%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed15.52 ± (15.51 - 15.52) MB15.51 ± (15.51 - 15.51) MB-0.0%
runtime.dotnet.threads.count13 ± (13 - 13)13 ± (13 - 13)+0.0%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms255.74 ± (252.50 - 258.99) ms263.52 ± (261.24 - 265.81) ms+3.0%✅⬆️
process.time_to_main_ms494.93 ± (494.40 - 495.47) ms493.14 ± (492.70 - 493.59) ms-0.4%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed52.18 ± (52.16 - 52.20) MB52.16 ± (52.13 - 52.19) MB-0.0%
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.0%
.NET 6 - Baseline
process.internal_duration_ms21.18 ± (21.15 - 21.21) ms20.99 ± (20.96 - 21.01) ms-0.9%
process.time_to_main_ms75.48 ± (75.35 - 75.61) ms75.21 ± (75.04 - 75.38) ms-0.4%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed15.18 ± (15.18 - 15.18) MB15.18 ± (15.18 - 15.18) MB+0.0%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 6 - Bailout
process.internal_duration_ms21.15 ± (21.13 - 21.17) ms21.01 ± (20.99 - 21.04) ms-0.6%
process.time_to_main_ms76.58 ± (76.47 - 76.69) ms76.69 ± (76.56 - 76.83) ms+0.1%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed15.30 ± (15.29 - 15.30) MB15.29 ± (15.29 - 15.30) MB-0.0%
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms258.07 ± (257.29 - 258.85) ms256.68 ± (255.81 - 257.55) ms-0.5%
process.time_to_main_ms473.26 ± (472.71 - 473.82) ms470.53 ± (469.97 - 471.09) ms-0.6%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed53.02 ± (53.00 - 53.04) MB52.94 ± (52.91 - 52.97) MB-0.1%
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.3%✅⬆️
.NET 8 - Baseline
process.internal_duration_ms19.00 ± (18.98 - 19.02) ms19.02 ± (18.99 - 19.04) ms+0.1%✅⬆️
process.time_to_main_ms68.83 ± (68.70 - 68.97) ms68.76 ± (68.63 - 68.90) ms-0.1%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.69 ± (7.68 - 7.69) MB7.66 ± (7.66 - 7.67) MB-0.3%
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 8 - Bailout
process.internal_duration_ms19.02 ± (19.00 - 19.05) ms19.07 ± (19.05 - 19.09) ms+0.3%✅⬆️
process.time_to_main_ms69.51 ± (69.43 - 69.59) ms69.70 ± (69.59 - 69.81) ms+0.3%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.76 ± (7.75 - 7.77) MB7.72 ± (7.71 - 7.73) MB-0.5%
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms180.80 ± (179.84 - 181.77) ms180.51 ± (179.68 - 181.33) ms-0.2%
process.time_to_main_ms428.11 ± (427.50 - 428.73) ms431.88 ± (431.15 - 432.61) ms+0.9%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed35.89 ± (35.87 - 35.92) MB36.02 ± (35.99 - 36.05) MB+0.4%✅⬆️
runtime.dotnet.threads.count26 ± (26 - 26)27 ± (27 - 27)+0.6%✅⬆️

HttpMessageHandler

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration199.88 ± (199.78 - 200.70) ms210.64 ± (210.30 - 211.44) ms+5.4%✅⬆️
.NET Framework 4.8 - Bailout
duration203.91 ± (203.73 - 204.75) ms214.60 ± (214.34 - 215.45) ms+5.2%✅⬆️
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1172.24 ± (1172.38 - 1179.74) ms1206.35 ± (1206.02 - 1214.68) ms+2.9%✅⬆️
.NET Core 3.1 - Baseline
process.internal_duration_ms199.33 ± (198.87 - 199.80) ms209.05 ± (208.52 - 209.57) ms+4.9%✅⬆️
process.time_to_main_ms91.75 ± (91.49 - 92.00) ms97.30 ± (96.99 - 97.60) ms+6.0%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed20.59 ± (20.57 - 20.62) MB20.42 ± (20.41 - 20.44) MB-0.8%
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 20)+0.4%✅⬆️
.NET Core 3.1 - Bailout
process.internal_duration_ms198.70 ± (198.23 - 199.18) ms208.30 ± (207.75 - 208.84) ms+4.8%✅⬆️
process.time_to_main_ms92.90 ± (92.65 - 93.15) ms98.16 ± (97.87 - 98.46) ms+5.7%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed20.60 ± (20.58 - 20.63) MB20.49 ± (20.47 - 20.51) MB-0.6%
runtime.dotnet.threads.count21 ± (21 - 21)21 ± (21 - 21)+0.7%✅⬆️
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms447.71 ± (445.05 - 450.38) ms469.58 ± (466.78 - 472.39) ms+4.9%✅⬆️
process.time_to_main_ms511.47 ± (510.72 - 512.21) ms538.03 ± (537.20 - 538.87) ms+5.2%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed62.35 ± (62.23 - 62.46) MB61.45 ± (61.29 - 61.61) MB-1.4%
runtime.dotnet.threads.count29 ± (29 - 29)30 ± (30 - 30)+0.5%✅⬆️
.NET 6 - Baseline
process.internal_duration_ms201.88 ± (201.42 - 202.35) ms209.64 ± (209.20 - 210.08) ms+3.8%✅⬆️
process.time_to_main_ms74.07 ± (73.83 - 74.32) ms77.49 ± (77.26 - 77.72) ms+4.6%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.30 ± (16.28 - 16.32) MB16.20 ± (16.18 - 16.22) MB-0.6%
runtime.dotnet.threads.count19 ± (19 - 19)19 ± (19 - 19)+0.1%✅⬆️
.NET 6 - Bailout
process.internal_duration_ms200.27 ± (199.85 - 200.68) ms209.35 ± (208.88 - 209.81) ms+4.5%✅⬆️
process.time_to_main_ms74.34 ± (74.13 - 74.56) ms78.66 ± (78.45 - 78.88) ms+5.8%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.36 ± (16.33 - 16.38) MB16.24 ± (16.22 - 16.26) MB-0.7%
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 21)+1.4%✅⬆️
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms459.41 ± (456.71 - 462.11) ms468.91 ± (465.29 - 472.54) ms+2.1%✅⬆️
process.time_to_main_ms468.22 ± (467.30 - 469.14) ms482.65 ± (481.88 - 483.42) ms+3.1%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed57.89 ± (57.74 - 58.05) MB57.58 ± (57.42 - 57.74) MB-0.5%
runtime.dotnet.threads.count30 ± (29 - 30)30 ± (30 - 30)+0.8%✅⬆️
.NET 8 - Baseline
process.internal_duration_ms202.14 ± (201.64 - 202.65) ms212.61 ± (212.08 - 213.14) ms+5.2%✅⬆️
process.time_to_main_ms79.05 ± (78.79 - 79.31) ms83.59 ± (83.32 - 83.87) ms+5.7%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.23 ± (16.21 - 16.25) MB16.02 ± (16.00 - 16.04) MB-1.3%
runtime.dotnet.threads.count19 ± (19 - 19)19 ± (19 - 19)+0.0%✅⬆️
.NET 8 - Bailout
process.internal_duration_ms202.37 ± (201.90 - 202.83) ms213.36 ± (212.88 - 213.85) ms+5.4%✅⬆️
process.time_to_main_ms80.39 ± (80.17 - 80.61) ms85.45 ± (85.22 - 85.69) ms+6.3%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.27 ± (16.25 - 16.28) MB16.12 ± (16.10 - 16.13) MB-0.9%
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 20)+0.6%✅⬆️
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms391.30 ± (387.34 - 395.26) ms465.88 ± (458.69 - 473.07) ms+19.1%✅⬆️
process.time_to_main_ms476.62 ± (475.78 - 477.46) ms497.34 ± (496.48 - 498.20) ms+4.3%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed52.96 ± (52.80 - 53.13) MB54.37 ± (54.33 - 54.41) MB+2.7%✅⬆️
runtime.dotnet.threads.count29 ± (29 - 29)29 ± (29 - 29)-0.2%
Comparison explanation

Execution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

Duration charts
FakeDbCommand (.NET Framework 4.8)
gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8197) - mean (69ms)  : 68, 71
    master - mean (70ms)  : 68, 71

    section Bailout
    This PR (8197) - mean (73ms)  : 72, 75
    master - mean (74ms)  : 72, 75

    section CallTarget+Inlining+NGEN
    This PR (8197) - mean (1,059ms)  : 967, 1152
    master - mean (1,041ms)  : 988, 1095

Loading
FakeDbCommand (.NET Core 3.1)
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8197) - mean (115ms)  : 111, 119
    master - mean (116ms)  : 112, 119

    section Bailout
    This PR (8197) - mean (117ms)  : 115, 118
    master - mean (117ms)  : 115, 119

    section CallTarget+Inlining+NGEN
    This PR (8197) - mean (781ms)  : 741, 821
    master - mean (779ms)  : 723, 835

Loading
FakeDbCommand (.NET 6)
gantt
    title Execution time (ms) FakeDbCommand (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8197) - mean (102ms)  : 99, 104
    master - mean (102ms)  : 99, 105

    section Bailout
    This PR (8197) - mean (104ms)  : 101, 106
    master - mean (103ms)  : 101, 105

    section CallTarget+Inlining+NGEN
    This PR (8197) - mean (756ms)  : 736, 776
    master - mean (770ms)  : 746, 793

Loading
FakeDbCommand (.NET 8)
gantt
    title Execution time (ms) FakeDbCommand (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8197) - mean (94ms)  : 92, 97
    master - mean (95ms)  : 92, 97

    section Bailout
    This PR (8197) - mean (95ms)  : 94, 97
    master - mean (95ms)  : 94, 97

    section CallTarget+Inlining+NGEN
    This PR (8197) - mean (642ms)  : 626, 657
    master - mean (637ms)  : 623, 651

Loading
HttpMessageHandler (.NET Framework 4.8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8197) - mean (211ms)  : 202, 220
    master - mean (200ms)  : 195, 206

    section Bailout
    This PR (8197) - mean (215ms)  : 207, 223
    master - mean (204ms)  : 199, 210

    section CallTarget+Inlining+NGEN
    This PR (8197) - mean (1,210ms)  : 1144, 1277
    master - mean (1,176ms)  : 1124, 1228

Loading
HttpMessageHandler (.NET Core 3.1)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8197) - mean (316ms)  : 304, 329
    master - mean (301ms)  : 293, 309

    section Bailout
    This PR (8197) - mean (316ms)  : 308, 325
    master - mean (301ms)  : 292, 309

    section CallTarget+Inlining+NGEN
    This PR (8197) - mean (1,039ms)  : 986, 1091
    master - mean (992ms)  : 951, 1033

Loading
HttpMessageHandler (.NET 6)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8197) - mean (296ms)  : 288, 305
    master - mean (285ms)  : 276, 293

    section Bailout
    This PR (8197) - mean (297ms)  : 289, 305
    master - mean (284ms)  : 279, 288

    section CallTarget+Inlining+NGEN
    This PR (8197) - mean (985ms)  : 932, 1039
    master - mean (956ms)  : 900, 1011

Loading
HttpMessageHandler (.NET 8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8197) - mean (307ms)  : 297, 317
    master - mean (292ms)  : 284, 300

    section Bailout
    This PR (8197) - mean (310ms)  : crit, 301, 319
    master - mean (293ms)  : 286, 300

    section CallTarget+Inlining+NGEN
    This PR (8197) - mean (995ms)  : crit, 888, 1103
    master - mean (901ms)  : 842, 959

Loading

@tylfin
Copy link
Member

tylfin commented Feb 12, 2026

I see the benchmarks above, but are we able to introduce a unittest that provably shows the reduction in allocations?

@dudikeleti dudikeleti requested a review from a team as a code owner February 12, 2026 16:51
@dudikeleti dudikeleti requested a review from Copilot February 12, 2026 18:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 349 to 356
internal readonly struct TagBatch : IDisposable
{
private readonly List<KeyValuePair<string, string>>? _tags;

internal TagBatch(List<KeyValuePair<string, string>> tags)
{
_tags = tags;
}
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TagBatch is a lock-holding struct, but as a normal (readonly) struct it can be copied accidentally (e.g., passed by value), which can lead to double-dispose or disposing a copy and throwing SynchronizationLockException when Monitor.Exit is called incorrectly. Consider making TagBatch a ref struct (and rely on pattern-based Dispose) or otherwise redesigning it to be non-copyable so misuse is prevented by the type system rather than only by comments.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does concern me... it's really hard to reason about exactly when these copies might or might not be added

Since we last spoke, I've realised that we can have some simpler alternatives. IN particular, we can create multiple SetTags methods, which side-steps the allocation issue, at the expense of some minor inconvenience/duplication implementation wise 🙂

public void SetTags(KeyValuePair<string, string> tag1, KeyValuePair<string, string> tag2);
public void SetTags(KeyValuePair<string, string> tag1, KeyValuePair<string, string> tag2, KeyValuePair<string, string> tag3);
public void SetTags(KeyValuePair<string, string> tag1, KeyValuePair<string, string> tag2, KeyValuePair<string, string> tag3, KeyValuePair<string, string> tag4);

// I don't know if it's worth adding this too - it's much nicer to use and implement
// but can't work with .NET FX/.NET Standard, so may never be used in practice 🤷‍♂️ 
#if NETCOREAPP
public void SetTags(params ReadOnlySpan<KeyValuePair<string, string>> tags);
#endif

Internally, these methods would do the same thing as the existing SetTag, but would ensure the capacity is sufficient first

Copy link
Contributor Author

@dudikeleti dudikeleti Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 6414c1c & 12b0e4a

Comment on lines +378 to +388
[Fact]
public void EntryCodeOriginTagging_ShouldReduceAllocations()
{
// 1) the optimized approach allocates less than the baseline
// 2) the optimized approach stays below a fixed budget to catch regressions
//
// This test focuses on allocations only (not CPU).
const int iterations = 10_000;
const int rounds = 3;
const double maxOptimizedRatio = 0.90;
const long maxOptimizedBytesPerOp = 300;
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allocation test includes a hard absolute budget (maxOptimizedBytesPerOp = 300) and a fixed ratio threshold, which can be brittle across runtime versions, CPU architectures, and build configurations (Debug/Release), leading to flaky CI. To reduce flakiness while still catching regressions, consider removing/relaxing the absolute byte cap and basing the assertion only on relative improvement vs. the baseline (or gating the absolute check behind an explicit opt-in like an environment variable / dedicated performance test run).

Copilot uses AI. Check for mistakes.
Comment on lines 40 to 50
[Tag(Trace.Tags.CodeOriginType)]
public string CodeOriginType { get; set; }

[Tag(Trace.Tags.CodeOriginFrameIndex)]
public string CodeOriginFrames0Index { get; set; }

[Tag(Trace.Tags.CodeOriginFrameMethod)]
public string CodeOriginFrames0Method { get; set; }

[Tag(Trace.Tags.CodeOriginFrameType)]
public string CodeOriginFrames0Type { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should add these to the base WebTags type, as that increases the allocations of all derived types, whereas we will only ever actually add these to the root span tag type, which will be a limited subset of those 🤔

Also, we shouldn't do this until after (or ideally, at the same time) as we enable code origins by default 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we do add code origin to all derived types. The only one that is still missing is WCF, but I plan to add it soon.
image

I'll take another look.

Copy link
Contributor Author

@dudikeleti dudikeleti Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we shouldn't do this until after (or ideally, at the same time) as we enable code origins by default 😄

Well… now is that time 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we do add code origin to all derived types. The only one that is still missing is WCF, but I plan to add it soon

We don't have ASP.NET support yet do we? 🤔 Also, we don't need it in AspNetCoreEndpointTags or AspNetCoreMvcTags as those aren't service-entry spans.

Tbh, the Endpoint and MVC Tags might not need to be WebTags at all 🤔 we should check whether they really need to be or not. I can look into that separately as part of an effort on aspnetcore perf I'm working on atm

Well… now is that time 😉

Fair enough, I really meant in the same PR, but meh 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have ASP.NET support yet do we? 🤔

#8083

Also, we don't need it in AspNetCoreEndpointTags or AspNetCoreMvcTags as those aren't service-entry spans.

Got it. Let's start with netcore only in this PR. I'll add the others later.

Fair enough, I really meant in the same PR, but meh 😉

Ah got it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 349 to 356
internal readonly struct TagBatch : IDisposable
{
private readonly List<KeyValuePair<string, string>>? _tags;

internal TagBatch(List<KeyValuePair<string, string>> tags)
{
_tags = tags;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does concern me... it's really hard to reason about exactly when these copies might or might not be added

Since we last spoke, I've realised that we can have some simpler alternatives. IN particular, we can create multiple SetTags methods, which side-steps the allocation issue, at the expense of some minor inconvenience/duplication implementation wise 🙂

public void SetTags(KeyValuePair<string, string> tag1, KeyValuePair<string, string> tag2);
public void SetTags(KeyValuePair<string, string> tag1, KeyValuePair<string, string> tag2, KeyValuePair<string, string> tag3);
public void SetTags(KeyValuePair<string, string> tag1, KeyValuePair<string, string> tag2, KeyValuePair<string, string> tag3, KeyValuePair<string, string> tag4);

// I don't know if it's worth adding this too - it's much nicer to use and implement
// but can't work with .NET FX/.NET Standard, so may never be used in practice 🤷‍♂️ 
#if NETCOREAPP
public void SetTags(params ReadOnlySpan<KeyValuePair<string, string>> tags);
#endif

Internally, these methods would do the same thing as the existing SetTag, but would ensure the capacity is sufficient first

@dudikeleti dudikeleti force-pushed the dudik/co/optimizations branch 4 times, most recently from 09dcf7f to 58f2a67 Compare February 16, 2026 10:47
Make TagBatch.Dispose() a no-op when uninitialized
…Tag helper in TagsList that reserves capacity and preserves replace/remove semantics while setting multiple tags.
@dudikeleti dudikeleti force-pushed the dudik/co/optimizations branch from 137fb83 to 7ce0004 Compare February 16, 2026 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:debugger area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) type:performance Performance, speed, latency, resource usage (CPU, memory)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments