-
Notifications
You must be signed in to change notification settings - Fork 151
Fix flaky test metrics tag values #8099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8099) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8099) - mean (68ms) : 67, 70
master - mean (68ms) : 67, 70
section Bailout
This PR (8099) - mean (72ms) : 71, 74
master - mean (72ms) : 71, 73
section CallTarget+Inlining+NGEN
This PR (8099) - mean (1,030ms) : 966, 1095
master - mean (1,022ms) : 978, 1066
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8099) - mean (106ms) : 104, 109
master - mean (106ms) : 103, 109
section Bailout
This PR (8099) - mean (107ms) : 106, 109
master - mean (107ms) : 106, 108
section CallTarget+Inlining+NGEN
This PR (8099) - mean (745ms) : 692, 799
master - mean (742ms) : 691, 793
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8099) - mean (94ms) : 92, 96
master - mean (93ms) : 91, 96
section Bailout
This PR (8099) - mean (94ms) : 93, 96
master - mean (94ms) : 93, 95
section CallTarget+Inlining+NGEN
This PR (8099) - mean (727ms) : 694, 760
master - mean (723ms) : 694, 752
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8099) - mean (93ms) : 90, 95
master - mean (92ms) : 90, 94
section Bailout
This PR (8099) - mean (94ms) : 92, 95
master - mean (93ms) : 92, 94
section CallTarget+Inlining+NGEN
This PR (8099) - mean (641ms) : 629, 653
master - mean (640ms) : 623, 657
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8099) - mean (194ms) : 190, 198
master - mean (195ms) : 191, 198
section Bailout
This PR (8099) - mean (197ms) : 194, 199
master - mean (198ms) : 195, 202
section CallTarget+Inlining+NGEN
This PR (8099) - mean (1,144ms) : 1073, 1215
master - mean (1,148ms) : 1090, 1205
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8099) - mean (278ms) : 272, 283
master - mean (278ms) : 272, 284
section Bailout
This PR (8099) - mean (279ms) : 274, 285
master - mean (278ms) : 274, 282
section CallTarget+Inlining+NGEN
This PR (8099) - mean (940ms) : 901, 979
master - mean (940ms) : 904, 976
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8099) - mean (271ms) : 267, 276
master - mean (270ms) : 266, 274
section Bailout
This PR (8099) - mean (270ms) : 268, 273
master - mean (270ms) : 266, 273
section CallTarget+Inlining+NGEN
This PR (8099) - mean (931ms) : 890, 973
master - mean (929ms) : 872, 986
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8099) - mean (271ms) : 263, 278
master - mean (270ms) : 265, 275
section Bailout
This PR (8099) - mean (270ms) : 266, 274
master - mean (270ms) : 266, 275
section CallTarget+Inlining+NGEN
This PR (8099) - mean (834ms) : 814, 854
master - mean (839ms) : 817, 861
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
BenchmarksBenchmark execution time: 2026-01-29 11:25:36 Comparing candidate commit 9a49df3 in PR branch Found 13 performance improvements and 5 performance regressions! Performance is the same for 155 metrics, 19 unstable metrics. scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net6.0
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net472
scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack netcoreapp3.1
scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net472
scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net6.0
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice netcoreapp3.1
scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net6.0
scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net6.0
scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog netcoreapp3.1
scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net472
scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net6.0
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope net6.0
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes netcoreapp3.1
|
| testCase.DisplayName : | ||
| $"{TestMethod.TestClass.Class.Name}.{testCase.DisplayName}").Trim(); | ||
|
|
||
| if (testFullName.Length > 200) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For readability in the future can we just extract 200 to some variable to note that it is the maximum name length? var maxTestNameLength = 200
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Done!
Co-authored-by: Steven Bouwkamp <[email protected]>
Summary of changes
We currently send telemetry metrcis related to flaky tests. These metrics, though, had some issues.
In some tests, the test name already contained the class name. The resulting test name would include repeated class names. For instance: "Datadog.Trace.Tests.Logging.DirectSubmission.Sink.PeriodicBatching.BatchingSinkTests.Datadog.Trace.Tests.Logging.DirectSubmission.Sink.PeriodicBatching.BatchingSinkTests.WhenRunning_AndAnEventIsQueued_ItIsWrittenToABatch"
We are sanitizing tag values by calling TryNormalizeTagName() which is useful for value normalization but adds some unneccessary restrictions. For instance, framewrok values were not being sent because they started with a number, which is fine for tag values but not for tag names. Therefore, TryNormalizeTagName will not be used for framework values.
Sometimes, the test name is not sent because we validate tags before sending them and we don't send tags with a length > 200. Test name has been trucated to 200 to avoid having empty test name tags.
Reason for change
Implementation details
Test coverage
Other details