Skip to content

[Test Optimization] Add test.final_status tag#8091

Merged
tonyredondo merged 19 commits intomasterfrom
tony/test-final-status
Feb 16, 2026
Merged

[Test Optimization] Add test.final_status tag#8091
tonyredondo merged 19 commits intomasterfrom
tony/test-final-status

Conversation

@tonyredondo
Copy link
Member

@tonyredondo tonyredondo commented Jan 22, 2026

Summary of changes

Add a new test.final_status tag to the final execution span of tests across NUnit, XUnit, and MsTest frameworks. This tag represents the adjusted final outcome of a test for CI pipeline result determination, with values pass, fail, or skip.

Jira: SDTEST-2985

Core changes:

  • Add TestFinalStatus constant to TestTags.cs and FinalStatus property to TestSpanTags.cs
  • Add shared CalculateFinalStatus() helper in Common.cs implementing the priority logic
  • Implement final status tracking in NUnit (TestOptimizationTestCommand.cs), XUnit (XUnitIntegration.cs), and MsTest (TestMethodAttributeExecuteIntegration.cs)
  • Add ATR budget pre-check methods (GetRemainingBudget/GetRemainingAtrBudget) for early termination detection
  • Add per-row caching for MsTest parameterized tests to track execution results independently

Fix:

  • Corrected test.test_management.attempt_to_fix_passed tag to only be set on Attempt-to-Fix (ATF) tests; previously it was incorrectly set on all retried tests (EFD, ATR, ATF) due to a hardcoded behavior type.

Reason for change

When retry mechanisms are enabled (ATR, EFD, Attempt to Fix), a single test can run multiple times with different outcomes. Some intermediate outcomes are suppressed to avoid failing CI pipelines. Currently, there is no way to query tests by their final adjusted status to build monitors and alerts for hard failures on default branches.

The test.final_status tag enables customers to:

  • Query tests by their effective CI outcome in Datadog
  • Build monitors for tests that are truly failing (not just flaky)
  • Distinguish between tests that eventually passed vs. those that consistently failed
  • Track quarantined/disabled tests separately from actual failures

Priority logic:

  1. Quarantined/disabled tests → always skip (CI never sees actual result)
  2. ATF tests with any execution failed → fail (test is still flaky, fix didn't work)
  3. Any execution passed → pass (matches CI behavior: one pass = pipeline pass)
  4. Last retry is skip/inconclusive AND no pass → skip
  5. All executions failed → fail

ATF (Attempt to Fix) semantics:

For ATF tests, the goal is to determine if a fix actually resolved a flaky test. Therefore:

  • If any execution fails (initial or retry), the test is still flaky → final_status = fail
  • Only if all executions pass, the fix worked → final_status = pass
  • Skip/inconclusive does not count as failure (only actual failures count)
  • attempt_to_fix_passed tag is derived from the same logic for consistency

Implementation details

Shared logic (Common.cs):

  • CalculateFinalStatus(anyExecutionPassed, anyExecutionFailed, isSkippedOrInconclusive, testTags) implements the 5-priority determination
  • Added anyExecutionFailed parameter to support ATF-specific behavior

NUnit:

  • Extended RetryState struct with InitialExecutionPassed, InitialExecutionFailed, and AnyRetryPassed fields
  • Added GetRemainingBudget() to FlakyRetryBehavior for ATR budget exhaustion pre-check
  • Set final_status in ExecuteTest() before span close, handling ATR early exit and budget exhaustion
  • AllAttemptsPassed only clears on actual failure (not skip) for ATF semantics
  • AttemptToFixPassed derived from anyExecutionFailed for consistency

XUnit:

  • Extended TestCaseMetadata with InitialExecutionPassed, InitialExecutionFailed, and AnyRetryPassed fields
  • Added unified GetRemainingAtrBudget() handling both v2 and v3 via Math.Max() strategy
  • Updated WriteFinalTagsFromMetadata() with final execution detection (handles stale TotalExecutions on initial EFD)
  • Track InitialExecutionFailed on exception path for first execution
  • AttemptToFixPassed derived from anyExecutionFailed for consistency

MsTest:

  • Added per-row caches (InitialExecutionPassedCache, InitialExecutionFailedCache, AnyRetryPassedCache, AllAttemptsPassedCache) using ConditionalWeakTable<object, ConcurrentDictionary<string, bool>> to track parameterized test results independently
  • Added SetFinalStatusIfApplicable() helper with proper final execution detection
  • Added GetRemainingAtrBudget() for budget exhaustion pre-check
  • Updated SkipTestMethodExecutor and UnitTestRunnerRunSingleTest* integrations for pre-execution skip paths
  • Track InitialExecutionFailed in both exception and non-exception paths
  • AttemptToFixPassed derived from anyExecutionFailed for consistency

Test coverage

Unit tests (TestFinalStatusTests.cs): 136 tests covering:

  • Priority order verification (quarantined/disabled → ATF-fail → pass → skip → fail)
  • Single execution scenarios (pass/fail/skip)
  • EFD scenarios (new tests, duration-based retries, slow abort)
  • ATR scenarios (initial pass, retry pass, all fail, budget exhaustion)
  • ATF scenarios (any failure → fail, all pass → pass, skip semantics)
  • ATF skip semantics: skip does NOT count as failure (5 dedicated tests)
  • MsTest-specific: parameterized per-row tracking, class/assembly init errors, inconclusive/not-runnable
  • XUnit-specific: SkipException handling, null metadata
  • NUnit-specific: ITR skipped, attribute skipped, inconclusive
  • Mixed feature interactions (EFD+ATR, EFD+ATF, ATR+ATF)
  • Edge cases: empty strings, case sensitivity, null tags

Integration tests:

  • Added TestFinalStatus to span ordering in TestingFrameworkEvpTest.cs for deterministic snapshot verification

Other details

@dd-trace-dotnet-ci-bot
Copy link

dd-trace-dotnet-ci-bot bot commented Jan 22, 2026

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing This PR (8091) and master.

✅ No regressions detected - check the details below

Full Metrics Comparison

FakeDbCommand

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration69.17 ± (69.15 - 69.45) ms69.36 ± (69.31 - 69.56) ms+0.3%✅⬆️
.NET Framework 4.8 - Bailout
duration73.17 ± (73.09 - 73.41) ms73.10 ± (73.03 - 73.34) ms-0.1%
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1039.74 ± (1040.02 - 1046.06) ms1039.82 ± (1040.95 - 1047.98) ms+0.0%✅⬆️
.NET Core 3.1 - Baseline
process.internal_duration_ms22.31 ± (22.28 - 22.34) ms22.34 ± (22.31 - 22.37) ms+0.2%✅⬆️
process.time_to_main_ms87.40 ± (87.24 - 87.56) ms87.75 ± (87.59 - 87.90) ms+0.4%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed15.49 ± (15.48 - 15.49) MB15.47 ± (15.47 - 15.48) MB-0.1%
runtime.dotnet.threads.count12 ± (12 - 12)12 ± (12 - 12)+0.0%
.NET Core 3.1 - Bailout
process.internal_duration_ms22.26 ± (22.23 - 22.28) ms22.30 ± (22.27 - 22.32) ms+0.2%✅⬆️
process.time_to_main_ms88.47 ± (88.32 - 88.63) ms88.62 ± (88.48 - 88.75) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed15.51 ± (15.50 - 15.51) MB15.51 ± (15.51 - 15.52) MB+0.0%✅⬆️
runtime.dotnet.threads.count13 ± (13 - 13)13 ± (13 - 13)+0.0%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms255.07 ± (251.56 - 258.57) ms254.85 ± (251.65 - 258.05) ms-0.1%
process.time_to_main_ms493.87 ± (493.34 - 494.40) ms494.52 ± (493.94 - 495.09) ms+0.1%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed52.15 ± (52.13 - 52.17) MB52.12 ± (52.09 - 52.14) MB-0.1%
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.0%
.NET 6 - Baseline
process.internal_duration_ms21.06 ± (21.03 - 21.08) ms21.10 ± (21.08 - 21.13) ms+0.2%✅⬆️
process.time_to_main_ms75.31 ± (75.17 - 75.44) ms75.50 ± (75.36 - 75.65) ms+0.3%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed15.18 ± (15.17 - 15.18) MB15.18 ± (15.18 - 15.18) MB+0.0%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 6 - Bailout
process.internal_duration_ms20.98 ± (20.95 - 21.00) ms21.14 ± (21.10 - 21.17) ms+0.8%✅⬆️
process.time_to_main_ms76.27 ± (76.18 - 76.36) ms76.66 ± (76.54 - 76.78) ms+0.5%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed15.30 ± (15.30 - 15.31) MB15.30 ± (15.29 - 15.30) MB-0.1%
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms258.59 ± (257.66 - 259.51) ms257.46 ± (256.80 - 258.11) ms-0.4%
process.time_to_main_ms475.08 ± (474.28 - 475.88) ms473.79 ± (473.24 - 474.34) ms-0.3%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed52.92 ± (52.89 - 52.94) MB52.97 ± (52.94 - 53.00) MB+0.1%✅⬆️
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.1%✅⬆️
.NET 8 - Baseline
process.internal_duration_ms18.84 ± (18.82 - 18.86) ms18.96 ± (18.93 - 18.99) ms+0.6%✅⬆️
process.time_to_main_ms68.28 ± (68.15 - 68.41) ms68.91 ± (68.78 - 69.04) ms+0.9%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.68 ± (7.67 - 7.69) MB7.69 ± (7.68 - 7.69) MB+0.1%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 8 - Bailout
process.internal_duration_ms19.00 ± (18.96 - 19.03) ms19.03 ± (19.00 - 19.07) ms+0.2%✅⬆️
process.time_to_main_ms69.49 ± (69.37 - 69.62) ms70.12 ± (69.99 - 70.24) ms+0.9%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.74 ± (7.73 - 7.75) MB7.74 ± (7.74 - 7.75) MB+0.1%✅⬆️
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms179.35 ± (178.44 - 180.25) ms178.11 ± (177.47 - 178.75) ms-0.7%
process.time_to_main_ms427.95 ± (427.30 - 428.59) ms430.47 ± (429.99 - 430.95) ms+0.6%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed35.89 ± (35.86 - 35.92) MB35.98 ± (35.96 - 36.01) MB+0.3%✅⬆️
runtime.dotnet.threads.count27 ± (27 - 27)27 ± (27 - 27)-0.0%

HttpMessageHandler

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration212.75 ± (212.46 - 213.54) ms214.29 ± (214.54 - 215.71) ms+0.7%✅⬆️
.NET Framework 4.8 - Bailout
duration219.13 ± (218.71 - 219.72) ms219.15 ± (218.96 - 219.89) ms+0.0%✅⬆️
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1221.60 ± (1219.66 - 1226.50) ms1230.86 ± (1229.57 - 1236.24) ms+0.8%✅⬆️
.NET Core 3.1 - Baseline
process.internal_duration_ms215.35 ± (214.82 - 215.87) ms215.43 ± (214.80 - 216.07) ms+0.0%✅⬆️
process.time_to_main_ms100.46 ± (100.17 - 100.74) ms101.17 ± (100.86 - 101.47) ms+0.7%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed20.47 ± (20.46 - 20.49) MB20.32 ± (20.31 - 20.34) MB-0.7%
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 20)+0.1%✅⬆️
.NET Core 3.1 - Bailout
process.internal_duration_ms213.97 ± (213.47 - 214.47) ms214.97 ± (214.39 - 215.54) ms+0.5%✅⬆️
process.time_to_main_ms101.40 ± (101.12 - 101.68) ms102.63 ± (102.31 - 102.95) ms+1.2%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed20.47 ± (20.46 - 20.49) MB20.41 ± (20.40 - 20.43) MB-0.3%
runtime.dotnet.threads.count21 ± (21 - 21)21 ± (21 - 21)-0.2%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms482.92 ± (480.71 - 485.12) ms480.33 ± (478.18 - 482.48) ms-0.5%
process.time_to_main_ms551.52 ± (550.59 - 552.45) ms554.99 ± (553.88 - 556.10) ms+0.6%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed60.61 ± (60.48 - 60.74) MB60.85 ± (60.72 - 60.98) MB+0.4%✅⬆️
runtime.dotnet.threads.count30 ± (30 - 30)30 ± (30 - 30)+0.0%✅⬆️
.NET 6 - Baseline
process.internal_duration_ms212.45 ± (211.98 - 212.93) ms214.36 ± (213.88 - 214.84) ms+0.9%✅⬆️
process.time_to_main_ms79.06 ± (78.82 - 79.29) ms79.95 ± (79.71 - 80.18) ms+1.1%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.19 ± (16.17 - 16.20) MB16.11 ± (16.09 - 16.12) MB-0.5%
runtime.dotnet.threads.count20 ± (19 - 20)20 ± (19 - 20)-0.1%
.NET 6 - Bailout
process.internal_duration_ms213.33 ± (212.77 - 213.89) ms214.46 ± (213.94 - 214.98) ms+0.5%✅⬆️
process.time_to_main_ms80.86 ± (80.65 - 81.08) ms81.36 ± (81.12 - 81.59) ms+0.6%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.17 ± (16.15 - 16.19) MB16.20 ± (16.18 - 16.21) MB+0.2%✅⬆️
runtime.dotnet.threads.count21 ± (20 - 21)20 ± (20 - 21)-0.3%
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms476.27 ± (472.33 - 480.21) ms476.92 ± (473.13 - 480.72) ms+0.1%✅⬆️
process.time_to_main_ms493.12 ± (492.23 - 494.01) ms494.59 ± (493.77 - 495.41) ms+0.3%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed57.29 ± (57.13 - 57.46) MB57.25 ± (57.09 - 57.41) MB-0.1%
runtime.dotnet.threads.count30 ± (30 - 30)30 ± (30 - 30)-0.1%
.NET 8 - Baseline
process.internal_duration_ms219.00 ± (218.41 - 219.60) ms216.27 ± (215.59 - 216.95) ms-1.2%
process.time_to_main_ms85.86 ± (85.58 - 86.14) ms85.39 ± (85.14 - 85.65) ms-0.5%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.07 ± (16.06 - 16.09) MB15.96 ± (15.94 - 15.98) MB-0.7%
runtime.dotnet.threads.count19 ± (19 - 19)19 ± (19 - 19)+0.1%✅⬆️
.NET 8 - Bailout
process.internal_duration_ms218.40 ± (217.80 - 218.99) ms217.64 ± (217.00 - 218.29) ms-0.3%
process.time_to_main_ms87.36 ± (87.12 - 87.61) ms87.19 ± (86.94 - 87.44) ms-0.2%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.15 ± (16.14 - 16.17) MB16.08 ± (16.06 - 16.10) MB-0.4%
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 20)-0.4%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms485.17 ± (477.85 - 492.49) ms471.00 ± (463.72 - 478.28) ms-2.9%
process.time_to_main_ms508.16 ± (507.33 - 508.99) ms506.40 ± (505.51 - 507.30) ms-0.3%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed54.29 ± (54.26 - 54.33) MB54.36 ± (54.32 - 54.39) MB+0.1%✅⬆️
runtime.dotnet.threads.count29 ± (29 - 29)29 ± (29 - 29)+0.1%✅⬆️
Comparison explanation

Execution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

Duration charts
FakeDbCommand (.NET Framework 4.8)
gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8091) - mean (69ms)  : 68, 71
    master - mean (69ms)  : 67, 71

    section Bailout
    This PR (8091) - mean (73ms)  : 72, 75
    master - mean (73ms)  : 72, 75

    section CallTarget+Inlining+NGEN
    This PR (8091) - mean (1,044ms)  : 995, 1094
    master - mean (1,043ms)  : 1000, 1086

Loading
FakeDbCommand (.NET Core 3.1)
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8091) - mean (116ms)  : 113, 120
    master - mean (116ms)  : 113, 120

    section Bailout
    This PR (8091) - mean (117ms)  : 115, 119
    master - mean (117ms)  : 115, 119

    section CallTarget+Inlining+NGEN
    This PR (8091) - mean (783ms)  : 724, 842
    master - mean (777ms)  : 725, 828

Loading
FakeDbCommand (.NET 6)
gantt
    title Execution time (ms) FakeDbCommand (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8091) - mean (102ms)  : 100, 105
    master - mean (102ms)  : 99, 105

    section Bailout
    This PR (8091) - mean (103ms)  : 102, 105
    master - mean (103ms)  : 101, 104

    section CallTarget+Inlining+NGEN
    This PR (8091) - mean (768ms)  : 747, 788
    master - mean (763ms)  : 737, 790

Loading
FakeDbCommand (.NET 8)
gantt
    title Execution time (ms) FakeDbCommand (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8091) - mean (95ms)  : 91, 98
    master - mean (94ms)  : 91, 97

    section Bailout
    This PR (8091) - mean (96ms)  : 94, 98
    master - mean (95ms)  : 93, 98

    section CallTarget+Inlining+NGEN
    This PR (8091) - mean (641ms)  : 622, 659
    master - mean (635ms)  : 622, 648

Loading
HttpMessageHandler (.NET Framework 4.8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8091) - mean (215ms)  : 206, 224
    master - mean (213ms)  : 205, 221

    section Bailout
    This PR (8091) - mean (219ms)  : 214, 225
    master - mean (219ms)  : 214, 225

    section CallTarget+Inlining+NGEN
    This PR (8091) - mean (1,233ms)  : 1183, 1282
    master - mean (1,223ms)  : 1172, 1274

Loading
HttpMessageHandler (.NET Core 3.1)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8091) - mean (328ms)  : 314, 342
    master - mean (326ms)  : 315, 337

    section Bailout
    This PR (8091) - mean (328ms)  : 314, 343
    master - mean (326ms)  : 316, 336

    section CallTarget+Inlining+NGEN
    This PR (8091) - mean (1,077ms)  : 1039, 1114
    master - mean (1,069ms)  : 1016, 1121

Loading
HttpMessageHandler (.NET 6)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8091) - mean (304ms)  : 294, 314
    master - mean (301ms)  : 292, 310

    section Bailout
    This PR (8091) - mean (305ms)  : 294, 316
    master - mean (304ms)  : 291, 316

    section CallTarget+Inlining+NGEN
    This PR (8091) - mean (1,008ms)  : 952, 1064
    master - mean (1,001ms)  : 930, 1071

Loading
HttpMessageHandler (.NET 8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8091) - mean (313ms)  : 303, 323
    master - mean (316ms)  : 303, 328

    section Bailout
    This PR (8091) - mean (316ms)  : 303, 330
    master - mean (317ms)  : 305, 329

    section CallTarget+Inlining+NGEN
    This PR (8091) - mean (1,015ms)  : 907, 1124
    master - mean (1,030ms)  : 939, 1120

Loading

@pr-commenter
Copy link

pr-commenter bot commented Jan 22, 2026

Benchmarks

Benchmark execution time: 2026-02-16 16:14:09

Comparing candidate commit 3a460cf in PR branch tony/test-final-status with baseline commit 9281c0d in branch master.

Found 8 performance improvements and 10 performance regressions! Performance is the same for 159 metrics, 15 unstable metrics.

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net6.0

  • 🟩 execution_time [-31.271ms; -26.470ms] or [-13.798%; -11.680%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody netcoreapp3.1

  • 🟥 execution_time [+18.488ms; +24.884ms] or [+9.360%; +12.598%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net6.0

  • 🟩 execution_time [-21.529ms; -15.603ms] or [-9.929%; -7.196%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net6.0

  • 🟥 execution_time [+13.797ms; +14.290ms] or [+7.287%; +7.548%]

scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack netcoreapp3.1

  • 🟥 throughput [-458.217op/s; -180.596op/s] or [-16.427%; -6.474%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net6.0

  • 🟥 execution_time [+8.069ms; +10.197ms] or [+8.798%; +11.118%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net472

  • 🟩 execution_time [-31.317ms; -26.326ms] or [-14.285%; -12.009%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

  • 🟥 execution_time [+24.317ms; +29.201ms] or [+16.292%; +19.564%]
  • 🟥 throughput [-173.159op/s; -132.295op/s] or [-10.768%; -8.227%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1

  • 🟩 execution_time [-64.080ms; -60.586ms] or [-31.206%; -29.504%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net472

  • 🟥 execution_time [+97.297µs; +102.796µs] or [+5.115%; +5.404%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice netcoreapp3.1

  • 🟩 execution_time [-936.802µs; -687.866µs] or [-33.636%; -24.698%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net6.0

  • 🟩 execution_time [-133.767µs; -128.340µs] or [-11.654%; -11.182%]
  • 🟩 throughput [+109.870op/s; +114.758op/s] or [+12.610%; +13.171%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net472

  • 🟥 execution_time [+134.479µs; +139.401µs] or [+5.137%; +5.325%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net6.0

  • 🟥 throughput [-4007.645op/s; -2356.233op/s] or [-16.948%; -9.965%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope netcoreapp3.1

  • 🟥 execution_time [+12.096ms; +15.516ms] or [+6.069%; +7.785%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net6.0

  • 🟩 execution_time [-21.582ms; -16.497ms] or [-9.981%; -7.629%]

@github-actions
Copy link
Contributor

github-actions bot commented Jan 22, 2026

Snapshots difference summary

The following differences have been observed in committed snapshots. It is meant to help the reviewer.
The diff is simplistic, so please check some files anyway while we improve it.

198 occurrences of :

+      test.final_status: fail,

332 occurrences of :

+      test.final_status: pass,

205 occurrences of :

+      test.final_status: skip,

23 occurrences of :

-      test.test_management.attempt_to_fix_passed: true,

24 occurrences of :

-      test.test_management.attempt_to_fix_passed: false,

@tonyredondo tonyredondo marked this pull request as ready for review January 28, 2026 10:45
@tonyredondo tonyredondo requested review from a team as code owners January 28, 2026 10:45
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new test.final_status tag to test execution spans across NUnit, XUnit, and MsTest frameworks. The tag represents the adjusted final outcome of a test for CI pipeline result determination, enabling customers to query tests by their effective CI outcome and build monitors for truly failing tests (not just flaky ones).

Changes:

  • Adds test.final_status constant to TestTags.cs with values: pass, fail, or skip
  • Implements shared CalculateFinalStatus() logic in Common.cs with priority-based determination
  • Integrates final status tracking across NUnit, XUnit, and MsTest frameworks with framework-specific execution tracking

Reviewed changes

Copilot reviewed 62 out of 66 changed files in this pull request and generated no comments.

Show a summary per file
File Description
TestTags.cs / TestSpanTags.cs Added constant and property for final_status tag
Common.cs Implemented shared CalculateFinalStatus() with 5-priority logic
XUnitIntegration.cs Added final status calculation with ATR/EFD/ATF handling
TestOptimizationTestCommand.cs NUnit final status with retry state tracking
MsTest integration files Final status for pre-execution skips and failures
Test snapshots Updated verified snapshots with final_status values
Integration test files Added final_status to tag removal for deterministic tests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

/// <summary>
/// Retry reason value for Early Flake Detection
/// </summary>
public const string TestRetryReasonEfd = "efd";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, we're not using any namespacing on these tag names? I'm sure these are defined somewhere else, but still 😬

Also, you have efd and atr but not atf? 😅

@tonyredondo tonyredondo merged commit 42e2dc9 into master Feb 16, 2026
142 checks passed
@tonyredondo tonyredondo deleted the tony/test-final-status branch February 16, 2026 19:10
@github-actions github-actions bot added this to the vNext-v3 milestone Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments