Skip to content

Add Azure DevOps flaky-history annotations and quarantine awareness#8298

Merged
Evangelink merged 6 commits into
mainfrom
dev/amauryleve/azdo-flaky-history
May 19, 2026
Merged

Add Azure DevOps flaky-history annotations and quarantine awareness#8298
Evangelink merged 6 commits into
mainfrom
dev/amauryleve/azdo-flaky-history

Conversation

@Evangelink
Copy link
Copy Markdown
Member

Part 2 of the brainstorm in #5951 — adds opt-in flaky-test history annotations and quarantine awareness to Microsoft.Testing.Extensions.AzureDevOpsReport. Decorates AzDO log issues with historical flake context and lets known-noisy failures be downgraded so PR gates aren't blocked.

One of three PRs derived from the #5951 brainstorm. The others:

See issue comment for the broader plan.

Why

Today every failure in an AzDO log looks equally bad. There's no way for the build to say "this test failed 4/20 times in the last 14 days — known noise" vs "this test had zero failures in 14 days and just broke — likely regression". Teams that have moved to MTP currently mark failures as warnings manually with a separate task or live with red PR checks for known-flaky tests.

What

Three new opt-in CLI options on Microsoft.Testing.Extensions.AzureDevOpsReport:

Option Type Purpose
--report-azdo-flaky-history <days> int (1–90) Query AzDO REST history for the last N days and annotate failures with [flaky: failed K/N in last Md] or [REGRESSION] (only when ≥5 prior samples).
--report-azdo-quarantine-file <path> string Path to a text file (one FQN/glob per line, # comments allowed) listing tests considered quarantined. Their failures are demoted to warning and tagged [quarantined]; emits ##vso[build.addbuildtag]has-quarantined-test-failure exactly once.
--report-azdo-demote-known-flaky zero-arity Together with --report-azdo-flaky-history, auto-demote failures whose flake-rate ≥25% in the window to warning. Default OFF (annotate-only). Requires --report-azdo-flaky-history.

All three are opt-in; missing AzDO env vars (SYSTEM_ACCESSTOKEN/SYSTEM_COLLECTIONURI/SYSTEM_TEAMPROJECT/BUILD_DEFINITIONID) → log warning and no-op.

How it works

  • AuthAuthorization: Basic base64(":<SYSTEM_ACCESSTOKEN>") (no Microsoft.TeamFoundationServer.Client dependency; just HttpClient + a source-generated JsonSerializerContext so it's AOT-safe).
  • History queryGET {project}/_apis/test/Runs?definitions={pipelineDefinitionId}&minLastUpdatedDate=…&maxLastUpdatedDate=…&automated=true&$top=200 paginated with $skip up to MaxRunsToInspect = 200. Per run, GET …/results?api-version=7.1&outcomes=Failed,Passed paged with continuation token. Aggregated into Dictionary<automatedTestName, FlakyStats>.
  • Bounded session start — history load has a wall-clock budget (default 30 s). If exceeded, log info and degrade to empty stats; tests start immediately.
  • Resilience — REST calls retry on transient errors (3 attempts, exponential backoff, 429 honors Retry-After). Response bodies truncated to 500 chars in error messages. All callbacks catch everything except OperationCanceledException; history/quarantine failures never fail the test run.
  • Quarantine file — line-based (# comments), case-sensitive ordinal matching against FQN, glob patterns (*, ?) compiled to a single alternation regex. Capped at 10 000 patterns / 4 KB per pattern with a logged warning.
  • Build tag##vso[build.addbuildtag]has-quarantined-test-failure guarded by Interlocked.Exchange so concurrent failure events emit it exactly once.

Highlights from the expert-reviewer round

Implementation went through one full round of expert-reviewer. Critical issues addressed:

  • C1 (feature was broken): the runs query was using buildIds=<pipelineDefinitionId> (the wrong AzDO parameter — buildIds filters by individual build run id, not pipeline). Switched to definitions=<pipelineDefinitionId> so the query actually returns data. Unit test now asserts the URL contains definitions=.
  • C2: history load was synchronous & serial — up to 25 000 sequential HTTP requests blocking session start. Bounded with a 30 s wall-clock budget + cancellation; degrades to empty stats on exceed.
  • C3: $top=501 was silently capped server-side. Now properly pages runs via $skip until MaxRunsToInspect is reached.
  • C4: [REGRESSION] previously fired on a single prior sample, generating false positives everywhere. Now requires TotalCount >= 5 (configurable via MinSamplesForRegressionAnnotation).

Major items also addressed: User-Agent/Accept headers added (prevents AzDO WAF-side 403s), source-generated JsonSerializerContext (AOT-safe), error-body truncation, inner-exception preservation on retry exhaustion, quarantine pattern caps, ordinal-case-sensitive matching, IAzureDevOpsHistoryService interface + proper DI registration, guard-clause CLI validation, acceptance tests for invalid-value error paths.

Tests

546 unit tests pass. New coverage:

  • AzureDevOpsHistoryServiceTests.cs — history aggregation, paging, time budget, regression boundary (4 vs 5 samples), quarantine tag-emitted-once.
  • AzureDevOpsHistoryClientTests.cs — URL composition (definitions= parameter), User-Agent/Accept headers, 429 retry honoring Retry-After.
  • AzureDevOpsCommandLineProviderTests.cs — cross-product validation (e.g. --demote-known-flaky without --flaky-history, --quarantine-file without --report-azdo).
  • AzureDevOpsCommandLineTests.cs — acceptance-style coverage.

HelpInfoAllExtensionsTests expectations updated for the new options (both --help and --info blocks, alphabetical order preserved).

Build status (local)

  • .\.dotnet\dotnet.exe build src\Platform\Microsoft.Testing.Extensions.AzureDevOpsReport\Microsoft.Testing.Extensions.AzureDevOpsReport.csproj -c Debug0 warnings, 0 errors.
  • .\.dotnet\dotnet.exe test test\UnitTests\Microsoft.Testing.Extensions.UnitTests\Microsoft.Testing.Extensions.UnitTests.csproj546/546 passed.
  • .\build.cmd -pack0 warnings, 0 errors.

Out of scope (deliberate)

Checklist

  • Critical & Major review findings addressed
  • Localized via resx + xlf (regenerated with /t:UpdateXlf, not hand-edited)
  • Help/info acceptance test expectations updated
  • No new public API (besides one internal IAzureDevOpsHistoryService)
  • .\build.cmd green (0 warnings, 0 errors)
  • PR feedback addressed

Refs #5951

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 16, 2026 19:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds opt-in Azure DevOps flaky-history annotations and quarantine awareness to the Microsoft.Testing.Extensions.AzureDevOpsReport extension. Failures can now be annotated with historical flake context, demoted from error to warning when known-flaky, and demoted with a [quarantined] tag when listed in a quarantine file (which also emits a one-shot ##vso[build.addbuildtag]has-quarantined-test-failure).

Changes:

  • New CLI options --report-azdo-flaky-history, --report-azdo-quarantine-file, --report-azdo-demote-known-flaky with cross-option validation in AzureDevOpsCommandLineProvider.
  • New AzureDevOpsHistoryService + AzureDevOpsHistoryClient (AOT-safe JsonSerializerContext) that queries the AzDO REST Runs/Results APIs under a 30 s wall-clock budget, with retries, 429 Retry-After honoring, paging caps, and a regression-annotation min-sample threshold.
  • AzureDevOpsReporter now annotates errors with [flaky: failed K/N in last Md] / [REGRESSION] / [quarantined], and demotes severity per the quarantine file and known-flaky rule.
Show a summary per file
File Description
src/.../AzureDevOpsCommandLineOptions.cs Adds 3 new option name constants.
src/.../AzureDevOpsCommandLineProvider.cs Registers new options and adds cross-option validation.
src/.../AzureDevOpsExtensions.cs Wires AzureDevOpsHistoryService as data consumer + session lifetime handler.
src/.../AzureDevOpsHistoryClient.cs New REST client (auth, paging, retries, AOT JSON).
src/.../AzureDevOpsHistoryClientJsonContext.cs Source-generated JSON context for DTOs.
src/.../AzureDevOpsHistoryService.cs Loads/aggregates flaky stats with a bounded budget; exposes TryGetStats/IsLikelyFlaky.
src/.../AzureDevOpsReporter.cs Adds annotation suffix building, severity demotion, one-shot quarantine build tag.
src/.../FlakyStats.cs Struct holding pass/fail counts and failure rate.
src/.../IAzureDevOpsHistoryService.cs Internal abstraction over the history service.
src/.../QuarantineFile.cs Parses quarantine file (globs, # comments, caps) into regex matchers.
src/.../Microsoft.Testing.Extensions.AzureDevOpsReport.csproj Adds System.Text.Json dependency and DynamicProxyGenAssembly2 IVT for Moq.
Directory.Packages.props Pins System.Text.Json version.
src/.../Resources/AzureDevOpsResources.resx New strings for options, warnings, and annotation templates; fixes prior Eanble/AzureDev Ops typos.
src/.../Resources/xlf/*.xlf (12 locales) Regenerated XLFs for new strings; Description/OptionDescription flipped to needs-review-translation after the English typo fix.
test/.../AzureDevOpsHistoryClientTests.cs Asserts URL composition (definitions=), headers, and run-paging behavior.
test/.../AzureDevOpsHistoryServiceTests.cs Covers aggregation, paging, time-budget timeout, regression threshold, demote, quarantine tag-once.
test/.../AzureDevOpsCommandLineProviderTests.cs Validates cross-option error messages.
test/.../AzureDevOpsCommandLineTests.cs Acceptance-style test for invalid CLI argument errors.
test/.../HelpInfoAllExtensionsTests.cs Updates --help / --info expectations for new options.

Copilot's findings

  • Files reviewed: 31/31 changed files
  • Comments generated: 0

@Evangelink Evangelink marked this pull request as ready for review May 17, 2026 19:08
Evangelink and others added 2 commits May 17, 2026 21:19
Aligns the failure message with the convention used by Assert.Contains
(which uses GetType().Name) and fixes the unit test expectations in
AssertTests.AreAll.cs that expected the short type name.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 07:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 32/32 changed files
  • Comments generated: 4

Comment thread src/TestFramework/TestFramework/Assertions/Assert.AreAllDistinct.cs
Copilot AI added 2 commits May 18, 2026 22:03
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Revert unrelated merge-resolution changes that regressed main (Assert.AreEqual updates from RFC 012, AreAllDistinct, PreferAsyncAssertion, Arcade/Versions, eng/common)
- Format DemoteKnownFlakyOptionDescription with the threshold percentage at runtime to keep the resource string and KnownFlakyFailureRateThreshold constant in sync
- Regenerate XLF resources via UpdateXlf

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 19, 2026 04:49
@Evangelink
Copy link
Copy Markdown
Member Author

Addressed the open review feedback in 0a2cdb5:

  1. Assert.AreAllDistinct.cs (unrelated change) — Reverted. The bad merge resolution in 0dac227 had also accidentally regressed several other unrelated files merged from main (the RFC 012 changes in Assert.AreEqual*, the MSTEST0064 follow-up in PreferAsyncAssertionFixer/PreferAsyncAssertionAnalyzerTests, and eng/Version.Details.xml / eng/Versions.props / eng/common/* / global.json). All of these were restored from origin/main so the PR diff is now scoped to AzDO files only.
  2. AzureDevOpsExtensions.cs (hard cast to ServiceProvider) — Already fixed in 0dac227 (history service is now created lazily via closure shared by the data consumer and the lifetime handler, no cast to the concrete ServiceProvider).
  3. AzureDevOpsReporter.cs constant/resource drift (25%) — Switched to runtime formatting. KnownFlakyFailureRateThreshold is now internal const; DemoteKnownFlakyOptionDescription carries a {0}% placeholder and AzureDevOpsCommandLineProvider formats it once at startup with KnownFlakyFailureRateThreshold * 100. XLF files regenerated via UpdateXlf; the rendered help text is identical to before, so HelpInfoAllExtensionsTests expectations remain valid.
  4. AzureDevOpsHistoryService.cs budget-task leak — Already fixed in 0dac227 (the budget CancellationTokenSource is canceled as soon as the load task wins the race, so the Task.Delay registration is released promptly).

Verified locally:

  • dotnet build src/Platform/Microsoft.Testing.Extensions.AzureDevOpsReport0 warnings, 0 errors.
  • AzureDevOps unit tests (32 of them in Microsoft.Testing.Extensions.UnitTests) — all pass.
  • Full Microsoft.Testing.Extensions.UnitTests151/153 pass, 2 skipped (the two crash-report tests skipped on Windows, as on main).
  • AreAllDistinct tests in TestFramework.UnitTests (23 tests) — all pass.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 58/58 changed files
  • Comments generated: 1

Comment thread src/TestFramework/TestFramework/Resources/FrameworkMessages.resx
Copilot AI review requested due to automatic review settings May 19, 2026 06:41
@Evangelink Evangelink force-pushed the dev/amauryleve/azdo-flaky-history branch from 6abe725 to 0a2cdb5 Compare May 19, 2026 06:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 58/58 changed files
  • Comments generated: 0 new

Resolve XLF conflicts by taking main's translations (the AzDO PR should not modify FrameworkMessages localizations).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Evangelink Evangelink merged commit 7052533 into main May 19, 2026
82 checks passed
@Evangelink Evangelink deleted the dev/amauryleve/azdo-flaky-history branch May 19, 2026 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants