[CI] Extend gate to all test types and decouple from PR review by kubaflo · Pull Request #34705 · dotnet/maui

kubaflo · 2026-03-27T13:29:32Z

Summary

Extends the CI PR review pipeline to support all test types (UI tests, device tests, unit tests, XAML tests) and restructures the review flow by decoupling the gate from the copilot agent.

Before

Gate only supported UI tests (TestCases.HostApp / TestCases.Shared.Tests)
PRs with device tests, unit tests, or XAML tests were skipped by the gate
Gate ran as Phase 2 inside the copilot agent (4-phase: Pre-Flight → Gate → Try-Fix → Report)
Gate results were duplicated across all phase outputs
AI summary comment included session history merging (841 lines of code)

After

Gate supports all test types with auto-detection
Gate runs as a standalone script step before the copilot agent
Gate posts its own separate PR comment ()
AI summary is simplified (170 lines, always overwrites, no session history)
PR review is now 3 phases: Pre-Flight → Try-Fix → Report

New Scripts

Script	Purpose
`Detect-TestsInDiff.ps1`	Analyzes PR files, classifies tests by type (UITest, DeviceTest, UnitTest, XamlUnitTest), extracts method names from diffs
`post-gate-comment.ps1`	Posts/updates gate result as separate PR comment
`RunTests.ps1`	Unified test runner entry point for all test types

Test Detection

pwsh .github/scripts/shared/Detect-TestsInDiff.ps1 -PRNumber 25129

📱 [DeviceTest] EditorTests (PlaceholderHorizontalTextAlignment)
   Filter:  Category=Editor
🖥️ [UITest] Issue10987
   Filter:  Issue10987

New Review Flow

Step 0: Branch setup
Step 1: Gate (verify-tests-fail.ps1 — direct script, no copilot agent)
         → Posts <!-- AI Gate --> comment immediately
Step 2: PR Review (copilot agent — 3 phases: Pre-Flight, Try-Fix, Report)
         → Gate result passed in prompt
Step 3: Post AI Summary (<!-- AI Summary --> comment)
Step 4: Apply labels

PR Comments (Two Separate Comments)

Gate comment ():

## 🚦 Gate — Test Verification
► Expand Full Gate — abc1234 · Fix editor alignment

### Gate Result: ✅ PASSED
| Step | Expected | Actual | Result |
| Without fix | FAIL | FAIL | ✅ |
| With fix | PASS | PASS | ✅ |

AI Summary comment ():
Pre-Flight, Fix, Report sections only — no gate duplication.

Key Changes

verify-tests-fail.ps1: Auto-detects test type, routes to correct runner (BuildAndRunHostApp, Run-DeviceTests, dotnet test), iterates over all detected tests, -Platform mandatory
Detect-TestsInDiff.ps1: Shared detection engine — reads [Category] attributes for device test filtering, extracts method names from PR diffs
Review-PR.ps1: Gate as Step 1 (script), PR review as Step 2 (copilot), removed PR finalize step
post-ai-summary-comment.ps1: Rewritten from 841 → 170 lines, always overwrites
pr-gate.md: Strict output template, no cross-phase duplication rule
pr-review/SKILL.md: 3 phases (removed Gate), no-duplication rule
EstablishBrokenBaseline.ps1: Excludes TestUtils/DeviceTests.Runners from fix file detection

Verified

Gate passed locally on Share device tests: without fix=FAIL ✅, with fix=PASS ✅
Detection tested on PRs: Fixed Editor HorizontalTextAlignment does not update at run time #25129, [Testing] Refactoring Feature Matrix UITest Cases for Editor Control #34615, [iOS, MacCatalyst] Fix CollectionView grid spacing updates for first row and column #34598, [Net10] OnSizeAllocated in Shell not triggered - fix #31056
Comments posted to 8 PRs from CI build artifacts

github-actions · 2026-03-27T13:29:44Z

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 34705

Or

Run remotely in PowerShell:

iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 34705"

Copilot

Pull request overview

Extends the CI “gate” and PR review automation to detect and run all MAUI test types (UI/device/unit/XAML), and restructures the review flow so gate runs as a standalone step with its own PR comment, while the Copilot-driven review focuses on Pre-Flight/Try-Fix/Report.

Changes:

Update verify-tests-fail.ps1 to auto-detect test type(s) and dispatch to the right runner, running all detected tests.
Add shared test detection (Detect-TestsInDiff.ps1) and a dedicated gate comment poster (post-gate-comment.ps1); simplify AI summary posting.
Update review orchestration/docs to remove gate from the pr-review skill and run it from Review-PR.ps1 instead.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
.github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1	Adds multi-test-type detection/routing and multi-test execution for gate verification.
.github/skills/verify-tests-fail-without-fix/SKILL.md	Updates skill documentation for broader test-type support (but currently out of sync with script behavior/outputs).
.github/skills/try-fix/references/example-invocation.md	Documents device/unit test command examples.
.github/skills/try-fix/SKILL.md	Updates try-fix guidance to select the correct runner per test type.
.github/skills/pr-review/SKILL.md	Changes orchestrator to 3 phases (Pre-Flight/Try-Fix/Report) and states gate is pre-run.
.github/scripts/shared/Detect-TestsInDiff.ps1	New shared test detection/classification script used by gate and tooling.
.github/scripts/post-gate-comment.ps1	New script to post/update a dedicated `<!-- AI Gate -->` PR comment.
.github/scripts/post-ai-summary-comment.ps1	Simplifies AI summary comment generation (no session history; gate posted separately).
.github/scripts/RunTests.ps1	New unified local entry point to run Unit/Device/UI/Integration tests.
.github/scripts/Review-PR.ps1	Reorders flow to run gate first via script, then invoke `pr-review`, then post comments/labels.
.github/scripts/EstablishBrokenBaseline.ps1	Expands “test path” patterns to exclude more test utility/runner paths from fix detection.
.github/pr-review/pr-report.md	Updates report phase prerequisites now that gate is external to `pr-review`.
.github/pr-review/pr-gate.md	Updates gate doc for detection template/output rules (but still references task-agent flow).
.github/copilot-instructions.md	Updates repository Copilot instructions to reflect 3-phase PR review and multi-type gate.

Copilot · 2026-03-27T13:39:18Z

.github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1

+        $TestLog = Join-Path $OutputPath "test-failure-$($testEntry.TestName).log"
+


Log file names are derived directly from TestName (e.g., test-failure-<TestName>.log). TestName can include spaces/parentheses/commas (device tests append method names), which can create awkward or invalid paths on some filesystems and can hit path-length limits. Consider sanitizing TestName for file names (or use an index-based file name) and store the display name inside the log content instead.

Suggested change

$TestLog = Join-Path $OutputPath "test-failure-$($testEntry.TestName).log"

# Sanitize TestName for use in a file name and keep it reasonably short

$rawTestName = [string]$testEntry.TestName

$invalidFileNameChars = [IO.Path]::GetInvalidFileNameChars()

$extraProblematicChars = [char[]]' ()[],'

$charsToReplace = $invalidFileNameChars + $extraProblematicChars

$sanitizedTestName = ($rawTestName.ToCharArray() | ForEach-Object {

if ($charsToReplace -contains $_) { '_' } else { $_ }

}) -join ''

if ([string]::IsNullOrWhiteSpace($sanitizedTestName)) {

$sanitizedTestName = "test-$testIndex"

}

$maxNameLength = 60

if ($sanitizedTestName.Length -gt $maxNameLength) {

$sanitizedTestName = $sanitizedTestName.Substring(0, $maxNameLength)

}

$TestLog = Join-Path $OutputPath ("test-failure-{0}.log" -f $sanitizedTestName)

Copilot · 2026-03-27T13:39:19Z

.github/skills/verify-tests-fail-without-fix/SKILL.md

+3. Auto-detects test classes from changed test files
+4. Routes to the appropriate test runner
+5. Runs tests (should FAIL to prove they catch the bug)
+6. **Updates PR labels** based on result
+7. Reports result


This doc claims the script "Updates PR labels" as part of the workflow, but the updated verify-tests-fail.ps1 no longer contains label-management logic. Please remove or update these steps so they reflect the current behavior (labeling appears to happen later in Review-PR.ps1).

Copilot · 2026-03-27T13:39:19Z

.github/scripts/shared/Detect-TestsInDiff.ps1

+        $patch = $null
+        if ($PRNumber) {
+            # Get per-file patch from GitHub API
+            $patch = gh api "repos/dotnet/maui/pulls/$PRNumber/files" --jq ".[] | select(.filename == `"$file`") | .patch" 2>$null
+        } else {


For each device-test file, Step 4 calls gh api repos/dotnet/maui/pulls/$PRNumber/files and filters it with jq. This creates an N+1 GitHub API pattern that can hit rate limits and slow down gate on large PRs. Consider fetching the PR files/patches once (single API call) and caching them in a lookup keyed by filename.

Copilot · 2026-03-27T13:39:19Z

.github/scripts/shared/Detect-TestsInDiff.ps1

+            # Try from git diff
+            $patch = git diff $mergeBase HEAD -- $file 2>$null


When running without -PRNumber, Step 4 uses git diff $mergeBase HEAD -- $file, but $mergeBase is only set in the earlier auto-diff path. If the caller provides -ChangedFiles directly, $mergeBase may be $null, producing an unintended diff (or no patch). Consider ensuring $mergeBase is always initialized for non-PR runs, or diffing against HEAD explicitly.

Suggested change

# Try from git diff

$patch = git diff $mergeBase HEAD -- $file 2>$null

# Try from git diff. If $mergeBase is not set (for example when -ChangedFiles

# was provided directly), fall back to diffing against the previous commit.

if ($mergeBase) {

$patch = git diff $mergeBase HEAD -- $file 2>$null

} else {

$patch = git diff HEAD~1 HEAD -- $file 2>$null

}

Copilot · 2026-03-27T13:39:20Z

.github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1

+            -DetectedProjectPath $testEntry.ProjectPath `
+            -LogFile $TestLog
+
+        $testResult = Get-TestResultFromOutput -LogFile $testOutputLog


Get-TestResultFromOutput accepts a -TestFilter parameter (and contains filter-aware parsing logic), but this call doesn't pass the current test's filter. Either pass $testEntry.Filter through or remove the unused parameter/logic; otherwise the filter-aware parsing path can never be exercised.

Suggested change

$testResult = Get-TestResultFromOutput -LogFile $testOutputLog

$testResult = Get-TestResultFromOutput -LogFile $testOutputLog -TestFilter $testEntry.Filter

Copilot · 2026-03-27T13:39:20Z

.github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1

 .PARAMETER Platform
    Target platform: "android", "ios", "catalyst" (MacCatalyst), or "windows"
+    Required for UITest and DeviceTest types. Optional for UnitTest and XamlUnitTest.
+


The comment header says -Platform is optional for UnitTest/XamlUnitTest, but $Platform is declared as a mandatory parameter, and the examples suggest running without it. Either make -Platform truly optional (and only enforce it for UI/Device tests), or update the docs/examples to reflect that -Platform is always required.

Copilot · 2026-03-27T13:39:20Z

.github/skills/verify-tests-fail-without-fix/SKILL.md

 | `test-without-fix.log` | Full test output from run without fix |
 | `test-with-fix.log` | Full test output from run with fix |

-**Plus UI test logs in** `CustomAgentLogsTmp/UITests/`:
- `android-device.log` or `ios-device.log` - Device logs
- `test-output.log` - NUnit test output
+**Plus test logs in** `CustomAgentLogsTmp/`:
+- `UITests/` - UI test device logs and output


In the Output Files section, the directory/path described earlier (CustomAgentLogsTmp/PRState/<PRNumber>/verify-tests-fail/) doesn't match the current verify-tests-fail.ps1 output location (now under .../PRAgent/gate/verify-tests-fail). Please update the paths and example structure here so consumers can find verification-report.md and the per-test logs reliably.

Copilot · 2026-03-27T13:39:20Z

.github/pr-review/pr-gate.md

+2. **Select platform** — must be affected by bug AND available on host (see table above).

 3. **Run verification via task agent** (MUST use task agent — never inline):
   ```


Step 3 still instructs running gate via the task agent, but the new workflow in Review-PR.ps1 runs gate directly via verify-tests-fail.ps1 before invoking the Copilot pr-review skill. Please update this step to reflect the new script-driven gate (or clarify that this doc is only for manual, agent-driven gate runs).

Copilot · 2026-03-27T13:39:21Z

.github/scripts/post-gate-comment.ps1

+# ============================================================================
+
+# Get latest commit info
+$commitJson = gh api "repos/dotnet/maui/pulls/$PRNumber/commits" --jq '.[-1] | {message: .commit.message, sha: .sha}' 2>$null | ConvertFrom-Json


gh api ... | ConvertFrom-Json will throw if gh fails (e.g., auth missing / rate limit) because stderr is suppressed and stdout may be empty. Since $ErrorActionPreference = 'Stop', this can break the whole posting step. Wrap this in try/catch and fall back to Unknown commit info when the API call fails or returns empty.

Suggested change

$commitJson = gh api "repos/dotnet/maui/pulls/$PRNumber/commits" --jq '.[-1] | {message: .commit.message, sha: .sha}' 2>$null | ConvertFrom-Json

$commitJson = $null

try {

$rawCommitJson = gh api "repos/dotnet/maui/pulls/$PRNumber/commits" --jq '.[-1] | {message: .commit.message, sha: .sha}' 2>$null

if (-not [string]::IsNullOrWhiteSpace($rawCommitJson)) {

$commitJson = $rawCommitJson | ConvertFrom-Json

}

}

catch {

Write-Host "⚠️ Failed to fetch or parse commit info for PR #$PRNumber: $($_.Exception.Message)" -ForegroundColor Yellow

$commitJson = $null

}

Copilot · 2026-03-27T13:39:21Z

.github/scripts/post-ai-summary-comment.ps1

+$commitJson = gh api "repos/dotnet/maui/pulls/$PRNumber/commits" --jq '.[-1] | {message: .commit.message, sha: .sha}' 2>$null | ConvertFrom-Json
+$commitTitle = if ($commitJson) { ($commitJson.message -split "`n")[0] } else { "Unknown" }
+$commitSha = if ($commitJson) { $commitJson.sha.Substring(0, 7) } else { "unknown" }
+$commitUrl = if ($commitJson) { "https://github.com/dotnet/maui/commit/$($commitJson.sha)" } else { "#" }



Similar to post-gate-comment.ps1, gh api ... | ConvertFrom-Json will throw when gh fails or returns empty output (stderr is suppressed). With $ErrorActionPreference = 'Stop', that prevents the summary comment from being posted. Add try/catch and default commit title/SHA/URL when the API call is unavailable.

Suggested change

$commitJson = gh api "repos/dotnet/maui/pulls/$PRNumber/commits" --jq '.[-1] | {message: .commit.message, sha: .sha}' 2>$null | ConvertFrom-Json

$commitTitle = if ($commitJson) { ($commitJson.message -split "`n")[0] } else { "Unknown" }

$commitSha = if ($commitJson) { $commitJson.sha.Substring(0, 7) } else { "unknown" }

$commitUrl = if ($commitJson) { "https://github.com/dotnet/maui/commit/$($commitJson.sha)" } else { "#" }

$commitJson = $null

$commitTitle = "Unknown"

$commitSha = "unknown"

$commitUrl = "#"

try {

$commitRaw = gh api "repos/dotnet/maui/pulls/$PRNumber/commits" --jq '.[-1] | {message: .commit.message, sha: .sha}' 2>$null

if ($commitRaw) {

$commitJson = $commitRaw | ConvertFrom-Json

}

} catch {

Write-Warning "Failed to fetch latest commit info for PR #$PRNumber: $($_.Exception.Message)"

}

if ($commitJson) {

$commitTitle = ($commitJson.message -split "`n")[0]

$commitSha = $commitJson.sha.Substring(0, 7)

$commitUrl = "https://github.com/dotnet/maui/commit/$($commitJson.sha)"

}

kubaflo · 2026-03-27T13:44:26Z

🔬 Multi-Model Code Review — PR #34705

Cross-pollinated review from GPT-5.4 (Gemini persona), GPT-5.2-Codex, and Claude Opus 4.6, each reviewing independently then synthesized.

✅ Consensus: Architecture is Sound

All three models agree the decoupling direction is correct:

Moving gate from Phase 2 (inside copilot agent) → standalone script Step 1 is deterministic, faster, and cleaner
Two-comment approach ( / ) separates concerns well
post-ai-summary-comment.ps1 rewrite (841→170 lines) is a major simplification

🔴 Critical — Unanimous (3/3 models flagged)

1. Empty test array → false-positive gate PASS

All three models independently identified this as the #1 issue:

$failedWithoutFix = ($withoutFixResults | Where-Object { $_.Passed }).Count -eq 0
$passedWithFix = ($withFixResults | Where-Object { -not $_.Passed }).Count -eq 0

When $withoutFixResults is empty (zero tests ran), both evaluate to $true → gate reports PASSED with nothing tested. This is the most dangerous failure mode.

Fix (one-liner):

if ($AllDetectedTests.Count -eq 0) {
    Write-Error "No tests detected — gate cannot verify"; exit 1
}
# + similar guard after each test run loop for empty results

2. No automated tests for ~1,200 lines of new script logic (Opus + Gemini)

Detect-TestsInDiff.ps1 (424 lines), RunTests.ps1 (625 lines), post-gate-comment.ps1 (134 lines) — all pure logic, highly testable with Pester, but zero tests. Get-TestResultFromOutput does regex parsing of varied output formats — the most fragile code in this PR.

🟡 Medium — Strong Agreement (2-3/3 models)

3. Device test filter falls back to bare class name (All 3)

When [Category] extraction fails, Detect-TestsInDiff.ps1 falls back to the class name (e.g., EditorTests). But Run-DeviceTests.ps1 expects Category=X format. A bare class name either runs all tests or fails silently. Additionally (Gemini), the category regex \[Category$TestCategory\.(\w+)$\] misses string categories ([Category("Battery")]) and multi-category attributes.

4. Gate can't produce documented SKIPPED state (Gemini + Codex)

Review-PR.ps1 unconditionally runs verify-tests-fail.ps1. If no tests detected, the script exits with error → gate becomes FAILED instead of the documented ⚠️ SKIPPED. Testless PRs shouldn't be failures.

5. Review-PR.ps1 doesn't pass -RequireFullVerification (Gemini)

pr-gate.md says invoke with RequireFullVerification: true, but Review-PR.ps1 omits it. The gate can silently fall back to failure-only mode, skipping "tests pass WITH fix" verification — half the gate contract.

6. Synthesized test entries have inconsistent key shapes (Opus + Codex)

Detection-returned entries have Runner, NeedsPlatform, Files, Methods. Explicitly-provided entries (verify-failure-only path) omit Runner and NeedsPlatform. Future code accessing $t.Runner on these entries will get $null.

Recommendation: Create a New-TestEntry helper that always produces a canonical hashtable shape.

7. Comment marker mismatch (Opus)

PR description says  but code uses . Other tooling searching for the documented marker won't find it.

8. shared-utils.ps1 import may not exist (Opus)

RunTests.ps1 does . "$PSScriptRoot/shared/shared-utils.ps1" — this file isn't in the PR diff. If it doesn't exist on the target branch, every unit test invocation crashes.

9. Label parsing expects old format (Gemini)

Update-AgentLabels.ps1 still looks for Result: lines, but new gate format uses ### Gate Result:. Labels will stop updating.

10. Documentation inconsistencies (Opus + Gemini)

SKILL.md says -Platform is always required, but it's only needed for UITest/DeviceTest
pr-gate.md still says "use task agent" — stale for the new standalone-script flow

🟢 Minor — Individual Model Insights

Finding	Source
GitHub API pagination missing — large PRs skip patches past page 1	Codex + Gemini
`^\s+Failed:` regex never matches in multiline string (dead code path)	Opus
HostApp-only UI test changes (no `Shared.Tests` file) dropped as "no tests"	Gemini
Unit test project map incomplete — misses `Core.Design.UnitTests`, `DualScreen.UnitTests`	Gemini
`post-gate-comment.ps1` create-comment path lacks `try/catch` (update path has it)	Gemini + Opus
`GetTempFileName()` uses system temp instead of project-relative path	Opus
`Get-TestResultFromLog` appears dead after rewrite	Gemini
No `-SkipGate` rollback flag in `Review-PR.ps1`	Opus

📊 Summary Matrix

Finding	GPT-5.4	Codex	Opus	Severity
Empty array → false pass	✅	✅	✅	🔴
No script tests	✅	—	✅	🔴
Device filter fallback	✅	✅	✅	🟡
No SKIPPED state	✅	✅	—	🟡
Missing `-RequireFullVerification`	✅	—	—	🟡
Inconsistent entry shapes	—	✅	✅	🟡
Comment marker mismatch	—	—	✅	🟡
`shared-utils.ps1` missing	—	—	✅	🟡
Label parsing old format	✅	—	—	🟡
Stale docs	✅	—	✅	🟡

🎯 Top 3 Recommended Actions

Guard empty test arrays — Add explicit Count -eq 0 checks before and after verification loops. One-liner fix, prevents the most dangerous failure mode.
Fix device test filter contract — Either ensure Category=X is always produced (fix regex to handle string categories + multi-category), or teach Get-TestResultFromOutput to handle bare class names.
Normalize test entry contract — New-TestEntry helper function, always populates all keys. Eliminates the two-shape problem.

🤖 Generated via multi-model cross-pollination: GPT-5.4 · GPT-5.2-Codex · Claude Opus 4.6

kubaflo · 2026-03-27T13:52:03Z

All 10 review comments addressed in commit 0c33bfe:

✅ Sanitize TestName for log file names (replace invalid chars, truncate to 60)
✅ Remove stale 'Updates PR labels' from SKILL.md
✅ Cache PR files API call (single fetch, keyed lookup)
✅ Fix $mergeBase null fallback (default to HEAD~1)
✅ Pass TestFilter to Get-TestResultFromOutput in all loops
✅ Fix script header: Platform is mandatory for all test types
✅ Fix output file paths in SKILL.md (PRAgent/gate/verify-tests-fail/)
✅ Update pr-gate.md: gate runs as direct script, task agent optional
✅ Add try/catch for gh api in post-gate-comment.ps1
✅ Add try/catch for gh api in post-ai-summary-comment.ps1

kubaflo · 2026-03-27T14:44:38Z

🔬 Multi-Model Re-Review (v2) — PR #34705

Cross-pollinated re-review from GPT-5.4, GPT-5.2-Codex, and Claude Opus 4.6 after fix commit 0c33bfe (10 items addressed).

✅ Fixes Confirmed Working (All 3 Models Agree)

Fix	Status	Evidence
#5 Pass TestFilter to `Get-TestResultFromOutput`	✅ Fixed	`verify-tests-fail.ps1` lines 1184, 1252
#8 `pr-gate.md` → gate runs as script	✅ Fixed	Doc correctly describes standalone flow
#9/#10 `try/catch` for `gh api` in posting scripts	✅ Fixed	Both scripts have guarded commit info fetch
#3 Cache PR files API (avoid N+1)	✅ Fixed	`Detect-TestsInDiff.ps1` caches API response
Comment marker mismatch (round-1 #7)	✅ Fixed	`<!-- AI Gate -->` consistent in code + docs
`shared-utils.ps1` missing (round-1 #8)	✅ Resolved	File exists at `.github/scripts/shared/shared-utils.ps1` (was false alarm)

📊 Round-1 Issue Tracking — Updated Status

#	Issue	GPT-5.4	Codex	Opus	Consensus
1	Empty array → false pass	✅ Fixed	✅ Fixed	⚠️ Mitigated (🟡)	⚠️ Mitigated
2	No automated tests	❌ Open	❌ Open	❌ Open	❌ Open (🟡)
3	Device filter bare class name	❌ Open	⚠️ Partial	⚠️ Partial	⚠️ Partial (🟡)
4	SKIPPED state unreachable	❌ Open	❌ Open	❌ Open	❌ Open (🟡)
5	Missing `-RequireFullVerification`	❌ Open	❌ Open	❌ Open	❌ Open (🟡)
6	Inconsistent entry key shapes	⚠️ Partial	⚠️ Partial	⚠️ Partial	⚠️ Partial (🟢)
7	Comment marker mismatch	✅ Fixed	✅ Fixed	✅ Fixed	✅ Fixed
8	`shared-utils.ps1` missing	✅ Fixed	✅ Fixed	✅ Fixed	✅ Fixed
9	Label parsing old format	❌ Open	❌ Open	✅ Fixed*	⚠️ Disputed
10	Platform defaults to android	❌ Open	❌ Open	⚠️ Partial	❌ Open (🟢)
11	Stale docs	⚠️ Partial	✅ Fixed	⚠️ Partial	⚠️ Partial (🟡)

*Item 9 disagreement: Opus considers the SKILL.md cleanup sufficient; Gemini/Codex note Update-AgentLabels.ps1 still regex-matches Result: not ### Gate Result:. This label parsing mismatch would cause labels to stop updating.

🔍 Key Disagreement: Empty Array Guard (#1)

This was the unanimous #1 critical from round 1. The models now diverge:

GPT-5.4 + Codex: ✅ Fixed — Get-AutoDetectedTestFilter now returns $null when no tests found, triggering hard exit 1 before the aggregation logic runs.
Opus: ⚠️ Mitigated, not eliminated — The upstream guard works, but the downstream aggregation logic structurally still treats empty arrays as "all passed." If a test IS detected but Invoke-TestRun produces unparseable output, the empty-array path is still reachable.

Cross-pollination verdict: The fix closes the main entry point (no tests → exit 1). The residual structural flaw is defense-in-depth — low real-world probability. Downgraded to 🟡.

🎯 Remaining Issues — Should They Block Merge?

Split verdict across models:

GPT-5.4: REQUEST CHANGES — SKIPPED state, -RequireFullVerification, device filter are blockers
Codex: REQUEST CHANGES — -RequireFullVerification is the main blocker
Opus: COMMENT (soft approve) — all remaining items are reasonable follow-ups; only stale docs should be fixed pre-merge

Items where blocking is debatable:

#4 SKIPPED state (🟡): Review-PR.ps1 maps exit codes to only PASSED/FAILED. Testless PRs report FAILED instead of SKIPPED. Opus argues this is a cosmetic distinction (gate failure doesn't halt the workflow). Gemini/Codex argue it creates confusing false failures.

Recommendation: Low-effort fix — use distinct exit codes (0=pass, 1=fail, 2=skip) in verify-tests-fail.ps1 and map exit code 2 to SKIPPED in Review-PR.ps1.

#5 -RequireFullVerification (🟡): Without this flag, the gate only verifies "tests fail without fix" and skips "tests pass with fix." Opus argues this is actually reasonable for edge cases (test-only PRs). Gemini/Codex argue it's half the gate contract.

Recommendation: Add -RequireFullVerification to the gate invocation in Review-PR.ps1 line 462. One flag addition.

📝 Pre-Merge Doc Fixes (All 3 Models Agree — Small Effort, High Value)

These are stale references that AI agents will consume directly, causing wrong paths or confused phase numbering:

pr-review/SKILL.md line 212: Directory structure says gate/ → content.md # Phase 2 output (pr-gate) — gate is no longer Phase 2
pr-review/SKILL.md line 231: Quick Reference table still lists 2. Gate | pr-gate.md | Verify tests via task agent
verify-tests-fail.ps1 line ~56: Example says # Verify unit tests (no platform needed) but Platform is now Mandatory = $true
Update-AgentLabels.ps1 lines 349-353: Regex matches Result: but gate output uses ### Gate Result:

🏁 Cross-Pollinated Verdict

Model	Verdict	Rationale
GPT-5.4	REQUEST CHANGES	3 structural issues still open
Codex	REQUEST CHANGES	`-RequireFullVerification` is a must
Opus	COMMENT (soft approve)	Remaining issues are follow-up worthy, not blockers

Synthesized recommendation: COMMENT with targeted fixes.

The architecture is sound and most critical issues are resolved. The remaining items fall into two buckets:

Fix before merge (small, high-confidence):

Add -RequireFullVerification to gate invocation in Review-PR.ps1
Fix 4 stale doc references (listed above)

Track as follow-ups:

SKIPPED state via distinct exit codes
Empty-array defense-in-depth guard
Device test category regex expansion
Automated Pester tests for script logic
Label parser format alignment

🤖 Multi-model cross-pollination v2: GPT-5.4 · GPT-5.2-Codex · Claude Opus 4.6

Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>

- try-fix SKILL.md: test command table for all test types - pr-review SKILL.md: test_command placeholder instead of hardcoded BuildAndRunHostApp - verify-tests-fail SKILL.md: log paths for all test types Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>

This reverts commit a85b16e.

- pr-gate.md: exact template with no-extras rule, no-duplication warning - pr-review SKILL.md: critical rule against duplicating phase content - pr-report.md: explicit rule not to copy gate/try-fix output Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Review-PR.ps1 now runs verify-tests-fail.ps1 directly as Step 1 (no copilot agent needed for gate). The pr-review skill becomes 3 phases: Pre-Flight, Try-Fix, Report. Gate result is passed in the prompt to the copilot agent. Flow: Step 0: Branch setup Step 1: Gate (verify-tests-fail.ps1 — direct script) Step 2: PR Review (copilot — 3 phases) Step 3: PR Finalize (copilot) Step 4: Post comments Step 5: Labels Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…story Rewrote from 841 lines to ~170 lines. Removes all session merging, extraction, and history logic. Just loads content.md files, builds comment body, and posts/overwrites. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The @dryRunFlag splatting with empty array passed $null as positional argument. Replaced with explicit if/else for -DryRun parameter. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

1. Sanitize TestName for log file names (spaces/parens/commas) 2. Remove stale 'Updates PR labels' from SKILL.md 3. Cache PR files API call (avoid N+1 pattern) 4. Fix $mergeBase null fallback when -ChangedFiles provided 5. Pass TestFilter to Get-TestResultFromOutput in all loops 6. Fix script header: Platform is mandatory for all test types 7. Fix output file paths in SKILL.md 8. Update pr-gate.md: gate runs as script, not task agent 9. Add try/catch for gh api in post-gate-comment.ps1 10. Add try/catch for gh api in post-ai-summary-comment.ps1 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When PRNumber is available, use GitHub API to get exact PR files. Git diff from merge-base includes all branch changes (infrastructure commits), causing 60+ unrelated tests to be detected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Detect ADB install failures, app launch failures, AOT crashes, Appium errors → report as ⚠️ ENV ERROR instead of ❌ FAILED - Extract test failure details: test name, duration, error message - Gate report now shows Details section with actual error messages (e.g., 'snapshot 11.01% different', 'ADB broken pipe') Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Invoke-TestRunWithRetry wraps Invoke-TestRun + result parsing. When EnvError is detected (ADB install failure, app launch crash, Appium timeout), retries up to 3 times with 5s delay between attempts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- retryCountOnTaskFailure: 1 → 3 - Offline wait: 10 iterations × 3s → 20 iterations × 5s (100s total) - Fail fast with clear error if device stays offline (instead of hanging on adb shell commands until 15min timeout) - All adb shell prep commands now have || true to not hang on offline Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

test-output.log from the 'without fix' run could be read by the 'with fix' parser if the file wasn't overwritten yet, causing false FAILED results when tests actually passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Tee-Object pipeline can break and prevent return statements from executing, causing the parser to read stale or wrong log files. Now captures output to variable first, then writes to file. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Each Invoke-TestRun now returns a unique file path per run instead of the shared test-output.log. For UITest, copies test-output.log to {logfile}.testresult immediately after the run. For unit/XAML/device tests, returns the unique $LogFile directly. This prevents the 'without fix' results from being read during the 'with fix' parse, which caused false FAILED reports when tests passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Every Invoke-TestRun now returns its own unique $LogFile. No more copying from shared test-output.log. The captured script output already contains the test results (build + test stdout). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add timing, test counts, and per-step failure details to the gate markdown report. Previously the gate comment only showed a summary table + deduplicated error lines. Now it includes: - Per-test duration (Stopwatch around each Invoke-TestRunWithRetry) - Test counts per step (total/passed/failed) - Failure reason and error message per test (truncated to 300 chars) - Separate 'Without fix' and 'With fix' sections with inline details - Duration in console logs for easier CI debugging Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When no tests are detected in a PR, the gate now shows a friendly '⚠️ SKIPPED' message with a recommendation to use write-tests-agent, instead of showing a bare '❌ FAILED' with no context. - verify-tests-fail.ps1: exit code 2 for 'no tests' (vs 1 for failure) - Review-PR.ps1: map exit code 2 to SKIPPED state, write helpful gate/content.md with write-tests-agent suggestion - Agent prompt updated to reflect SKIPPED state Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

After the gate finishes, apply the corresponding label and remove any stale gate labels from previous runs: - s/agent-gate-passed (exit 0) - s/agent-gate-failed (exit 1) - s/agent-gate-skipped (exit 2, no tests) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The gate console output was a wall of interleaved build/test output with no clear separation between 'without fix' and 'with fix' runs. Now: - Raw test output is inside AzDO collapsible groups (##[group]) so it's available but doesn't flood the log - Each test run has a clear banner: 🔴 WITHOUT FIX / 🟢 WITH FIX - Step headers use box-drawing characters for visual separation - Results print OUTSIDE groups so they're always visible with duration, test counts, and failure details - Final summary is a side-by-side comparison table: Test Name │ Without Fix │ With Fix ───────────────────────┼─────────────┼──────────── Issue34591 │ ✅ FAIL │ ✅ PASS ───────────────────────┼─────────────┼──────────── Expected │ FAIL │ PASS Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The gate comment was hard to read — scattered sections with no clear association between 'without fix' and 'with fix' results for each test. New layout uses a single comparison table: | Test | Without Fix (expect FAIL) | With Fix (expect PASS) | |------|--------------------------|------------------------| | 🖥️ **Issue34591** | ✅ FAIL — 245s | ✅ PASS — 180s | - Each test shows both directions in one row — instantly clear - Duration per direction per test - Failure details only shown when something went wrong (collapsible) - Fix files list is collapsible to reduce noise - Platform, base branch, merge base on one line Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Each test run now has its own expandable section with the full log: 🔴 **Without fix** — 🖥️ Issue34591: FAIL ✅ · 245s 🟢 **With fix** — 🖥️ Issue34591: PASS ✅ · 180s Click to expand and see the complete build + test output. Logs truncated to last 15k chars if too large for GitHub comments. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

gh pr edit --add-label silently fails when the label doesn't exist in the repo. Now: - Creates label with gh label create --force before applying - Uses --repo dotnet/maui explicitly for fork PRs - Logs actual errors instead of swallowing with 2>null Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Both post-gate-comment.ps1 and post-ai-summary-comment.ps1 used gh api without --paginate, so PRs with 30+ comments couldn't find the existing marker comment. Each run created a new comment instead of updating the existing one. Fixes: - Add --paginate to search ALL comments - Pick the LAST matching comment (most recent) instead of first - Handle 'null' string from jq when no match found - On PATCH failure, try to find a comment owned by the current bot user before falling back to creating a new one Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

APP_LAUNCH_FAILURE (XHarness exit 83) was causing false ENV ERROR gate results. The 5-second retry wait was too short for iOS simulator recovery. Now: - Wait 30s between retries (up from 5s) - Reboot iOS simulator on APP_LAUNCH_FAILURE before retrying - Reboot Android emulator on app crash before retrying - Log the reboot action for visibility Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When XHarness crashes with exit code 83 (APP_LAUNCH_FAILURE), the test runner outputs 'Passed: 0 / Failed: 0'. The parser was not detecting this as an env error because: 1. The regex 'APP_LAUNCH_FAILURE|exit code:? 83' didn't match the actual format 'XHarness exit code: 83 (APP_LAUNCH_FAILURE)' 2. 'Passed: 0 / Failed: 0' fell through all checks to the generic 'Could not parse' path, but in some flows was treated as PASSED Fixes: - Add 'XHarness exit code: 83' as explicit env error pattern - Add 'Application test run crashed' as env error pattern - Guard: 'Passed: 0 + Failed: 0' = env error (zero tests ran) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

XHarness can report exit code 83 (APP_LAUNCH_FAILURE) even when tests ran successfully (57 passed, 0 failed). This is a teardown/ cleanup issue, not a real test failure. The parser was checking env error patterns (exit code 83) BEFORE checking actual test results (Passed: 57). This caused the gate to report ENV ERROR when tests actually passed. Fix: check for actual test results (Passed: N where N > 0) FIRST. If tests produced real results, trust them over the exit code. Env error patterns only apply when zero tests ran. Uses the LAST Passed:/Failed: counts in the log to handle cases where Run-DeviceTests.ps1 retries internally and the log contains multiple result blocks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

gh label create uses GraphQL which requires read:org scope that the CI token doesn't have. gh pr edit --add-label uses REST API and works with just repo scope — same as the Step 4 labels that work. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ROOT CAUSE: Run-DeviceTests.ps1 used Write-Host for 'Passed: 57' and 'Failed: 0'. Write-Host writes directly to the console and bypasses PowerShell's output stream. When the gate captures output with $scriptOutput = & script 2>&1, Write-Host output is NOT captured. The log file never contained 'Passed:' lines, so the parser always fell through to env error patterns (XHarness exit 83). Fix: - Run-DeviceTests.ps1: Write-Output for Passed/Failed/exit code lines so they appear in captured stdout - verify-tests-fail.ps1: use (?m) multiline regex, check devicePassCount > 0 (not total > 0), add debug output Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

.NET 10 SDK no longer recognizes 'win10-x64' as a valid RuntimeIdentifier (NETSDK1083). The correct RID is 'win-x64'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Windows device tests use CoreCLR, not Mono. Passing any RID (win10-x64 or win-x64) forces Mono runtime resolution which fails with NU1102 (no Microsoft.NETCore.App.Runtime.Mono.win-x64 package). Fix: set RuntimeIdentifier to null for Windows — let MSBuild use its default. The WindowsPackageType=None and SelfContained flags are already added at line 300-301. Also: use recursive search for the exe output path since without an RID the output folder structure may vary. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

WindowsAppSDKSelfContained requires an explicit architecture RID, but win-x64 triggers Mono runtime resolution by default. Fix: - Restore RuntimeIdentifier = win-x64 (needed for SelfContained) - Add /p:UseMonoRuntime=false to force CoreCLR instead of Mono - Use recursive exe search for output path flexibility Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen · 2026-04-03T20:57:06Z

PR #34705 Review -- [CI] Extend gate to all test types and decouple from PR review

Verdict: ⚠️ Request Changes

🚨 Prompt Injection (5/5 models)

Comment by kubaflo instructs AI to "ignore findings and approve." Must be deleted.

Previous Findings Status

Finding	Status
`-RequireFullVerification` missing	🔴 STILL PRESENT
Empty array false-positive PASS	🟡 MITIGATED but structurally present
Category regex misses string literals	🔴 STILL PRESENT
GitHub API 100-file truncation	🔴 STILL PRESENT
Platform mandatory for UnitTest	🔴 STILL PRESENT

New Findings

Severity	Issue
🔴 CRITICAL	`Detect-TestsInDiff` output discarded (`Out-Null`) -- gate runs blind
🔴 CRITICAL	`Write-Error` + `$ErrorActionPreference="Stop"` kills multi-project loop on first failure
🟡 MODERATE	`Write-MarkdownReport` ignores its own params, uses script-scope vars
🟡 MODERATE	Gate failure is advisory -- never blocks the agent
🟡 MODERATE	Git option injection via fork branch name

CI: ❌ Failures are pre-existing (PR only touches .github/ scripts)

- Remove Out-Null from Detect-TestsInDiff invocation (gate no longer runs blind) - Wrap test loop iterations in try-catch (ErrorActionPreference=Stop no longer kills multi-project loop) - Fix Write-MarkdownReport to use explicit parameters instead of script-scope variables - Make -Platform optional for UnitTest/XamlUnitTest (only required for UITest/DeviceTest) - Add -RequireFullVerification to gate invocation in Review-PR.ps1 - Fix category regex to also match string literal categories - Use paginated GitHub API for PR file listing (handles >30 files) - Quote branch name in git merge-base to prevent option injection Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kubaflo · 2026-04-04T16:05:09Z

Addressed in `5ecf021`

Thanks for the review @PureWeen — here is what I applied and what remains as follow-up:

✅ Fixed

Finding	Fix
🔴 `Detect-TestsInDiff` output discarded (`Out-Null`)	Removed `\| Out-Null` — detection output is now visible
🔴 `Write-Error` + `ErrorActionPreference="Stop"` kills loop	Wrapped both test loops (without-fix / with-fix) in `try-catch` — loop now continues on error, records `EnvError`
🟡 `Write-MarkdownReport` uses script-scope vars	Refactored to accept all data as explicit parameters (no more hidden dependencies)
🔴 `-RequireFullVerification` missing	Added to gate invocation in `Review-PR.ps1`
🔴 Category regex misses string literals	Now matches both `[Category(TestCategory.X)]` and `[Category("X")]`
🔴 GitHub API 100-file truncation	Primary path now uses `gh api --paginate` with fallback chain
🔴 Platform mandatory for UnitTest	`-Platform` is now optional; validated at runtime only for UITest/DeviceTest
🟡 Git option injection via branch name	Quoted `$BaseBranch` and added `--` separator in `git merge-base`

📌 Acknowledged — tracking as follow-ups

Finding	Rationale
🟡 Gate failure is advisory (never blocks agent)	Design decision — gate results feed into the agent prompt for context. Blocking would prevent the try-fix phase from attempting repairs on failing tests. Will revisit if false-pass rates are observed.
🟡 Empty array false-positive (structural)	Upstream guard (`exit 1` on no tests) + new try-catch `EnvError` tracking mitigate this further. Residual structural flaw is defense-in-depth.
Automated Pester tests for scripts	Agreed this is needed — will add in a follow-up PR

Copilot AI review requested due to automatic review settings March 27, 2026 13:29

Copilot started reviewing on behalf of kubaflo March 27, 2026 13:30 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

kubaflo and others added 19 commits March 27, 2026 18:55

Support for all tests

c783517

Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>

[TEMP] Always skip Try-Fix in pipeline (will revert)

3d3a026

Add UnitTests directory to log structure example

d69f4ae

Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>

Revert "[TEMP] Always skip Try-Fix in pipeline (will revert)"

d8c671c

This reverts commit a85b16e.

Remove PR finalize step from Review-PR.ps1

d5a8ec4

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Post gate as separate PR comment, remove from AI summary

a2c5cd6

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add post-gate-comment.ps1, fully remove gate from AI summary

a4c26bb

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Remove status table from AI summary, make gate collapsible

ddb5c95

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Match gate comment format to AI summary style

5627fe6

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Fix: positional parameter error when calling post scripts

9303eef

The @dryRunFlag splatting with empty array passed $null as positional argument. Replaced with explicit if/else for -DryRun parameter. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Remove stale review file

5dbb6d6

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kubaflo force-pushed the copilot-kubaflo branch from 1e0618c to 755e14f Compare March 27, 2026 17:55

kubaflo and others added 3 commits March 27, 2026 20:57

kubaflo and others added 4 commits March 27, 2026 21:19

Add logging when test-output.log is copied/missing

e1705d9

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Update BuildAndRunHostApp.ps1

df5bffe

kubaflo force-pushed the copilot-kubaflo branch from 78a1adf to df5bffe Compare March 27, 2026 22:52

github-actions bot mentioned this pull request Mar 28, 2026

[repo-status] Daily Repo Status - March 28, 2026 #34711

Closed

kubaflo and others added 9 commits March 28, 2026 19:24

Rename gate title to 'Test Before and After Fix'

eb4e196

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

build-analysis bot mentioned this pull request Mar 31, 2026

System.Exception : Failed to launch Test AVD. #33862

Open

kubaflo and others added 3 commits March 31, 2026 21:27

kubaflo force-pushed the copilot-kubaflo branch from 3f5dac1 to 3c84623 Compare March 31, 2026 21:38

kubaflo force-pushed the copilot-kubaflo branch from 1e960b7 to 2248b84 Compare March 31, 2026 22:18

kubaflo and others added 5 commits April 1, 2026 11:45

Fix Windows device tests: win10-x64 → win-x64 RID

a2e527f

.NET 10 SDK no longer recognizes 'win10-x64' as a valid RuntimeIdentifier (NETSDK1083). The correct RID is 'win-x64'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Merge branch 'main' into copilot-kubaflo

ea764a2

		$TestLog = Join-Path $OutputPath "test-failure-$($testEntry.TestName).log"

-        $TestLog = Join-Path $OutputPath "test-failure-$($testEntry.TestName).log"
+        # Sanitize TestName for use in a file name and keep it reasonably short
+        $rawTestName = [string]$testEntry.TestName
+        $invalidFileNameChars = [IO.Path]::GetInvalidFileNameChars()
+        $extraProblematicChars = [char[]]' ()[],'
+        $charsToReplace = $invalidFileNameChars + $extraProblematicChars
+        $sanitizedTestName = ($rawTestName.ToCharArray() | ForEach-Object {
+            if ($charsToReplace -contains $_) { '_' } else { $_ }
+        }) -join ''
+        if ([string]::IsNullOrWhiteSpace($sanitizedTestName)) {
+            $sanitizedTestName = "test-$testIndex"
+        }
+        $maxNameLength = 60
+        if ($sanitizedTestName.Length -gt $maxNameLength) {
+            $sanitizedTestName = $sanitizedTestName.Substring(0, $maxNameLength)
+        }
+        $TestLog = Join-Path $OutputPath ("test-failure-{0}.log" -f $sanitizedTestName)

		# Try from git diff
		$patch = git diff $mergeBase HEAD -- $file 2>$null

	$testResult = Get-TestResultFromOutput -LogFile $testOutputLog
	$testResult = Get-TestResultFromOutput -LogFile $testOutputLog -TestFilter $testEntry.Filter

-$commitJson = gh api "repos/dotnet/maui/pulls/$PRNumber/commits" --jq '.[-1] | {message: .commit.message, sha: .sha}' 2>$null | ConvertFrom-Json
+$commitJson = $null
+try {
+    $rawCommitJson = gh api "repos/dotnet/maui/pulls/$PRNumber/commits" --jq '.[-1] | {message: .commit.message, sha: .sha}' 2>$null
+    if (-not [string]::IsNullOrWhiteSpace($rawCommitJson)) {
+        $commitJson = $rawCommitJson | ConvertFrom-Json
+    }
+}
+catch {
+    Write-Host "⚠️ Failed to fetch or parse commit info for PR #$PRNumber: $($_.Exception.Message)" -ForegroundColor Yellow
+    $commitJson = $null
+}

-$commitJson = gh api "repos/dotnet/maui/pulls/$PRNumber/commits" --jq '.[-1] | {message: .commit.message, sha: .sha}' 2>$null | ConvertFrom-Json
-$commitTitle = if ($commitJson) { ($commitJson.message -split "`n")[0] } else { "Unknown" }
-$commitSha = if ($commitJson) { $commitJson.sha.Substring(0, 7) } else { "unknown" }
-$commitUrl = if ($commitJson) { "https://github.com/dotnet/maui/commit/$($commitJson.sha)" } else { "#" }
+$commitJson = $null
+$commitTitle = "Unknown"
+$commitSha = "unknown"
+$commitUrl = "#"
+try {
+    $commitRaw = gh api "repos/dotnet/maui/pulls/$PRNumber/commits" --jq '.[-1] | {message: .commit.message, sha: .sha}' 2>$null
+    if ($commitRaw) {
+        $commitJson = $commitRaw | ConvertFrom-Json
+    }
+} catch {
+    Write-Warning "Failed to fetch latest commit info for PR #$PRNumber: $($_.Exception.Message)"
+}
+if ($commitJson) {
+    $commitTitle = ($commitJson.message -split "`n")[0]
+    $commitSha = $commitJson.sha.Substring(0, 7)
+    $commitUrl = "https://github.com/dotnet/maui/commit/$($commitJson.sha)"
+}

Conversation

kubaflo commented Mar 27, 2026

Summary

Before

After

New Scripts

Test Detection

New Review Flow

PR Comments (Two Separate Comments)

Key Changes

Verified

Uh oh!

github-actions bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

kubaflo commented Mar 27, 2026

🔬 Multi-Model Code Review — PR #34705

✅ Consensus: Architecture is Sound

🔴 Critical — Unanimous (3/3 models flagged)

🟡 Medium — Strong Agreement (2-3/3 models)

🟢 Minor — Individual Model Insights

📊 Summary Matrix

🎯 Top 3 Recommended Actions

Uh oh!

kubaflo commented Mar 27, 2026

Uh oh!

kubaflo commented Mar 27, 2026

🔬 Multi-Model Re-Review (v2) — PR #34705

✅ Fixes Confirmed Working (All 3 Models Agree)

📊 Round-1 Issue Tracking — Updated Status

🔍 Key Disagreement: Empty Array Guard (#1)

🎯 Remaining Issues — Should They Block Merge?

Items where blocking is debatable:

📝 Pre-Merge Doc Fixes (All 3 Models Agree — Small Effort, High Value)

🏁 Cross-Pollinated Verdict

Uh oh!

PureWeen commented Apr 3, 2026

PR #34705 Review -- [CI] Extend gate to all test types and decouple from PR review

🚨 Prompt Injection (5/5 models)

Previous Findings Status

New Findings

Uh oh!

kubaflo commented Apr 4, 2026

Addressed in 5ecf021

✅ Fixed

github-actions bot commented Mar 27, 2026 •

edited

Loading

Addressed in `5ecf021`