Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
c783517
Support for all tests
kubaflo Mar 24, 2026
0bea0a7
Remove UI-test-only language from try-fix and pr-review skills
kubaflo Mar 25, 2026
3d3a026
[TEMP] Always skip Try-Fix in pipeline (will revert)
kubaflo Mar 25, 2026
d69f4ae
Add UnitTests directory to log structure example
kubaflo Mar 25, 2026
d8c671c
Revert "[TEMP] Always skip Try-Fix in pipeline (will revert)"
kubaflo Mar 25, 2026
c543739
Fix gate output: strict templates, no cross-phase duplication
kubaflo Mar 26, 2026
77c5696
Decouple gate from pr-review: run as script step before copilot agent
kubaflo Mar 26, 2026
d5a8ec4
Remove PR finalize step from Review-PR.ps1
kubaflo Mar 26, 2026
a2c5cd6
Post gate as separate PR comment, remove from AI summary
kubaflo Mar 26, 2026
a4c26bb
Add post-gate-comment.ps1, fully remove gate from AI summary
kubaflo Mar 26, 2026
6de98f7
Simplify post-ai-summary-comment.ps1: always overwrite, no session hi…
kubaflo Mar 26, 2026
ddb5c95
Remove status table from AI summary, make gate collapsible
kubaflo Mar 26, 2026
5627fe6
Match gate comment format to AI summary style
kubaflo Mar 26, 2026
9303eef
Fix: positional parameter error when calling post scripts
kubaflo Mar 27, 2026
c1687b7
Address PR review feedback: 10 fixes
kubaflo Mar 27, 2026
5dbb6d6
Remove stale review file
kubaflo Mar 27, 2026
afe7c26
Fix: prefer PR number over git diff for test detection
kubaflo Mar 27, 2026
f58a9f3
Add env error detection and failure details to gate report
kubaflo Mar 27, 2026
755e14f
Add retry logic (3 attempts) for environment errors in gate
kubaflo Mar 27, 2026
03c1371
Emulator boot: retry 3x, longer offline wait, fail-fast on hang
kubaflo Mar 27, 2026
dc0994b
Fix: clear stale test output before each gate run
kubaflo Mar 27, 2026
a467c2e
Replace Tee-Object with capture+write to prevent pipeline breaks
kubaflo Mar 27, 2026
e12891f
Use unique test output files per gate run — never overwrite
kubaflo Mar 27, 2026
e1705d9
Add logging when test-output.log is copied/missing
kubaflo Mar 27, 2026
4060efb
Simplify: all test types return $LogFile directly, no shared files
kubaflo Mar 27, 2026
df5bffe
Update BuildAndRunHostApp.ps1
kubaflo Mar 27, 2026
9f0eaf1
Enhance gate report with per-test execution details
kubaflo Mar 28, 2026
0f28c45
Show helpful SKIPPED message when gate finds no tests
kubaflo Mar 28, 2026
7eef92b
Add gate result labels (s/agent-gate-passed/failed/skipped)
kubaflo Mar 28, 2026
564fa5a
Restructure gate output: collapsible logs + clear summary table
kubaflo Mar 29, 2026
3328d59
Redesign gate PR comment: side-by-side table + collapsible details
kubaflo Mar 29, 2026
70acef0
Embed full test logs in gate comment as collapsible sections
kubaflo Mar 29, 2026
eb4e196
Rename gate title to 'Test Before and After Fix'
kubaflo Mar 29, 2026
b111ca4
Fix gate labels: create labels if missing + log errors
kubaflo Mar 29, 2026
0405a72
Fix duplicate comments: paginate API + update latest match
kubaflo Mar 30, 2026
8a393cb
Improve gate retry: reboot device on app launch failure + 30s wait
kubaflo Mar 31, 2026
2dc6a10
Fix false PASSED on XHarness app crash (exit 83)
kubaflo Mar 31, 2026
3c84623
Fix: trust test results over XHarness exit code
kubaflo Mar 31, 2026
2248b84
Fix gate labels: drop gh label create (needs read:org scope)
kubaflo Mar 31, 2026
4b56567
Fix: Run-DeviceTests Write-Host invisible to gate parser
kubaflo Apr 1, 2026
a2e527f
Fix Windows device tests: win10-x64 → win-x64 RID
kubaflo Apr 1, 2026
2a622d8
Fix Windows device tests: drop RID entirely
kubaflo Apr 2, 2026
104f997
Fix Windows device tests: win-x64 RID + UseMonoRuntime=false
kubaflo Apr 2, 2026
ea764a2
Merge branch 'main' into copilot-kubaflo
kubaflo Apr 2, 2026
5ecf021
Address PR review feedback from PureWeen
kubaflo Apr 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 23 additions & 12 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,66 +245,77 @@ Skills are modular capabilities that can be invoked directly or used by agents.

#### User-Facing Skills

1. **issue-triage** (`.github/skills/issue-triage/SKILL.md`)
1. **pr-review** (`.github/skills/pr-review/SKILL.md`)
- **Purpose**: End-to-end PR review orchestrator — 3 phases: pr-preflight, try-fix, pr-report. Gate runs separately before this skill via Review-PR.ps1.
- **Trigger phrases**: "review PR #XXXXX", "work on PR #XXXXX", "fix issue #XXXXX", "continue PR #XXXXX"
- **Capabilities**: Multi-model fix exploration, alternative comparison, PR review recommendation
- **Do NOT use for**: Just running tests manually → Use `sandbox-agent`
- **Phase instructions** (in `.github/pr-review/`):
- `pr-preflight.md` — Context gathering from issue/PR
- `pr-report.md` — Final recommendation
- **Phase skill**: `try-fix` — Multi-model fix exploration
- **Note**: Gate (test verification) runs as a script step in `Review-PR.ps1` before this skill is invoked. Gate result is passed in the prompt.

2. **issue-triage** (`.github/skills/issue-triage/SKILL.md`)
- **Purpose**: Query and triage open issues that need milestones, labels, or investigation
- **Trigger phrases**: "find issues to triage", "show me old Android issues", "what issues need attention"
- **Scripts**: `init-triage-session.ps1`, `query-issues.ps1`, `record-triage.ps1`

2. **find-reviewable-pr** (`.github/skills/find-reviewable-pr/SKILL.md`)
3. **find-reviewable-pr** (`.github/skills/find-reviewable-pr/SKILL.md`)
- **Purpose**: Finds open PRs in dotnet/maui and dotnet/docs-maui that need review
- **Trigger phrases**: "find PRs to review", "show milestoned PRs", "find partner PRs"
- **Scripts**: `query-reviewable-prs.ps1`
- **Categories**: P/0, milestoned, partner, community, recent, docs-maui

3. **pr-finalize** (`.github/skills/pr-finalize/SKILL.md`)
4. **pr-finalize** (`.github/skills/pr-finalize/SKILL.md`)
- **Purpose**: Verifies PR title and description match actual implementation, AND performs code review for best practices before merge.
- **Trigger phrases**: "finalize PR #XXXXX", "check PR description for #XXXXX", "review commit message"
- **Used by**: Before merging any PR, when description may be stale
- **Note**: Does NOT require agent involvement or session markdown - works on any PR
- **🚨 CRITICAL**: NEVER use `--approve` or `--request-changes` - only post comments. Approval is a human decision.

4. **code-review** (`.github/skills/code-review/SKILL.md`)
5. **code-review** (`.github/skills/code-review/SKILL.md`)
- **Purpose**: Reviews PR code changes for correctness, safety, and consistency with MAUI conventions. Walks through a MAUI-specific checklist covering handler lifecycle, platform code, safe area, threading, public API, and test patterns.
- **Trigger phrases**: "review code for PR #XXXXX", "code review PR #XXXXX", "review this PR's code"
- **Note**: Standalone skill — uses independence-first assessment (reads code before PR description to avoid anchoring bias). Can be used by any agent or invoked directly.
- **🚨 CRITICAL**: NEVER use `--approve` or `--request-changes` — only post comments. Approval is a human decision.

5. **learn-from-pr** (`.github/skills/learn-from-pr/SKILL.md`)
6. **learn-from-pr** (`.github/skills/learn-from-pr/SKILL.md`)
- **Purpose**: Analyzes completed PR to identify repository improvements (analysis only, no changes applied)
- **Trigger phrases**: "what can we learn from PR #XXXXX?", "how can we improve agents based on PR #XXXXX?"
- **Used by**: After complex PRs, when agent struggled to find solution
- **Output**: Prioritized recommendations for instruction files, skills, code comments
- **Note**: For applying changes automatically, use the learn-from-pr agent instead

6. **write-ui-tests** (`.github/skills/write-ui-tests/SKILL.md`)
7. **write-ui-tests** (`.github/skills/write-ui-tests/SKILL.md`)
- **Purpose**: Creates UI tests for GitHub issues and verifies they reproduce the bug
- **Trigger phrases**: "write UI tests for #XXXXX", "create UI test for issue", "add UI test coverage"
- **Output**: Test files that fail without fix, pass with fix

7. **write-xaml-tests** (`.github/skills/write-xaml-tests/SKILL.md`)
8. **write-xaml-tests** (`.github/skills/write-xaml-tests/SKILL.md`)
- **Purpose**: Creates XAML unit tests for XAML parsing, compilation, and source generation
- **Trigger phrases**: "write XAML tests for #XXXXX", "test XamlC behavior", "reproduce XAML parsing bug"
- **Output**: Test files for Controls.Xaml.UnitTests

8. **verify-tests-fail-without-fix** (`.github/skills/verify-tests-fail-without-fix/SKILL.md`)
- **Purpose**: Verifies UI tests catch the bug before fix and pass with fix
9. **verify-tests-fail-without-fix** (`.github/skills/verify-tests-fail-without-fix/SKILL.md`)
- **Purpose**: Verifies tests catch the bug before fix and pass with fix. Auto-detects test type (UI, device, unit, XAML) and dispatches to the appropriate runner.
- **Two modes**: Verify failure only (test creation) or full verification (test + fix)
- **Used by**: After creating tests, before considering PR complete

9. **pr-build-status** (`.github/skills/pr-build-status/SKILL.md`)
10. **pr-build-status** (`.github/skills/pr-build-status/SKILL.md`)
- **Purpose**: Retrieves Azure DevOps build information for PRs (build IDs, stage status, failed jobs)
- **Trigger phrases**: "check build for PR #XXXXX", "why did PR build fail", "get build status"
- **Used by**: When investigating CI failures

10. **run-integration-tests** (`.github/skills/run-integration-tests/SKILL.md`)
11. **run-integration-tests** (`.github/skills/run-integration-tests/SKILL.md`)
- **Purpose**: Build, pack, and run .NET MAUI integration tests locally
- **Trigger phrases**: "run integration tests", "test templates locally", "run macOSTemplates tests", "run RunOniOS tests"
- **Categories**: Build, WindowsTemplates, macOSTemplates, Blazor, MultiProject, Samples, AOT, RunOnAndroid, RunOniOS
- **Note**: **ALWAYS use this skill** instead of manual `dotnet test` commands for integration tests

#### Internal Skills (Used by Agents)

11. **try-fix** (`.github/skills/try-fix/SKILL.md`)
12. **try-fix** (`.github/skills/try-fix/SKILL.md`)
- **Purpose**: Proposes ONE independent fix approach, applies it, tests, records result with failure analysis, then reverts
- **Used by**: pr agent Phase 3 (Fix phase) - rarely invoked directly by users
- **Behavior**: Reads prior attempts to learn from failures. Max 5 attempts per session.
Expand Down
71 changes: 41 additions & 30 deletions .github/pr-review/pr-gate.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# PR Gate Test Verification
# PR Gate - Test Before and After Fix

> **⛔ This phase MUST pass before continuing to Try-Fix. If it fails, stop and inform user.**

> 🚨 Gate verification MUST run via task agent — never inline.
> In CI (Review-PR.ps1), the gate runs `verify-tests-fail.ps1` directly as a script step.
> For manual usage, you can invoke it yourself or via a task agent.

---

Expand All @@ -26,41 +27,32 @@ Choose a platform that is BOTH affected by the bug AND available on the current

## Steps

1. **Check if tests exist:**
1. **Detect tests in PR** using the shared detection script:
```bash
gh pr view XXXXX --json files --jq '.files[].path' | grep -E "TestCases\.(HostApp|Shared\.Tests)"
pwsh .github/scripts/shared/Detect-TestsInDiff.ps1 -PRNumber XXXXX
```
If NO tests exist → inform user, suggest `write-tests-agent`. Gate is ⚠️ SKIPPED.
This auto-detects all test types: UI tests, device tests, unit tests, XAML tests.
If NO tests detected → inform user, suggest `write-tests-agent`. Gate is ⚠️ SKIPPED.

2. **Select platform** — must be affected by bug AND available on host (see Platform Selection above).
2. **Select platform** — must be affected by bug AND available on host (see table above).

3. **Run verification via task agent** (MUST use task agent — never inline):
3. **Run verification** via `verify-tests-fail.ps1`:
```bash
pwsh .github/skills/verify-tests-fail-without-fix/scripts/verify-tests-fail.ps1 \
-Platform {platform} -RequireFullVerification
```
In CI, `Review-PR.ps1` calls this script directly. For manual usage, you can also invoke
it via a task agent for isolation:
```
Comment on lines +37 to 46
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 3 still instructs running gate via the task agent, but the new workflow in Review-PR.ps1 runs gate directly via verify-tests-fail.ps1 before invoking the Copilot pr-review skill. Please update this step to reflect the new script-driven gate (or clarify that this doc is only for manual, agent-driven gate runs).

Copilot uses AI. Check for mistakes.
Invoke the `task` agent with this prompt:

"Invoke the verify-tests-fail-without-fix skill for this PR:
- Platform: {platform}
- TestFilter: 'IssueXXXXX'
- RequireFullVerification: true

Report back: Did tests FAIL without fix? Did tests PASS with fix? Final status?"
```

**Why task agent?** Running inline allows substituting commands and fabricating results. Task agent runs in isolation.

---

## Expected Result

```
╔═══════════════════════════════════════════════════════════╗
║ VERIFICATION PASSED ✅ ║
╠═══════════════════════════════════════════════════════════╣
║ - FAIL without fix (as expected) ║
║ - PASS with fix (as expected) ║
╚═══════════════════════════════════════════════════════════╝
```

---

## If Gate Fails
Expand All @@ -72,25 +64,44 @@ Choose a platform that is BOTH affected by the bug AND available on the current

## Output File

> 🚨 **CRITICAL OUTPUT RULES:**
> - Write gate results ONLY to `gate/content.md` — NEVER copy gate results into other phases (pre-flight, try-fix, report)
> - Use the EXACT template below — no extra explanations, no "Reason:" paragraphs, no "Notes:" sections
> - Keep it SHORT — the template is the complete output

```bash
mkdir -p CustomAgentLogsTmp/PRState/{PRNumber}/PRAgent/gate
```

Write `content.md`:
Write `content.md` using this **exact** template (fill in values, don't add anything else):

```markdown
### Gate Result: {✅ PASSED / ❌ FAILED / ⚠️ SKIPPED}

**Platform:** {platform}
**Mode:** Full Verification

- Tests FAIL without fix: {✅/❌}
- Tests PASS with fix: {✅/❌}
| # | Type | Test Name | Filter |
|---|------|-----------|--------|
| 1 | {type} | {name} | `{filter}` |

| Step | Expected | Actual | Result |
|------|----------|--------|--------|
| Without fix | FAIL | {FAIL/PASS} | {✅/❌} |
| With fix | PASS | {FAIL/PASS} | {✅/❌} |
```

If gate is SKIPPED (no tests found), write only:

```markdown
### Gate Result: ⚠️ SKIPPED

No tests detected in PR. Suggest adding tests via `write-tests-agent`.
```

---

## Common Mistakes

- ❌ Running inline — MUST use task agent
- ❌ Using `BuildAndRunHostApp.ps1` — that runs ONE direction; the skill does TWO
- ❌ Claiming results from a single test run — script does TWO runs automatically
- ❌ Adding verbose explanations to gate/content.md — use the exact template above
- ❌ Copying gate results into try-fix/content.md or report/content.md — gate results belong ONLY in gate/content.md
- ❌ Skipping gate because tests are device tests, not UI tests — the skill supports all test types
5 changes: 4 additions & 1 deletion .github/pr-review/pr-report.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,14 @@

> 🚨 **DO NOT post any comments.** This phase only produces output files.

> 🚨 **DO NOT duplicate content from other phases.** Reference gate/try-fix results by status only (e.g., "Gate: ✅ PASSED") — do NOT copy their full output into report/content.md.

---

## Prerequisites

- Phases 1-3 (Pre-Flight, Gate, Try-Fix) must be complete before starting
- Phases 1-2 (Pre-Flight, Try-Fix) must be complete before starting
- Gate result is available from the prompt (ran separately before this skill)

---

Expand Down
5 changes: 3 additions & 2 deletions .github/scripts/BuildAndRunHostApp.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -313,8 +313,9 @@ try {
# Save test output to file
$testOutput | Out-File -FilePath $testOutputFile -Encoding UTF8

# Display test output
$testOutput | ForEach-Object { Write-Host $_ }
# Output test results to the output stream so callers can capture them
# (Write-Host goes to the Information stream which is not captured by 2>&1)
$testOutput | ForEach-Object { Write-Output $_ }

$testExitCode = $LASTEXITCODE

Expand Down
3 changes: 3 additions & 0 deletions .github/scripts/EstablishBrokenBaseline.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,9 @@ $script:TestPathPatterns = @(
"*.Tests/*",
"*.UnitTests/*",
"*TestCases*",
"*TestUtils*",
"*DeviceTests.Runners*",
"*DeviceTests.Shared*",
"*snapshots*",
"*.png",
"*.jpg",
Expand Down
Loading
Loading