|
| 1 | +--- |
| 2 | +name: analyze-test-run |
| 3 | +description: "Analyze a GitHub Actions integration test run and produce a skill invocation report with failure root-cause issues. TRIGGERS: analyze test run, skill invocation rate, test run report, compare test runs, skill invocation summary, test failure analysis, run report, test results, action run report" |
| 4 | +license: MIT |
| 5 | +metadata: |
| 6 | + author: Microsoft |
| 7 | + version: "1.0.0" |
| 8 | +--- |
| 9 | + |
| 10 | +# Analyze Test Run |
| 11 | + |
| 12 | +Downloads artifacts from a GitHub Actions integration test run, generates a summarized skill invocation report, and files GitHub issues for each test failure with root-cause analysis. |
| 13 | + |
| 14 | +## When to Use |
| 15 | + |
| 16 | +- Summarize results of a GitHub Actions integration test run |
| 17 | +- Calculate skill invocation rates for the skill under test |
| 18 | +- For azure-deploy tests: track the full deployment chain (azure-prepare → azure-validate → azure-deploy) |
| 19 | +- Compare skill invocation across two runs |
| 20 | +- File issues for test failures with root-cause context |
| 21 | + |
| 22 | +## Input |
| 23 | + |
| 24 | +| Parameter | Required | Description | |
| 25 | +|-----------|----------|-------------| |
| 26 | +| **Run ID or URL** | Yes | GitHub Actions run ID (e.g. `22373768875`) or full URL | |
| 27 | +| **Comparison Run** | No | Second run ID/URL for side-by-side comparison | |
| 28 | + |
| 29 | +## Workflow |
| 30 | + |
| 31 | +### Phase 1 — Download & Parse |
| 32 | + |
| 33 | +1. Extract the numeric run ID from the input (strip URL prefix if needed) |
| 34 | +2. Fetch run metadata: |
| 35 | + ```bash |
| 36 | + gh run view <run-id> --repo microsoft/GitHub-Copilot-for-Azure --json jobs,status,conclusion,name |
| 37 | + ``` |
| 38 | +3. Download artifacts to a temp directory: |
| 39 | + ```bash |
| 40 | + gh run download <run-id> --repo microsoft/GitHub-Copilot-for-Azure --dir "$TMPDIR/gh-run-<run-id>" |
| 41 | + ``` |
| 42 | +4. Locate these files in the downloaded artifacts: |
| 43 | + - `junit.xml` — test pass/fail/skip/error results |
| 44 | + - `*-SKILL-REPORT.md` — generated skill report with per-test details |
| 45 | + - `agent-metadata-*.md` files — raw agent session logs per test |
| 46 | + |
| 47 | +### Phase 2 — Build Summary Report |
| 48 | + |
| 49 | +Produce a markdown report with four sections. See [report-format.md](references/report-format.md) for the exact template. |
| 50 | + |
| 51 | +**Section 1 — Test Results Overview** |
| 52 | + |
| 53 | +Parse `junit.xml` to build: |
| 54 | + |
| 55 | +| Metric | Value | |
| 56 | +|--------|-------| |
| 57 | +| Total tests | count from `<testsuites tests=…>` | |
| 58 | +| Executed | total − skipped | |
| 59 | +| Skipped | count of `<skipped/>` elements | |
| 60 | +| Passed | executed − failures − errors | |
| 61 | +| Failed | count of `<failure>` elements | |
| 62 | +| Test Pass Rate | passed / executed as % | |
| 63 | + |
| 64 | +Include a per-test table with name, duration (from `time` attribute, convert seconds to `Xm Ys`), and Pass/Fail result. |
| 65 | + |
| 66 | +**Section 2 — Skill Invocation Rate** |
| 67 | + |
| 68 | +Read the SKILL-REPORT.md "Per-Test Case Results" sections. For each executed test determine whether the skill under test was invoked. |
| 69 | + |
| 70 | +The skills to track depend on which integration test suite the run belongs to: |
| 71 | + |
| 72 | +**azure-deploy integration tests** — track the full deployment chain: |
| 73 | + |
| 74 | +| Skill | How to detect | |
| 75 | +|-------|---------------| |
| 76 | +| `azure-prepare` | Mentioned as invoked in the narrative or agent-metadata | |
| 77 | +| `azure-validate` | Mentioned as invoked in the narrative or agent-metadata | |
| 78 | +| `azure-deploy` | Mentioned as invoked in the narrative or agent-metadata | |
| 79 | + |
| 80 | +Build a per-test invocation matrix (Yes/No for each skill) and compute rates: |
| 81 | + |
| 82 | +| Skill | Invocation Rate | |
| 83 | +|-------|----------------| |
| 84 | +| azure-deploy | X% (n/total) | |
| 85 | +| azure-prepare | X% (n/total) | |
| 86 | +| azure-validate | X% (n/total) | |
| 87 | +| Full skill chain (P→V→D) | X% (n/total) | |
| 88 | + |
| 89 | +> The azure-deploy integration tests exercise the full deployment workflow where the agent is expected to invoke azure-prepare, azure-validate, and azure-deploy in sequence. This three-skill chain tracking is **specific to azure-deploy tests only**. |
| 90 | +
|
| 91 | +**All other integration tests** — track only the skill under test: |
| 92 | + |
| 93 | +| Skill | Invocation Rate | |
| 94 | +|-------|----------------| |
| 95 | +| {skill-under-test} | X% (n/total) | |
| 96 | + |
| 97 | +For non-deploy tests (e.g. azure-prepare, azure-ai, azure-kusto), only track whether the primary skill under test was invoked. Do not include azure-prepare/azure-validate/azure-deploy chain columns. |
| 98 | + |
| 99 | +**Section 3 — Report Confidence & Pass Rate** |
| 100 | + |
| 101 | +Extract from SKILL-REPORT.md: |
| 102 | +- Overall Test Pass Rate (from the report's statistics section) |
| 103 | +- Average Confidence (from the report's statistics section) |
| 104 | + |
| 105 | +**Section 4 — Comparison** (only when a second run is provided) |
| 106 | + |
| 107 | +Repeat Phase 1–3 for the second run, then produce a side-by-side delta table. See [report-format.md](references/report-format.md) § Comparison. |
| 108 | + |
| 109 | +### Phase 3 — File Issues for Failures |
| 110 | + |
| 111 | +For every test with a `<failure>` element in `junit.xml`: |
| 112 | + |
| 113 | +1. Read the failure message and file:line from the XML |
| 114 | +2. Read the actual line of code from the test file at that location |
| 115 | +3. Read the `agent-metadata-*.md` for that test from the artifacts |
| 116 | +4. Read the corresponding section in the SKILL-REPORT.md for context on what the agent did |
| 117 | +5. Determine root cause category: |
| 118 | + - **Skill not invoked** — agent bypassed skills and used manual commands |
| 119 | + - **Deployment failure** — infrastructure or RBAC error during deployment |
| 120 | + - **Timeout** — test exceeded time limit |
| 121 | + - **Assertion mismatch** — expected files/links not found |
| 122 | + - **Quota exhaustion** — Azure region quota prevented deployment |
| 123 | +6. Create a GitHub issue: |
| 124 | + |
| 125 | +``` |
| 126 | +gh issue create --repo microsoft/GitHub-Copilot-for-Azure \ |
| 127 | + --title "Integration test failure: <skill> – <keywords> [<root-cause-category>]" \ |
| 128 | + --label "bug,integration-test" \ |
| 129 | + --body "<body>" |
| 130 | +``` |
| 131 | + |
| 132 | + **Title format:** `Integration test failure: {skill} – {keywords} [{root-cause-category}]` |
| 133 | + - `{keywords}`: 2-4 words from the test name — app type (function app, static web app) + IaC type (Terraform, Bicep) + trigger if relevant |
| 134 | + - `{root-cause-category}`: one of the categories from step 5 in brackets |
| 135 | + |
| 136 | +Issue body template — see [issue-template.md](references/issue-template.md). |
| 137 | + |
| 138 | +> ⚠️ **Note:** Do NOT include the Error Details (JUnit XML) or Agent Metadata sections in the issue body. Keep issues concise with the diagnosis, prompt context, skill report context, and environment sections only. |
| 139 | +
|
| 140 | +> For azure-deploy integration tests, include an "azure-deploy Skill Invocation" section showing whether azure-deploy was invoked (Yes/No), with a note that the full chain is azure-prepare → azure-validate → azure-deploy. For all other integration tests, include a "{skill} Skill Invocation" section showing only whether the primary skill under test was invoked. |
| 141 | +
|
| 142 | +## Error Handling |
| 143 | + |
| 144 | +| Error | Cause | Fix | |
| 145 | +|-------|-------|-----| |
| 146 | +| `gh: command not found` | GitHub CLI not installed | Install with `winget install GitHub.cli` or `brew install gh` | |
| 147 | +| `no artifacts found` | Run has no uploadable reports | Verify the run completed the "Export report" step | |
| 148 | +| `HTTP 404` on run view | Invalid run ID or no access | Check the run ID and ensure `gh auth status` is authenticated | |
| 149 | +| `rate limit exceeded` | Too many GitHub API calls | Wait and retry, or use `--limit` on searches | |
| 150 | + |
| 151 | +## References |
| 152 | + |
| 153 | +- [report-format.md](references/report-format.md) — Output report template |
| 154 | +- [issue-template.md](references/issue-template.md) — GitHub issue body template |
0 commit comments