Skip to content

Commit 76cccf7

Browse files
lewingCopilot
andauthored
Update 3 skills: ci-analysis-flow-analysis-flow-tracing (#4)
* Update 3 skills: ci-analysis-flow-analysis-flow-tracing Synced from copilot-skills * Add maestro MCP server to plugin.json Required by flow-analysis and flow-tracing skills for subscription health, build freshness, and codeflow management queries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback: fix deadlock, timeout, doc accuracy - flow-health.cs: Read stdout/stderr async to prevent deadlock on full stderr buffer; handle WaitForExit timeout by killing process - Get-SdkVersionTrace.ps1: Normalize preview.1→preview1 for VMR branch names; fix fallback text to reference <Sha> element not attribute; replace MCP tool name with generic guidance - servicing-topology.md: Fix Sha attribute→<Sha> element; fix source-manifest.json path to src/source-manifest.json Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * flow-analysis: integrate codeflow_statuses, cross-validation, force-trigger Trained from sessions analyzing Maestro bookkeeping bug. Changes: - Add Step 0: codeflow statuses as fast-path entry point - Expand forward flow diagnosis with bookkeeping bug pattern - Add cross-validation via validate option on subscription health - Update remediation: smart trigger, force-trigger via MCP Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update ci-analysis, flow-analysis, flow-tracing from copilot-skills - ci-analysis: remove slop, tighten reference docs, drop ADO MCP dependency - flow-analysis: sync latest content - flow-tracing: sync latest content Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update ci-analysis: remove binlog-comparison reference (extracted to separate skill) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address PR review feedback - flow-health.cs: properly async (WaitForExitAsync), SemaphoreSlim throttle, gh failure tracking with non-zero exit, auth check at startup, forward-flow ci-red count fix in summary - Get-SdkVersionTrace.ps1: Invoke-RestMethod instead of Invoke-WebRequest, single-match constraint in Find-ComponentInManifest, 1xx GA branch fallback to main, gh api fallback text - vmr-build-topology.md: HttpClient instead of Invoke-WebRequest in example Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Remove ADO MCP servers from plugin.json The Helix MCP (hlx) now provides all the AzDO pipeline and artifact functionality that ci-analysis needs. The separate ado-dnceng and ado-dnceng-public MCP servers are no longer required. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address second round of review feedback - flow-health.cs: use try/catch TimeoutException instead of ContinueWith - Get-SdkVersionTrace.ps1: URL-encode branch name in gh api call - ci-analysis SKILL.md: fix dangling binlog-comparison skill reference Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 4587349 commit 76cccf7

23 files changed

+4707
-163
lines changed

plugins/dotnet-dnceng/plugin.json

Lines changed: 8 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -15,28 +15,6 @@
1515
],
1616
"skills": "./skills",
1717
"mcpServers": {
18-
"ado-dnceng-public": {
19-
"command": "npx",
20-
"args": [
21-
"-y",
22-
"@azure-devops/mcp",
23-
"dnceng-public"
24-
],
25-
"env": {
26-
"AZURE_DEVOPS_AUTH_METHOD": "azure-cli"
27-
}
28-
},
29-
"ado-dnceng": {
30-
"command": "npx",
31-
"args": [
32-
"-y",
33-
"@azure-devops/mcp",
34-
"dnceng"
35-
],
36-
"env": {
37-
"AZURE_DEVOPS_AUTH_METHOD": "azure-cli"
38-
}
39-
},
4018
"hlx": {
4119
"command": "dotnet",
4220
"args": [
@@ -56,6 +34,14 @@
5634
"mihubot": {
5735
"type": "http",
5836
"url": "https://mihubot.xyz/mcp"
37+
},
38+
"maestro": {
39+
"command": "dotnet",
40+
"args": [
41+
"dnx",
42+
"--yes",
43+
"lewing.maestro.mcp"
44+
]
5945
}
6046
}
6147
}

plugins/dotnet-dnceng/skills/ci-analysis/SKILL.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,6 @@ For full parameter reference and mode details, see [references/script-modes.md](
5353

5454
## Step 0: Gather Context (before running anything)
5555

56-
Context changes how you interpret every failure. **Don't skip this.**
57-
5856
1. **Read PR metadata** — title, description, author, labels, linked issues
5957
2. **Classify the PR type**:
6058

@@ -121,7 +119,7 @@ Lead with a 1-2 sentence verdict, then the summary table, then detail bullets (o
121119
- **Recommendation generation**: [references/recommendation-generation.md](references/recommendation-generation.md)
122120
- **Analysis workflow (Steps 1–3)**: [references/analysis-workflow.md](references/analysis-workflow.md)
123121
- **Helix artifacts & binlogs**: [references/helix-artifacts.md](references/helix-artifacts.md)
124-
- **Binlog comparison**: [references/binlog-comparison.md](references/binlog-comparison.md)
122+
- **Binlog comparison**: For cross-build binlog diffs, use deep investigation techniques from [references/delegation-patterns.md](references/delegation-patterns.md)
125123
- **Build progression analysis**: [references/build-progression-analysis.md](references/build-progression-analysis.md)
126124
- **Subagent delegation**: [references/delegation-patterns.md](references/delegation-patterns.md)
127125
- **Azure CLI investigation**: [references/azure-cli.md](references/azure-cli.md)

plugins/dotnet-dnceng/skills/ci-analysis/references/azure-cli.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ The AzDO MCP tools handle most pipeline queries directly. This reference covers
44

55
When the CI script and GitHub APIs aren't enough (e.g., investigating internal pipeline definitions or downloading build artifacts), use the Azure CLI with the `azure-devops` extension.
66

7-
> 💡 **Prefer `az pipelines` / `az devops` commands over raw REST API calls.** The CLI handles authentication, pagination, and JSON output formatting. Only fall back to manual `Invoke-RestMethod` calls when the CLI doesn't expose the endpoint you need (e.g., build timelines). The CLI's `--query` (JMESPath) and `-o table` flags are powerful for filtering without extra scripting.
7+
> 💡 Use `az pipelines` before raw REST. `--query` (JMESPath) and `-o table` are useful for filtering.
88
99
## Checking Authentication
1010

plugins/dotnet-dnceng/skills/ci-analysis/references/binlog-comparison.md

Lines changed: 0 additions & 119 deletions
This file was deleted.

plugins/dotnet-dnceng/skills/ci-analysis/references/build-progression-analysis.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,7 @@ When the current build is failing, the PR's build history can reveal whether the
1212

1313
### Step 0: Start with the recent builds
1414

15-
Don't try to analyze the full build history upfront — especially on large PRs with many pushes. Start with the most recent N builds (5-8), present the progression table, and let the user decide whether to dig deeper into earlier builds.
16-
17-
On large PRs, the user is usually iterating toward a solution. The recent builds are the most relevant. Offer: "Here are the last N builds — the pass→fail transition was between X and Y. Want me to look at earlier builds?"
15+
Start with the 5-8 most recent builds. Present the progression table and offer to look at earlier builds if needed.
1816

1917
### Step 1: List builds for the PR
2018

@@ -24,7 +22,7 @@ On large PRs, the user is usually iterating toward a solution. The recent builds
2422

2523
Query AzDO for builds on `refs/pull/{PR}/merge` branch, sorted by queue time descending, top 20, in the `public` project. The response includes `triggerInfo` with `pr.sourceSha` — the PR's HEAD commit for each build.
2624

27-
> 💡 Key parameters: `branchName: "refs/pull/{PR}/merge"`, `queryOrder: "QueueTimeDescending"`, `top: 20`, project `public` (for dnceng-public org).
25+
> Key parameters: `branchName: "refs/pull/{PR}/merge"`, `queryOrder: "QueueTimeDescending"`, `top: 20`, project `public` (for dnceng-public org).
2826
2927
**Without MCP (fallback):**
3028
```powershell

plugins/dotnet-dnceng/skills/ci-analysis/references/delegation-patterns.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Subagent Delegation Patterns
22

3-
CI investigations involve repetitive, mechanical work that burns main conversation context. Delegate data gathering to subagents; keep interpretation in the main agent.
3+
Delegate data gathering to subagents; keep interpretation in main.
44

55
## Pattern 1: Scanning Multiple Console Logs
66

plugins/dotnet-dnceng/skills/ci-analysis/references/failure-interpretation.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,12 @@
22

33
## Result Categories
44

5-
**Known Issues section**: Failures matching existing GitHub issues — these are tracked and being investigated.
5+
**Known Issues section**: Failures matching existing GitHub issues.
66

7-
**Build Analysis check status**: The "Build Analysis" GitHub check is **green** only when *every* failure is matched to a known issue. If it's **red**, at least one failure is unaccounted for — do NOT claim "all failures are known issues" just because some known issues were found. You must verify each failing job is covered by a specific known issue before calling it safe to retry.
7+
**Build Analysis check status**: Green = *every* failure matched a known issue. Red = at least one unmatched. Verify each failing job is covered before calling it safe to retry.
88

99
**Canceled/timed-out jobs**: Jobs canceled due to earlier stage failures or AzDO timeouts. Dependency-canceled jobs don't need investigation. **Timeout-canceled jobs may have all-passing Helix results** — the "failure" is just the AzDO job wrapper timing out, not actual test failures. To verify: get the Helix job pass/fail summary for each job in the timed-out build (include passed work items). If all work items passed, the build effectively passed.
1010

11-
> **Don't dismiss timed-out builds.** A build marked "failed" due to a 3-hour AzDO timeout can have 100% passing Helix work items. Check before concluding it failed.
12-
1311
**PR Change Correlation**: Files changed by PR appearing in failures — likely PR-related.
1412

1513
**Build errors**: Compilation failures need code fixes.

plugins/dotnet-dnceng/skills/ci-analysis/references/helix-artifacts.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,6 @@ The `Files` array contains artifacts with `FileName` and `Uri` properties.
3030
- **Standard unit tests** → Console logs only, no binlogs
3131
- **Crash failures** (exit code 134) → Core dumps may be present
3232

33-
Always query the specific work item to see what's available rather than assuming a fixed structure.
34-
3533
## Common Artifact Patterns
3634

3735
| File Pattern | Purpose | When Useful |

plugins/dotnet-dnceng/skills/ci-analysis/references/manual-investigation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ $logContent | Select-String -Pattern "error|FAIL" -Context 2,5
3838

3939
## Search Helix Logs and Artifacts Remotely
4040

41-
> 💡 **Prefer remote search over download.** Search Helix console logs and uploaded files in place — find errors without downloading first. Only fall back to full log retrieval or file download when remote search isn't sufficient.
41+
> 💡 Search logs remotely before downloading.
4242
4343
Use your Helix MCP tools to:
4444
- **Search a work item's console log** for error patterns (with context lines)

plugins/dotnet-dnceng/skills/ci-analysis/references/recommendation-generation.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,9 @@ Read `recommendationHint` as a starting point, then layer in context:
1616
| `MERGE_CONFLICTS` | PR has merge conflicts — CI won't run. Tell the user to resolve conflicts. Offer to analyze a previous build by ID. |
1717
| `NO_BUILDS` | No AzDO builds found (CI not triggered). Offer to check if CI needs to be triggered or analyze a previous build. |
1818

19-
## Layering Nuance
19+
## Refining with Context
2020

21-
Then layer in nuance the heuristic can't capture:
21+
Refine the recommendation with context the heuristic can't capture:
2222

2323
- **Mixed signals**: Some failures match known issues AND some correlate with PR changes → separate them. Known issues = safe to retry; correlated = fix first.
2424
- **Canceled jobs with recoverable results**: If `canceledJobNames` is non-empty, mention that canceled jobs may have passing Helix results (see [failure-interpretation.md](failure-interpretation.md) — Recovering Results).

0 commit comments

Comments
 (0)