You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Canceled AzDO jobs (typically from timeouts) still have pipeline
artifacts containing binlogs. The SendToHelix.binlog contains Helix
job IDs that can be queried directly to recover actual test results.
Discovered while investigating PR #124125 where a 3-hour timeout
caused a WasmBuildTests job to be canceled, but all 226 Helix work
items had actually passed.
Copy file name to clipboardExpand all lines: .github/skills/azdo-helix-failures/SKILL.md
+15Lines changed: 15 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -111,10 +111,25 @@ The script provides a recommendation at the end, but this is based on heuristics
111
111
-**Manual investigation steps**: See [references/manual-investigation.md](references/manual-investigation.md)
112
112
-**AzDO/Helix details**: See [references/azdo-helix-reference.md](references/azdo-helix-reference.md)
113
113
114
+
## Recovering Results from Canceled Jobs
115
+
116
+
Canceled jobs (typically from timeouts) often still have useful artifacts. The Helix work items may have completed successfully even though the AzDO job was killed while waiting to collect results.
117
+
118
+
**To investigate canceled jobs:**
119
+
120
+
1.**Download build artifacts**: Use the AzDO artifacts API to get `Logs_Build_*` pipeline artifacts for the canceled job. These contain binlogs even for canceled jobs.
121
+
2.**Extract Helix job IDs**: Use the binlog MCP server to load the `SendToHelix.binlog` and search for `"Sent Helix Job"` messages. Each contains a Helix job ID.
122
+
3.**Query Helix directly**: For each job ID, query `https://helix.dot.net/api/jobs/{jobId}/workitems?api-version=2019-06-17` to get actual pass/fail results.
123
+
124
+
**Example**: A `browser-wasm windows WasmBuildTests` job was canceled after 3 hours. The binlog (truncated) still contained 12 Helix job IDs. Querying them revealed all 226 work items passed — the "failure" was purely a timeout in the AzDO wrapper.
125
+
126
+
**Key insight**: "Canceled" ≠ "Failed". Always check artifacts before concluding results are lost.
127
+
114
128
## Tips
115
129
116
130
1. Read PR description and comments first for context
117
131
2. Check if same test fails on main branch before assuming transient
118
132
3. Look for `[ActiveIssue]` attributes for known skipped tests
119
133
4. Use `-SearchMihuBot` for semantic search of related issues
120
134
5. Binlogs in artifacts help diagnose MSB4018 task failures
135
+
6. Use the binlog MCP server (`binlog.mcp`) to search binlogs for Helix job IDs, build errors, and properties
0 commit comments