You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: plugins/eng/skills/debug/SKILL.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -131,9 +131,9 @@ Before starting any work, create these tasks using `TaskCreate`. This makes the
131
131
132
132
Create these tasks in order:
133
133
134
-
1.**Debug: Phase 1 — triage** — Parse complete error output (every word). Classify bug into one of 9 categories. Load relevant triage playbook. Identify files to investigate from stack trace.
134
+
1.**Debug: Phase 1 — classify bug and load playbook** — Parse complete error output (every word). Classify bug into one of 9 categories. Load relevant triage playbook. Identify files to investigate from stack trace.
135
135
2.**Debug: Phase 2 — reproduce and comprehend** — Reproduce failure reliably. Map relevant code area (30-50 lines context, follow imports, read 2-3 siblings). Check system state. Check git history. Build mental model: expected vs actual behavior. State premises with file:line citations.
136
-
3.**Debug: Phase 3 — investigate** — Present all plausible hypotheses in one batch ranked by confidence. Test each via hypothesis-test-refine cycle (predict before running). Record verdict per hypothesis. Switch strategy after 3 rejections. Escalate after 5 hypotheses.
136
+
3.**Debug: Phase 3 — hypothesis-driven root cause investigation** — Present all plausible hypotheses in one batch ranked by confidence. Test each via hypothesis-test-refine cycle (predict before running). Record verdict per hypothesis. Switch strategy after 3 rejections. Escalate after 5 hypotheses.
137
137
4.**Debug: Phase 4 — classify root cause** — Classify: dev environment/config issue vs code bug vs both. This determines the resolution path.
138
138
5.**Debug: Phase 5 — report and recommend** — Deliver structured findings: root cause summary (file:function:logic + evidence chain), recommended fix strategy, similar patterns, hardening recommendations. Clean up diagnostic artifacts. NO FIX CODE.
139
139
@@ -143,9 +143,9 @@ Use `addBlockedBy` to enforce ordering. As each phase begins, mark its task `in_
143
143
144
144
| Task | Done when |
145
145
|---|---|
146
-
|Triage| Bug category identified, playbook loaded, relevant files identified from error signal |
146
+
|Classify + load playbook| Bug category identified, playbook loaded, relevant files identified from error signal |
147
147
| Reproduce | Failure reproduced on demand (or documented why it can't be), expected vs actual behavior gap articulated, premises stated with file:line|
148
-
|Investigate| Specific root cause identified with evidence from at least one diagnostic action |
148
+
|Hypothesis-driven investigation| Specific root cause identified with evidence from at least one diagnostic action |
149
149
| Classify | Root cause classified as env/config, code bug, or both |
150
150
| Report | Structured findings delivered with file:line specificity, diagnostic artifacts documented, no fix code written |
2.**Explore: Investigate** — execute active phases (map surfaces, search and trace, inspect patterns — based on selected lenses)
70
+
2.**Explore: Execute active lenses — map, trace, or inspect** — execute active phases (map surfaces, search and trace, inspect patterns — based on selected lenses)
71
71
3.**Explore: Synthesize** — produce brief in appropriate format (pattern, trace, world model, or combined) with confidence provenance and gap discipline
72
72
73
73
Mark each task `in_progress` when starting and `completed` when finished.
Copy file name to clipboardExpand all lines: plugins/eng/skills/qa/SKILL.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,7 +39,7 @@ Before starting any work, create these tasks using `TaskCreate`. This makes the
39
39
Create these tasks in order:
40
40
41
41
1.**QA: Detect tools and gather context** — Probe for available testing tools (browser/Playwright, desktop/Peekaboo, shell). Document tool gaps. Gather feature context from SPEC.md, PR, or feature description. Build mental model of what was built and what surfaces were touched.
42
-
2.**QA: Derive test plan** — Identify concrete scenarios requiring manual verification. Apply formalization gate to each: if automatable with easy-medium effort, write the formal test instead. Categorize remaining scenarios. Create qa-progress.json with all scenarios in "planned" status (when tmp/ship/ exists) or persist checklist to PR body (standalone mode).
42
+
2.**QA: Derive test plan and apply formalization gate** — Identify concrete scenarios requiring manual verification. Apply formalization gate to each: if automatable with easy-medium effort, write the formal test instead. Categorize remaining scenarios. Create qa-progress.json with all scenarios in "planned" status (when tmp/ship/ exists) or persist checklist to PR body (standalone mode).
43
43
3.**QA: Execute test scenarios** — Work through each scenario using strongest available tool (browser > API > shell > inference). Test happy path first, then break it, then stress it. Record video evidence for browser scenarios. Fix bugs discovered during testing.
44
44
4.**QA: Record results** — Update qa-progress.json for every scenario: set status (validated/failed/blocked/skipped), verifiedVia fidelity level, notes, and evidence URLs.
45
45
5.**QA: Report** — Communicate results to invoker. Total scenarios tested vs passed vs failed vs skipped. Bugs found and fixed. Gaps that could NOT be tested. Judgment call on readiness.
@@ -51,7 +51,7 @@ Use `addBlockedBy` to enforce ordering. As each step begins, mark its task `in_p
51
51
| Task | Done when |
52
52
|---|---|
53
53
| Detect tools + context | Tool inventory documented, feature context understood, mental model built |
54
-
| Derive test plan | All scenarios identified, formalization gate applied, qa-progress.json created with all scenarios in `planned` status |
54
+
| Derive test plan + formalization gate | All scenarios identified, formalization gate applied (automatable scenarios converted to formal tests), qa-progress.json created with all scenarios in `planned` status |
55
55
| Execute | All planned scenarios executed (or marked blocked/skipped with reason), bugs found are fixed or documented |
56
56
| Record results | Every scenario in qa-progress.json has non-`planned` status, verifiedVia populated, notes populated for non-clean-pass scenarios |
Copy file name to clipboardExpand all lines: plugins/eng/skills/review/SKILL.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,7 +29,7 @@ Before starting any work, create these tasks using `TaskCreate`. This makes the
29
29
30
30
Create these tasks in order:
31
31
32
-
1.**Review: Resolve PR and assess starting state** — Verify gh auth. Detect PR number. Check for unpushed local changes. Fetch existing review feedback. Update PR body if implementation changed.
32
+
1.**Review: Assess starting state — detect PR, fetch feedback, check local changes** — Verify gh auth. Detect PR number. Check for unpushed local changes. Fetch existing review feedback. Update PR body if implementation changed.
33
33
2.**Review: Stage 1 — review feedback loop** — Poll for reviewer feedback (6-min intervals). For each thread: investigate proportionally, evaluate across all dimensions, decide with evidence, resolve thread. Implement accepted changes, run tests, push. Re-poll after every push.
34
34
3.**Review: Stage 2 — CI/CD resolution** — Monitor pipeline. Classify each failure with evidence (PR-caused, pre-existing, flaky, infrastructure, cancelled). Fix PR-caused failures. Re-trigger flaky/infra/cancelled and wait for result.
35
35
4.**Review: Final verification** — Verify: all threads resolved, CI/CD green or documented, no pending runs. If security/auth/multi-subsystem changes, trigger second-pass review.
@@ -40,7 +40,7 @@ Use `addBlockedBy` to enforce ordering. As each stage begins, mark its task `in_
40
40
41
41
| Task | Done when |
42
42
|---|---|
43
-
|Resolve PR| Starting state determined, PR body current, existing feedback fetched |
43
+
|Assess starting state| Starting state determined, PR body current, existing feedback fetched |
44
44
| Stage 1 | Every thread resolved with evidence-backed reply, latest changes pushed, re-polled after last push with no new comments |
45
45
| Stage 2 | Pipeline green OR all failures documented as pre-existing/unrelated with `--compare-main` evidence. No cancelled or pending runs. |
46
46
| Final verification | All exit checklist items pass. Second-pass triggered if applicable. |
Copy file name to clipboardExpand all lines: plugins/eng/skills/spec/SKILL.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -75,10 +75,10 @@ Before starting any work, create these tasks using `TaskCreate`. This makes the
75
75
Create these tasks in order:
76
76
77
77
1.**Spec: Intake — problem framing and stress-test** — Capture seed (what, why, who). Draft problem statement in SCR format. Run all 5 stress-test probes. Produce initial Open Questions list.
78
-
2.**Spec: Scaffold — create artifacts and build world model** — Create SPEC.md, evidence/, meta/_changelog.md. Build product + internal surface-area maps. Dispatch /explore for blast radius. Investigate 3P dependencies. Produce scope hypothesis.
78
+
2.**Spec: Scaffold — create artifacts, investigate system and dependencies** — Create SPEC.md, evidence/, meta/_changelog.md. Build product + internal surface-area maps. Dispatch /explore for blast radius. Investigate 3P dependencies. Produce scope hypothesis.
79
79
3.**Spec: Backlog — extract and prioritize open questions** — Systematically extract OQs via walkthrough, tensions, and negative-space probes. Classify every item. Present priority triage to user for confirmation.
80
80
4.**Spec: Iterate — investigate, decide, cascade** — Run the core loop: investigate P0 items, present decision batches, cascade through SPEC.md, completeness re-sweep every 2-3 iterations.
81
-
5.**Spec: Freeze — scope freeze** — Run adversarial pre-freeze review. Assign resolution status to all decisions. Run resolution completeness gate. Classify Future Work by maturity tier.
81
+
5.**Spec: Freeze — adversarial review, resolution status, completeness gate** — Run adversarial pre-freeze review. Assign resolution status to all decisions. Run resolution completeness gate. Classify Future Work by maturity tier.
82
82
6.**Spec: Verify and finalize — technical accuracy and quality bar** — Refresh codebase. Extract load-bearing technical assertions. Dispatch parallel verification subagents. Present findings (Tier 1 design-affecting, Tier 2 factual corrections). Apply corrections. Run quality bar checklist.
83
83
84
84
Use `addBlockedBy` to enforce ordering. As each step begins, mark its task `in_progress`. When the step completes, mark it `completed`.
@@ -88,10 +88,10 @@ Use `addBlockedBy` to enforce ordering. As each step begins, mark its task `in_p
88
88
| Task | Done when |
89
89
|---|---|
90
90
| Intake | SCR problem statement drafted, all 5 stress-test probes run, initial Open Questions list exists |
91
-
| Scaffold | SPEC.md exists on disk, evidence/ directory created, product + internal surface maps drafted, scope hypothesis presented to user |
91
+
| Scaffold (artifacts + investigation) | SPEC.md exists on disk, evidence/ directory created, product + internal surface maps drafted, scope hypothesis presented to user |
92
92
| Backlog | All items extracted (not filtered), classified with P0/P2 tags, user has confirmed priority assignments |
93
93
| Iterate | All P0 open questions resolved, scope stabilized through iterative loop, no pending decision batches |
94
-
| Freeze | All decisions have resolution status (LOCKED/DIRECTED/DELEGATED), all In Scope items pass completeness gate, Future Work classified by maturity tier |
94
+
| Freeze (review + status + gate) | All decisions have resolution status (LOCKED/DIRECTED/DELEGATED), all In Scope items pass completeness gate, Future Work classified by maturity tier |
95
95
| Verify and finalize | Codebase refreshed, all load-bearing assertions verified (CONFIRMED/CONTRADICTED/STALE/UNVERIFIABLE), Tier 1 issues resolved via iterative loop, Tier 2 corrections applied, quality bar checklist passes, SPEC.md finalized |
96
96
97
97
**On re-entry:** Check `TaskList` first. If tasks exist, read SPEC.md and `meta/_changelog.md` to determine current state. Resume from the first non-completed task. If no tasks exist, create them and mark completed phases based on SPEC.md content (has SCR? → Intake done. Has surface maps? → Scaffold done. Etc.)
0 commit comments