Skip to content

Commit b7c3f3d

Browse files
nick-inkeepclaude
andcommitted
refactor(skills): sharpen 7 task names for semantic precision
Rename task subjects to more precisely describe the key actions: - spec #2: "Scaffold — create artifacts and build world model" → "Scaffold — create artifacts, investigate system and dependencies" - spec #5: "Freeze — scope freeze" → "Freeze — adversarial review, resolution status, completeness gate" - review #1: "Resolve PR and assess starting state" → "Assess starting state — detect PR, fetch feedback, check local changes" - qa #2: "Derive test plan" → "Derive test plan and apply formalization gate" - debug #1: "Phase 1 — triage" → "Phase 1 — classify bug and load playbook" - debug #3: "Phase 3 — investigate" → "Phase 3 — hypothesis-driven root cause investigation" - explore #2: "Investigate" → "Execute active lenses — map, trace, or inspect" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 317dc82 commit b7c3f3d

File tree

5 files changed

+13
-13
lines changed

5 files changed

+13
-13
lines changed

plugins/eng/skills/debug/SKILL.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -131,9 +131,9 @@ Before starting any work, create these tasks using `TaskCreate`. This makes the
131131

132132
Create these tasks in order:
133133

134-
1. **Debug: Phase 1 — triage** — Parse complete error output (every word). Classify bug into one of 9 categories. Load relevant triage playbook. Identify files to investigate from stack trace.
134+
1. **Debug: Phase 1 — classify bug and load playbook** — Parse complete error output (every word). Classify bug into one of 9 categories. Load relevant triage playbook. Identify files to investigate from stack trace.
135135
2. **Debug: Phase 2 — reproduce and comprehend** — Reproduce failure reliably. Map relevant code area (30-50 lines context, follow imports, read 2-3 siblings). Check system state. Check git history. Build mental model: expected vs actual behavior. State premises with file:line citations.
136-
3. **Debug: Phase 3 — investigate** — Present all plausible hypotheses in one batch ranked by confidence. Test each via hypothesis-test-refine cycle (predict before running). Record verdict per hypothesis. Switch strategy after 3 rejections. Escalate after 5 hypotheses.
136+
3. **Debug: Phase 3 — hypothesis-driven root cause investigation** — Present all plausible hypotheses in one batch ranked by confidence. Test each via hypothesis-test-refine cycle (predict before running). Record verdict per hypothesis. Switch strategy after 3 rejections. Escalate after 5 hypotheses.
137137
4. **Debug: Phase 4 — classify root cause** — Classify: dev environment/config issue vs code bug vs both. This determines the resolution path.
138138
5. **Debug: Phase 5 — report and recommend** — Deliver structured findings: root cause summary (file:function:logic + evidence chain), recommended fix strategy, similar patterns, hardening recommendations. Clean up diagnostic artifacts. NO FIX CODE.
139139

@@ -143,9 +143,9 @@ Use `addBlockedBy` to enforce ordering. As each phase begins, mark its task `in_
143143

144144
| Task | Done when |
145145
|---|---|
146-
| Triage | Bug category identified, playbook loaded, relevant files identified from error signal |
146+
| Classify + load playbook | Bug category identified, playbook loaded, relevant files identified from error signal |
147147
| Reproduce | Failure reproduced on demand (or documented why it can't be), expected vs actual behavior gap articulated, premises stated with file:line |
148-
| Investigate | Specific root cause identified with evidence from at least one diagnostic action |
148+
| Hypothesis-driven investigation | Specific root cause identified with evidence from at least one diagnostic action |
149149
| Classify | Root cause classified as env/config, code bug, or both |
150150
| Report | Structured findings delivered with file:line specificity, diagnostic artifacts documented, no fix code written |
151151

plugins/eng/skills/explore/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ When the purpose is not stated, infer from context using this table.
6767
Before starting, create tasks to track progress through the phases:
6868

6969
1. **Explore: Scope and load knowledge** — determine target, select lenses, check existing repo knowledge (skills, architecture docs, surface catalogs)
70-
2. **Explore: Investigate** — execute active phases (map surfaces, search and trace, inspect patterns — based on selected lenses)
70+
2. **Explore: Execute active lenses — map, trace, or inspect** — execute active phases (map surfaces, search and trace, inspect patterns — based on selected lenses)
7171
3. **Explore: Synthesize** — produce brief in appropriate format (pattern, trace, world model, or combined) with confidence provenance and gap discipline
7272

7373
Mark each task `in_progress` when starting and `completed` when finished.

plugins/eng/skills/qa/SKILL.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Before starting any work, create these tasks using `TaskCreate`. This makes the
3939
Create these tasks in order:
4040

4141
1. **QA: Detect tools and gather context** — Probe for available testing tools (browser/Playwright, desktop/Peekaboo, shell). Document tool gaps. Gather feature context from SPEC.md, PR, or feature description. Build mental model of what was built and what surfaces were touched.
42-
2. **QA: Derive test plan** — Identify concrete scenarios requiring manual verification. Apply formalization gate to each: if automatable with easy-medium effort, write the formal test instead. Categorize remaining scenarios. Create qa-progress.json with all scenarios in "planned" status (when tmp/ship/ exists) or persist checklist to PR body (standalone mode).
42+
2. **QA: Derive test plan and apply formalization gate** — Identify concrete scenarios requiring manual verification. Apply formalization gate to each: if automatable with easy-medium effort, write the formal test instead. Categorize remaining scenarios. Create qa-progress.json with all scenarios in "planned" status (when tmp/ship/ exists) or persist checklist to PR body (standalone mode).
4343
3. **QA: Execute test scenarios** — Work through each scenario using strongest available tool (browser > API > shell > inference). Test happy path first, then break it, then stress it. Record video evidence for browser scenarios. Fix bugs discovered during testing.
4444
4. **QA: Record results** — Update qa-progress.json for every scenario: set status (validated/failed/blocked/skipped), verifiedVia fidelity level, notes, and evidence URLs.
4545
5. **QA: Report** — Communicate results to invoker. Total scenarios tested vs passed vs failed vs skipped. Bugs found and fixed. Gaps that could NOT be tested. Judgment call on readiness.
@@ -51,7 +51,7 @@ Use `addBlockedBy` to enforce ordering. As each step begins, mark its task `in_p
5151
| Task | Done when |
5252
|---|---|
5353
| Detect tools + context | Tool inventory documented, feature context understood, mental model built |
54-
| Derive test plan | All scenarios identified, formalization gate applied, qa-progress.json created with all scenarios in `planned` status |
54+
| Derive test plan + formalization gate | All scenarios identified, formalization gate applied (automatable scenarios converted to formal tests), qa-progress.json created with all scenarios in `planned` status |
5555
| Execute | All planned scenarios executed (or marked blocked/skipped with reason), bugs found are fixed or documented |
5656
| Record results | Every scenario in qa-progress.json has non-`planned` status, verifiedVia populated, notes populated for non-clean-pass scenarios |
5757
| Report | Results communicated, gaps documented, readiness judgment stated |

plugins/eng/skills/review/SKILL.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Before starting any work, create these tasks using `TaskCreate`. This makes the
2929

3030
Create these tasks in order:
3131

32-
1. **Review: Resolve PR and assess starting state** — Verify gh auth. Detect PR number. Check for unpushed local changes. Fetch existing review feedback. Update PR body if implementation changed.
32+
1. **Review: Assess starting state — detect PR, fetch feedback, check local changes** — Verify gh auth. Detect PR number. Check for unpushed local changes. Fetch existing review feedback. Update PR body if implementation changed.
3333
2. **Review: Stage 1 — review feedback loop** — Poll for reviewer feedback (6-min intervals). For each thread: investigate proportionally, evaluate across all dimensions, decide with evidence, resolve thread. Implement accepted changes, run tests, push. Re-poll after every push.
3434
3. **Review: Stage 2 — CI/CD resolution** — Monitor pipeline. Classify each failure with evidence (PR-caused, pre-existing, flaky, infrastructure, cancelled). Fix PR-caused failures. Re-trigger flaky/infra/cancelled and wait for result.
3535
4. **Review: Final verification** — Verify: all threads resolved, CI/CD green or documented, no pending runs. If security/auth/multi-subsystem changes, trigger second-pass review.
@@ -40,7 +40,7 @@ Use `addBlockedBy` to enforce ordering. As each stage begins, mark its task `in_
4040

4141
| Task | Done when |
4242
|---|---|
43-
| Resolve PR | Starting state determined, PR body current, existing feedback fetched |
43+
| Assess starting state | Starting state determined, PR body current, existing feedback fetched |
4444
| Stage 1 | Every thread resolved with evidence-backed reply, latest changes pushed, re-polled after last push with no new comments |
4545
| Stage 2 | Pipeline green OR all failures documented as pre-existing/unrelated with `--compare-main` evidence. No cancelled or pending runs. |
4646
| Final verification | All exit checklist items pass. Second-pass triggered if applicable. |

plugins/eng/skills/spec/SKILL.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -75,10 +75,10 @@ Before starting any work, create these tasks using `TaskCreate`. This makes the
7575
Create these tasks in order:
7676

7777
1. **Spec: Intake — problem framing and stress-test** — Capture seed (what, why, who). Draft problem statement in SCR format. Run all 5 stress-test probes. Produce initial Open Questions list.
78-
2. **Spec: Scaffold — create artifacts and build world model** — Create SPEC.md, evidence/, meta/_changelog.md. Build product + internal surface-area maps. Dispatch /explore for blast radius. Investigate 3P dependencies. Produce scope hypothesis.
78+
2. **Spec: Scaffold — create artifacts, investigate system and dependencies** — Create SPEC.md, evidence/, meta/_changelog.md. Build product + internal surface-area maps. Dispatch /explore for blast radius. Investigate 3P dependencies. Produce scope hypothesis.
7979
3. **Spec: Backlog — extract and prioritize open questions** — Systematically extract OQs via walkthrough, tensions, and negative-space probes. Classify every item. Present priority triage to user for confirmation.
8080
4. **Spec: Iterate — investigate, decide, cascade** — Run the core loop: investigate P0 items, present decision batches, cascade through SPEC.md, completeness re-sweep every 2-3 iterations.
81-
5. **Spec: Freeze — scope freeze** — Run adversarial pre-freeze review. Assign resolution status to all decisions. Run resolution completeness gate. Classify Future Work by maturity tier.
81+
5. **Spec: Freeze — adversarial review, resolution status, completeness gate** — Run adversarial pre-freeze review. Assign resolution status to all decisions. Run resolution completeness gate. Classify Future Work by maturity tier.
8282
6. **Spec: Verify and finalize — technical accuracy and quality bar** — Refresh codebase. Extract load-bearing technical assertions. Dispatch parallel verification subagents. Present findings (Tier 1 design-affecting, Tier 2 factual corrections). Apply corrections. Run quality bar checklist.
8383

8484
Use `addBlockedBy` to enforce ordering. As each step begins, mark its task `in_progress`. When the step completes, mark it `completed`.
@@ -88,10 +88,10 @@ Use `addBlockedBy` to enforce ordering. As each step begins, mark its task `in_p
8888
| Task | Done when |
8989
|---|---|
9090
| Intake | SCR problem statement drafted, all 5 stress-test probes run, initial Open Questions list exists |
91-
| Scaffold | SPEC.md exists on disk, evidence/ directory created, product + internal surface maps drafted, scope hypothesis presented to user |
91+
| Scaffold (artifacts + investigation) | SPEC.md exists on disk, evidence/ directory created, product + internal surface maps drafted, scope hypothesis presented to user |
9292
| Backlog | All items extracted (not filtered), classified with P0/P2 tags, user has confirmed priority assignments |
9393
| Iterate | All P0 open questions resolved, scope stabilized through iterative loop, no pending decision batches |
94-
| Freeze | All decisions have resolution status (LOCKED/DIRECTED/DELEGATED), all In Scope items pass completeness gate, Future Work classified by maturity tier |
94+
| Freeze (review + status + gate) | All decisions have resolution status (LOCKED/DIRECTED/DELEGATED), all In Scope items pass completeness gate, Future Work classified by maturity tier |
9595
| Verify and finalize | Codebase refreshed, all load-bearing assertions verified (CONFIRMED/CONTRADICTED/STALE/UNVERIFIABLE), Tier 1 issues resolved via iterative loop, Tier 2 corrections applied, quality bar checklist passes, SPEC.md finalized |
9696

9797
**On re-entry:** Check `TaskList` first. If tasks exist, read SPEC.md and `meta/_changelog.md` to determine current state. Resume from the first non-completed task. If no tasks exist, create them and mark completed phases based on SPEC.md content (has SCR? → Intake done. Has surface maps? → Scaffold done. Etc.)

0 commit comments

Comments
 (0)