Skip to content

Commit 018d814

Browse files
committed
refactor: update documentation and skill metadata for improved clarity and workflow alignment
1 parent f8e648f commit 018d814

File tree

7 files changed

+44
-11
lines changed

7 files changed

+44
-11
lines changed

docs/design.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ Behavior guarantees:
101101

102102
### 6.1 pb-init
103103

104-
Audits the repository and produces a **minimal** `AGENTS.md` containing only information that agents cannot discover from the codebase itself. Applies a strict three-part filter: each entry must be (1) not inferrable from code, (2) operationally decisive, and (3) not guessable from industry conventions. The ideal AGENTS.md is empty — every entry represents a codebase smell that should eventually be fixed at the root cause. Re-runs audit existing entries and flag any that are now discoverable.
104+
Audits the repository and updates a **managed snapshot block** inside `AGENTS.md`. The generated block captures current project context, key file locations, active specs, and an `Architecture Decision Snapshot` that later agents inherit. Re-runs replace only the managed block and preserve all user-authored content outside it.
105105

106106
### 6.2 pb-plan
107107

@@ -122,6 +122,7 @@ Implements tasks sequentially with strict context hygiene and an outside-in doub
122122
3. Minimal context handoff between subagents.
123123
4. File-scoped rollback guidance for failed task attempts.
124124
5. Per-task verification criteria, scenario coverage mapping, and explicit completion status tracking in `tasks.md`.
125+
6. Managed `AGENTS.md` snapshot updates instead of whole-file rewrites.
125126

126127
## 8. Testing and Verification
127128

@@ -131,6 +132,7 @@ Current automated coverage validates:
131132
2. Platform path/render behavior across all supported platforms.
132133
3. End-to-end structure generation for `--ai all`.
133134
4. Template loading and safety regressions (e.g., malformed wrappers, destructive command checks).
135+
5. Prompt/skill parity checks for workflow-critical instructions and architecture constraints.
134136

135137
Primary verification commands:
136138

@@ -142,5 +144,5 @@ uv run ruff check .
142144
## 9. Known Constraints and Follow-ups
143145

144146
1. Platform-specific runtime semantics can evolve; adapter paths/formats should be periodically re-validated against official tool docs.
145-
2. Prompt/skill content parity is maintained by template discipline, not code generation.
147+
2. Prompt/skill content parity is maintained by template discipline, and parity is guarded by regression tests for workflow-critical instructions.
146148
3. Additional platforms should be added only through new adapter classes and test expansion, not conditional sprawl in shared install logic.

src/pb_spec/platforms/base.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@
66
# Skill metadata: name -> description
77
SKILL_METADATA: dict[str, str] = {
88
"pb-init": (
9-
"Use to audit the repo and produce a minimal AGENTS.md containing only "
10-
"undiscoverable gotchas, hard constraints, and non-obvious conventions."
9+
"Use to audit the repo and update a managed AGENTS.md snapshot with "
10+
"project context, architecture decisions, and non-obvious conventions."
1111
),
1212
"pb-plan": (
1313
"Use when converting a requirement into a design proposal and executable tasks before coding."
@@ -17,7 +17,7 @@
1717
"and tasks.md."
1818
),
1919
"pb-build": (
20-
"Use when tasks.md is ready and you need sequential TDD implementation with recovery loops."
20+
"Use when tasks.md is ready and you need sequential BDD+TDD implementation with recovery loops."
2121
),
2222
}
2323

src/pb_spec/templates/prompts/pb-build.prompt.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ Never guess `<spec-dir>` from memory. Always resolve from actual directory names
4040

4141
## Step 2: Parse Unfinished Tasks
4242

43-
Scan for all unchecked items (`- [ ]`). Build an ordered list preserving Phase → Task number order.
43+
Determine unfinished tasks from each `### Task X.Y:` block in `tasks.md`, then inspect the status and checkbox lines inside that block. Do not treat every `- [ ]` step as a separate task. Build an ordered list of task blocks preserving Phase → Task number order.
4444

4545
**Use Task IDs for state tracking.** Each task has a unique ID in the format `Task X.Y` (e.g., `Task 1.1`, `Task 2.3`). When locating tasks, match on the `### Task X.Y:` heading pattern, not just bare checkboxes.
4646

@@ -49,7 +49,7 @@ Scan for all unchecked items (`- [ ]`). Build an ordered list preserving Phase
4949
- If `tasks.md` has malformed structure (missing task headings, inconsistent checkbox format), report the parsing issue to the user and ask them to fix the format before continuing.
5050
- If a task is marked `⏭️ SKIPPED`, treat it as unfinished but deprioritize — skip it unless the user explicitly requests a retry.
5151

52-
For execution reliability, represent the queue as explicit task units: `Task ID`, `Task Name`, `Status`, `Scenario Coverage`, `Loop Type`, `BDD Verification`, `Verification`.
52+
For execution reliability, represent the queue as explicit task-block units: `Task ID`, `Task Name`, `Status`, `Scenario Coverage`, `Loop Type`, `BDD Verification`, and `Verification`.
5353

5454
If all tasks are checked (`- [x]`), report:
5555

@@ -282,7 +282,7 @@ You are implementing **Task {{TASK_NUMBER}}: {{TASK_NAME}}**.
282282
283283
### Your Job
284284

285-
Execute in strict order:
285+
Execute in strict order. Report concise decisions and evidence for each step:
286286

287287
Before coding, define a compact task contract from the provided task block:
288288

src/pb_spec/templates/skills/pb-build/SKILL.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ Never guess `<spec-dir>` from memory. Always resolve from actual directory names
4444

4545
### Step 2: Parse Unfinished Tasks
4646

47-
Scan `tasks.md` for all unchecked task items (`- [ ]`). Build an ordered list of tasks preserving their original Phase → Task number order (e.g., Task 1.1, Task 1.2, Task 2.1, …).
47+
Determine unfinished tasks from each `### Task X.Y:` block in `tasks.md`, then inspect the status and checkbox lines inside that block. Do not treat every `- [ ]` step as a separate task. Build an ordered list of task blocks preserving their original Phase → Task number order (e.g., Task 1.1, Task 1.2, Task 2.1, …).
4848

4949
**Use Task IDs for state tracking.** Each task has a unique ID in the format `Task X.Y` (e.g., `Task 1.1`, `Task 2.3`). When locating tasks, match on the `### Task X.Y:` heading pattern, not just bare checkboxes.
5050

@@ -53,7 +53,7 @@ Scan `tasks.md` for all unchecked task items (`- [ ]`). Build an ordered list of
5353
- If `tasks.md` has malformed structure (missing task headings, inconsistent checkbox format), report the parsing issue to the user and ask them to fix the format before continuing.
5454
- If a task is marked `⏭️ SKIPPED`, treat it as unfinished but deprioritize — skip it unless the user explicitly requests a retry.
5555

56-
For execution reliability, represent the queue as explicit task units: `Task ID`, `Task Name`, `Status`, `Scenario Coverage`, `Loop Type`, `BDD Verification`, `Verification`.
56+
For execution reliability, represent the queue as explicit task-block units: `Task ID`, `Task Name`, `Status`, `Scenario Coverage`, `Loop Type`, `BDD Verification`, and `Verification`.
5757

5858
If all tasks are already checked (`- [x]`), report:
5959

src/pb_spec/templates/skills/pb-build/references/implementer_prompt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ You are implementing **Task {{TASK_NUMBER}}: {{TASK_NAME}}**.
2424

2525
## Your Job
2626

27-
Execute the following steps in strict order. **You must output your reasoning for each step.** Do not skip or reorder any step.
27+
Execute the following steps in strict order. Report concise decisions and evidence for each step. Do not skip or reorder any step.
2828

2929
Before coding, define a compact task contract from the provided task block:
3030

tests/test_platforms.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import pytest
66

77
from pb_spec.platforms import get_platform, resolve_targets
8+
from pb_spec.platforms.base import SKILL_METADATA
89
from pb_spec.platforms.claude import ClaudePlatform
910
from pb_spec.platforms.codex import CodexPlatform
1011
from pb_spec.platforms.copilot import CopilotPlatform
@@ -19,6 +20,11 @@ def test_skill_names_returns_four_skills():
1920
assert platform.skill_names == ["pb-init", "pb-plan", "pb-refine", "pb-build"]
2021

2122

23+
def test_skill_metadata_descriptions_match_current_workflow():
24+
assert "managed AGENTS.md snapshot" in SKILL_METADATA["pb-init"]
25+
assert "BDD+TDD" in SKILL_METADATA["pb-build"]
26+
27+
2228
# --- get_skill_path ---
2329

2430

tests/test_templates.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -364,6 +364,21 @@ def test_pb_build_templates_escalate_after_three_failures():
364364
assert "retry budget" in content
365365

366366

367+
def test_pb_build_templates_parse_task_blocks_instead_of_raw_checkboxes():
368+
"""pb-build should treat Task X.Y blocks as the execution unit, not each checkbox line."""
369+
for content in (load_skill_content("pb-build"), load_prompt("pb-build")):
370+
assert "Determine unfinished tasks from each `### Task X.Y:` block" in content
371+
assert "Do not treat every `- [ ]` step as a separate task." in content
372+
373+
374+
def test_pb_build_implementer_templates_require_concise_evidence_not_reasoning_dump():
375+
"""Implementer templates should ask for concise evidence, not full reasoning traces."""
376+
build_refs = load_references("pb-build")
377+
for content in (build_refs["implementer_prompt.md"], load_prompt("pb-build")):
378+
assert "output your reasoning for each step" not in content
379+
assert "Report concise decisions and evidence for each step" in content
380+
381+
367382
def test_pb_build_implementer_templates_require_runtime_evidence():
368383
"""Implementer guidance should require runtime log/probe evidence when applicable."""
369384
build_refs = load_references("pb-build")
@@ -378,3 +393,13 @@ def test_pb_refine_templates_accept_build_block_packets():
378393
for content in (load_skill_content("pb-refine"), load_prompt("pb-refine")):
379394
assert "Build-block packets" in content
380395
assert "🛑 Build Blocked" in content
396+
397+
398+
def test_project_design_doc_matches_current_snapshot_workflow():
399+
"""docs/design.md should describe the same managed-snapshot workflow implemented by templates."""
400+
design = Path("docs/design.md").read_text(encoding="utf-8")
401+
402+
assert "managed snapshot block" in design
403+
assert "Architecture Decision Snapshot" in design
404+
assert "BDD outer loop" in design
405+
assert "parity is guarded by regression tests" in design

0 commit comments

Comments
 (0)