You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: AGENTS.md
+11-1Lines changed: 11 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -153,6 +153,7 @@ When introducing new dependencies, prefer these unless compatibility requires a
153
153
- Type check with `ty` — all code must pass `ty check` with zero errors.
154
154
- Use Gherkin + `behave` for outer-loop acceptance tests.
155
155
- Use `pytest` for inner-loop TDD — tests must pass before claiming completion.
156
+
- Use Hypothesis under `tests/` when you need generated coverage for invariants; these tests must run through the normal `uv run pytest` path.
156
157
- Use `pytest-benchmark` for performance-sensitive code.
157
158
- Use `pytest-cov` to track test coverage.
158
159
@@ -178,8 +179,12 @@ When fixing failures, identify root cause first, then apply idiomatic fixes inst
178
179
179
180
Use outside-in development for behavior changes:
180
181
182
+
-**Git Restrictions:** NEVER use `git worktree`. All code modifications MUST be made directly on the current branch in the existing working directory.
181
183
- start with a failing Gherkin scenario under `features/`,
182
184
- drive implementation with failing `pytest` tests,
185
+
- keep example-based `pytest` tests as the default inner loop for named cases and edge cases,
186
+
- add Hypothesis properties under `tests/` when the rule is an invariant instead of a single named example,
187
+
- treat Atheris as conditional planning work rather than baseline template scaffolding,
183
188
- keep step definitions thin and reuse Python domain modules.
184
189
185
190
After each feature or bug fix, run:
@@ -198,9 +203,14 @@ If any command fails, report the failure and do not claim completion.
198
203
199
204
- BDD scenarios: place Gherkin features under `features/` and step definitions under `features/steps/`.
200
205
- Use BDD to define acceptance behavior first, then use `pytest` for the inner TDD loop.
206
+
- Keep example-based and Hypothesis-based tests together under `tests/`; `just test` must exercise both without a separate property-test command.
201
207
- Unit tests: place in `tests/` mirroring the source structure.
202
208
- Integration tests: place in `tests/integration/`.
203
-
- Performance tests: use `pytest-benchmark` with `@pytest.mark.benchmark`.
209
+
- Fuzz tests: only plan or add Atheris when a module parses hostile text or binary input, decodes file or protocol formats, or wraps native extensions where crash resistance matters.
210
+
- Fuzz workflow: when fuzzing is justified, keep the harness targeted to the affected module instead of adding a default workspace-wide fuzz command to every starter.
211
+
- Performance tests: use `pytest-benchmark` with `@pytest.mark.benchmark` only for code with an explicit latency SLA, throughput target, or hot-path requirement.
212
+
- Treat benchmark work as optional planning scope; keep the standard `pytest` loop as the default when no performance requirement exists.
213
+
- For `/pb-plan` work, mark fuzzing as `conditional` or `N/A` unless the scope explicitly includes parser-like, protocol, binary-decoding, hostile-input, or native-extension-heavy code.
204
214
- Add tests for behavioral changes and public API changes.
205
215
- Use `pytest` fixtures for setup/teardown; avoid `setUp`/`tearDown` methods.
206
216
- Use `pytest.raises` for exception testing; `pytest.approx` for floating-point comparisons.
|**Tooling**| Validation, parsing, and CI integration |`pb-spec validate`, internal parsers, future ecosystem tools |
62
+
63
+
The **workflow layer** defines the four commands and their execution semantics. The **contract layer** carries the explicit field requirements, state transitions, and scenario coverage rules described in [docs/contract.md](docs/contract.md). The **tooling layer** makes the contract executable through validation commands and reusable parsing libraries.
64
+
53
65
## Installation
54
66
55
67
```bash
@@ -111,6 +123,12 @@ pb-spec update
111
123
112
124
Update pb-spec to the latest version (requires `uv`).
113
125
126
+
```text
127
+
pb-spec validate <target>
128
+
```
129
+
130
+
Validate either a generated spec directory or a markdown feedback packet file. For spec directories, the current validator checks `design.md` required sections, `tasks.md` task blocks for required fields, allowed status values, and `DONE` completion evidence, inventories `features/*.feature` scenario names, and verifies `Scenario Coverage` references for `BDD+TDD` tasks. It does not currently fail build eligibility on an otherwise valid spec just because an extra scenario is not yet referenced by a task. For feedback files, it validates `🛑 Build Blocked` and `🔄 Design Change Request` packet completeness and requires concrete failure evidence rather than generic summaries.
131
+
114
132
## Workflow
115
133
116
134
four agent skills that chain together:
@@ -176,7 +194,7 @@ Reads `specs/<YYYY-MM-DD-NO-feature-name>/tasks.md` and implements each task seq
176
194
177
195
`/pb-build` is now explicitly architecture-bound: it reads the repo's **Architecture Decision Snapshot**, follows the feature's **Architecture Decisions**, re-checks **SRP** and **DIP** during execution, and keeps external dependencies behind interfaces or abstract classes when the design requires that seam. It should not improvise a different Factory, Strategy, Observer, Adapter, or Decorator choice mid-build.
178
196
179
-
Before parsing tasks or spawning subagents, `/pb-build` now runs a mandatory Phase 0 validation gate against the existing markdown contract: required design sections, required `Task X.Y` fields, and at least one feature scenario. If any item is missing, the build stops before implementation work starts. Task closure also follows explicit state transitions, so `DONE` is only reachable after scenario, test, and verification evidence are satisfied.
197
+
Before parsing tasks or spawning subagents, `/pb-build` now runs a mandatory Phase 0 validation gate against the existing markdown contract: required design sections, required `Task X.Y` fields, and at least one feature scenario. If any item is missing, the build stops before implementation work starts. The builder still manages explicit task-state transitions during execution, while the current static validator focuses on allowed status values and evidence-backed completion gates rather than reconstructing prior state history from a single markdown snapshot.
180
198
181
199
## Skills Overview
182
200
@@ -212,6 +230,45 @@ pb-spec's prompt design is inspired by Anthropic's research on [Effective Harnes
0 commit comments