| name | specflow |
|---|---|
| description | Spec-driven development with executable contracts |
| version | 1.0.0 |
| author | Hulupeep |
Specs that enforce themselves. Turn requirements into contracts that break the build when violated.
Spec --> [Pre-Flight] --> Contract --> Test --> Code --> Verify
- Write requirements with IDs (AUTH-001 MUST, J-CHECKOUT-001)
- Run pre-flight simulation — catches broken specs before any code is written
- Generate contract YAML with forbidden/required patterns
- Generate tests that scan source code for violations
- Implement code that satisfies contracts
- Violations = build fails = PR blocked
When this skill is active, Claude Code MUST:
- Before modifying any file: Check if it falls under a contract scope in
docs/contracts/*.yml. If yes, read the contract and respect allnon_negotiablerules. - Before closing any work: Run contract tests (
npm test -- contracts) and journey tests. Work is not done if tests fail. - When creating new features: Generate the spec (with REQ IDs), contract YAML, and test files BEFORE implementing code.
- When a contract violation is reported: Read the contract rule, understand why it exists, fix the code to comply. Never work around the test.
- Never modify
non_negotiablerules unless the user explicitly saysoverride_contract: <contract_id>. - Before accepting a ticket as specflow-compliant: Pre-flight must have run and returned
passed,passed_with_warnings, or a human-acknowledgedoverride:*status. A ticket withblocked,stale, or missing pre-flight section is NOT compliant.
A read-only simulation that catches broken specs before any code is written. Pre-flight runs at two points in the pipeline:
| Scope | Trigger | Lenses |
|---|---|---|
| Ticket | When any ticket is created or edited as a specflow ticket | Lenses 1-5 |
| Wave | Between dependency-mapper and sprint-executor, for all wave tickets simultaneously | Lenses 1-6 (includes Lens 6: Concurrent User Scenarios) |
All of the following invoke format-then-simulate, in that order, every time:
- "write this as a specflow ticket"
- "update this ticket as a specflow ticket"
- "edit this ticket as a specflow ticket"
- "make this ticket specflow-compliant"
- Any instruction resulting in specflow-writer creating or modifying a ticket body
A ticket is NOT specflow-compliant until both format AND simulate have completed cleanly.
passed — no CRITICAL findings
passed_with_warnings — P1 findings present but acknowledged
blocked — CRITICAL findings unresolved; ticket cannot enter a wave
stale — ticket or referenced contract updated after last simulation
override:[reason] — human override applied; wave can proceed
Any value outside this enum is treated as blocked by waves-controller. The field is parsed directly — no regex, no interpretation.
After dependency-mapper completes and before sprint-executor fires:
- Any ticket with
blockedorstale→ wave pauses, finding summary output to user, STOP - Any non-enum value → treated as
blocked, STOP - All tickets with
passed,passed_with_warnings, oroverride:*→ sprint-executor proceeds
override_preflight: [ticket-id] reason: [reason text]
Sets simulation_status: override:[reason] on the ticket. Logged to docs/preflight/overrides.md with ticket-id, reason, RFC 3339 UTC timestamp, and user. board-auditor displays overrides distinctly (
- Does not modify source files, contract YAMLs, or migrations
- Does not run tests or Playwright
- Does not fix tickets (that is heal-loop's job on built code)
- Does not call external APIs
- Does not mark a ticket compliant — it returns findings; specflow-writer applies the status
| Type | File Pattern | Enforced By | When |
|---|---|---|---|
| Architecture | feature_architecture.yml |
Pattern scan (Jest/Vitest) | Before build |
| Feature | feature_*.yml |
Pattern scan (Jest/Vitest) | Before build |
| Security | security_defaults.yml |
Pattern scan | Before build |
| Accessibility | accessibility_defaults.yml |
Pattern scan | Before build |
| Journey | journey_*.yml |
Playwright E2E | After build |
contract_meta:
id: auth_feature
version: 1
covers_reqs: [AUTH-001, AUTH-002]
rules:
non_negotiable:
- id: AUTH-001
title: "API endpoints require authMiddleware"
scope: ["src/routes/**/*.ts"]
behavior:
forbidden_patterns:
- pattern: /router\.(get|post).*\/api\//
message: "Route missing authMiddleware"
required_patterns:
- pattern: /authMiddleware/
message: "Must use authMiddleware"
auto_fix:
strategy: "add_import"
import_line: "import { authMiddleware } from '@/middleware/auth'"- forbidden_patterns: Must NOT match in ANY file in scope. If found, violation.
- required_patterns: Must match in AT LEAST ONE file in scope. If absent, violation.
CONTRACT VIOLATION: AUTH-001 - API route missing authMiddleware
File: src/routes/users.ts
Line: 42
Match: router.get('/api/users', async (req, res) => {
Only humans can override non-negotiable rules. User must say:
override_contract: <contract_id>
When overriding: explain what rule is broken, warn about consequences, ask if contract should be updated permanently.
These patterns are enforced as non-negotiable in all src/**/*.{ts,js,tsx,jsx} files (excluding tests).
Forbidden patterns:
/(password|secret|api_key|apikey|token)\s*[:=]\s*['"][^'"]{8,}['"]/i
/sk_live_[a-zA-Z0-9]{20,}/
/sk_test_[a-zA-Z0-9]{20,}/
/-----BEGIN (RSA |EC )?PRIVATE KEY-----/
/ghp_[a-zA-Z0-9]{36}/
/xoxb-[0-9]{10,}-[a-zA-Z0-9]{20,}/
Fix: Use process.env.VAR_NAME instead.
Forbidden patterns:
/query\s*\(\s*['"`].*\$\{/
/query\s*\(\s*['"`].*\+\s*\w/
/execute\s*\(\s*['"`].*\$\{/
Fix: Use parameterized queries ($1, $2).
Forbidden patterns:
/dangerouslySetInnerHTML\s*=\s*\{\s*\{\s*__html:(?!\s*(sanitize|DOMPurify|purify))/
/\.innerHTML\s*=(?!\s*['"`]<)/
Fix: Sanitize with DOMPurify before rendering.
Forbidden patterns:
/\beval\s*\(/
/new\s+Function\s*\(/
Fix: Use JSON.parse or safe alternatives.
Forbidden patterns:
/readFile(Sync)?\s*\(\s*(?!path\.join|path\.resolve|__dirname)/
/writeFile(Sync)?\s*\(\s*(?!path\.join|path\.resolve|__dirname)/
Fix: Use path.join(__dirname, 'safe-dir', path.basename(input)).
Enforced in src/**/*.{tsx,jsx} files.
Forbidden: /<img\s+(?![^>]*\balt\s*=)[^>]*\/?>/
Forbidden: /<button(?![^>]*aria-label)[^>]*>\s*<(?:svg|img|[A-Z]\w*)[^>]*\/?\s*>\s*<\/button>/
Forbidden: /<input(?![^>]*(?:aria-label|aria-labelledby|id\s*=))[^>]*>/
Forbidden: /tabIndex\s*=\s*\{?\s*[1-9]/
When: Before any ticket is accepted as specflow-compliant; before each wave fires (between dependency-mapper and sprint-executor).
Recommended model: sonnet
Process:
- Receive JSON input with scope (
ticketorwave) plus ticket bodies andcontracts_dir - Load all files in
docs/contracts/before running any lens (proof-of-work: list every file loaded) - Run lenses in sequence — Lenses 1-5 for ticket scope, Lenses 1-6 for wave scope:
- Lens 1: Dependency Order — are all upstream dependencies present in the wave or schema?
- Lens 2: Shared State — concurrent writes, global state that should be per-user scoped?
- Lens 3: Schema Reality Check — every field, column, endpoint, and contract ID referenced actually exists?
- Lens 4: Timing and Interval Assumptions — polling intervals, timeouts, SLA thresholds match contracts?
- Lens 5: Partial Failure States — missing rollback, cleanup, or idempotency on step N+1 failure?
- Lens 6: Concurrent User Scenarios (wave scope only) — race conditions, missing locks, shared state not isolated per-session?
- Write findings to
## Pre-flight Findingssection in each ticket body (via specflow-writer for writes) - Return structured report with Lens attribution on every finding
- P2 findings written to
docs/preflight/[ticket-id]-[timestamp].mdwithout blocking
This agent is read-only. It never modifies source files, contract YAMLs, or migrations.
When: New feature needs acceptance criteria, Gherkin, and contracts.
Process:
- Understand the domain -- read existing code, schema, and issues
- Define scope (In Scope / Not In Scope)
- Design data contracts with complete executable SQL
- Define invariants (I-DOMAIN-NNN)
- Generate Gherkin scenarios (happy path, edge cases, error paths)
- Write acceptance criteria as testable checkboxes
- Define user journeys for all UI features
- Create contract YAML and update CONTRACT_INDEX.yml
Output: GitHub issue with Gherkin, data contracts, journey reference, and contract YAML files.
When: After implementation, before closing tickets.
Process:
- Parse ticket sections (scope, data contract, Gherkin, acceptance criteria)
- Trace code paths for each Given/When/Then step
- Validate data contracts against actual migrations
- Check journey coverage for UI-facing issues
- Verify invariants are enforced (DB constraint, app logic, or both)
- Generate validation report with PASS / PARTIAL / FAIL status
Key rule: Issues with UI but no journey contract are PARTIAL at best, never PASS.
When: After implementation, before closing tickets or creating PRs.
Process:
- Detect test framework (Playwright, Jest, Vitest, Cypress)
- Run tests with verbose output
- Parse results, categorize failures (locator, assertion, network, auth, flaky, setup)
- Map failures to source files with root cause candidates
- Generate report with summary, failures, skip reasons, and rerun commands
Mandatory reporting: WHERE tests ran, WHICH tests, HOW MANY passed/failed, SKIPPED with reasons.
When: Contract tests fail. Invoked by orchestrator, never by users directly.
Scope: Only contract violations with enough YAML context to generate a fix. Never journey tests, build errors, or forbidden patterns without auto_fix hints.
Process:
- Parse violation output (rule ID, file, line, matched text)
- Load contract rule and extract patterns, auto_fix hints, examples
- Read the violating file and understand context
- Determine fix strategy from auto_fix or infer from required_patterns
- Generate minimal fix (smallest possible change)
- Apply fix and re-run the specific contract test
- If pass: report and continue. If fail: retry (max 3 attempts)
- After exhaustion: revert last failed attempt, escalate with all strategies tried
Fix strategies: add_import, remove_pattern, wrap_with, replace_with
Route tasks to the optimal model tier for cost efficiency (~40-60% savings).
| Tier | Task Types |
|---|---|
| Haiku | Compliance audits, pattern matching validation, test execution and parsing, coverage checks, issue closing |
| Sonnet | Spec generation, contract YAML creation, test code generation, dependency mapping, component building, orchestration |
| Opus | Deep fix reasoning (heal-loop), complex architectural analysis |
Override in .specflow/config.json:
{
"model_routing": {
"default": "sonnet",
"overrides": {
"heal-loop": "opus",
"test-runner": "haiku"
}
}
}All four gates must pass before work is considered complete.
npm test -- contractsPattern scans source code for forbidden/required patterns. Violations block the build.
npx playwright testE2E tests verify user flows work end-to-end. Critical journeys must pass before release.
SEC-001 through SEC-005 scan for OWASP Top 10 patterns. Non-negotiable.
A11Y-001 through A11Y-004 scan for WCAG AA violations. Non-negotiable.
| Level | Meaning | Release Impact |
|---|---|---|
critical |
Core user flow | Blocks release if failing |
important |
Key feature | Should fix before release |
future |
Planned feature | Can release without |
Never report "ready for release" if any critical journey is failing or not_tested.
Fix patterns are stored in .specflow/fix-patterns.json and scored by historical success rate.
| Tier | Confidence | Behavior |
|---|---|---|
| Platinum | >= 0.95 | Auto-apply immediately |
| Gold | >= 0.85 | Auto-apply, flag in commit message for review |
| Silver | >= 0.75 | Suggest only, do not auto-apply |
| Bronze | < 0.70 | Learning only, track for analysis |
Score rules: New patterns start at 0.50 (Silver). +0.05 per success, -0.10 per failure. Decay -0.01/week after 90 days unused. Below 0.30: archived.
Pattern entry format:
{
"id": "fix-sec-001-hardcoded-secret",
"contract_rule": "SEC-001",
"violation_signature": "Hardcoded secret detected",
"fix_strategy": "replace_with",
"fix_template": {
"find": "const KEY = \"sk_live_...\"",
"replace_pattern": "const KEY = process.env.STRIPE_SECRET_KEY"
},
"confidence": 0.50,
"tier": "silver"
}/specflow Full autonomous loop: spec, contract, test, implement, verify
/specflow verify Contract validation only against existing contracts
/specflow spec Generate spec with REQ IDs for current issue or feature
/specflow heal Run fix loop on failing contract tests
/specflow status Render full execution dashboard (all 5 visualizations)
/specflow compile Compile CSV journeys to YAML contracts + Playwright stubs
- Check if
docs/contracts/exists. If not, create it and install default templates. - Read existing specs and contracts for context.
- For each feature/issue in scope: a. Generate REQ IDs from plain English descriptions b. Create contract YAML with forbidden/required patterns c. Create contract test files d. Implement code that satisfies contracts e. Run contract tests -- fix violations f. Run journey tests if applicable
- Report: which REQs covered, which journeys pass, DOD status.
- Load all contracts from
docs/contracts/*.yml - For each non_negotiable rule, scan scoped files for pattern violations
- Report violations with file:line references
- Report DOD status (critical journeys passing?)
- Interview user about the feature in plain English
- Generate REQ IDs (AUTH-001, SEC-001, J-CHECKOUT-001)
- Create contract YAML
- Create test stubs
- Output summary of what was created
- Run contract tests, capture failures
- For each violation: a. Parse rule ID, file, line b. Read contract YAML for auto_fix hints c. Apply minimal fix d. Re-test
- Report: fixed, escalated, or exhausted
- Find CSV journey files (user provides path, or search for
*.csvwith journey headers) - Validate CSV: journey_id format, sequential steps, owner present, critical yes/no
- Run
node scripts/specflow-compile.js <csv-file> - Report: journeys compiled, contracts generated, test stubs generated
- Remind user to commit the CSV + generated files
Render the full execution dashboard with all 5 mandatory visualizations:
- EXECUTION TIMELINE — where you are, what's done, what's next
- ENFORCEMENT MAP — what gets tested, by what mechanism, and when (the trust layer)
- DEPENDENCY TREE — execution order and what blocks what
- PARALLEL AGENT MODEL — who is working on what right now
- SPRINT SUMMARY TABLE — cumulative totals across all completed waves
This command works at any point during wave execution. See agents/waves-controller.md for full visualization templates.
Core Loop: Spec --> [Pre-Flight] --> Contract --> Test --> Code --> Verify
REQ ID Format: AUTH-001 (MUST), AUTH-010 (SHOULD), J-AUTH-LOGIN
Contract Files: docs/contracts/feature_*.yml, journey_*.yml
Test Files: src/__tests__/contracts/*.test.ts, tests/e2e/*.spec.ts
Commands: npm test -- contracts, npx playwright test
Override: override_contract: <contract_id>
Pre-Flight: simulation_status: passed | passed_with_warnings | blocked | stale | override:[reason]
PF Override: override_preflight: <ticket-id> reason: <reason text>
When setting up Specflow in a new project, create this structure:
docs/
contracts/
feature_architecture.yml # ARCH rules
feature_*.yml # Feature rules
journey_*.yml # User flow DOD
security_defaults.yml # SEC-001..005
accessibility_defaults.yml # A11Y-001..004
CONTRACT_INDEX.yml # Central registry
src/
__tests__/
contracts/
*.test.ts # Contract pattern tests
tests/
e2e/
journey_*.spec.ts # Playwright journey tests
.specflow/
config.json # Model routing, overrides
fix-patterns.json # Fix pattern store
Inspired by the single-file skill packaging of forge by Ikenna N. Okpala.