8-Agent SDLC Pipeline for Claude Code
Transform feature implementation into an automated, quality-enforced workflow. One command runs research → plan → test → implement → review → security → docs → git automation.
AI coding assistants are powerful, but they drift. They build features you didn't ask for. They ignore your architecture. They forget your constraints.
Without guardrails, you get:
- Scope creep - Features that don't align with project goals
- 23% bug rate - Code that wasn't tested first
- 12% security vulnerabilities - No automated auditing
- 67% documentation drift - Docs that don't match code
autonomous-dev keeps AI aligned to YOUR intent. Every feature validates against your project's strategic goals before a single line of code is written.
Define your intent once. Enforce it automatically.
# .claude/PROJECT.md
## GOALS
- Build a secure authentication system
- Maintain sub-100ms API response times
## SCOPE
- User management, session handling
- NOT: Payment processing, analytics
## CONSTRAINTS
- No external auth providers
- Must support offline mode
## ARCHITECTURE
- REST API with JWT tokens
- PostgreSQL for persistenceEvery /implement command validates:
- Does this feature align with GOALS?
- Is it within SCOPE?
- Does it violate CONSTRAINTS?
- Does it follow ARCHITECTURE?
Result: The AI builds what you actually want, not what it thinks you want.
| Metric | Without Pipeline | With /implement |
|---|---|---|
| Scope alignment | Manual review | Automatic validation |
| Bug rate | 23% | 4% |
| Security issues | 12% | 0.3% |
| Documentation drift | 67% | 2% |
| Test coverage | 43% | 94% |
85% of issues caught before commit.
# Install (30 seconds)
bash <(curl -sSL https://raw.githubusercontent.com/akaszubski/autonomous-dev/master/install.sh)
# Restart Claude Code (Cmd+Q / Ctrl+Q), then:
/setup # Guided PROJECT.md creation/implement Add user authentication with JWT tokensautonomous-dev orchestrates 8 specialized AI agents in sequence:
STEP 1: Alignment → Validates against PROJECT.md goals
STEP 2: Research → Finds existing patterns in your codebase
STEP 3: Planning → Designs the architecture
STEP 4: TDD Tests → Writes failing tests FIRST
STEP 5: Implementation → Makes tests pass
STEP 6: Parallel Validation (3 agents simultaneously):
├── Code Review → Checks quality and patterns
├── Security Audit → Scans for OWASP vulnerabilities
└── Documentation → Keeps docs in sync
STEP 7: Git Automation → Commit, push, PR, close issue
Result: Production-ready code in 15-25 minutes.
| Command | Purpose |
|---|---|
/setup |
Interactive PROJECT.md creation wizard |
/implement |
Full pipeline: align, research, plan, test, implement, review, secure, document |
/implement --quick |
Fast mode: implementer agent only (2-5 min) |
/implement --batch |
Process multiple features from file |
/implement --issues 1 2 3 |
Process features from GitHub issues |
/implement --resume |
Continue interrupted batch |
/align |
Validate alignment (project goals, CLAUDE.md, retrofit) |
/create-issue |
Research-backed GitHub issues with duplicate detection |
/advise |
Critical analysis before major decisions |
/audit-tests |
Identify untested code paths |
/audit-claude |
Validate CLAUDE.md structure against best practices |
/health-check |
Validate all plugin components (agents, hooks, commands) |
/sync |
Update plugin from marketplace |
/worktree |
Manage git worktrees for isolated development |
┌─────────────────────────────────────────────────────────────────┐
│ /implement Pipeline │
├─────────────────────────────────────────────────────────────────┤
│ │
│ alignment → researcher → planner → test-master → implementer │
│ ↓ │
│ ┌─────────────────────┴─────────────────────┐│
│ │ Parallel Validation ││
│ │ reviewer + security-auditor + doc-master ││
│ │ (60% faster) ││
│ └───────────────────────────────────────────┘│
│ ↓ │
│ git automation │
│ (commit → push → PR → close issue) │
│ │
├─────────────────────────────────────────────────────────────────┤
│ 8 Pipeline Agents │ 39 Skills │ 73 Hooks │ 132 Libraries │
└─────────────────────────────────────────────────────────────────┘
8 Pipeline Agents (invoked via Task tool):
- researcher-local (Haiku) - Searches codebase for existing patterns
- researcher-web (Sonnet) - Researches best practices and security considerations
- planner (Sonnet) - Designs implementation architecture
- test-master (Sonnet) - Writes comprehensive tests BEFORE implementation (TDD)
- implementer (Sonnet) - Writes production-quality code to make tests pass
- reviewer (Sonnet) - Reviews code quality, patterns, and coverage
- security-auditor (Haiku) - Scans for OWASP vulnerabilities
- doc-master (Haiku) - Updates documentation to match code changes
Utility Agents (invoked via Task tool):
- alignment-validator, commit-message-generator, issue-creator, pr-description-generator, and more
How Agents Work:
- Agents are markdown prompts (not Python files)
- Invoked via Claude's Task tool with
subagent_typeparameter - Each agent has specialized knowledge and tools
- Pipeline runs sequentially (except parallel validation in step 6)
| Tier | Model | Agents | Cost Savings |
|---|---|---|---|
| Tier 1 | Haiku | researcher-local, reviewer, doc-master, security-auditor | 40-60% savings |
| Tier 2 | Sonnet | implementer, test-master, planner, researcher-web | Balanced cost/quality |
| Tier 3 | Opus | (reserved for future complex reasoning tasks) | Deep reasoning when needed |
Hooks run automatically at key moments to enforce quality without manual intervention:
| Hook Type | What It Does |
|---|---|
| PreToolUse | 4-layer security validation (sandboxing → MCP → agent auth → batch permissions) |
| PreCommit | Blocks commits with failing tests, missing docs, or security issues |
| SubagentStop | Triggers git automation after pipeline completion |
| PrePromptSubmit | Enforces workflow discipline and command validation |
| SessionStart | Restores session state after /clear for continuity |
Key Hooks:
- unified_pre_tool.py: 4-layer security (84% reduction in permission prompts)
- ralph_loop_enforcer.py: Self-correcting agent execution with 5 validation strategies
- pre_commit_gate.py: 100% test pass requirement (no bypass)
- enforce_pipeline_order.py: Prevents skipping research/planning/tests
- unified_session_tracker.py: Session state persistence across
/clearoperations
Hook Exit Code Semantics:
EXIT_SUCCESS (0): Hook passed, continue executionEXIT_WARNING (1): Non-blocking warning, log and continueEXIT_BLOCK (2): Critical failure, block operation- Lifecycle hooks (PreToolUse, SubagentStop) must exit 0 (blocking not allowed)
What hooks catch automatically:
- Documentation drift from code changes
- Secrets accidentally staged for commit
git push --forceto protected branches- Missing test coverage on new code
- CLAUDE.md out of sync with codebase
- Path traversal and injection attacks
- Agent execution failures (auto-retry with circuit breaker)
Philosophy: Quality gates should be automatic. If you have to remember to check something, you'll eventually forget.
After pipeline completion, autonomous-dev handles everything:
Feature completes → Generate commit message → Stage & commit
→ Push to remote → Create PR → Close GitHub issue
Enabled by default with first-run consent. Configure via .env:
AUTO_GIT_ENABLED=true # Master switch (default: true)
AUTO_GIT_PUSH=true # Auto-push (default: true)
AUTO_GIT_PR=true # Auto-create PRs (default: true)Features:
- Conventional commit messages generated by AI
- Co-authorship footer included
- PR descriptions with summary, test plan, related issues
- Automatic GitHub issue closing with workflow summary
autonomous-dev enforces its own quality standards. When the plugin detects it's running in the autonomous-dev repository, stricter gates activate automatically:
In autonomous-dev Repository:
- Coverage Threshold: 80% (vs 70% for user projects)
- No Bypass: Cannot override quality gates with
--no-verify - Automatic Enforcement: Enabled without configuration needed
Quality Gates Enforced:
| Gate | Standard | autonomous-dev |
|---|---|---|
| Test Coverage | 70% | 80% |
| TDD Requirement | Suggested | Required |
| Pre-commit Checks | Optional | Mandatory |
| Documentation Sync | Auto-checked | Strictly enforced |
How It Works:
- Hooks detect autonomous-dev repository via
plugins/autonomous-dev/manifest.json - Auto-detect based on file structure (survives worktrees, CI/CD)
- Apply stricter thresholds in enforcement hooks
- Block commits that don't meet standards (no bypass possible)
Philosophy: "We practice what we preach." If autonomous-dev requires 80% coverage from you, it requires it from itself.
Process 50+ features without manual intervention:
# features.txt
Add JWT authentication
Add password reset (requires JWT)
Add email notifications/implement --batch features.txt/implement --issues 72 73 74New: Batch now automatically continues through all features in a single invocation.
/implement --batch features.txt
# Processes Features 1/5 → 2/5 → 3/5 → 4/5 → 5/5 automatically
# No manual `/implement --resume` needed between features
# Manual resume only needed if batch is interrupted (not between features)Why This Matters: Previously, batch processing stopped after each feature requiring manual intervention. Now features auto-continue without interruption. Failed features are recorded but don't stop the batch.
Dependency Analysis: Automatically reorders features based on dependencies (Issue #157)
Original: [tests, auth, email]
Optimized: [auth, email, tests] # Tests run after implementation
- Analyzes feature dependencies via keyword matching
- Topological sort for optimal execution order
- Circular dependency detection with ASCII graph visualization
- Preserves explicit ordering when dependencies are ambiguous
Checkpoint/Resume: Automatic session snapshots with safe resume capability (Issues #276, #277)
# Batch processing automatically creates checkpoints after each feature
# If interrupted, resume from checkpoint:
/implement --resume batch-20260110-143022
# Rollback to previous checkpoint if needed:
/implement --rollback batch-20260110-143022 --previousContext Management:
- Claude auto-compact at ~185K tokens (23% increase from 150K baseline)
- Batch checkpoints after every feature (state + progress saved)
- SessionStart hook auto-resumes after
/clearor auto-compact - Corrupted checkpoint recovery with
.bakfallback - Token tracking:
context_token_deltaper feature for threshold detection
Automatic Retry: Transient failures (network, rate limits) retry automatically. Permanent errors (syntax, type) skip immediately.
Per-Feature Git: Each feature commits separately with conventional messages.
Issue Auto-Close: GitHub issues closed automatically after push with summary comment.
Agents automatically retry failed operations with intelligent validation:
Agent fails → Analyze error → Adjust strategy → Retry (up to 3x)
↓
Circuit breaker prevents infinite loops
5 Validation Strategies:
- pytest: Test-based validation (tests must pass)
- safe_word: AI confirms completion in response
- file_existence: Required files must exist
- regex: Output must match pattern
- json: Output must be valid JSON
Features:
- Token limits prevent runaway costs
- Circuit breaker after 3 failures
- Validation strategy per agent type
- Default: ENABLED for all pipeline agents
Batch processing enforces 100% integrity:
Rules:
- 100% test pass requirement (not 80%)
- Failed features keep GitHub issues OPEN + 'blocked' label
- Retry escalation: 3 attempts with increasing wait times
- Honest summaries: Never fakes success
Why This Matters: Prevents "silent failures" where batch reports success but issues remain broken.
Per-batch worktree creation for parallel development:
/implement --batch features.txt
# Creates: .worktrees/batch-20260128-143022/
# Automatic CWD change to worktree
# Main repo remains untouched until mergeBenefits:
- Concurrent CI jobs without interference
- Per-worktree batch state isolation
- Safe experimentation (discard without affecting main)
- Supports nested worktrees
Session state survives /clear operations:
Stored in .claude/local/SESSION_STATE.json:
- Active tasks and next steps
- Key conventions (repo-specific patterns)
- Recent context (files modified, workflows completed)
Read at SessionStart: Automatic continuity across sessions
All 73 hooks use UV for reproducible execution:
Features:
- PEP 723 metadata blocks (inline dependencies)
- Zero environment setup overhead
- Graceful fallback to
sys.path - Single-file portability
Example (from any hook):
#!/usr/bin/env -S uv run --quiet
# /// script
# dependencies = ["pytest", "coverage"]
# ///Layer 1: Sandbox Enforcer → Command classification (SAFE/BLOCKED/NEEDS_APPROVAL)
Layer 2: MCP Security → Path traversal, injection, SSRF prevention
Layer 3: Agent Authorization → Pipeline agent detection
Layer 4: Batch Approver → Caches user consent for identical operations
Result: 84% reduction in permission prompts (50+ → 8-10).
- CWE-22: Path traversal prevention
- CWE-78: Command injection blocking
- CWE-918: SSRF prevention
- Symlink rejection outside whitelist
- Credential safety (never logged)
- Audit logging to
logs/security_audit.log
| Phase | Improvement |
|---|---|
| Model optimization (Haiku for research) | 3-5 min saved |
| Prompt simplification | 2-4 min saved |
| Parallel validation | 60% faster (5 min → 2 min) |
| Smart agent selection | 95% faster for typos/docs |
Current Performance:
- Full pipeline: 15-25 minutes per feature
- Quick mode: 2-5 minutes
- Typo fixes: <2 minutes (95% faster than full pipeline)
- Automatic context summarization (Claude Code handles 200K token budget)
- Session files log to
docs/sessions/(200 tokens vs 5,000+) /clearrecommended after each feature for optimal performance
| Guide | Description |
|---|---|
| CLAUDE.md | Project instructions and quick reference |
| Architecture | Technical architecture deep-dive |
| Agents | 8-agent pipeline + utility agents |
| Hooks | 73 automation hooks reference |
| Skills | 32 skills and agent integration |
| Workflow Discipline | Why pipelines beat direct implementation |
| Performance | Benchmarks and optimization history |
| Git Automation | Zero manual git operations |
| Batch Processing | Multi-feature workflows with auto-continuation |
| Security | Security model and hardening guide |
- Claude Code 2.0+
- macOS or Linux
- Git
ghCLI (optional, for PR creation and issue management)
Quality should be automatic, not optional. autonomous-dev makes the right thing the easy thing:
- Research happens automatically before implementation
- Tests are written first (TDD enforced)
- Security is scanned on every feature
- Documentation stays in sync automatically
- Git operations are orchestrated end-to-end
- Alignment is validated against your stated intent
You describe what you want. The pipeline handles the rest.
| Layer | Weight | Purpose |
|---|---|---|
| Hooks | 10% | Deterministic blocking (secrets, force push) |
| CLAUDE.md | 30% | Persuasion via data (this README) |
| Convenience | 40% | Quality path is the easy path |
| Skills | 20% | Agent expertise and knowledge |
We don't force quality. We make it the path of least resistance.
MIT
Built for developers who ship.