Confidence: Tier 2 — Validated by multiple production reports and official documentation.
Last updated: February 2026
This is a quick reference for 15 structured development methodologies that have emerged for AI-assisted development in 2025-2026. For hands-on practical workflows, see workflows/.
- Decision Tree
- The 15 Methodologies
- SDD Tools Reference
- Writing Effective Specs
- Combination Patterns
- Sources
┌─ "I want quality code" ────────────→ workflows/tdd-with-claude.md
│
├─ "I want to spec before code" ─────→ workflows/spec-first.md
│
├─ "I need to plan architecture" ────→ workflows/plan-driven.md
│
├─ "I'm iterating on something" ─────→ workflows/iterative-refinement.md
│
└─ "I need methodology theory" ──────→ Continue reading below
Organized in a 6-tier pyramid from strategic orchestration down to optimization techniques.
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| BMAD | Multi-agent governance with constitution as guardrail | Enterprise 10+ teams, long-term projects | ⭐⭐ Niche but powerful |
| GSD | Meta-prompting 6-phase workflow with fresh contexts per task | Solo devs, Claude Code CLI | ⭐⭐ Similar to patterns in guide |
BMAD (Breakthrough Method for Agile AI-Driven Development) inverts the traditional paradigm: documentation becomes the source of truth, not code. Uses specialized agents (Analyst, PM, Architect, Developer, QA) orchestrated with strict governance. Note: BMAD's role-based agent naming reflects their methodology; see §9.17 Agent Anti-Patterns for scope-focused alternatives.
- Key concept: Constitution.md as strategic guardrail
- When to use: Complex enterprise projects needing governance
- When to avoid: Small teams, MVPs, rapid prototyping
GSD (Get Shit Done) addresses context rot through systematic 6-phase workflow (Initialize → Discuss → Plan → Execute → Verify → Complete) with fresh 200k-token contexts per task. Core concepts (multi-agent orchestration, fresh context management) overlap significantly with existing patterns like Ralph Loop, Gas Town, and BMAD. See resource evaluation for detailed comparison.
Emerging: Ralph Inferno implements autonomous multi-persona workflows (Analyst→PM→UX→Architect→Business) with VM-based execution and self-correcting E2E loops. Experimental but interesting for "vibe coding at scale".
"Once the plan is good, the code is good." — Boris Cherny, creator of Claude Code
Not just a feature (/plan command) — a systematic discipline.
Context Engineering: Thoughtworks designates this broader approach "Context Engineering" in their Technology Radar (Nov 2025)1 — the systematic design of information provided to LLMs during inference. Three core techniques: context setup (minimal system prompts, few-shot examples), context management for long-horizon tasks (summarization, external memories, sub-agent architectures), and dynamic information retrieval (JIT context loading). Related patterns in Claude Code: AGENTS.md, MCP Context7, Plan Mode.
The Mental Model:
Planning isn't optional for complex tasks. It's the difference between:
- ❌ 8 iterations of "try → fix → retry → fix again"
- ✅ 1 iteration of "plan → validate → execute cleanly"
When to plan first:
| Task Complexity | Plan First? | Why |
|---|---|---|
| >3 files modified | ✅ Yes | Cross-file dependencies need architecture |
| >50 lines changed | ✅ Yes | Enough complexity for mistakes |
| Architectural changes | ✅ Yes | Impact analysis required |
| Unfamiliar codebase | ✅ Yes | Need exploration before action |
| Typo/obvious fix | ❌ No | Planning overhead > task time |
| Single-line change | ❌ No | Just do it |
How plan-first works:
-
Exploration phase (
/planmode):- Claude reads files, explores architecture
- No edits allowed → forces thinking before action
- Proposes approach with trade-offs
-
Validation phase (you review):
- Plan exposes assumptions and gaps
- Easier to correct direction now vs after 100 lines written
- Plan becomes contract for execution
-
Execution phase (
/execute):- Plan → code becomes mechanical translation
- Fewer surprises, cleaner implementation
- Faster overall despite "slower" start
Boris Cherny workflow:
"I run many sessions, start in plan mode, then switch into execution once the plan looks right. The signature upgrade is verification—giving Claude a way to test and confirm its own output."
Benefits over "just start coding":
- Fewer correction iterations: Plan catches issues before they become code
- Better architecture: Forced to think about structure first
- Clearer communication: Plan is shared understanding with team/Claude
- Reduced cost: One clean iteration < multiple messy iterations (even if plan phase costs tokens)
Integration with CLAUDE.md:
Document your team's plan-first triggers:
## Planning Policy
- ALWAYS plan first: API changes, database migrations, new features
- OPTIONAL planning: Bug fixes <10 lines, test additions
- NEVER skip: Changes affecting >2 modulesSee also: Plan Mode documentation for /plan command usage.
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| SDD | Specs before code | APIs, contracts | ⭐⭐⭐ Core pattern |
| Doc-Driven | Docs = source of truth | Cross-team alignment | ⭐⭐⭐ CLAUDE.md native |
| Req-Driven | Rich artifact context (20+ artifacts) | Complex requirements | ⭐⭐ Heavy setup |
| DDD | Domain language first | Business logic | ⭐⭐ Design-time |
SDD (Spec-Driven Development) — Specifications BEFORE code. One well-structured iteration equals 8 unstructured ones. CLAUDE.md IS your spec file.
Doc-Driven Development — Living documentation versioned in git becomes the single source of truth. Changes to specs trigger implementation.
Requirements-Driven Development — Uses CLAUDE.md as comprehensive implementation guide with 20+ structured artifacts.
DDD (Domain-Driven Design) — Aligns software with business language through:
- Ubiquitous Language: Shared vocabulary in code
- Bounded Contexts: Isolated domain boundaries
- Domain Distillation: Core vs Support vs Generic domains
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| BDD | Given-When-Then scenarios | Stakeholder collaboration | ⭐⭐⭐ Tests & specs |
| ATDD | Acceptance criteria first | Compliance, regulated | ⭐⭐ Process-heavy |
| CDD | API contracts as interface | Microservices | ⭐⭐⭐ OpenAPI native |
BDD (Behavior-Driven Development) — Beyond testing: a collaboration process.
- Discovery: Involve devs and business experts
- Formulation: Write Given-When-Then examples
- Automation: Convert to executable tests (Gherkin/Cucumber)
Feature: Order Management
Scenario: Cannot buy without stock
Given product with 0 stock
When customer attempts purchase
Then system refuses with error messageATDD (Acceptance Test-Driven Development) — Acceptance criteria defined BEFORE coding, collaboratively ("Three Amigos": Business, Dev, Test).
CDD (Contract-Driven Development) — API contracts (OpenAPI specs) as executable interface between teams. Patterns: Contract as Test, Contract as Stub.
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| FDD | Feature-by-feature delivery | Large teams 10+ | ⭐⭐ Structure |
| Context Eng. | Context as first-class design | Long sessions | ⭐⭐⭐ Fundamental |
FDD (Feature-Driven Development) — Five processes:
- Develop Overall Model
- Build Features List
- Plan by Feature
- Design by Feature
- Build by Feature
Strict iteration: 2 weeks max per feature.
Context Engineering — Treat context as design element:
- Progressive Disclosure: Let agent discover incrementally
- Memory Management: Conversation vs persistent memory
- Dynamic Refresh: Rewrite TODO list before response
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| TDD | Red-Green-Refactor | Quality code | ⭐⭐⭐ Core workflow |
| Eval-Driven | Evals for LLM outputs | AI products | ⭐⭐⭐ Agents |
| Multi-Agent | Orchestrate sub-agents | Complex tasks | ⭐⭐⭐ Task tool |
TDD (Test-Driven Development) — The classic cycle:
- Red: Write failing test
- Green: Minimal code to pass
- Refactor: Clean up, tests stay green
With Claude: Be explicit. "Write FAILING tests that don't exist yet."
Verification Loops — A formalized pattern for autonomous iteration (broader than TDD):
Core principle: Give Claude a mechanism to verify its own output.
Code generated → Verification tool → Feedback loop → ImprovementWhy it works (Boris Cherny): "An agent that can 'see' what it has done produces better results."
Verification mechanisms by domain:
Domain Verification Tool What Claude "Sees" Frontend Browser preview (live reload) Visual rendering, layout, interactions Backend Tests (unit/integration) Pass/fail status, error messages Types TypeScript compiler Type errors, incompatibilities Style Linters (ESLint, Prettier) Style violations, formatting issues Performance Profilers, benchmarks Execution time, memory usage Accessibility axe-core, screen readers WCAG violations, navigation issues Security Static analyzers (Semgrep) Vulnerability patterns UX User testing, recordings Usability problems, confusion points TDD as canonical example:
- Claude writes tests for the feature
- Claude iterates code until tests pass
- Continue until explicit completion criteria met
Official guidance: "Tell Claude to keep going until all tests pass. It will usually take a few iterations." — Anthropic Best Practices
Implementation patterns:
- Hooks: PostToolUse hook runs verification after each edit
- Browser extension: Claude in Chrome sees rendered output
- Test watchers: Jest/Vitest watch mode provides instant feedback
- CI/CD gates: GitHub Actions runs full validation suite
- Multi-Claude verification: One Claude codes, another reviews
Anti-pattern: Blind iteration without feedback. Without verification mechanism, Claude can't converge toward correct solution—it guesses.
Eval-Driven Development — TDD for LLMs. Test agent behaviors via evals:
- Code-based:
output == golden_answer - LLM-based: Another Claude evaluates
- Human grading: Reference, slow
Eval Harness — The infrastructure that runs evaluations end-to-end: providing instructions and tools, running tasks concurrently, recording steps, grading outputs, and aggregating results.
See Anthropic's comprehensive guide: Demystifying Evals for AI Agents
Multi-Agent Orchestration — From single assistant to orchestrated team:
Meta-Agent (Orchestrator)
├── Analyst (requirements)
├── Architect (design)
├── Developer (code)
└── Reviewer (validation)
Pattern: Write plain English ADRs → Feed to implement-adr skill → Execute natively
Architecture Decision Records (ADRs) combined with Claude Code skills create a workflow where architectural decisions drive implementation directly.
Workflow Steps:
- Document decision in ADR format (context, decision, consequences)
- Create implementation skill (generic or
implement-adrspecialized) - Feed ADR as prompt to skill with clear acceptance criteria
- Claude executes based on architectural guidance in ADR
Example ADR Template:
# ADR-001: Database Migration Strategy
## Context
Legacy MySQL schema needs migration to PostgreSQL for better JSON support.
## Decision
Use incremental dual-write pattern with feature flags.
## Consequences
- Positive: Zero-downtime migration
- Negative: Temporary code complexity during transition
Implementation Workflow:
# 1. Write ADR (plain English)
vim docs/adr/001-database-migration.md
# 2. Feed to implementation skill
/implement-adr docs/adr/001-database-migration.md
# 3. Claude executes based on ADR guidance
# → Creates migration scripts
# → Updates ORM configuration
# → Adds feature flags
# → Implements dual-write logicBenefits:
- ✅ Documentation-driven: Architecture and code stay synchronized
- ✅ Native execution: No external frameworks needed
- ✅ Traceable decisions: Clear audit trail from decision to implementation
- ✅ Team alignment: ADRs communicate intent to both humans and AI
Source: Gur Sannikov embedded engineering workflow
| Name | What | Best For | Claude Fit |
|---|---|---|---|
| Iterative Loops | Autonomous refinement | Optimization | ⭐⭐⭐ Core |
| Fresh Context | Reset per task, state in files | Long autonomous sessions | ⭐⭐⭐ Power users |
| Prompt Engineering | Technique foundation | Everything | ⭐⭐⭐ Prerequisite |
Iterative Refinement Loops — Autonomous convergence:
- Execute prompt
- Observe result
- If result ≠ "DONE" → refine and repeat
Prompt Engineering — Foundations for ALL Claude usage:
- Zero-Shot Chain of Thought: "Think step by step"
- Few-Shot Learning: 2-3 examples of expected pattern
- Structured Prompts: XML tags for organization
- Position Matters: For long docs, place question at end
Fresh Context Pattern (Ralph Loop) — Solves context rot by spawning fresh agent instances per task. State persists in git + progress files, not chat history. Ideal for long autonomous sessions (migrations, overnight runs). See Ultimate Guide - Fresh Context Pattern for implementation.
Three tools have emerged to formalize Spec-Driven Development:
| Tool | Use Case | Official Docs | Claude Integration |
|---|---|---|---|
| Spec Kit | Greenfield, governance | github.blog/spec-kit | /speckit.constitution, /speckit.specify, /speckit.plan |
| OpenSpec | Brownfield, changes | github.com/Fission-AI/OpenSpec | /openspec:proposal, /openspec:apply, /openspec:archive |
| Specmatic | API contract testing | specmatic.io | MCP agent available |
5-phase workflow:
- Constitution:
/speckit.constitution→ guardrails - Specify:
/speckit.specify→ requirements - Plan:
/speckit.plan→ architecture - Tasks:
/speckit.tasks→ decomposition - Implement:
/speckit.implement→ code
Two-folder architecture:
openspec/
├── specs/ ← Current truth (stable)
└── changes/ ← Proposals (temporary)
Workflow: Proposal → Review → Apply → Archive
- Contract as Test: Auto-generates 1000s of tests from OpenAPI spec
- Contract as Stub: Mock server for parallel development
- Backward Compatibility: Detects breaking changes
Based on analysis of 2,500+ agent configuration files. Source: Addy Osmani
| Component | What to Include | Example |
|---|---|---|
| Commands | Executable with flags | npm test -- --coverage |
| Testing | Framework, coverage, locations | vitest, 80%, tests/ |
| Project structure | Explicit directories | src/, lib/, tests/ |
| Code style | One example > paragraphs | Show a real function |
| Git workflow | Branch, commit, PR format | feat/name, conventional commits |
| Boundaries | Permission tiers | See below |
| Tier | Symbol | Use For |
|---|---|---|
| Always do | ✅ | Safe actions, no approval (lint, format) |
| Ask first | High-impact changes (delete, publish) | |
| Never do | 🚫 | Hard stops (commit secrets, force push main) |
⚠️ Research shows more instructions = worse adherence to each one.Solution: Feed only relevant spec sections per task, not the entire document.
| Project Size | Approach |
|---|---|
| Small (<10 files) | Single spec file |
| Medium (10-50 files) | Sectioned spec, feed per task |
| Large (50+ files) | Sub-agent routing by domain |
Recommended stacks by situation:
| Situation | Recommended Stack | Notes |
|---|---|---|
| Solo MVP | SDD + TDD | Minimal overhead, quality focus |
| Team 5-10, greenfield | Spec Kit + TDD + BDD | Governance + quality + collaboration |
| Microservices | CDD + Specmatic | Contract-first, parallel dev |
| Existing SaaS (100+ features) | OpenSpec + BDD | Change tracking, no spec drift |
| Enterprise 10+ | BMAD + Spec Kit + Specmatic | Full governance + contracts |
| LLM-native product | Eval-Driven + Multi-Agent | Self-improving systems |
| Methodology | Level | Primary Focus | Team Size | Learning Curve |
|---|---|---|---|---|
| BMAD | Orchestration | Governance | 10+ | High |
| SDD | Specification | Contracts | Any | Medium |
| Doc-Driven | Specification | Alignment | Any | Low |
| Req-Driven | Specification | Context | 5+ | Medium |
| DDD | Specification | Domain | 5+ | Very High |
| BDD | Behavior | Collaboration | 5+ | Medium |
| ATDD | Behavior | Compliance | 5+ | Medium |
| CDD | Behavior | APIs | 5+ | Medium |
| FDD | Delivery | Features | 10+ | Medium |
| Context Eng. | Delivery | AI sessions | Any | Low |
| TDD | Implementation | Quality | Any | Low |
| Eval-Driven | Implementation | AI outputs | Any | Medium |
| Multi-Agent | Implementation | Complexity | Any | Medium |
| Iterative | Optimization | Refinement | Any | Low |
| Prompt Eng. | Optimization | Foundation | Any | Very Low |
- Anthropic: Claude Code Best Practices
- Anthropic: Effective Context Engineering for AI Agents
- Anthropic: Demystifying Evals for AI Agents
- GitHub: Spec-Driven Development Toolkit
- Microsoft: Spec-Driven Development with Spec Kit
SDD & Spec-First
- Addy Osmani: How to Write Good Specs for AI Agents
- Addy Osmani: My AI Coding Workflow in 2026 — End-to-end workflow: spec-first, context packing, TDD, git checkpoints
- Martin Fowler: SDD Tools Analysis
- InfoQ: Spec-Driven Development
- Kinde: Beyond TDD - Why SDD is the Next Step
- Tessl.io: Spec-Driven Dev with Claude Code
BMAD
- GMO Recruit: The BMAD Method
- Benny Cheung: BMAD - Reclaiming Control in AI Dev
- GitHub: BMAD-AT-CLAUDE
TDD with AI
- Steve Kinney: TDD with Claude
- Nathan Fox: Taming GenAI Agents
- Alex Op: Custom TDD Workflow Claude Code
BDD & DDD
- Alex Soyes: BDD Behavior-Driven Development
- Alex Soyes: DDD Domain-Driven Design
- Inflectra: Behavior-Driven Development
Context Engineering
- Intuition Labs: What is Context Engineering
- Manus.im: Context Engineering for AI Agents
Eval-Driven & Multi-Agent
- Fireworks AI: Eval-Driven Development with Claude Code
- Brandon Casci: Transform into a Dev Team using Claude Code Agents
- The Unwind AI: Claude Code's Multi-Agent Orchestration
- OpenSpec: github.com/Fission-AI/OpenSpec
- Spec Kit: github.com/github/spec-kit
- Specmatic: specmatic.io
- Specmatic Article: Spec-Driven Development with GitHub Spec Kit and Specmatic MCP
- Talent500: Claude Code TDD Guide
- Testlio: Acceptance Test-Driven Development
- Monday.com: Feature-Driven Development
- Paddo.dev: Ralph Wiggum Autonomous Loops
- Walturn: Prompt Engineering for Claude
- AWS: Prompt Engineering with Claude on Bedrock
- workflows/tdd-with-claude.md — Practical TDD guide
- workflows/spec-first.md — Spec-first development
- workflows/plan-driven.md — Using /plan mode
- workflows/iterative-refinement.md — Refinement loops
- ultimate-guide.md#912 — Section 9.12 summary
Footnotes
-
Thoughtworks Technology Radar Vol 33, Nov 2025. PDF. See also: Macro trends blog post. ↩