lytics · prosdev · Nov 29, 2025 · Nov 29, 2025 · Nov 29, 2025
diff --git a/AGENTS.md b/AGENTS.md
@@ -179,10 +179,13 @@ Specialized agents for development tasks.
 
 MCP server with built-in adapters for AI tools.
 
-**Adapters:**
+**Adapters (9 tools):**
 - **SearchAdapter:** Semantic code search (`dev_search`)
+- **RefsAdapter:** Relationship queries - callers/callees (`dev_refs`)
+- **MapAdapter:** Codebase structure with change frequency (`dev_map`)
+- **HistoryAdapter:** Semantic git commit search (`dev_history`)
 - **StatusAdapter:** Repository status (`dev_status`)
-- **PlanAdapter:** Planning from issues (`dev_plan`)
+- **PlanAdapter:** Context assembly for issues (`dev_plan`)
 - **ExploreAdapter:** Code exploration (`dev_explore`)
 - **GitHubAdapter:** Issue/PR search (`dev_gh`)
 - **HealthAdapter:** Server health checks (`dev_health`)

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -154,16 +154,19 @@ dev mcp install
 
 That's it! Claude Code now has access to all dev-agent capabilities.
 
-### Available Tools in Claude Code & Cursor
-
-Once installed, AI tools gain access to these powerful capabilities:
-
-- **`dev_search`** - Semantic code search across indexed repositories
-- **`dev_status`** - Repository indexing status and health information  
-- **`dev_plan`** - Generate implementation plans from GitHub issues
-- **`dev_explore`** - Explore code patterns, find similar code, analyze relationships
-- **`dev_gh`** - Search GitHub issues and pull requests with semantic context (auto-reloads on index changes)
-- **`dev_health`** - Check MCP server health and component status (vector storage, repository, GitHub index)
+### Available Tools in Claude Code & Cursor (9 tools)
+
+Once installed, AI tools gain access to:
+
+- **`dev_search`** - Semantic code search (USE THIS FIRST for conceptual queries)
+- **`dev_refs`** - Find callers/callees of functions (for specific symbols)
+- **`dev_map`** - Codebase structure with component counts and change frequency
+- **`dev_history`** - Semantic search over git commits (who changed what and why)
+- **`dev_plan`** - Assemble context for GitHub issues (code + history + patterns)
+- **`dev_explore`** - Find similar code, trace relationships
+- **`dev_gh`** - Search GitHub issues/PRs semantically
+- **`dev_status`** - Repository indexing status
+- **`dev_health`** - Server health checks
 
 ### MCP Command Reference
 

diff --git a/PLAN.md b/PLAN.md
@@ -213,7 +213,7 @@ Git history is valuable context that LLMs can't easily access. We add intelligen
 
 > Building on git history with deeper insights.
 
-### Tasks
+### Git Tasks
 
 | Task | Priority | Status |
 |------|----------|--------|
@@ -222,6 +222,28 @@ Git history is valuable context that LLMs can't easily access. We add intelligen
 | Contributor expertise mapping | 🟢 Low | 🔲 Todo |
 | Cross-repo history | 🟢 Low | 🔲 Todo |
 
+### Tool Improvements
+
+| Task | Rationale | Priority | Status |
+|------|-----------|----------|--------|
+| Generalize `dev_plan` → `dev_context` | Currently requires GitHub issue; should work with any task description | 🔴 High | 🔲 Todo |
+| Freeform context assembly | `dev_context "Add rate limiting"` without needing issue # | 🔴 High | 🔲 Todo |
+| Multiple input modes | `--issue 42`, `--file src/auth.ts`, or freeform query | 🟡 Medium | 🔲 Todo |
+
+**Why:** `dev_plan` is really a context assembler but is tightly coupled to GitHub issues. Generalizing it:
+- Works without GitHub
+- Easier to benchmark (no real issues needed)
+- Name matches function (assembles context, doesn't "plan")
+- More useful for ad-hoc implementation tasks
+
+### Benchmark Improvements
+
+| Task | Rationale | Priority | Status |
+|------|-----------|----------|--------|
+| Add implementation task types | Current benchmark only tests exploration; missing `dev_plan`/`dev_gh` coverage | 🟡 Medium | 🔲 Todo |
+| Generic implementation patterns | "Add a new adapter similar to X" — tests pattern discovery | 🟡 Medium | 🔲 Todo |
+| Snapshotted issue tests | Capture real issues for reproducible `dev_plan` testing | 🟢 Low | 🔲 Todo |
+
 ---
 
 ## Future: Extended Intelligence (v0.6+)
@@ -314,23 +336,40 @@ How we know dev-agent is working:
 4. **Daily use:** We actually use it ourselves (dogfooding)
 5. **LLM effectiveness:** Claude/Cursor make better suggestions with dev-agent
 
-### Benchmark Results (v0.4.2)
+### Benchmark Results (v0.4.3)
+
+#### By Task Type
+
+| Task Type | Cost Savings | Time Savings | Why |
+|-----------|--------------|--------------|-----|
+| **Debugging** | **42%** | 37% | Semantic search beats grep chains |
+| **Exploration** | **44%** | 19% | Find code by meaning |
+| **Implementation** | **29%** | 22% | Context bundling via `dev_plan` |
+| **Simple lookup** | ~0% | ~0% | Both approaches are fast |
+
+**Key insight:** Savings scale with task complexity.
+
+#### Why It Saves Money
+
+| What dev-agent does | Manual equivalent | Impact |
+|---------------------|-------------------|--------|
+| Returns code snippets in search | Read entire files | 99% fewer input tokens |
+| `dev_plan` bundles issue + code + commits | 5-10 separate tool calls | 29% cost reduction |
+| Semantic search finds relevant code | grep chains + filtering | 42% cost reduction |
 
-Measured against baseline Claude Code across 5 task types:
+#### Token Analysis (Debugging Task)
 
-| Metric | Baseline | With dev-agent | Improvement |
-|--------|----------|----------------|-------------|
-| Cost per session | $1.82 | $1.02 | **-44%** |
-| Time per session | 14.1 min | 11.5 min | **-19%** |
-| Tool calls | 69 | 40 | **-42%** |
-| Files examined | 23 | 15 | **-35%** |
+| Metric | Without dev-agent | With dev-agent | Difference |
+|--------|-------------------|----------------|------------|
+| Input tokens | 18,800 | 65 | **99.7% less** |
+| Output tokens | 12,200 | 6,200 | **49% less** |
+| Files read | 10 | 5 | **50% less** |
 
 **Trade-offs identified:**
-- Less thorough for debugging (missing diagnostic commands)
-- Fewer code examples in responses
-- Skips test files (baseline reads them)
+- Baseline provides more diagnostic shell commands
+- Baseline reads more files (sometimes helpful for thoroughness)
 
-**Target users:** Mid-to-senior engineers who value speed over exhaustiveness for routine exploration tasks.
+**Target users:** Engineers working on complex exploration, debugging, or implementation tasks in large/unfamiliar codebases.
 
 ---
 
@@ -347,4 +386,4 @@ pnpm test
 
 ---
 
-*Last updated: November 2025*
+*Last updated: November 29, 2025 at 02:30 PST*
diff --git a/website/content/docs/index.mdx b/website/content/docs/index.mdx
@@ -1,35 +1,46 @@
 # Introduction
 
-**dev-agent** provides semantic code search to AI assistants like Cursor and Claude Code via MCP.
+**dev-agent** provides semantic code search and context bundling to AI assistants like Cursor and Claude Code via MCP.
 
-We built this for ourselves. When exploring large codebases, we found AI tools spending too much time grepping through files. dev-agent gives them a faster path: search by meaning, not keywords.
+We built this for ourselves. When exploring large codebases, we found AI tools spending too much time grepping through files and reading entire files to find relevant code. dev-agent gives them a faster path: search by meaning, get code snippets, bundle context.
 
 ## What it does
 
 1. **Indexes your codebase** locally with embeddings (all-MiniLM-L6-v2)
-2. **Exposes 9 MCP tools** for semantic search, code relationships, git history
-3. **Integrates with GitHub** to search issues and PRs semantically
+2. **Returns code snippets** — not just file paths, reducing input tokens by 99%
+3. **Bundles context** — `dev_plan` assembles issue + code + commits in one call
+4. **Integrates with GitHub** to search issues and PRs semantically
 
 ## Measured impact
 
-We benchmarked dev-agent against baseline Claude Code:
+We benchmarked dev-agent against baseline Claude Code across different task types:
 
-| Metric | Baseline | With dev-agent | Change |
-|--------|----------|----------------|--------|
-| Cost | $1.82 | $1.02 | **-44%** |
-| Time | 14.1 min | 11.5 min | **-19%** |
-| Tool calls | 69 | 40 | **-42%** |
+| Task Type | Cost Savings | Time Savings | Why |
+|-----------|--------------|--------------|-----|
+| **Debugging** | **42%** | 37% | Semantic search beats grep chains |
+| **Exploration** | **44%** | 19% | Find code by meaning |
+| **Implementation** | **29%** | 22% | Context bundling via `dev_plan` |
 
-**Trade-off:** Faster but sometimes less thorough. Best for implementation tasks and exploration. For deep debugging, baseline Claude may read more files.
+**Key insight:** Savings scale with task complexity. Simple lookups show no improvement; complex debugging shows 42% cost reduction.
+
+**Trade-off:** Faster but sometimes less thorough. Baseline Claude provides more diagnostic shell commands.
+
+## Why it saves money
+
+| What dev-agent does | Manual equivalent | Impact |
+|---------------------|-------------------|--------|
+| Returns code snippets in search | Read entire files | 99% fewer input tokens |
+| `dev_plan` bundles issue + code + commits | 5-10 separate tool calls | 29% cost reduction |
+| Semantic search finds relevant code | grep chains + filtering | 42% cost reduction |
 
 ## Key Features
 
 | Feature | Description |
 |---------|-------------|
+| **Context Bundling** | `dev_plan` replaces 5-10 tool calls with one |
+| **Code Snippets** | Search returns code, not just file paths |
 | **Semantic Search** | Find code by meaning, not keywords |
-| **Relationship Queries** | What calls this function? What does it call? |
 | **Git History** | Semantic search over commits |
-| **GitHub Integration** | Search issues and PRs semantically |
 | **100% Local** | Your code never leaves your machine |
 
 ## Architecture
@@ -45,4 +56,3 @@ dev-agent is a monorepo:
 
 - [Installation →](/docs/install) — Get dev-agent installed in under 2 minutes
 - [Quickstart →](/docs/quickstart) — Index and search in 5 minutes
-