docs(blog): add blog section with first article on vibe coding journey

prosdev · prosdev · commit 140b7b9310c6 · 2025-11-29T03:30:09.000-08:00
diff --git a/website/content/_meta.js b/website/content/_meta.js
@@ -1,4 +1,5 @@
 export default {
   index: 'Home',
   docs: 'Documentation',
+  blog: 'Blog',
 };
diff --git a/website/content/blog/10-days-of-vibe-coding.mdx b/website/content/blog/10-days-of-vibe-coding.mdx
@@ -0,0 +1,306 @@
+---
+title: "10 days of vibe coding: what I learned building an MCP server"
+date: 2025-11-29
+description: "How a hackathon project turned into 42% cost savings for AI-assisted development"
+---
+
+import { Callout, Steps, Tabs, FileTree } from 'nextra/components'
+
+# 10 days of vibe coding: what I learned building an MCP server
+
+<Callout type="info">
+This is the story of building [dev-agent](https://github.com/lytics/dev-agent), an MCP server that gives AI tools semantic code search. What started as a hackathon exploration turned into measurable improvements in my daily workflow.
+</Callout>
+
+I was watching Claude read the same file for the third time in a row. It had already found the answer — it just didn't know it.
+
+That's when I decided to spend a week figuring out why.
+
+## What "vibe coding" actually means to me
+
+Vibe coding isn't about letting AI write everything. It's about:
+
+1. Describing intent at a high level
+2. Letting AI handle the boilerplate
+3. Focusing my attention on the hard parts
+
+The problem? AI can't handle the boilerplate if it doesn't understand the codebase. I was spending more time correcting Claude's assumptions than writing code myself.
+
+## The problem: grep chains
+
+Here's what a typical Claude Code session looked like before I built dev-agent:
+
+```
+Task: "Where is rate limiting implemented?"
+
+Claude's approach:
+1. grep "rate limit" → 23 matches across 8 files
+2. Read packages/mcp-server/src/server/rate-limiter.ts (180 lines)
+3. grep "token bucket" → 12 matches
+4. Read packages/mcp-server/src/server/index.ts (340 lines)
+5. Read packages/core/src/utils/retry.ts (95 lines)
+6. ... 5 more file reads
+
+Total: 18 tool calls, 10 files read, ~18,000 input tokens
+Time: 45 seconds
+```
+
+The answer was in lines 45-62 of the first file. Claude read 10 files to find it.
+
+## The premise
+
+I set aside a week to explore this. My question: **Can I make Claude Code understand my codebase better?**
+
+I started with a `PLAN.md` and a monorepo scaffold. The goal wasn't to build a product — it was to learn how AI tools explore codebases and whether I could improve that experience.
+
+## Day 1-2: The foundation
+
+The first two days were about building the core: a repository scanner and vector storage.
+
+### Why local-first mattered
+
+I wanted embeddings stored locally, not sent to a cloud service. My code stays on my machine. This led me to:
+
+- **LanceDB** for vector storage (embedded, no server)
+- **Transformers.js** for embeddings (runs locally, no API calls)
+- **ts-morph** for TypeScript parsing (extracts functions, classes, relationships)
+
+```typescript
+// What the scanner extracts
+interface Component {
+  name: string;
+  type: 'function' | 'class' | 'interface';
+  filePath: string;
+  startLine: number;
+  endLine: number;
+  imports: string[];
+  exports: string[];
+}
+```
+
+By day 2, I had a working CLI:
+
+```bash
+dev index .        # Index the repository
+dev search "auth"  # Semantic search
+```
+
+The scanner hit 94% test coverage on day 1. Not because I'm obsessive about coverage, but because testing edge cases revealed bugs in how I was parsing TypeScript.
+
+## Day 3-4: The subagent architecture
+
+I got ambitious. What if I had specialized agents for different tasks?
+
+- **Explorer** — Find similar code, trace relationships
+- **Planner** — Analyze GitHub issues, break them into tasks  
+- **GitHub agent** — Index issues/PRs for semantic search
+
+By day 4, I had 557 tests passing. The subagent coordinator could route messages between agents, share context, and handle graceful shutdown.
+
+### The decision: context provision, not automation
+
+I originally planned a "PR agent" that would create pull requests automatically. I cut it.
+
+Why? I realized the real value was in **context provision** — giving AI tools better information to work with. Automation can come later. First, solve the information problem.
+
+## Day 5-6: MCP integration
+
+This is where things got interesting.
+
+### Why MCP over HTTP API
+
+My original plan was an HTTP API server. But MCP (Model Context Protocol) was a better fit:
+
+- Works natively with Claude Code and Cursor
+- No server management — just a CLI command
+- Stdio transport is simple and reliable
+
+```bash
+# One command to integrate with Claude Code
+dev mcp install
+```
+
+### The "aha" moment
+
+When I first got semantic search working in Claude Code, I noticed something unexpected. Claude was making **fewer file reads**.
+
+Before: Claude would grep, find file paths, then read entire files.
+
+After: My search returned **code snippets**, not just file paths. Claude could see the relevant code without reading the file.
+
+```typescript
+// What dev_search returns
+// packages/mcp-server/src/server/rate-limiter.ts (score: 0.92)
+// Lines 45-62
+
+export class TokenBucketRateLimiter implements RateLimiter {
+  private tokens: number;
+  private lastRefill: number;
+  
+  constructor(private config: RateLimitConfig) {
+    this.tokens = config.bucketSize;
+    this.lastRefill = Date.now();
+  }
+  
+  async consume(): Promise<boolean> {
+    this.refill();
+    if (this.tokens > 0) {
+      this.tokens--;
+      return true;
+    }
+    return false;
+  }
+}
+```
+
+This was the insight that would later show up in benchmarks: **99% fewer input tokens** because Claude doesn't need to read entire files.
+
+## Day 7-8: Richer context
+
+With the foundation working, I added more tools:
+
+- **dev_refs** — Find who calls a function and what it calls
+- **dev_map** — Codebase structure with component counts
+- **dev_history** — Semantic search over git commits
+
+The git history integration was particularly useful. Claude can now search commits by meaning:
+
+```bash
+dev_history query="authentication refactor"
+# Returns commits about auth, even if they don't use that exact word
+```
+
+### Unified indexing
+
+I consolidated everything into one command:
+
+```bash
+dev index .
+# Indexes: code → git history → GitHub issues/PRs
+```
+
+One command, three types of context. This became important for the `dev_plan` tool, which bundles all three into a single response.
+
+## Day 9-10: Measuring it
+
+I'm an engineer. I had to measure it.
+
+I ran the same tasks with and without dev-agent, tracking time, cost, tool calls, and result quality.
+
+### One real example
+
+**Task:** "Where is rate limiting implemented and how does it work?"
+
+<Tabs items={['Without dev-agent', 'With dev-agent']}>
+  <Tabs.Tab>
+    ```
+    Tool calls: 18
+    Files read: 10
+    Input tokens: ~18,000
+    Time: 45 seconds
+    
+    Approach: grep → read → grep → read → grep...
+    ```
+  </Tabs.Tab>
+  <Tabs.Tab>
+    ```
+    Tool calls: 3
+    Files read: 2
+    Input tokens: ~1,200
+    Time: 28 seconds
+    
+    Approach: dev_search → read 2 files for full context
+    ```
+  </Tabs.Tab>
+</Tabs>
+
+Same answer. **93% fewer input tokens.**
+
+### The results across task types
+
+| Task Type | Cost Savings | Time Savings |
+|-----------|--------------|--------------|
+| Debugging | 42% | 37% |
+| Exploration | 44% | 19% |
+| Implementation | 29% | 22% |
+
+The 42% cost savings wasn't the goal — it was a side effect of returning code snippets instead of file paths.
+
+### When it helps (and when it doesn't)
+
+The data revealed something important: **savings scale with task complexity**.
+
+- **Simple lookups** (find a specific function): ~0% savings. Claude's grep is fine.
+- **Conceptual queries** ("how does auth work"): 44% savings. Semantic search shines.
+- **Implementation tasks** (GitHub issues): 29% savings. Context bundling helps.
+
+If your tasks are simple, dev-agent won't help much. If you're doing complex exploration or implementation, it adds up.
+
+## Things that didn't work
+
+### Attempt 1: HTTP API server
+
+I spent half a day building an HTTP server before realizing CLI + MCP was simpler. Lesson: don't add infrastructure you don't need.
+
+### Attempt 2: Automatic PR creation
+
+I built a PR agent that would create PRs automatically. Cut it after day 4. Why? The real problem was context, not automation. I was solving the wrong problem.
+
+### Attempt 3: Complex tool descriptions
+
+My first tool descriptions were paragraphs long. Claude ignored them. Shorter, more prescriptive descriptions worked better:
+
+```typescript
+// Before: vague
+description: "Search the codebase"
+
+// After: prescriptive
+description: "USE THIS FIRST for code exploration. Semantic search finds code by meaning, not just keywords. Better than grep for conceptual queries."
+```
+
+### Attempt 4: Too many tools too fast
+
+By day 4, I had 9 tools. That was too many to test properly. I should have started with 3 and added incrementally.
+
+## How my workflow changed
+
+Before dev-agent, vibe coding felt like babysitting. I'd describe what I wanted, watch Claude grep around, then correct its assumptions.
+
+Now it feels more like pair programming. Claude finds the right code faster, which means I spend more time on the interesting decisions and less time on "no, look in *that* file."
+
+The biggest change: **I trust Claude's first answer more often.** When it has the right context, it makes fewer mistakes.
+
+## If you're building an MCP server
+
+1. **Start with one tool.** Don't build 9 tools on day 1.
+2. **Return code snippets, not file paths.** This is the biggest win.
+3. **Test with real tasks, not synthetic benchmarks.** I waited until day 9 — that was too late.
+4. **Tool descriptions matter more than you think.** Be prescriptive.
+5. **Measure early.** If I'd measured on day 3, I would have focused on the code-snippet insight sooner.
+
+## What's next
+
+The project is open source:
+
+```bash
+npm install -g dev-agent
+dev index .
+dev mcp install  # For Claude Code
+dev mcp install --cursor  # For Cursor
+```
+
+I'm using it daily now. The next milestone (v0.5.0) is generalizing `dev_plan` into `dev_context` — a tool that bundles relevant context for any query, not just GitHub issues.
+
+---
+
+## The takeaway
+
+Vibe coding works better when your AI tools have better context. Semantic search, code snippets, and context bundling aren't magic — they're just information retrieval done right.
+
+The 42% cost savings is nice, but the real win is **faster iteration**. When Claude finds the right code on the first try, I spend less time correcting it.
+
+If you're building AI tooling, consider: what context is your tool missing? The answer might be simpler than you think.
+
+---
+
+*Built during a hackathon week in November 2025. [Source code on GitHub](https://github.com/lytics/dev-agent).*
diff --git a/website/content/blog/_meta.js b/website/content/blog/_meta.js
@@ -0,0 +1,4 @@
+export default {
+  index: 'Blog',
+  '10-days-of-vibe-coding': '10 days of vibe coding',
+};
diff --git a/website/content/blog/index.mdx b/website/content/blog/index.mdx
@@ -0,0 +1,14 @@
+---
+title: Blog
+---
+
+# Blog
+
+Notes from building dev-agent — an MCP server for semantic code search.
+
+---
+
+## Latest Posts
+
+- **[10 days of vibe coding: what I learned building an MCP server](/blog/10-days-of-vibe-coding)** — How a hackathon project turned into 42% cost savings for AI-assisted development.
+