Skip to content

Latest commit

 

History

History
1075 lines (890 loc) · 40.1 KB

File metadata and controls

1075 lines (890 loc) · 40.1 KB

Voice Vibe Coding - Multi-Agent System Architecture

Project Overview

Voice Vibe Coding enables users to build and deploy software through natural phone conversations. Users call an ElevenLabs phone agent, describe what they want to build, and a multi-agent Claude Code system automatically transforms the conversation into a deployed web application.

Core Workflow

  1. User calls ElevenLabs phone number
  2. Natural conversation about desired application
  3. Webhook captures transcript in JSON format
  4. Multi-agent Claude Code pipeline processes transcript
  5. Fully functional web app is built and deployed
  6. User receives SMS with deployment URL

System Architecture

Key Principle: Claude Code SDK as Agent Engine

The system uses the Claude Code SDK (Python) to orchestrate specialized agents. Each agent is invoked via the Task tool with isolated context, enabling clean separation of concerns.

Input (webhook) → Orchestrator (Python/SDK) → Chain of Claude Code Agents → Generated HTML → Deployed App

Ephemeral Storage & Multi-Tenancy Architecture

The system implements zero-persistence local storage with privacy-first multi-tenancy to ensure:

  • Minimal disk usage: Projects only exist locally during pipeline execution
  • Complete user isolation: Anonymized user IDs prevent data collision
  • GitHub as source of truth: All code persisted in GitHub, not locally
  • Unlimited scalability: Support thousands of users without disk concerns

How It Works:

  1. Phone Number Extraction: Every request includes caller's phone number
  2. Ephemeral Workspace: Projects built in /projects/.temp/{phone}/{timestamp}/
  3. Anonymized Repos: All GitHub repos named vvc-{user_id}-{project-name}
  4. Automatic Cleanup: Temp workspace deleted after successful GitHub push
  5. GitHub-Centric Registry: Only tracks GitHub URLs, no local paths

Workflow:

Phone Call → Extract Phone → Map to User ID → Create Temp Workspace → Build Project → 
Push to GitHub (anonymized) → Delete Local → Update Registry (GitHub URL only)

Benefits:

  • No disk bloat: Local storage near-zero (only active pipelines)
  • No user collision: Phone prefixes guarantee uniqueness
  • Simple recovery: Everything recoverable from GitHub
  • Stateless processing: Local system is just a build server

Current Implementation Status (January 2025)

  • ✅ Python orchestrator using Claude Code SDK
  • ✅ Agent invocation via Task tool with proper subagent_type
  • ✅ JSON validation after each agent with automatic retry (3 attempts)
  • ✅ Static HTML generator agent created and working
  • ✅ Path clarification added to all agents
  • ✅ Environment variables for security (no hardcoded credentials)
  • Git-based project management (no more timestamps!)
  • GitHub Manager agent for repository operations
  • ✅ State tracking with state.json
  • Next.js/React pipeline fully implemented
  • Dynamic pipeline selection via workflow orchestrator
  • Browser automation via Playwright MCP integrated
  • Real UI/UX testing with self-healing capabilities
  • Multi-project support with ElevenLabs pre-call webhooks
  • Iterative updates for existing projects
  • Ephemeral storage with automatic cleanup
  • Privacy-first multi-tenancy with anonymized user IDs
  • Vercel deployment with automatic protection disabling
  • Repository name matching via VOICE_CONTEXT.md GitHub field
  • SMS notification agent integrated
  • Pipeline resume capability with phase management
  • Parallel agent execution with isolated clients
  • Command-line controls for selective execution

Project Structure (Ephemeral Architecture)

voice-vibe-coding/
├── orchestrator.py              # Main Python orchestrator using Claude Code SDK
├── .claude/
│   └── agents/                 # Agent prompt files (MUST be here for discovery)
│       ├── voice-requirements-analyst.md  # Extracts requirements
│       ├── workflow-orchestrator.md       # Plans execution
│       ├── github-manager.md              # Manages git repositories (phone-aware)
│       ├── content-strategist.md          # Generates content
│       ├── ui-ux-designer.md              # Creates design specs
│       ├── static-html-generator.md       # Generates HTML output
│       ├── project-scaffolder.md          # Initializes Next.js project
│       ├── component-architect.md         # Designs component structure
│       ├── component-builder.md           # Builds React components
│       ├── page-assembler.md              # Assembles pages and routes
│       ├── styling-finisher.md            # Applies final styling
│       ├── nextjs-validator.md            # Validates Next.js app
│       ├── browser-functionality-validator.md  # Browser UI/UX testing
│       ├── vercel-deployer.md              # Deploys to Vercel from GitHub
│       └── notification-agent.md           # Sends SMS notifications via Twilio
├── webhooks/
│   └── latest_raw.json          # Latest ElevenLabs transcript
├── projects/
│   ├── .temp/                  # Ephemeral workspaces (auto-deleted)
│   │   └── {phone}/{timestamp}/ # Temporary build directory
│   ├── .project-registry.json  # GitHub-centric registry (no local paths)
│   ├── {phone}/                # Phone-based organization
│   │   ├── VOICE_CONTEXT.md    # Conversational context for voice agent
│   │   └── {project-name}/     # OPTIONAL: Local cache for active development
│   └── .archive/               # Legacy projects (to be removed)
├── webhook_server.py           # Receives ElevenLabs webhooks
├── twilio_messaging.py         # SMS notifications (uses env vars)
├── requirements.txt            # Python dependencies
├── .env.example               # Environment variable template
├── README.md                  # Setup and usage instructions
└── CLAUDE.md                  # This file (project context)

GitHub Repository Structure:
- github.com/breagent/vvc-{user_id}-{project-name}  # Anonymized user IDs
- Example: github.com/breagent/vvc-a7x9k2-banana-delivery
- Phone numbers mapped to random IDs for privacy
- Ensures complete user isolation without exposing personal info

Agent Definitions

1. Requirements Analyst Agent

Purpose: Extract structured requirements from voice transcript
Input: webhooks/latest_raw.json and VOICE_CONTEXT.md
Output: artifacts/requirements.json with githubRepository field
Key Tasks:

  • Parse conversation transcript
  • Check VOICE_CONTEXT.md for existing projects (no artifact access)
  • Determine if update or new project
  • Set githubRepository field for updates
  • Extract feature requirements
  • Document any ambiguities with assumptions

2. Workflow Orchestrator Agent

Purpose: Intelligently determine execution strategy using AI reasoning
Input: artifacts/requirements.json
Output: artifacts/execution_plan.json with stages and parallelization
Key Approach:

  • Uses principles, not rules
  • Reasons about what makes sense for THIS project
  • Determines what can run in parallel
  • Includes testing for interactive features
  • Explains all decisions for transparency

3. GitHub Manager Agent

Purpose: Handle all git and GitHub operations with privacy-first multi-tenancy
Input: Phone number + artifacts/requirements.json with githubRepository field
Output: artifacts/github_setup.json or artifacts/github_finalize.json
Key Tasks:

  • Extract phone number from task description
  • Map phone to anonymous user ID (or generate new one)
  • Check githubRepository field to determine new vs update
  • Clone existing repo if githubRepository present
  • Create new repo if no githubRepository
  • Use anonymized naming: vvc-{user_id}-{project_name}
  • Generate meaningful commits with user identification
  • Update VOICE_CONTEXT.md with conversational descriptions
  • Maintain GitHub-centric registry (no local paths)

4. Content Strategist Agent

Purpose: Extract and generate all text content
Input: artifacts/requirements.json
Output: artifacts/content.json
Key Tasks:

  • Extract content from transcript
  • Generate marketing copy
  • Create headlines and CTAs
  • Write placeholder content
  • Ensure tone consistency

5. UI/UX Designer Agent

Purpose: Create design specifications
Input: artifacts/requirements.json, artifacts/content.json
Output: artifacts/design_specs.json
Key Tasks:

  • Define color palette
  • Choose typography
  • Create layout specifications
  • Define component styling
  • Specify responsive breakpoints
  • Document interaction patterns

6. Component Architect Agent

Purpose: Plan component structure
Input: artifacts/requirements.json, artifacts/design_specs.json
Output: artifacts/components_map.json
Key Tasks:

  • Break UI into components
  • Define component hierarchy
  • Specify props and state
  • Plan data flow
  • Identify reusable patterns

7. Project Scaffolder Agent

Purpose: Initialize project structure
Input: artifacts/execution_plan.json
Output: Initialized project in src/
Key Tasks:

  • Create Next.js/React project
  • Install dependencies
  • Set up folder structure
  • Configure build tools
  • Initialize git repository

8. Component Builder Agent

Purpose: Build the actual application
Input: All artifacts
Output: Complete application in src/
Key Tasks:

  • Implement all components
  • Add routing
  • Integrate content
  • Ensure responsive design
  • Add accessibility features
  • Create forms and interactions

9. Page Assembler Agent

Purpose: Assemble pages and configure routing
Input: Component implementations
Output: Complete Next.js pages
Key Tasks:

  • Create page components
  • Set up routing
  • Integrate all components
  • Configure navigation
  • Ensure data flow

10. Styling Finisher Agent

Purpose: Apply design system and styles
Input: artifacts/design_specs.json, existing code
Output: Styled application
Key Tasks:

  • Implement Tailwind/CSS
  • Apply color scheme
  • Ensure design consistency
  • Add animations/transitions
  • Optimize for all breakpoints

11. Next.js Validator Agent

Purpose: Ensure code quality
Input: Complete application code
Output: artifacts/validation_report.json
Key Tasks:

  • Run linting
  • Type checking
  • Basic functionality tests
  • Fix identified issues
  • Verify build success

12. Browser Functionality Validator Agent

Purpose: Validate UI/UX through real browser automation
Input: Built and validated Next.js application
Output: artifacts/browser_test_report.json
Key Tasks:

  • Start development server
  • Use Playwright MCP tools for real browser testing
  • Test all forms and interactions
  • Validate responsive design
  • Automatically fix UI/UX issues found
  • Re-test after fixes to confirm resolution

13. Vercel Deployer Agent

Purpose: Deploy projects to Vercel from GitHub repository
Input: artifacts/github_finalize.json (contains GitHub repo info)
Output: artifacts/deployment.json with production URL
Key Tasks:

  • Read GitHub repository details from finalize artifact
  • Create Vercel project linked to GitHub repo
  • Configure automatic deployments
  • Disable deployment protection automatically
  • Deploy from main branch
  • Return production URL for SMS notification Note: Protection is disabled via API to ensure sites go live immediately without manual intervention

14. Documentation Compiler Agent

Purpose: Create comprehensive documentation
Input: All artifacts
Output: Documentation in docs/
Key Tasks:

  • Compile requirements summary
  • Document design decisions
  • List all assumptions made
  • Create README
  • Generate technical specs

15. Notification Agent

Purpose: Notify user of completion via SMS
Input: Phone number from task description + deployment/project info
Output: artifacts/notification.json with SMS status
Key Tasks:

  • Extract deployment URL or GitHub URL
  • Craft personalized, engaging message
  • Send SMS notification via Twilio
  • Log notification status Note: Uses conversational tone matching the Voice Vibe experience

Orchestration Strategy

Current Implementation (Python SDK)

The orchestrator (orchestrator.py) uses the Claude Code SDK to manage agent execution:

# Key features implemented:
- Git-based project management (no more timestamps!)
- Dynamic pipeline selection via workflow-orchestrator
- JSON validation after each agent
- Retry logic (3 attempts per agent)
- State tracking in state.json
- GitHub integration for version control
- Playwright MCP integration for browser testing
- Pipeline resume capability with phase management
- Parallel agent execution with isolated clients
- Command-line controls for selective execution

# MCP Configuration:
options = ClaudeCodeOptions(
    mcp_servers={
        "playwright": {
            "command": "npx",
            "args": ["-y", "@playwright/mcp@latest"]
        }
    },
    allowed_tools=["Task", "Read", "Write", "Edit", "Bash", "Grep", "Glob", "TodoWrite", "mcp__playwright"]
)

# Agent invocation pattern:
async def invoke_agent(client, agent_name, description, project_dir, expected_outputs=None, 
                      create_own_client=False):  # New: isolated client for parallel execution
    # Uses Task tool with subagent_type
    # Validates JSON outputs
    # Auto-corrects filename variants
    # Retries on failure
    # Creates separate client if parallel execution needed
    # MCP tools available to all agents

# Execution flow:
1. Requirements Analysis (always)
2. GitHub Setup (create/clone repo)
3. Workflow Planning (intelligent decision-making)
4. Execute Pipeline (stages from plan, with parallelization)
5. GitHub Finalize (commit & push)
6. Vercel Deployment (if included in plan)
7. SMS Notification (if phone number available)
8. Summary Generation

# Command-line options:
python orchestrator.py                          # Normal full run
python orchestrator.py --resume                 # Resume from last failure
python orchestrator.py --start-from execution   # Start from specific phase
python orchestrator.py --only deployment        # Run only specific phase
python orchestrator.py --skip github-setup      # Skip specific phases
python orchestrator.py --force                  # Force re-run all phases
python orchestrator.py --status                 # Check pipeline status
python orchestrator.py --dry-run               # Preview execution plan
python orchestrator.py --verbose               # Show detailed output

# Phase names for command-line options:
# - requirements (Requirements Analysis)
# - github-setup (GitHub Workspace Setup)
# - workflow-planning (Workflow Planning)
# - execution (Agent Execution)
# - github-finalize (GitHub Finalization)
# - deployment (Vercel Deployment)
# - notification (SMS Notification)
# - summary (Pipeline Summary)

Command-Line Reference

Resume and Recovery Options

--resume

  • Resumes pipeline from the last failed phase
  • Automatically detects last successful phase from state.json
  • Loads existing artifacts and continues execution
  • Skips already completed phases

--force

  • Forces re-run of all phases even if artifacts exist
  • Overwrites existing state.json
  • Useful for debugging or when artifacts are corrupted

--project-dir <path>

  • Uses existing project directory instead of creating new one
  • Must point to valid project with state.json
  • Useful for resuming specific projects

Selective Execution Options

--start-from <phase>

  • Starts execution from specified phase
  • Skips all phases before the specified one
  • Phase must be one of: requirements, github-setup, workflow-planning, execution, github-finalize, deployment, notification, summary

--only <phases>

  • Runs only specified phases (comma-separated)
  • Example: --only deployment,notification
  • Useful for testing specific phases in isolation

--skip <phases>

  • Skips specified phases (comma-separated)
  • Example: --skip github-setup,deployment
  • Continues with remaining phases

Monitoring and Testing Options

--status

  • Shows current pipeline status and exits
  • Displays phase completion status
  • Shows last error if pipeline failed
  • No execution occurs

--dry-run

  • Shows what would be executed without running
  • Displays phase execution plan
  • Useful for verifying command-line options
  • No actual execution occurs

--verbose

  • Shows detailed output from all agents
  • Includes full agent responses
  • Useful for debugging agent issues

Execution Pipelines

The workflow-orchestrator uses intelligent reasoning to create execution plans:

Principle-Based Decision Making

Instead of rigid pipelines, the orchestrator considers:

  • Quality Principles: Test interactive features, validate before deploying
  • Efficiency Principles: Parallelize independent work
  • User Experience: Include browser testing for UI/UX critical features
  • Deployment Strategy: Deploy after validation passes

Example Execution Plan Structure

{
  "pipeline_type": "static_website",
  "reasoning": "User wants a simple landing page with interactive elements",
  "execution_stages": [
    {
      "stage": 1,
      "description": "Content and Design",
      "parallel": true,
      "agents": ["content-strategist", "ui-ux-designer"],
      "rationale": "Independent tasks that can run simultaneously"
    },
    {
      "stage": 2,
      "description": "Implementation",
      "parallel": false,
      "agents": ["static-html-generator"],
      "rationale": "Needs both content and design complete"
    },
    {
      "stage": 3,
      "description": "Browser Testing",
      "parallel": false,
      "agents": ["browser-functionality-validator"],
      "rationale": "Interactive elements need browser validation"
    }
  ],
  "deployment": {
    "required": true,
    "agent": "vercel-deployer",
    "rationale": "User needs production deployment"
  },
  "notification": {
    "required": true,
    "agent": "notification-agent",
    "rationale": "SMS notification for project completion"
  }
}

The orchestrator reasons about each project individually, considering:

  • Technology choice (static HTML vs Next.js)
  • Interactive elements requiring browser testing
  • Parallelization opportunities
  • Deployment needs

Important Implementation Notes

Artifact Access in Ephemeral Storage

  • Problem: Requirements analyst runs BEFORE GitHub clones repos
  • Solution: Use githubRepository field in requirements.json with exact repo name from VOICE_CONTEXT.md
  • Flow: Requirements analyst reads VOICE_CONTEXT.md → Extracts GitHub field → Sets githubRepository → GitHub manager clones that exact repo → All agents have artifacts
  • Key: Requirements analyst CANNOT read artifact files directly (they don't exist locally yet)

Agent Communication

  • Agents DO NOT communicate directly
  • All communication via JSON artifacts in project-specific directories
  • Each agent runs in isolated context via Task tool
  • Orchestrator passes project directory path to each agent
  • Agent names must match exactly (e.g., voice-requirements-analyst not requirements-analyst)

Path Handling

  • Agents receive explicit project directory from orchestrator
  • All artifacts read/written to {project_dir}/artifacts/
  • Webhook data remains at global webhooks/latest_raw.json
  • Static HTML output goes to {project_dir}/src/index.html

Error Handling

  • JSON validation using validate_json_file() after each agent
  • Automatic retry with exponential backoff (up to 3 attempts)
  • State preserved in state.json even on failure
  • Clear error logging with timestamps
  • Retry logic includes 2-second pause between attempts

Recent Fixes Applied

  1. Agent invocation: Now properly uses Task tool with subagent_type parameter
  2. Path confusion: All agents updated with clear path instructions
  3. JSON validation: Added validation function and retry mechanism
  4. Security: Moved from hardcoded credentials to environment variables
  5. Static output: Created static-html-generator agent for immediate viewable results
  6. Dependencies: Added typing_extensions to requirements.txt
  7. Agent location: Agents MUST be in .claude/agents/ at project root for discovery
  8. Working directory: Orchestrator must NOT change cwd - Claude needs to stay at project root to find agents
  9. Filename consistency: Fixed scaffold_report.json naming (was scaffolding_report.json mismatch)
  10. File naming enforcement: Added explicit filename instructions and auto-correction for variant names
  11. Browser validation: Added browser-functionality-validator agent for UI/UX testing with self-healing
  12. Playwright MCP integration: Configured MCP servers in orchestrator for real browser automation
  13. Parallel execution fix: Resolved concurrent client access error by creating isolated clients for parallel agents
  14. Vercel protection: Auto-disables deployment protection for immediate site access
  15. Resume capability: Added comprehensive phase management for pipeline recovery
  16. Double notification prevention: Separated deployment and notification into dedicated fields in execution_plan.json

Data Flow

  1. Git Repository: Each project lives in its own git repo
  2. Artifacts Directory: Central hub for inter-agent communication
  3. JSON Format: Structured data exchange between agents
  4. State Management: state.json tracks pipeline progress
  5. Version Control: Complete history via git commits
  6. GitHub Backup: All code pushed to GitHub repositories

Prompt Engineering Guidelines

Prompt Template Structure

# [Agent Name] Agent

You are a [role] agent in the Voice Vibe Coding system.

## Context
You are part of a pipeline that transforms voice conversations into deployed web applications.

## Your Specific Task
[Clear description of what this agent does]

## Inputs
- [List of files/artifacts this agent reads]

## Expected Outputs
- [List of files/artifacts this agent produces]

## Guidelines
1. Make reasonable assumptions when information is ambiguous
2. Document every assumption in decisions.log with reasoning
3. Optimize for beautiful, functional, modern web applications
4. Prioritize user experience and accessibility

## Technical Constraints
- Use Next.js/React for frontend
- Use Tailwind CSS for styling
- Ensure mobile responsiveness
- Follow modern web best practices

[Specific instructions for this agent's task]

Key Prompt Principles

  1. Autonomous Decision Making: Agents should make reasonable assumptions rather than fail
  2. Documentation: Every decision and assumption must be logged
  3. Quality Focus: Prioritize beautiful, functional output
  4. Error Recovery: Agents should attempt to fix issues before reporting failure
  5. Context Awareness: Each agent knows its role in the larger pipeline

Implementation Phases

Phase 1: Foundation (Current)

  • Set up basic orchestrator script
  • Create initial prompt templates
  • Test with matchmaking website example
  • Establish artifact structure

Phase 2: Refinement

  • Optimize prompts based on results
  • Add error handling to orchestrator
  • Implement parallel execution where possible
  • Add progress monitoring

Phase 3: Enhancement

  • Add more specialized agents
  • Support multiple framework options
  • Implement advanced deployment options
  • Add testing agents

Phase 4: Scale

  • Support backend development
  • Add database agents
  • Implement API builders
  • Enable full-stack applications

Testing Strategy

Test Case 1: Matchmaking Website

Using the existing transcript in webhooks/latest_raw.json:

  • Modern, elegant design for Southeast Asian professionals
  • Lead generation focus
  • Email capture and consultation scheduling
  • Instagram integration
  • Testimonials section

Success Criteria

  1. Requirements correctly extracted from conversation
  2. Design matches described aesthetic
  3. All mentioned features implemented
  4. Successfully deploys to Vercel
  5. SMS notification sent with URL

Future Expansions

Planned Agent Additions

  • Backend Developer Agent: API and server logic
  • Database Architect Agent: Schema design and setup
  • API Integration Agent: Third-party service connections
  • Testing Agent: Automated test generation
  • SEO Optimizer Agent: Search optimization
  • Performance Agent: Speed and optimization
  • Security Agent: Security best practices
  • Analytics Agent: Add tracking and analytics

Supported Project Types (Roadmap)

  1. Current: Static websites, landing pages
  2. Next: Full-stack web applications
  3. Future: Mobile apps, APIs, microservices

Configuration

Environment Variables

ANTHROPIC_API_KEY=xxx          # Required for Claude Code SDK
ANTHROPIC_MODEL=opus           # Model to use (opus recommended)
TWILIO_ACCOUNT_SID=xxx         # For SMS notifications
TWILIO_AUTH_TOKEN=xxx          # For SMS authentication
TWILIO_PHONE_NUMBER=xxx        # Your Twilio phone number
VERCEL_TOKEN=xxx               # For Vercel deployment
ELEVENLABS_API_KEY=xxx         # For webhook integration
GITHUB_TOKEN=xxx               # Personal access token for GitHub
GITHUB_USERNAME=xxx            # Your GitHub username

Requirements

  • Python 3.10+
  • Claude Code SDK (pip install claude-code-sdk)
  • All dependencies in requirements.txt
  • Sufficient API credits for complex projects
  • Node.js and npx (for Playwright MCP)
  • Playwright MCP installed: claude mcp add playwright -- npx -y @playwright/mcp@latest

Monitoring & Logging

State Tracking

Each project maintains enhanced state.json with phase management:

{
  "project_id": "1555555xxxx-project-name",
  "project_path": "/path/to/project",
  "current_phase": "execution",
  "last_successful_phase": "workflow-planning",
  "phases_completed": {
    "requirements": {
      "time": "2025-01-13T10:00:00Z",
      "artifacts": ["requirements.json"]
    },
    "github-setup": {
      "time": "2025-01-13T10:05:00Z",
      "artifacts": ["github_setup.json"]
    }
  },
  "phases_failed": {
    "execution": {
      "time": "2025-01-13T10:10:00Z",
      "error": "Concurrent client access error"
    }
  },
  "completed_agents": ["requirements-analyst", "github-manager"],
  "errors": [
    {
      "time": "2025-01-13T10:10:00Z",
      "phase": "execution",
      "error": "RuntimeError: read() called while another coroutine..."
    }
  ],
  "start_time": "2025-01-13T10:00:00Z",
  "last_updated": "2025-01-13T10:10:00Z"
}

Decision Logging

All assumptions logged to decisions.log:

[2024-01-07 10:15:23] UI_DESIGNER: Assumed primary color #6366F1 based on "elegant and modern" description
[2024-01-07 10:16:45] DEVELOPER: Used Next.js 14 with App Router for better performance
[2024-01-07 10:18:12] CONTENT: Generated placeholder testimonials as none provided

Ephemeral Storage Details

Cleanup Behavior

  • Temp workspace deleted only on successful completion
  • Failed pipelines leave temp directories for debugging
  • Manual cleanup may be needed: rm -rf projects/.temp/
  • Cleanup triggered by cleanup_ready flag in orchestrator

Disk Usage Management

# Check temp directory usage
du -sh projects/.temp/

# Clean up old temp directories
find projects/.temp/ -type d -mtime +7 -exec rm -rf {} +

# View active pipelines
ls -la projects/.temp/*/

# Clean up legacy archive (2.1GB) if it exists
# The .archive/ directory contains old timestamped projects from before ephemeral storage
rm -rf projects/.archive  # Safe to delete after verifying GitHub repos exist

Summary

Voice Vibe Coding revolutionizes software development by enabling natural language application building through phone conversations. By leveraging the Claude Code SDK as the agent engine, we achieve a simple yet powerful system that can transform ideas into deployed applications in minutes, not days.

The system is designed to be:

  • Simple: Python orchestrator with Claude Code SDK
  • Robust: JSON validation and automatic retry logic
  • Extensible: Easy to add new agents via markdown files
  • Transparent: All decisions documented in logs
  • Reliable: Built on proven Claude Code infrastructure
  • Scalable: From static HTML to full applications

Key Technical Decisions Made

  1. Git-based architecture: Replaced timestamped folders with proper git repositories
  2. Intelligent pipeline selection: AI reasons about optimal execution strategy
  3. GitHub integration: All projects backed up to GitHub with meaningful commits
  4. Multi-project support: Users can update existing projects naturally
  5. Agent isolation: Each agent runs in separate context via Task tool
  6. Artifact-based communication: JSON files for inter-agent data exchange
  7. Retry mechanism: 3 attempts with validation for robustness
  8. Environment variables: All credentials secured via .env
  9. Ephemeral storage: Projects exist locally only during build, then deleted
  10. Privacy-first multi-tenancy: All resources use anonymized user IDs for isolation
  11. GitHub as source of truth: No local persistence, everything in GitHub
  12. VOICE_CONTEXT.md: Conversational project descriptions for voice agent
  13. githubRepository field: Exact repo names from VOICE_CONTEXT.md
  14. Vercel deployment: Automatic deployment with protection disabled
  15. Principle-based orchestration: AI uses judgment, not rigid rules
  16. Parallel execution: Stages can run agents simultaneously with isolated clients
  17. Pipeline resume: Comprehensive phase management for failure recovery
  18. SMS notifications: Personalized messages via Twilio after deployment
  19. Command-line control: Fine-grained execution control via arguments
  20. Phase tracking: Detailed state management for progress visibility
  21. Privacy protection: Phone numbers mapped to random user IDs, never exposed in URLs

What's Working Now

  • Complete pipeline from transcript to deployed application
  • Multi-project support with natural language updates
  • Git-based version control for all projects
  • GitHub integration with automatic repository creation
  • Dynamic pipeline selection (only run necessary agents)
  • Requirements extraction from voice conversations
  • Content generation with proper tone/voice
  • Design system creation with colors, typography, layouts
  • Full Next.js/React application generation
  • Real browser testing with Playwright MCP
  • Self-healing UI/UX validation that auto-fixes issues
  • Static HTML generation with inline CSS
  • Comprehensive state tracking with phase management
  • Pipeline resume capability after failures
  • Parallel agent execution for improved performance
  • Vercel deployment with automatic protection disabling
  • SMS notifications via Twilio after deployment
  • Command-line controls for selective execution
  • Dry-run mode for execution preview
  • Status checking for pipeline progress
  • Repository name resolution via VOICE_CONTEXT.md

Performance Optimization

🚨 CRITICAL: The pipeline currently takes 33.6 minutes to complete. See PERFORMANCE_OPTIMIZATION.md for detailed analysis and optimization plan to reduce this to 6-10 minutes.

Quick Wins Available:

  • Switch 6 agents from Opus to Sonnet (50% speed improvement)
  • Reduce max_turns from 50 to 20 (10% improvement)
  • Optimize agent prompts (20-30% improvement)

Next Steps

  1. URGENT: Implement performance optimizations from PERFORMANCE_OPTIMIZATION.md
  2. Immediate: Full end-to-end testing with live phone calls
  3. Short-term: Optimize parallel execution for more agent combinations
  4. Medium-term: Add backend support with database agents
  5. Long-term: Support for mobile apps and microservices
  6. Future: AI-powered error recovery and self-healing pipelines

Multi-Project Support & Iterative Updates

Overview

Users can call back to modify existing projects. The system maintains context across calls using VOICE_CONTEXT.md files and git repositories, enabling natural conversations like "make the buttons blue on my auction site."

Core Architecture

User Calls → Pre-webhook (context) → Conversation → Post-webhook (transcript) →
Requirements → GitHub Setup → Workflow Planning → Dynamic Pipeline → GitHub Commit → Deploy

The workflow-orchestrator agent intelligently decides which agents to run, while the github-manager handles all version control operations.

Implementation Components

1. Voice Context File (VOICE_CONTEXT.md)

Each phone number has projects/{phone_number}/VOICE_CONTEXT.md containing conversational project descriptions. CRITICAL: Each project includes a GitHub: field with the exact repository name to prevent name mismatches:

# Your Projects

## Bred Auto Auction (Most Recent)
**What it is:** A professional website for luxury car auctions specializing in exotic vehicles. Perfect for serious dealerships looking for rare finds.

**GitHub:** 1555555xxxx-bred-auto-auction

**Recent updates:** Yesterday you asked me to add a new RSVP system for verified dealers only.

**Look and feel:** Bold reds and sleek blacks create a sense of speed and luxury. Very professional but with enough flair to match the exotic cars.

**Original request:** "Create a car auction site for dealerships to find exotic cars"

## Mystery Tech Landing
**What it is:** A futuristic landing page for your tech company with mysterious particle effects floating in the background.

**GitHub:** 1555555xxxx-mystery-tech

**Look and feel:** Dark and mysterious with glowing neon accents, like Blade Runner meets Silicon Valley.

2. ElevenLabs Pre-Call Webhook Integration

Critical Discovery: ElevenLabs supports pre-call webhooks that inject context BEFORE conversation starts.

How it works:

  1. User calls Twilio number
  2. ElevenLabs requests pre-call context from our webhook with: {"caller_id": "+1234567890", "agent_id": "...", ...}
  3. Webhook reads VOICE_CONTEXT.md and returns as dynamic variable
  4. Agent has full context before user speaks

webhook_server.py update (add to existing file):

@app.route('/precall', methods=['POST'])
def precall_context():
    """Pre-call webhook - provides context BEFORE conversation"""
    caller_id = request.json.get('caller_id', '').replace('+', '')
    context_file = f"projects/{caller_id}/VOICE_CONTEXT.md"
    
    if os.path.exists(context_file):
        with open(context_file) as f:
            content = f.read()
    else:
        content = "No projects yet."
    
    return jsonify({
        "dynamic_variables": {
            "projects_context": content
        }
    })

ElevenLabs Configuration:

  1. In ElevenLabs dashboard, set pre-call webhook URL: https://yourserver.com/precall
  2. Update agent prompt to include: {{projects_context}}

3. Requirements Analyst Updates

The requirements analyst now detects whether this is a new project or an update to an existing one.

Key additions to voice-requirements-analyst.md:

  • Reads VOICE_CONTEXT.md to check existing projects
  • Extracts exact GitHub repository name from GitHub: field
  • Matches conversation to correct project based on descriptions
  • Cannot read artifact files (ephemeral storage)
  • Outputs isUpdate: true and githubRepository with exact repo name (no guessing)

4. Workflow Orchestrator Agent (Central Intelligence)

The workflow-orchestrator agent decides which agents to run based on requirements. This eliminates complex if/else logic from orchestrator.py.

Decision examples:

  • New landing page → Full pipeline
  • "Change text" → Just component-builder + validator
  • "Make it blue" → Just styling-finisher + validator
  • "Add contact form" → Content, architect, builder, assembler, styling, validator

Output (execution_plan.json):

{
  "pipeline_type": "update_content",
  "reason": "User only wants text changes",
  "agents": [
    {"name": "component-builder", "purpose": "Update text in components"},
    {"name": "nextjs-validator", "purpose": "Ensure build succeeds"}
  ]
}

5. Git-Based Orchestrator Flow

# Step 1: Requirements analyst (always runs)
# Step 2: GitHub manager sets up workspace (new repo or pull latest)
# Step 3: Workflow orchestrator plans execution
# Step 4: Execute only necessary agents
# Step 5: GitHub manager commits and pushes changes
# Step 6: Generate summary

No more timestamped folders - proper version control!

6. Agent Update Modes

Key agents support "update mode" when src/ already exists:

  • component-builder: Makes surgical edits instead of regenerating
  • styling-finisher: Updates only specified styles
  • content-strategist: Merges new content with existing

Implementation Timeline

Phase 1 (2-3 hours):

  1. Update webhook_server.py with precall endpoint
  2. Simplify orchestrator.py to use workflow orchestrator
  3. Enable and configure workflow-orchestrator.md
  4. Update requirements analyst for multi-project
  5. Add update mode to component-builder
  6. Test complete flow

Why This Approach is Superior

  1. Simplest Solution: One markdown file for context, one agent for decisions
  2. Natural UX: Agent greets with "I see you have your auction site..."
  3. Efficient Updates: Only runs necessary agents
  4. Infinitely Scalable: Add new agents by updating one prompt
  5. Self-Documenting: execution_plan.json shows exactly what will run

Testing Checklist

  • New user creates first project
  • User with 1 project makes update
  • User with multiple projects - correct matching
  • Content-only changes skip design pipeline
  • Style changes preserve content
  • New features added without full rebuild

Usage Examples

# Run the orchestrator (analyzes requirements and chooses pipeline)
python orchestrator.py

# Resume after failure
python orchestrator.py --resume

# Run only specific phases
python orchestrator.py --only deployment
python orchestrator.py --start-from execution

# Skip phases that are already done
python orchestrator.py --skip requirements,github-setup

# Check pipeline status
python orchestrator.py --status

# Preview what would run
python orchestrator.py --dry-run

# Force re-run everything
python orchestrator.py --force

# After generation, projects are in git repositories:
cd projects/banana-delivery  # or any project name
git log                      # See commit history
git status                   # Check current state

# Run Next.js applications:
cd projects/banana-delivery/src
npm install
npm run dev
# Open http://localhost:3000

# View static HTML:
open projects/banana-delivery/src/index.html

Testing with Sample Data

Using Existing Webhook Data

The repository includes sample webhook data in webhooks/latest_raw.json for testing. This data represents a voice conversation about building or updating a project.

Running Tests

Manual Test Run

# Process existing webhook data
python orchestrator.py

Automated Webhook Testing

# Start webhook server (auto-triggers orchestrator)
python webhook_server.py

# In another terminal, send test webhook
curl -X POST http://localhost:5001/elevenlabs-webhook \
  -H "Content-Type: application/json" \
  -d @webhooks/latest_raw.json

Test with Specific Scenarios

# Test resume after failure
python orchestrator.py --dry-run  # Preview
python orchestrator.py --only requirements,github-setup  # Partial run
# Simulate failure, then:
python orchestrator.py --resume  # Continue from failure

# Test deployment only
python orchestrator.py --only deployment

# Test with verbose output
python orchestrator.py --verbose

Monitoring Test Execution

# Check pipeline status
python orchestrator.py --status

# Monitor logs in real-time
tail -f logs/pipeline_*.log

# Check webhook server status
curl http://localhost:5001/status

Expected Test Results

  1. Requirements Extraction: JSON file created in artifacts/
  2. GitHub Operations: Repository created/updated
  3. Agent Execution: All specified agents run successfully
  4. Build Validation: Application builds without errors
  5. Deployment: Vercel URL generated (if configured)
  6. Notification: SMS sent (if Twilio configured)