Voice Vibe Coding - Multi-Agent System Architecture

Project Overview

Voice Vibe Coding enables users to build and deploy software through natural phone conversations. Users call an ElevenLabs phone agent, describe what they want to build, and a multi-agent Claude Code system automatically transforms the conversation into a deployed web application.

Core Workflow

User calls ElevenLabs phone number
Natural conversation about desired application
Webhook captures transcript in JSON format
Multi-agent Claude Code pipeline processes transcript
Fully functional web app is built and deployed
User receives SMS with deployment URL

System Architecture

Key Principle: Claude Code SDK as Agent Engine

The system uses the Claude Code SDK (Python) to orchestrate specialized agents. Each agent is invoked via the Task tool with isolated context, enabling clean separation of concerns.

Input (webhook) → Orchestrator (Python/SDK) → Chain of Claude Code Agents → Generated HTML → Deployed App

Ephemeral Storage & Multi-Tenancy Architecture

The system implements zero-persistence local storage with privacy-first multi-tenancy to ensure:

Minimal disk usage: Projects only exist locally during pipeline execution
Complete user isolation: Anonymized user IDs prevent data collision
GitHub as source of truth: All code persisted in GitHub, not locally
Unlimited scalability: Support thousands of users without disk concerns

How It Works:

Phone Number Extraction: Every request includes caller's phone number
Ephemeral Workspace: Projects built in /projects/.temp/{phone}/{timestamp}/
Anonymized Repos: All GitHub repos named vvc-{user_id}-{project-name}
Automatic Cleanup: Temp workspace deleted after successful GitHub push
GitHub-Centric Registry: Only tracks GitHub URLs, no local paths

Workflow:

Phone Call → Extract Phone → Map to User ID → Create Temp Workspace → Build Project → 
Push to GitHub (anonymized) → Delete Local → Update Registry (GitHub URL only)

Benefits:

No disk bloat: Local storage near-zero (only active pipelines)
No user collision: Phone prefixes guarantee uniqueness
Simple recovery: Everything recoverable from GitHub
Stateless processing: Local system is just a build server

Current Implementation Status (January 2025)

✅ Python orchestrator using Claude Code SDK
✅ Agent invocation via Task tool with proper subagent_type
✅ JSON validation after each agent with automatic retry (3 attempts)
✅ Static HTML generator agent created and working
✅ Path clarification added to all agents
✅ Environment variables for security (no hardcoded credentials)
✅ Git-based project management (no more timestamps!)
✅ GitHub Manager agent for repository operations
✅ State tracking with state.json
✅ Next.js/React pipeline fully implemented
✅ Dynamic pipeline selection via workflow orchestrator
✅ Browser automation via Playwright MCP integrated
✅ Real UI/UX testing with self-healing capabilities
✅ Multi-project support with ElevenLabs pre-call webhooks
✅ Iterative updates for existing projects
✅ Ephemeral storage with automatic cleanup
✅ Privacy-first multi-tenancy with anonymized user IDs
✅ Vercel deployment with automatic protection disabling
✅ Repository name matching via VOICE_CONTEXT.md GitHub field
✅ SMS notification agent integrated
✅ Pipeline resume capability with phase management
✅ Parallel agent execution with isolated clients
✅ Command-line controls for selective execution

Project Structure (Ephemeral Architecture)

voice-vibe-coding/
├── orchestrator.py              # Main Python orchestrator using Claude Code SDK
├── .claude/
│   └── agents/                 # Agent prompt files (MUST be here for discovery)
│       ├── voice-requirements-analyst.md  # Extracts requirements
│       ├── workflow-orchestrator.md       # Plans execution
│       ├── github-manager.md              # Manages git repositories (phone-aware)
│       ├── content-strategist.md          # Generates content
│       ├── ui-ux-designer.md              # Creates design specs
│       ├── static-html-generator.md       # Generates HTML output
│       ├── project-scaffolder.md          # Initializes Next.js project
│       ├── component-architect.md         # Designs component structure
│       ├── component-builder.md           # Builds React components
│       ├── page-assembler.md              # Assembles pages and routes
│       ├── styling-finisher.md            # Applies final styling
│       ├── nextjs-validator.md            # Validates Next.js app
│       ├── browser-functionality-validator.md  # Browser UI/UX testing
│       ├── vercel-deployer.md              # Deploys to Vercel from GitHub
│       └── notification-agent.md           # Sends SMS notifications via Twilio
├── webhooks/
│   └── latest_raw.json          # Latest ElevenLabs transcript
├── projects/
│   ├── .temp/                  # Ephemeral workspaces (auto-deleted)
│   │   └── {phone}/{timestamp}/ # Temporary build directory
│   ├── .project-registry.json  # GitHub-centric registry (no local paths)
│   ├── {phone}/                # Phone-based organization
│   │   ├── VOICE_CONTEXT.md    # Conversational context for voice agent
│   │   └── {project-name}/     # OPTIONAL: Local cache for active development
│   └── .archive/               # Legacy projects (to be removed)
├── webhook_server.py           # Receives ElevenLabs webhooks
├── twilio_messaging.py         # SMS notifications (uses env vars)
├── requirements.txt            # Python dependencies
├── .env.example               # Environment variable template
├── README.md                  # Setup and usage instructions
└── CLAUDE.md                  # This file (project context)

GitHub Repository Structure:
- github.com/breagent/vvc-{user_id}-{project-name}  # Anonymized user IDs
- Example: github.com/breagent/vvc-a7x9k2-banana-delivery
- Phone numbers mapped to random IDs for privacy
- Ensures complete user isolation without exposing personal info

Agent Definitions

1. Requirements Analyst Agent

Purpose: Extract structured requirements from voice transcript
Input: webhooks/latest_raw.json and VOICE_CONTEXT.md
Output: artifacts/requirements.json with githubRepository field
Key Tasks:

Parse conversation transcript
Check VOICE_CONTEXT.md for existing projects (no artifact access)
Determine if update or new project
Set githubRepository field for updates
Extract feature requirements
Document any ambiguities with assumptions

2. Workflow Orchestrator Agent

Purpose: Intelligently determine execution strategy using AI reasoning
Input: artifacts/requirements.json
Output: artifacts/execution_plan.json with stages and parallelization
Key Approach:

Uses principles, not rules
Reasons about what makes sense for THIS project
Determines what can run in parallel
Includes testing for interactive features
Explains all decisions for transparency

3. GitHub Manager Agent

Purpose: Handle all git and GitHub operations with privacy-first multi-tenancy
Input: Phone number + artifacts/requirements.json with githubRepository field
Output: artifacts/github_setup.json or artifacts/github_finalize.json
Key Tasks:

Extract phone number from task description
Map phone to anonymous user ID (or generate new one)
Check githubRepository field to determine new vs update
Clone existing repo if githubRepository present
Create new repo if no githubRepository
Use anonymized naming: vvc-{user_id}-{project_name}
Generate meaningful commits with user identification
Update VOICE_CONTEXT.md with conversational descriptions
Maintain GitHub-centric registry (no local paths)

4. Content Strategist Agent

Purpose: Extract and generate all text content
Input: artifacts/requirements.json
Output: artifacts/content.json
Key Tasks:

Extract content from transcript
Generate marketing copy
Create headlines and CTAs
Write placeholder content
Ensure tone consistency

5. UI/UX Designer Agent

Purpose: Create design specifications
Input: artifacts/requirements.json, artifacts/content.json
Output: artifacts/design_specs.json
Key Tasks:

Define color palette
Choose typography
Create layout specifications
Define component styling
Specify responsive breakpoints
Document interaction patterns

6. Component Architect Agent

Purpose: Plan component structure
Input: artifacts/requirements.json, artifacts/design_specs.json
Output: artifacts/components_map.json
Key Tasks:

Break UI into components
Define component hierarchy
Specify props and state
Plan data flow
Identify reusable patterns

7. Project Scaffolder Agent

Purpose: Initialize project structure
Input: artifacts/execution_plan.json
Output: Initialized project in src/
Key Tasks:

Create Next.js/React project
Install dependencies
Set up folder structure
Configure build tools
Initialize git repository

8. Component Builder Agent

Purpose: Build the actual application
Input: All artifacts
Output: Complete application in src/
Key Tasks:

Implement all components
Add routing
Integrate content
Ensure responsive design
Add accessibility features
Create forms and interactions

9. Page Assembler Agent

Purpose: Assemble pages and configure routing
Input: Component implementations
Output: Complete Next.js pages
Key Tasks:

Create page components
Set up routing
Integrate all components
Configure navigation
Ensure data flow

10. Styling Finisher Agent

Purpose: Apply design system and styles
Input: artifacts/design_specs.json, existing code
Output: Styled application
Key Tasks:

Implement Tailwind/CSS
Apply color scheme
Ensure design consistency
Add animations/transitions
Optimize for all breakpoints

11. Next.js Validator Agent

Purpose: Ensure code quality
Input: Complete application code
Output: artifacts/validation_report.json
Key Tasks:

Run linting
Type checking
Basic functionality tests
Fix identified issues
Verify build success

12. Browser Functionality Validator Agent

Purpose: Validate UI/UX through real browser automation
Input: Built and validated Next.js application
Output: artifacts/browser_test_report.json
Key Tasks:

Start development server
Use Playwright MCP tools for real browser testing
Test all forms and interactions
Validate responsive design
Automatically fix UI/UX issues found
Re-test after fixes to confirm resolution

13. Vercel Deployer Agent

Purpose: Deploy projects to Vercel from GitHub repository
Input: artifacts/github_finalize.json (contains GitHub repo info)
Output: artifacts/deployment.json with production URL
Key Tasks:

Read GitHub repository details from finalize artifact
Create Vercel project linked to GitHub repo
Configure automatic deployments
Disable deployment protection automatically
Deploy from main branch
Return production URL for SMS notification Note: Protection is disabled via API to ensure sites go live immediately without manual intervention

14. Documentation Compiler Agent

Purpose: Create comprehensive documentation
Input: All artifacts
Output: Documentation in docs/
Key Tasks:

Compile requirements summary
Document design decisions
List all assumptions made
Create README
Generate technical specs

15. Notification Agent

Purpose: Notify user of completion via SMS
Input: Phone number from task description + deployment/project info
Output: artifacts/notification.json with SMS status
Key Tasks:

Extract deployment URL or GitHub URL
Craft personalized, engaging message
Send SMS notification via Twilio
Log notification status Note: Uses conversational tone matching the Voice Vibe experience

Orchestration Strategy

Current Implementation (Python SDK)

The orchestrator (orchestrator.py) uses the Claude Code SDK to manage agent execution:

# Key features implemented:
- Git-based project management (no more timestamps!)
- Dynamic pipeline selection via workflow-orchestrator
- JSON validation after each agent
- Retry logic (3 attempts per agent)
- State tracking in state.json
- GitHub integration for version control
- Playwright MCP integration for browser testing
- Pipeline resume capability with phase management
- Parallel agent execution with isolated clients
- Command-line controls for selective execution

# MCP Configuration:
options = ClaudeCodeOptions(
    mcp_servers={
        "playwright": {
            "command": "npx",
            "args": ["-y", "@playwright/mcp@latest"]
        }
    },
    allowed_tools=["Task", "Read", "Write", "Edit", "Bash", "Grep", "Glob", "TodoWrite", "mcp__playwright"]
)

# Agent invocation pattern:
async def invoke_agent(client, agent_name, description, project_dir, expected_outputs=None, 
                      create_own_client=False):  # New: isolated client for parallel execution
    # Uses Task tool with subagent_type
    # Validates JSON outputs
    # Auto-corrects filename variants
    # Retries on failure
    # Creates separate client if parallel execution needed
    # MCP tools available to all agents

# Execution flow:
1. Requirements Analysis (always)
2. GitHub Setup (create/clone repo)
3. Workflow Planning (intelligent decision-making)
4. Execute Pipeline (stages from plan, with parallelization)
5. GitHub Finalize (commit & push)
6. Vercel Deployment (if included in plan)
7. SMS Notification (if phone number available)
8. Summary Generation

# Command-line options:
python orchestrator.py                          # Normal full run
python orchestrator.py --resume                 # Resume from last failure
python orchestrator.py --start-from execution   # Start from specific phase
python orchestrator.py --only deployment        # Run only specific phase
python orchestrator.py --skip github-setup      # Skip specific phases
python orchestrator.py --force                  # Force re-run all phases
python orchestrator.py --status                 # Check pipeline status
python orchestrator.py --dry-run               # Preview execution plan
python orchestrator.py --verbose               # Show detailed output

# Phase names for command-line options:
# - requirements (Requirements Analysis)
# - github-setup (GitHub Workspace Setup)
# - workflow-planning (Workflow Planning)
# - execution (Agent Execution)
# - github-finalize (GitHub Finalization)
# - deployment (Vercel Deployment)
# - notification (SMS Notification)
# - summary (Pipeline Summary)

Command-Line Reference

Resume and Recovery Options

--resume

Resumes pipeline from the last failed phase
Automatically detects last successful phase from state.json
Loads existing artifacts and continues execution
Skips already completed phases

--force

Forces re-run of all phases even if artifacts exist
Overwrites existing state.json
Useful for debugging or when artifacts are corrupted

--project-dir <path>

Uses existing project directory instead of creating new one
Must point to valid project with state.json
Useful for resuming specific projects

Selective Execution Options

--start-from <phase>

Starts execution from specified phase
Skips all phases before the specified one
Phase must be one of: requirements, github-setup, workflow-planning, execution, github-finalize, deployment, notification, summary

--only <phases>

Runs only specified phases (comma-separated)
Example: --only deployment,notification
Useful for testing specific phases in isolation

--skip <phases>

Skips specified phases (comma-separated)
Example: --skip github-setup,deployment
Continues with remaining phases

Monitoring and Testing Options

--status

Shows current pipeline status and exits
Displays phase completion status
Shows last error if pipeline failed
No execution occurs

--dry-run

Shows what would be executed without running
Displays phase execution plan
Useful for verifying command-line options
No actual execution occurs

--verbose

Shows detailed output from all agents
Includes full agent responses
Useful for debugging agent issues

Execution Pipelines

The workflow-orchestrator uses intelligent reasoning to create execution plans:

Principle-Based Decision Making

Instead of rigid pipelines, the orchestrator considers:

Quality Principles: Test interactive features, validate before deploying
Efficiency Principles: Parallelize independent work
User Experience: Include browser testing for UI/UX critical features
Deployment Strategy: Deploy after validation passes

Example Execution Plan Structure

{
  "pipeline_type": "static_website",
  "reasoning": "User wants a simple landing page with interactive elements",
  "execution_stages": [
    {
      "stage": 1,
      "description": "Content and Design",
      "parallel": true,
      "agents": ["content-strategist", "ui-ux-designer"],
      "rationale": "Independent tasks that can run simultaneously"
    },
    {
      "stage": 2,
      "description": "Implementation",
      "parallel": false,
      "agents": ["static-html-generator"],
      "rationale": "Needs both content and design complete"
    },
    {
      "stage": 3,
      "description": "Browser Testing",
      "parallel": false,
      "agents": ["browser-functionality-validator"],
      "rationale": "Interactive elements need browser validation"
    }
  ],
  "deployment": {
    "required": true,
    "agent": "vercel-deployer",
    "rationale": "User needs production deployment"
  },
  "notification": {
    "required": true,
    "agent": "notification-agent",
    "rationale": "SMS notification for project completion"
  }
}

The orchestrator reasons about each project individually, considering:

Technology choice (static HTML vs Next.js)
Interactive elements requiring browser testing
Parallelization opportunities
Deployment needs

Important Implementation Notes

Artifact Access in Ephemeral Storage

Problem: Requirements analyst runs BEFORE GitHub clones repos
Solution: Use githubRepository field in requirements.json with exact repo name from VOICE_CONTEXT.md
Flow: Requirements analyst reads VOICE_CONTEXT.md → Extracts GitHub field → Sets githubRepository → GitHub manager clones that exact repo → All agents have artifacts
Key: Requirements analyst CANNOT read artifact files directly (they don't exist locally yet)

Agent Communication

Agents DO NOT communicate directly
All communication via JSON artifacts in project-specific directories
Each agent runs in isolated context via Task tool
Orchestrator passes project directory path to each agent
Agent names must match exactly (e.g., voice-requirements-analyst not requirements-analyst)

Path Handling

Agents receive explicit project directory from orchestrator
All artifacts read/written to {project_dir}/artifacts/
Webhook data remains at global webhooks/latest_raw.json
Static HTML output goes to {project_dir}/src/index.html

Error Handling

JSON validation using validate_json_file() after each agent
Automatic retry with exponential backoff (up to 3 attempts)
State preserved in state.json even on failure
Clear error logging with timestamps
Retry logic includes 2-second pause between attempts

Recent Fixes Applied

Agent invocation: Now properly uses Task tool with subagent_type parameter
Path confusion: All agents updated with clear path instructions
JSON validation: Added validation function and retry mechanism
Security: Moved from hardcoded credentials to environment variables
Static output: Created static-html-generator agent for immediate viewable results
Dependencies: Added typing_extensions to requirements.txt
Agent location: Agents MUST be in .claude/agents/ at project root for discovery
Working directory: Orchestrator must NOT change cwd - Claude needs to stay at project root to find agents
Filename consistency: Fixed scaffold_report.json naming (was scaffolding_report.json mismatch)
File naming enforcement: Added explicit filename instructions and auto-correction for variant names
Browser validation: Added browser-functionality-validator agent for UI/UX testing with self-healing
Playwright MCP integration: Configured MCP servers in orchestrator for real browser automation
Parallel execution fix: Resolved concurrent client access error by creating isolated clients for parallel agents
Vercel protection: Auto-disables deployment protection for immediate site access
Resume capability: Added comprehensive phase management for pipeline recovery
Double notification prevention: Separated deployment and notification into dedicated fields in execution_plan.json

Data Flow

Git Repository: Each project lives in its own git repo
Artifacts Directory: Central hub for inter-agent communication
JSON Format: Structured data exchange between agents
State Management: state.json tracks pipeline progress
Version Control: Complete history via git commits
GitHub Backup: All code pushed to GitHub repositories

Prompt Engineering Guidelines

Prompt Template Structure

# [Agent Name] Agent

You are a [role] agent in the Voice Vibe Coding system.

## Context
You are part of a pipeline that transforms voice conversations into deployed web applications.

## Your Specific Task
[Clear description of what this agent does]

## Inputs
- [List of files/artifacts this agent reads]

## Expected Outputs
- [List of files/artifacts this agent produces]

## Guidelines
1. Make reasonable assumptions when information is ambiguous
2. Document every assumption in decisions.log with reasoning
3. Optimize for beautiful, functional, modern web applications
4. Prioritize user experience and accessibility

## Technical Constraints
- Use Next.js/React for frontend
- Use Tailwind CSS for styling
- Ensure mobile responsiveness
- Follow modern web best practices

[Specific instructions for this agent's task]

Key Prompt Principles

Autonomous Decision Making: Agents should make reasonable assumptions rather than fail
Documentation: Every decision and assumption must be logged
Quality Focus: Prioritize beautiful, functional output
Error Recovery: Agents should attempt to fix issues before reporting failure
Context Awareness: Each agent knows its role in the larger pipeline

Implementation Phases

Phase 1: Foundation (Current)

Set up basic orchestrator script
Create initial prompt templates
Test with matchmaking website example
Establish artifact structure

Phase 2: Refinement

Optimize prompts based on results
Add error handling to orchestrator
Implement parallel execution where possible
Add progress monitoring

Phase 3: Enhancement

Add more specialized agents
Support multiple framework options
Implement advanced deployment options
Add testing agents

Phase 4: Scale

Support backend development
Add database agents
Implement API builders
Enable full-stack applications

Testing Strategy

Test Case 1: Matchmaking Website

Using the existing transcript in webhooks/latest_raw.json:

Modern, elegant design for Southeast Asian professionals
Lead generation focus
Email capture and consultation scheduling
Instagram integration
Testimonials section

Success Criteria

Requirements correctly extracted from conversation
Design matches described aesthetic
All mentioned features implemented
Successfully deploys to Vercel
SMS notification sent with URL

Future Expansions

Planned Agent Additions

Backend Developer Agent: API and server logic
Database Architect Agent: Schema design and setup
API Integration Agent: Third-party service connections
Testing Agent: Automated test generation
SEO Optimizer Agent: Search optimization
Performance Agent: Speed and optimization
Security Agent: Security best practices
Analytics Agent: Add tracking and analytics

Supported Project Types (Roadmap)

Current: Static websites, landing pages
Next: Full-stack web applications
Future: Mobile apps, APIs, microservices

Configuration

Environment Variables

ANTHROPIC_API_KEY=xxx          # Required for Claude Code SDK
ANTHROPIC_MODEL=opus           # Model to use (opus recommended)
TWILIO_ACCOUNT_SID=xxx         # For SMS notifications
TWILIO_AUTH_TOKEN=xxx          # For SMS authentication
TWILIO_PHONE_NUMBER=xxx        # Your Twilio phone number
VERCEL_TOKEN=xxx               # For Vercel deployment
ELEVENLABS_API_KEY=xxx         # For webhook integration
GITHUB_TOKEN=xxx               # Personal access token for GitHub
GITHUB_USERNAME=xxx            # Your GitHub username

Requirements

Python 3.10+
Claude Code SDK (pip install claude-code-sdk)
All dependencies in requirements.txt
Sufficient API credits for complex projects
Node.js and npx (for Playwright MCP)
Playwright MCP installed: claude mcp add playwright -- npx -y @playwright/mcp@latest

Monitoring & Logging

State Tracking

Each project maintains enhanced state.json with phase management:

{
  "project_id": "1555555xxxx-project-name",
  "project_path": "/path/to/project",
  "current_phase": "execution",
  "last_successful_phase": "workflow-planning",
  "phases_completed": {
    "requirements": {
      "time": "2025-01-13T10:00:00Z",
      "artifacts": ["requirements.json"]
    },
    "github-setup": {
      "time": "2025-01-13T10:05:00Z",
      "artifacts": ["github_setup.json"]
    }
  },
  "phases_failed": {
    "execution": {
      "time": "2025-01-13T10:10:00Z",
      "error": "Concurrent client access error"
    }
  },
  "completed_agents": ["requirements-analyst", "github-manager"],
  "errors": [
    {
      "time": "2025-01-13T10:10:00Z",
      "phase": "execution",
      "error": "RuntimeError: read() called while another coroutine..."
    }
  ],
  "start_time": "2025-01-13T10:00:00Z",
  "last_updated": "2025-01-13T10:10:00Z"
}

Decision Logging

All assumptions logged to decisions.log:

[2024-01-07 10:15:23] UI_DESIGNER: Assumed primary color #6366F1 based on "elegant and modern" description
[2024-01-07 10:16:45] DEVELOPER: Used Next.js 14 with App Router for better performance
[2024-01-07 10:18:12] CONTENT: Generated placeholder testimonials as none provided

Ephemeral Storage Details

Cleanup Behavior

Temp workspace deleted only on successful completion
Failed pipelines leave temp directories for debugging
Manual cleanup may be needed: rm -rf projects/.temp/
Cleanup triggered by cleanup_ready flag in orchestrator

Disk Usage Management

# Check temp directory usage
du -sh projects/.temp/

# Clean up old temp directories
find projects/.temp/ -type d -mtime +7 -exec rm -rf {} +

# View active pipelines
ls -la projects/.temp/*/

# Clean up legacy archive (2.1GB) if it exists
# The .archive/ directory contains old timestamped projects from before ephemeral storage
rm -rf projects/.archive  # Safe to delete after verifying GitHub repos exist

Summary

Voice Vibe Coding revolutionizes software development by enabling natural language application building through phone conversations. By leveraging the Claude Code SDK as the agent engine, we achieve a simple yet powerful system that can transform ideas into deployed applications in minutes, not days.

The system is designed to be:

Simple: Python orchestrator with Claude Code SDK
Robust: JSON validation and automatic retry logic
Extensible: Easy to add new agents via markdown files
Transparent: All decisions documented in logs
Reliable: Built on proven Claude Code infrastructure
Scalable: From static HTML to full applications

Key Technical Decisions Made

Git-based architecture: Replaced timestamped folders with proper git repositories
Intelligent pipeline selection: AI reasons about optimal execution strategy
GitHub integration: All projects backed up to GitHub with meaningful commits
Multi-project support: Users can update existing projects naturally
Agent isolation: Each agent runs in separate context via Task tool
Artifact-based communication: JSON files for inter-agent data exchange
Retry mechanism: 3 attempts with validation for robustness
Environment variables: All credentials secured via .env
Ephemeral storage: Projects exist locally only during build, then deleted
Privacy-first multi-tenancy: All resources use anonymized user IDs for isolation
GitHub as source of truth: No local persistence, everything in GitHub
VOICE_CONTEXT.md: Conversational project descriptions for voice agent
githubRepository field: Exact repo names from VOICE_CONTEXT.md
Vercel deployment: Automatic deployment with protection disabled
Principle-based orchestration: AI uses judgment, not rigid rules
Parallel execution: Stages can run agents simultaneously with isolated clients
Pipeline resume: Comprehensive phase management for failure recovery
SMS notifications: Personalized messages via Twilio after deployment
Command-line control: Fine-grained execution control via arguments
Phase tracking: Detailed state management for progress visibility
Privacy protection: Phone numbers mapped to random user IDs, never exposed in URLs

What's Working Now

Complete pipeline from transcript to deployed application
Multi-project support with natural language updates
Git-based version control for all projects
GitHub integration with automatic repository creation
Dynamic pipeline selection (only run necessary agents)
Requirements extraction from voice conversations
Content generation with proper tone/voice
Design system creation with colors, typography, layouts
Full Next.js/React application generation
Real browser testing with Playwright MCP
Self-healing UI/UX validation that auto-fixes issues
Static HTML generation with inline CSS
Comprehensive state tracking with phase management
Pipeline resume capability after failures
Parallel agent execution for improved performance
Vercel deployment with automatic protection disabling
SMS notifications via Twilio after deployment
Command-line controls for selective execution
Dry-run mode for execution preview
Status checking for pipeline progress
Repository name resolution via VOICE_CONTEXT.md

Performance Optimization

🚨 CRITICAL: The pipeline currently takes 33.6 minutes to complete. See PERFORMANCE_OPTIMIZATION.md for detailed analysis and optimization plan to reduce this to 6-10 minutes.

Quick Wins Available:

Switch 6 agents from Opus to Sonnet (50% speed improvement)
Reduce max_turns from 50 to 20 (10% improvement)
Optimize agent prompts (20-30% improvement)

Next Steps

URGENT: Implement performance optimizations from PERFORMANCE_OPTIMIZATION.md
Immediate: Full end-to-end testing with live phone calls
Short-term: Optimize parallel execution for more agent combinations
Medium-term: Add backend support with database agents
Long-term: Support for mobile apps and microservices
Future: AI-powered error recovery and self-healing pipelines

Multi-Project Support & Iterative Updates

Overview

Users can call back to modify existing projects. The system maintains context across calls using VOICE_CONTEXT.md files and git repositories, enabling natural conversations like "make the buttons blue on my auction site."

Core Architecture

User Calls → Pre-webhook (context) → Conversation → Post-webhook (transcript) →
Requirements → GitHub Setup → Workflow Planning → Dynamic Pipeline → GitHub Commit → Deploy

The workflow-orchestrator agent intelligently decides which agents to run, while the github-manager handles all version control operations.

Implementation Components

1. Voice Context File (VOICE_CONTEXT.md)

Each phone number has projects/{phone_number}/VOICE_CONTEXT.md containing conversational project descriptions. CRITICAL: Each project includes a GitHub: field with the exact repository name to prevent name mismatches:

# Your Projects

## Bred Auto Auction (Most Recent)
**What it is:** A professional website for luxury car auctions specializing in exotic vehicles. Perfect for serious dealerships looking for rare finds.

**GitHub:** 1555555xxxx-bred-auto-auction

**Recent updates:** Yesterday you asked me to add a new RSVP system for verified dealers only.

**Look and feel:** Bold reds and sleek blacks create a sense of speed and luxury. Very professional but with enough flair to match the exotic cars.

**Original request:** "Create a car auction site for dealerships to find exotic cars"

## Mystery Tech Landing
**What it is:** A futuristic landing page for your tech company with mysterious particle effects floating in the background.

**GitHub:** 1555555xxxx-mystery-tech

**Look and feel:** Dark and mysterious with glowing neon accents, like Blade Runner meets Silicon Valley.

2. ElevenLabs Pre-Call Webhook Integration

Critical Discovery: ElevenLabs supports pre-call webhooks that inject context BEFORE conversation starts.

How it works:

User calls Twilio number
ElevenLabs requests pre-call context from our webhook with: {"caller_id": "+1234567890", "agent_id": "...", ...}
Webhook reads VOICE_CONTEXT.md and returns as dynamic variable
Agent has full context before user speaks

webhook_server.py update (add to existing file):

@app.route('/precall', methods=['POST'])
def precall_context():
    """Pre-call webhook - provides context BEFORE conversation"""
    caller_id = request.json.get('caller_id', '').replace('+', '')
    context_file = f"projects/{caller_id}/VOICE_CONTEXT.md"
    
    if os.path.exists(context_file):
        with open(context_file) as f:
            content = f.read()
    else:
        content = "No projects yet."
    
    return jsonify({
        "dynamic_variables": {
            "projects_context": content
        }
    })

ElevenLabs Configuration:

In ElevenLabs dashboard, set pre-call webhook URL: https://yourserver.com/precall
Update agent prompt to include: {{projects_context}}

3. Requirements Analyst Updates

The requirements analyst now detects whether this is a new project or an update to an existing one.

Key additions to voice-requirements-analyst.md:

Reads VOICE_CONTEXT.md to check existing projects
Extracts exact GitHub repository name from GitHub: field
Matches conversation to correct project based on descriptions
Cannot read artifact files (ephemeral storage)
Outputs isUpdate: true and githubRepository with exact repo name (no guessing)

4. Workflow Orchestrator Agent (Central Intelligence)

The workflow-orchestrator agent decides which agents to run based on requirements. This eliminates complex if/else logic from orchestrator.py.

Decision examples:

New landing page → Full pipeline
"Change text" → Just component-builder + validator
"Make it blue" → Just styling-finisher + validator
"Add contact form" → Content, architect, builder, assembler, styling, validator

Output (execution_plan.json):

{
  "pipeline_type": "update_content",
  "reason": "User only wants text changes",
  "agents": [
    {"name": "component-builder", "purpose": "Update text in components"},
    {"name": "nextjs-validator", "purpose": "Ensure build succeeds"}
  ]
}

5. Git-Based Orchestrator Flow

# Step 1: Requirements analyst (always runs)
# Step 2: GitHub manager sets up workspace (new repo or pull latest)
# Step 3: Workflow orchestrator plans execution
# Step 4: Execute only necessary agents
# Step 5: GitHub manager commits and pushes changes
# Step 6: Generate summary

No more timestamped folders - proper version control!

6. Agent Update Modes

Key agents support "update mode" when src/ already exists:

component-builder: Makes surgical edits instead of regenerating
styling-finisher: Updates only specified styles
content-strategist: Merges new content with existing

Implementation Timeline

Phase 1 (2-3 hours):

Update webhook_server.py with precall endpoint
Simplify orchestrator.py to use workflow orchestrator
Enable and configure workflow-orchestrator.md
Update requirements analyst for multi-project
Add update mode to component-builder
Test complete flow

Why This Approach is Superior

Simplest Solution: One markdown file for context, one agent for decisions
Natural UX: Agent greets with "I see you have your auction site..."
Efficient Updates: Only runs necessary agents
Infinitely Scalable: Add new agents by updating one prompt
Self-Documenting: execution_plan.json shows exactly what will run

Testing Checklist

New user creates first project
User with 1 project makes update
User with multiple projects - correct matching
Content-only changes skip design pipeline
Style changes preserve content
New features added without full rebuild

Usage Examples

# Run the orchestrator (analyzes requirements and chooses pipeline)
python orchestrator.py

# Resume after failure
python orchestrator.py --resume

# Run only specific phases
python orchestrator.py --only deployment
python orchestrator.py --start-from execution

# Skip phases that are already done
python orchestrator.py --skip requirements,github-setup

# Check pipeline status
python orchestrator.py --status

# Preview what would run
python orchestrator.py --dry-run

# Force re-run everything
python orchestrator.py --force

# After generation, projects are in git repositories:
cd projects/banana-delivery  # or any project name
git log                      # See commit history
git status                   # Check current state

# Run Next.js applications:
cd projects/banana-delivery/src
npm install
npm run dev
# Open http://localhost:3000

# View static HTML:
open projects/banana-delivery/src/index.html

Testing with Sample Data

Using Existing Webhook Data

The repository includes sample webhook data in webhooks/latest_raw.json for testing. This data represents a voice conversation about building or updating a project.

Running Tests

Manual Test Run

# Process existing webhook data
python orchestrator.py

Automated Webhook Testing

# Start webhook server (auto-triggers orchestrator)
python webhook_server.py

# In another terminal, send test webhook
curl -X POST http://localhost:5001/elevenlabs-webhook \
  -H "Content-Type: application/json" \
  -d @webhooks/latest_raw.json

Test with Specific Scenarios

# Test resume after failure
python orchestrator.py --dry-run  # Preview
python orchestrator.py --only requirements,github-setup  # Partial run
# Simulate failure, then:
python orchestrator.py --resume  # Continue from failure

# Test deployment only
python orchestrator.py --only deployment

# Test with verbose output
python orchestrator.py --verbose

Monitoring Test Execution

# Check pipeline status
python orchestrator.py --status

# Monitor logs in real-time
tail -f logs/pipeline_*.log

# Check webhook server status
curl http://localhost:5001/status

Expected Test Results

Requirements Extraction: JSON file created in artifacts/
GitHub Operations: Repository created/updated
Agent Execution: All specified agents run successfully
Build Validation: Application builds without errors
Deployment: Vercel URL generated (if configured)
Notification: SMS sent (if Twilio configured)

FilesExpand file tree

CLAUDE.md

Latest commit

History