Last Updated: 2025-12-07 For: Future Claude instances working on this codebase
New to this project? Start here:
- Quick Start Guide - Get running in 5 minutes
- Operational Context - Understand your environment (Local Dev vs Production)
- Documentation Index - Complete documentation map
- Essential Context
- Project Overview
- Architecture
- Development Guides
- System Components
- Deployment
- Quick Reference
hostname && pwdTwo operational contexts exist:
- Local Development (VS Code) - Direct code editing, localhost URLs
- Jetson Production - Voice assistant, systemd services, HTTPS URLs
📖 See: OPERATIONAL_CONTEXT.md for complete context guide.
Local Development (uses NVM):
~/.nvm/versions/node/v22.21.1/bin/npm startJetson Production (uses Conda):
export PATH=/home/rodrigo/miniconda3/envs/agentic/bin:$PATH
npm run buildNote: Never use bare npm or node commands - always use full paths or set PATH first.
An agentic AI system with Python backend (FastAPI + AutoGen) and React frontend.
- Multi-agent coordination - Nested team agents with orchestration
- Multimodal vision agents - Interpret images from tool responses
- Voice assistant - OpenAI Realtime API with WebRTC bridge
- Mobile voice interface - Smartphone as wireless microphone
- Claude Code integration - Live code self-editing
- Memory management - ChromaDB + MongoDB storage
- Screenshot workflows - Automated UI development
Backend:
- Python 3.x, FastAPI, AutoGen
- ChromaDB 0.4.24, MongoDB 3.6+, SQLite
- OpenAI API, Anthropic API
Frontend:
- React 18, Material-UI
- WebSocket + WebRTC
- Feature-based architecture
/home/rodrigo/agentic/
├── backend/ # Python FastAPI server
│ ├── main.py # FastAPI entry point
│ ├── agents/ # Agent JSON configurations
│ ├── tools/ # Custom tool implementations
│ ├── core/ # Agent execution engine
│ ├── api/ # Voice & Claude Code controllers
│ ├── config/ # Configuration & schemas
│ ├── utils/ # Utility modules
│ └── tests/ # Test suite
│
├── frontend/ # React application
│ └── src/features/ # Feature-based organization
│ ├── agents/ # Agent management
│ ├── tools/ # Tool management
│ └── voice/ # Voice assistant
│
├── debug/ # Screenshot & export tools
├── docs/ # Documentation
└── CLAUDE.md # This file
User Input → WebSocket → FastAPI → Agent Factory → Agent Execution
→ Tool Execution → WebSocket Stream → Frontend → UI Display
Local Development:
- Backend:
http://localhost:8000 - Frontend:
http://localhost:3000/agentic/
Production (Jetson):
- Backend:
https://192.168.0.200/api/ - Frontend:
https://192.168.0.200/agentic/
Agent Types:
looping- Single agent with tool loopnested_team- Multi-agent coordinationmultimodal_tools_looping- Vision-capable agentdynamic_init_looping- Custom initialization logic
Create New Agent:
- Create JSON in
backend/agents/YourAgent.json - Configure tools, LLM, prompts
- Test via frontend
📖 Complete Guide: Creating New Agents (see full CLAUDE.md)
Tool Structure:
# tools/my_tool.py
from autogen_core.tools import FunctionTool
def my_tool(param: str) -> str:
"""Tool description for LLM"""
return f"Result: {param}"
my_tool_func = FunctionTool(
func=my_tool,
name="my_tool",
description="Brief description"
)
tools = [my_tool_func]📖 Complete Guide: Creating New Tools (see full CLAUDE.md)
WebRTC Bridge Architecture:
Browser (WebRTC) ↔ Backend (aiortc) ↔ OpenAI Realtime API (WebRTC)
Start Interactive Session:
# Terminal 1
./start-backend.sh
# Terminal 2
./start-frontend.sh
# Browser: http://localhost:3000/agentic/voice📖 Complete Guides:
- Voice Quick Start - 1-minute setup
- Voice Interactive Guide - Full walkthrough
- Voice System Overview - Testing guide
Purpose: Use smartphone as wireless microphone
Access: http://[YOUR_IP]:3000/agentic/mobile-voice
📖 Complete Guide: docs/guides/MOBILE_VOICE_GUIDE.md
Two Systems:
- MongoDB - Structured data (Database agent, 10 tools)
- ChromaDB - Vector memory (Memory agent, 8 tools)
📖 Complete Setup: docs/guides/DATABASE_AND_MEMORY_SETUP.md
Screenshot Automation:
~/.nvm/versions/node/v22.21.1/bin/node debug/screenshot.js http://localhost:3000/agentic/voiceVoice Conversation Export:
python3 debug/export_voice_conversations.py📖 Complete Guide: Debugging Tools & Workflows (see full CLAUDE.md)
Self-editing capabilities via voice:
- Voice model calls:
send_to_claude_code({text: "Add feature"}) - Claude Code executes with
bypassPermissionsmode - Events stream back via WebSocket
📖 Complete Guide: Claude Code Self-Editor (see full CLAUDE.md)
Server: 192.168.0.200 (ARM64, Ubuntu 18.04.6 LTS)
Quick Access:
# SSH
ssh rodrigo@192.168.0.200
# HTTPS
https://192.168.0.200/agentic/Environment: Miniconda3 with agentic conda environment (Python 3.11, Node 20.17)
Common Tasks:
# Set PATH for conda environment (required for non-interactive SSH)
export PATH=/home/rodrigo/miniconda3/envs/agentic/bin:$PATH
# Deploy frontend update
cd ~/agentic/frontend
npm install # if new dependencies
npm run build
sudo kill -HUP $(cat ~/nginx.pid)
# Restart backend
sudo systemctl restart agentic-backend
# View logs
sudo journalctl -u agentic-backend -f📖 Complete Guide: docs/deployment/JETSON_DEPLOYMENT_GUIDE.md
# Start services (local dev)
cd backend && source venv/bin/activate && uvicorn main:app --reload
cd frontend && ~/.nvm/versions/node/v22.21.1/bin/npm start
# Run tests
cd backend && pytest tests/ -v
# Take screenshot
~/.nvm/versions/node/v22.21.1/bin/node debug/screenshot.js http://localhost:3000/agentic/voice
# Export voice data
python3 debug/export_voice_conversations.py
# List agents/tools
curl http://localhost:8000/api/agents
curl http://localhost:8000/api/tools# Test WebRTC (no network)
pytest tests/integration/test_backend_webrtc_integration.py -v
# Check active sessions
curl http://localhost:8000/api/realtime/conversations
# Stop voice bridge
curl -X DELETE http://localhost:8000/api/realtime/webrtc/bridge/{conversation_id}
# Monitor logs
tail -f logs/backend.log | grep -E "(webrtc|openai)"| Purpose | Location |
|---|---|
| Agent configs | backend/agents/*.json |
| Tool implementations | backend/tools/*.py |
| Voice controller | backend/api/realtime_voice_webrtc.py |
| OpenAI WebRTC client | backend/api/openai_webrtc_client.py |
| Frontend voice | frontend/src/features/voice/pages/ |
| Screenshot tool | debug/screenshot.js |
| Voice DB exports | debug/db_exports/voice_conversations/ |
| Documentation | docs/ |
- Always Use TodoWrite - Track multi-step tasks
- Read Before Write - Understand context before changes
- Test Changes - Screenshots for UI, exports for backend
- Screenshot Before/After - Visual verification for UI changes
# 1. Screenshot before
node debug/screenshot.js http://localhost:3000/agentic/voice before.png
# 2. Make changes
# 3. Wait for hot reload
sleep 3
# 4. Screenshot after
node debug/screenshot.js http://localhost:3000/agentic/voice after.png
# 5. Read and verify- Design agent purpose - What specific task?
- Choose agent type - Looping or nested team?
- Select tools - What capabilities needed?
- Write system prompt - Clear instructions + examples
- Test iteratively - Start simple, add complexity
# Port conflict
lsof -i :8000
# Dependencies
cd backend && pip install -r requirements.txt
# MongoDB not running
sudo systemctl start mongodb# Always use nvm path!
export NODE_PATH=~/.nvm/versions/node/v22.21.1/bin
# Port conflict
lsof -i :3000
# Reinstall dependencies
cd frontend && $NODE_PATH/npm install# Check browser console for ICE state
# Expected: new → checking → connected → completed
# Backend logs
tail -f logs/backend.log | grep -i webrtc
# Verify OPENAI_API_KEY
grep OPENAI_API_KEY backend/.env- QUICK_START.md - Get started in 5 minutes
- OPERATIONAL_CONTEXT.md - Context-specific behavior
- DOCUMENTATION_INDEX.md - Complete doc map
- DATABASE_AND_MEMORY_SETUP.md - MongoDB & ChromaDB
- MULTIMODAL_AGENT_GUIDE.md - Vision agents
- MOBILE_VOICE_GUIDE.md - Smartphone microphone
- Voice System Overview - Architecture & integration
- Voice Quick Start - 5-minute setup
- Voice Interactive Guide - Complete walkthrough
- Voice Commands - Command reference
- Voice Troubleshooting - Debug guide
- Backend Implementation - WebRTC bridge (coming soon)
- Frontend Implementation - React components (coming soon)
- Nested Agents Integration - Agent orchestration
- Audio Fixes Log - Historical audio issues & fixes
- JETSON_DEPLOYMENT_GUIDE.md - Production server
- TV_WEBVIEW_FIX_SUMMARY.md - TV compatibility
- REFACTORING_SUMMARY.md - Backend structure
- FRONTEND_REFACTORING.md - Frontend structure
- Created helper scripts:
start-backend.sh,start-frontend.sh,start-webrtc-session.sh - Added comprehensive WebRTC testing guides
- Dual logging (console +
/tmp/agentic-logs/)
- Migrated from Pipecat to pure WebRTC bridge
- Direct OpenAI Realtime API connection
- Comprehensive unit and integration tests
- Extracted detailed content to focused guides
- Created QUICK_START.md for immediate setup
- Streamlined CLAUDE.md with references
- Critical instructions for using full nvm paths
- System node may be outdated/incompatible
- MongoDB + ChromaDB fully operational
- Memory banks migrated from git history
- Comprehensive setup documentation
End of CLAUDE.md
For detailed information on any topic, see the documentation links above.
Last updated: 2025-12-04