Skip to content

Latest commit

 

History

History
482 lines (340 loc) · 12.9 KB

File metadata and controls

482 lines (340 loc) · 12.9 KB

CLAUDE.md - Agentic System Guide

Last Updated: 2025-12-07 For: Future Claude instances working on this codebase


🚀 Quick Start

New to this project? Start here:

  1. Quick Start Guide - Get running in 5 minutes
  2. Operational Context - Understand your environment (Local Dev vs Production)
  3. Documentation Index - Complete documentation map

Table of Contents

  1. Essential Context
  2. Project Overview
  3. Architecture
  4. Development Guides
  5. System Components
  6. Deployment
  7. Quick Reference

Essential Context

Detect Your Environment First

hostname && pwd

Two operational contexts exist:

  1. Local Development (VS Code) - Direct code editing, localhost URLs
  2. Jetson Production - Voice assistant, systemd services, HTTPS URLs

📖 See: OPERATIONAL_CONTEXT.md for complete context guide.

⚠️ CRITICAL: Node.js Environment Differences

Local Development (uses NVM):

~/.nvm/versions/node/v22.21.1/bin/npm start

Jetson Production (uses Conda):

export PATH=/home/rodrigo/miniconda3/envs/agentic/bin:$PATH
npm run build

Note: Never use bare npm or node commands - always use full paths or set PATH first.


Project Overview

An agentic AI system with Python backend (FastAPI + AutoGen) and React frontend.

Key Features

  • Multi-agent coordination - Nested team agents with orchestration
  • Multimodal vision agents - Interpret images from tool responses
  • Voice assistant - OpenAI Realtime API with WebRTC bridge
  • Mobile voice interface - Smartphone as wireless microphone
  • Claude Code integration - Live code self-editing
  • Memory management - ChromaDB + MongoDB storage
  • Screenshot workflows - Automated UI development

Tech Stack

Backend:

  • Python 3.x, FastAPI, AutoGen
  • ChromaDB 0.4.24, MongoDB 3.6+, SQLite
  • OpenAI API, Anthropic API

Frontend:

  • React 18, Material-UI
  • WebSocket + WebRTC
  • Feature-based architecture

Architecture

Directory Structure

/home/rodrigo/agentic/
├── backend/                    # Python FastAPI server
│   ├── main.py                 # FastAPI entry point
│   ├── agents/                 # Agent JSON configurations
│   ├── tools/                  # Custom tool implementations
│   ├── core/                   # Agent execution engine
│   ├── api/                    # Voice & Claude Code controllers
│   ├── config/                 # Configuration & schemas
│   ├── utils/                  # Utility modules
│   └── tests/                  # Test suite
│
├── frontend/                   # React application
│   └── src/features/           # Feature-based organization
│       ├── agents/             # Agent management
│       ├── tools/              # Tool management
│       └── voice/              # Voice assistant
│
├── debug/                      # Screenshot & export tools
├── docs/                       # Documentation
└── CLAUDE.md                   # This file

Data Flow

User Input → WebSocket → FastAPI → Agent Factory → Agent Execution
    → Tool Execution → WebSocket Stream → Frontend → UI Display

Important URLs

Local Development:

  • Backend: http://localhost:8000
  • Frontend: http://localhost:3000/agentic/

Production (Jetson):

  • Backend: https://192.168.0.200/api/
  • Frontend: https://192.168.0.200/agentic/

Development Guides

Agent Development

Agent Types:

  • looping - Single agent with tool loop
  • nested_team - Multi-agent coordination
  • multimodal_tools_looping - Vision-capable agent
  • dynamic_init_looping - Custom initialization logic

Create New Agent:

  1. Create JSON in backend/agents/YourAgent.json
  2. Configure tools, LLM, prompts
  3. Test via frontend

📖 Complete Guide: Creating New Agents (see full CLAUDE.md)

Tool Development

Tool Structure:

# tools/my_tool.py
from autogen_core.tools import FunctionTool

def my_tool(param: str) -> str:
    """Tool description for LLM"""
    return f"Result: {param}"

my_tool_func = FunctionTool(
    func=my_tool,
    name="my_tool",
    description="Brief description"
)

tools = [my_tool_func]

📖 Complete Guide: Creating New Tools (see full CLAUDE.md)

Voice System

WebRTC Bridge Architecture:

Browser (WebRTC) ↔ Backend (aiortc) ↔ OpenAI Realtime API (WebRTC)

Start Interactive Session:

# Terminal 1
./start-backend.sh

# Terminal 2
./start-frontend.sh

# Browser: http://localhost:3000/agentic/voice

📖 Complete Guides:

Mobile Voice Interface

Purpose: Use smartphone as wireless microphone

Access: http://[YOUR_IP]:3000/agentic/mobile-voice

📖 Complete Guide: docs/guides/MOBILE_VOICE_GUIDE.md


System Components

Database & Memory

Two Systems:

  1. MongoDB - Structured data (Database agent, 10 tools)
  2. ChromaDB - Vector memory (Memory agent, 8 tools)

📖 Complete Setup: docs/guides/DATABASE_AND_MEMORY_SETUP.md

Debugging Tools

Screenshot Automation:

~/.nvm/versions/node/v22.21.1/bin/node debug/screenshot.js http://localhost:3000/agentic/voice

Voice Conversation Export:

python3 debug/export_voice_conversations.py

📖 Complete Guide: Debugging Tools & Workflows (see full CLAUDE.md)

Claude Code Integration

Self-editing capabilities via voice:

  • Voice model calls: send_to_claude_code({text: "Add feature"})
  • Claude Code executes with bypassPermissions mode
  • Events stream back via WebSocket

📖 Complete Guide: Claude Code Self-Editor (see full CLAUDE.md)


Deployment

Jetson Nano Production

Server: 192.168.0.200 (ARM64, Ubuntu 18.04.6 LTS)

Quick Access:

# SSH
ssh rodrigo@192.168.0.200

# HTTPS
https://192.168.0.200/agentic/

Environment: Miniconda3 with agentic conda environment (Python 3.11, Node 20.17)

Common Tasks:

# Set PATH for conda environment (required for non-interactive SSH)
export PATH=/home/rodrigo/miniconda3/envs/agentic/bin:$PATH

# Deploy frontend update
cd ~/agentic/frontend
npm install  # if new dependencies
npm run build
sudo kill -HUP $(cat ~/nginx.pid)

# Restart backend
sudo systemctl restart agentic-backend

# View logs
sudo journalctl -u agentic-backend -f

📖 Complete Guide: docs/deployment/JETSON_DEPLOYMENT_GUIDE.md


Quick Reference

Common Commands

# Start services (local dev)
cd backend && source venv/bin/activate && uvicorn main:app --reload
cd frontend && ~/.nvm/versions/node/v22.21.1/bin/npm start

# Run tests
cd backend && pytest tests/ -v

# Take screenshot
~/.nvm/versions/node/v22.21.1/bin/node debug/screenshot.js http://localhost:3000/agentic/voice

# Export voice data
python3 debug/export_voice_conversations.py

# List agents/tools
curl http://localhost:8000/api/agents
curl http://localhost:8000/api/tools

WebRTC Voice Commands

# Test WebRTC (no network)
pytest tests/integration/test_backend_webrtc_integration.py -v

# Check active sessions
curl http://localhost:8000/api/realtime/conversations

# Stop voice bridge
curl -X DELETE http://localhost:8000/api/realtime/webrtc/bridge/{conversation_id}

# Monitor logs
tail -f logs/backend.log | grep -E "(webrtc|openai)"

File Locations

Purpose Location
Agent configs backend/agents/*.json
Tool implementations backend/tools/*.py
Voice controller backend/api/realtime_voice_webrtc.py
OpenAI WebRTC client backend/api/openai_webrtc_client.py
Frontend voice frontend/src/features/voice/pages/
Screenshot tool debug/screenshot.js
Voice DB exports debug/db_exports/voice_conversations/
Documentation docs/

Best Practices

Development Workflow

  1. Always Use TodoWrite - Track multi-step tasks
  2. Read Before Write - Understand context before changes
  3. Test Changes - Screenshots for UI, exports for backend
  4. Screenshot Before/After - Visual verification for UI changes

UI Development

# 1. Screenshot before
node debug/screenshot.js http://localhost:3000/agentic/voice before.png

# 2. Make changes

# 3. Wait for hot reload
sleep 3

# 4. Screenshot after
node debug/screenshot.js http://localhost:3000/agentic/voice after.png

# 5. Read and verify

Agent Development

  1. Design agent purpose - What specific task?
  2. Choose agent type - Looping or nested team?
  3. Select tools - What capabilities needed?
  4. Write system prompt - Clear instructions + examples
  5. Test iteratively - Start simple, add complexity

Troubleshooting

Backend Issues

# Port conflict
lsof -i :8000

# Dependencies
cd backend && pip install -r requirements.txt

# MongoDB not running
sudo systemctl start mongodb

Frontend Issues

# Always use nvm path!
export NODE_PATH=~/.nvm/versions/node/v22.21.1/bin

# Port conflict
lsof -i :3000

# Reinstall dependencies
cd frontend && $NODE_PATH/npm install

WebRTC Issues

# Check browser console for ICE state
# Expected: new → checking → connected → completed

# Backend logs
tail -f logs/backend.log | grep -i webrtc

# Verify OPENAI_API_KEY
grep OPENAI_API_KEY backend/.env

Documentation Index

Core Documentation

Feature Guides

Voice System

Voice Technical Details

Deployment

Architecture


Recent Changes

Interactive Session Setup (2025-12-04)

  • Created helper scripts: start-backend.sh, start-frontend.sh, start-webrtc-session.sh
  • Added comprehensive WebRTC testing guides
  • Dual logging (console + /tmp/agentic-logs/)

WebRTC Bridge Migration (2025-12-04)

  • Migrated from Pipecat to pure WebRTC bridge
  • Direct OpenAI Realtime API connection
  • Comprehensive unit and integration tests

Documentation Reorganization (2025-12-04)

  • Extracted detailed content to focused guides
  • Created QUICK_START.md for immediate setup
  • Streamlined CLAUDE.md with references

NVM Node Path Enforcement (2025-12-04)

  • Critical instructions for using full nvm paths
  • System node may be outdated/incompatible

Database & Memory System (2025-12-02)

  • MongoDB + ChromaDB fully operational
  • Memory banks migrated from git history
  • Comprehensive setup documentation

End of CLAUDE.md

For detailed information on any topic, see the documentation links above.

Last updated: 2025-12-04