|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +**LLMHub** is a full-stack AI collaboration platform with computer automation capabilities. It features a Next.js frontend with a FastAPI Python backend that orchestrates multi-agent AI systems capable of browser automation, terminal operations, and desktop control through containerized virtual machines. |
| 8 | + |
| 9 | +## Architecture |
| 10 | + |
| 11 | +### Frontend (Next.js 15 + React 19) |
| 12 | +- **Framework**: Next.js 15 with App Router, TypeScript, Tailwind CSS |
| 13 | +- **State Management**: Zustand stores for chat, models, user, and sessions |
| 14 | +- **Key Libraries**: |
| 15 | + - Vercel AI SDK (`ai`) for streaming LLM responses |
| 16 | + - Radix UI for accessible components |
| 17 | + - Supabase for authentication and database |
| 18 | + - Stripe for billing/subscriptions |
| 19 | +- **Provider System**: Multi-provider AI support (OpenAI, Anthropic, Azure, Google, Mistral, xAI, OpenRouter, Perplexity) |
| 20 | + |
| 21 | +### Backend (Python FastAPI) |
| 22 | +- **Framework**: FastAPI with async/await patterns |
| 23 | +- **Key Services**: |
| 24 | + - `multi_agent_executor.py`: Orchestrates multi-agent task execution with browser, terminal, and desktop agents |
| 25 | + - `vm_control.py`: WebSocket-based VM control with persistent connections and auto-reconnection |
| 26 | + - `database.py`: Supabase integration for user data, chats, and billing |
| 27 | + - `agent_billing.py`: Tracks usage and credits for agent sessions |
| 28 | + - `search.py`: Google Custom Search API integration |
| 29 | +- **API Routes**: `/api/chat`, `/api/models`, `/api/search`, `/api/vm`, `/api/billing`, `/api/files` |
| 30 | + |
| 31 | +### VM Agent System |
| 32 | +- **Architecture**: Docker containers running Ubuntu 22.04 with XFCE desktop |
| 33 | +- **Agent Types**: |
| 34 | + - **Browser Agent**: Web automation using Chrome with remote debugging (search-first strategy) |
| 35 | + - **Terminal Agent**: Command execution and file operations |
| 36 | + - **Desktop Agent**: UI automation with screenshot analysis |
| 37 | +- **Communication**: WebSocket protocol on port 8080 (8081 for localhost) |
| 38 | +- **Tools**: Each agent has specialized tools (browser navigation, terminal commands, desktop controls) |
| 39 | + |
| 40 | +### Key Design Patterns |
| 41 | + |
| 42 | +#### Multi-Agent Execution Flow |
| 43 | +1. **Task Planning**: LLM decomposes user request into sequential subtasks |
| 44 | +2. **Agent Assignment**: Each subtask assigned to specialized agent (browser/terminal/desktop) |
| 45 | +3. **Sequential Execution**: Tasks execute in order (no dependencies system) |
| 46 | +4. **Context Passing**: Previous task summaries passed to next task for context |
| 47 | +5. **Streaming**: All execution streams via Server-Sent Events to frontend |
| 48 | + |
| 49 | +#### Provider Architecture |
| 50 | +- Located in `lib/providers/` and `backend/app/providers/` |
| 51 | +- Each provider implements streaming chat with tool calling |
| 52 | +- Frontend providers handle model selection and API routing |
| 53 | +- Backend providers execute tools and manage agent workflows |
| 54 | + |
| 55 | +#### State Management |
| 56 | +- **Chat Store** (`lib/chat-store/`): Manages conversations, messages, attachments |
| 57 | +- **Model Store** (`lib/model-store/`): Available models and provider configurations |
| 58 | +- **User Store** (`lib/user-store/`): User profile and authentication state |
| 59 | +- **VM Store** (`lib/vm-store/`): Virtual machine sessions and connections |
| 60 | + |
| 61 | +## Development Commands |
| 62 | + |
| 63 | +### Frontend Development |
| 64 | + |
| 65 | +```bash |
| 66 | +# Install dependencies |
| 67 | +npm install |
| 68 | + |
| 69 | +# Development server (with Turbopack) |
| 70 | +npm run dev |
| 71 | + |
| 72 | +# Production build |
| 73 | +npm run build |
| 74 | + |
| 75 | +# Start production server |
| 76 | +npm start |
| 77 | + |
| 78 | +# Type checking |
| 79 | +npm run type-check |
| 80 | + |
| 81 | +# Linting |
| 82 | +npm run lint |
| 83 | +``` |
| 84 | + |
| 85 | +### Backend Development |
| 86 | + |
| 87 | +```bash |
| 88 | +# Navigate to backend directory |
| 89 | +cd backend |
| 90 | + |
| 91 | +# Create virtual environment (first time) |
| 92 | +python -m venv venv |
| 93 | + |
| 94 | +# Activate virtual environment |
| 95 | +# Windows: |
| 96 | +venv\Scripts\activate |
| 97 | +# Linux/Mac: |
| 98 | +source venv/bin/activate |
| 99 | + |
| 100 | +# Install dependencies |
| 101 | +pip install -r requirements.txt |
| 102 | + |
| 103 | +# Run development server (from backend directory) |
| 104 | +python main.py |
| 105 | + |
| 106 | +# Or use the helper script |
| 107 | +# Windows: |
| 108 | +.\run_backend.bat |
| 109 | +# Linux/Mac: |
| 110 | +./run_backend.sh |
| 111 | +``` |
| 112 | + |
| 113 | +### Docker Deployment |
| 114 | + |
| 115 | +```bash |
| 116 | +# Build and start all services |
| 117 | +docker-compose up --build |
| 118 | + |
| 119 | +# Start services in detached mode |
| 120 | +docker-compose up -d |
| 121 | + |
| 122 | +# Stop services |
| 123 | +docker-compose down |
| 124 | + |
| 125 | +# View logs |
| 126 | +docker-compose logs -f |
| 127 | + |
| 128 | +# AI Desktop container (separate compose file) |
| 129 | +docker-compose -f docker-compose.ai-desktop.yml up --build |
| 130 | +``` |
| 131 | + |
| 132 | +### Testing |
| 133 | + |
| 134 | +```bash |
| 135 | +# Backend tests |
| 136 | +cd backend |
| 137 | +pytest |
| 138 | + |
| 139 | +# Run specific test file |
| 140 | +pytest tests/test_specific.py |
| 141 | + |
| 142 | +# Run with coverage |
| 143 | +pytest --cov=app tests/ |
| 144 | +``` |
| 145 | + |
| 146 | +## Environment Configuration |
| 147 | + |
| 148 | +### Frontend Environment Variables (.env) |
| 149 | +- `NEXT_PUBLIC_SUPABASE_URL`: Supabase project URL |
| 150 | +- `NEXT_PUBLIC_SUPABASE_ANON_KEY`: Supabase anonymous key |
| 151 | +- `SUPABASE_SERVICE_ROLE`: Supabase service role key (server-side) |
| 152 | +- `CSRF_SECRET`: CSRF protection secret (required) |
| 153 | +- `ENCRYPTION_KEY`: For encrypting user API keys (required for BYOK) |
| 154 | +- `PYTHON_BACKEND_URL`: Backend API URL (default: http://0.0.0.0:8001) |
| 155 | +- `NEXT_PUBLIC_BACKEND_URL`: Public backend URL (default: http://localhost:8001) |
| 156 | +- Azure credentials for VM provisioning (AZURE_*) |
| 157 | +- Stripe keys for billing (STRIPE_*) |
| 158 | +- Google Search API keys (GOOGLE_SEARCH_*) |
| 159 | + |
| 160 | +### Backend Environment Variables (backend/.env) |
| 161 | +- `DEBUG`: Enable debug mode (true/false) |
| 162 | +- `CORS_ORIGINS`: Allowed CORS origins (comma-separated) |
| 163 | +- `SUPABASE_URL`, `SUPABASE_ANON_KEY`, `SUPABASE_SERVICE_ROLE`: Supabase config |
| 164 | +- `CSRF_SECRET`, `ENCRYPTION_KEY`: Security keys (must match frontend) |
| 165 | +- `GOOGLE_SEARCH_KEY`, `GOOGLE_SEARCH_CX`: Google Custom Search API |
| 166 | + |
| 167 | +See `.env.example` and `backend/.env.example` for complete configuration templates. |
| 168 | + |
| 169 | +## Code Organization |
| 170 | + |
| 171 | +### Frontend Structure |
| 172 | +- `app/`: Next.js app directory with routes and layouts |
| 173 | + - `c/[chatId]/`: Individual chat pages |
| 174 | + - `api/`: API route handlers (Next.js API routes) |
| 175 | + - `auth/`, `billing/`, `account/`: Feature-specific pages |
| 176 | +- `components/`: Reusable React components |
| 177 | + - `ui/`: shadcn/ui components (Radix UI based) |
| 178 | + - `common/`: Shared components (chat interface, message display) |
| 179 | + - `prompt-kit/`: Prompt-related components |
| 180 | +- `lib/`: Business logic and utilities |
| 181 | + - `providers/`: AI provider implementations |
| 182 | + - `chat-store/`, `model-store/`, `user-store/`: Zustand state stores |
| 183 | + - `supabase/`: Database client and queries |
| 184 | + - `services/`: Service layer (API calls, utilities) |
| 185 | + |
| 186 | +### Backend Structure |
| 187 | +- `backend/app/` |
| 188 | + - `api/routes/`: FastAPI route handlers |
| 189 | + - `services/`: Core business logic |
| 190 | + - `multi_agent_executor.py`: Multi-agent orchestration |
| 191 | + - `vm_control.py`: VM WebSocket management |
| 192 | + - `database.py`: Supabase operations |
| 193 | + - `agent_billing.py`: Usage tracking |
| 194 | + - `core/`: Configuration, middleware, logging |
| 195 | + - `models/`: Pydantic data models |
| 196 | + - `providers/`: AI provider integrations |
| 197 | + - `utils/`: Utility functions |
| 198 | + |
| 199 | +### Docker Structure |
| 200 | +- `docker/ai-desktop/`: Ubuntu desktop container with AI agents |
| 201 | + - Includes Chrome, Node.js, Python, automation tools |
| 202 | + - WebSocket server for agent communication |
| 203 | + - VNC server for remote desktop access |
| 204 | + |
| 205 | +## Key Workflows |
| 206 | + |
| 207 | +### Adding a New AI Provider |
| 208 | + |
| 209 | +1. **Frontend**: Create provider in `lib/providers/your-provider.ts` |
| 210 | + - Implement `streamChat()` method with tool calling support |
| 211 | + - Add to `lib/providers/index.ts` |
| 212 | + |
| 213 | +2. **Backend**: Add provider support in `backend/app/providers/` |
| 214 | + - Configure API keys in environment |
| 215 | + - Update model lists in `models.py` |
| 216 | + |
| 217 | +### Creating a New Agent Type |
| 218 | + |
| 219 | +1. Add agent type to `AgentType` enum in `multi_agent_executor.py` |
| 220 | +2. Create agent prompt in `_get_*_agent_prompt()` method |
| 221 | +3. Define agent tools in `_get_*_tools()` method |
| 222 | +4. Update task planner to recognize new agent type |
| 223 | + |
| 224 | +### Adding New VM Tools |
| 225 | + |
| 226 | +1. Create tool function in `backend/app/api/routes/chat_vm_tools.py` |
| 227 | +2. Define tool schema (name, description, parameters) |
| 228 | +3. Add tool to appropriate agent's tool list in `multi_agent_executor.py` |
| 229 | +4. Implement tool execution in VM agent server (if needed) |
| 230 | + |
| 231 | +## Important Technical Details |
| 232 | + |
| 233 | +### WebSocket Connection Management |
| 234 | +- VM connections are persistent with auto-reconnection |
| 235 | +- Heartbeat mechanism prevents stale connections |
| 236 | +- Connection reuse minimizes latency |
| 237 | +- Password authentication for VNC access |
| 238 | + |
| 239 | +### Tool Response Handling |
| 240 | +- Tool responses are truncated to prevent context overflow (5000 chars) |
| 241 | +- `frontendScreenshot` field is preserved and not sent to model |
| 242 | +- Screenshots are compressed (JPEG, 1280x720 max) before transmission |
| 243 | + |
| 244 | +### Streaming Architecture |
| 245 | +- All AI responses stream via Server-Sent Events (SSE) |
| 246 | +- Tool calls and results stream separately from text |
| 247 | +- Frontend accumulates chunks and updates UI reactively |
| 248 | +- `finish` event signals completion with full content |
| 249 | + |
| 250 | +### Task Execution Rules |
| 251 | +- Tasks execute sequentially (no parallel execution) |
| 252 | +- Each task receives context from all previous completed tasks |
| 253 | +- Tasks can request user input via `[NEED_USER_INPUT]` markers |
| 254 | +- Execution stops if agent encounters critical blocker |
| 255 | + |
| 256 | +### Browser Agent Strategy |
| 257 | +- **Search-First**: Always use Google Search before opening browser |
| 258 | +- **Minimal Browsing**: Only open browser when action is required (forms, clicks, purchases) |
| 259 | +- **State Validation**: Use `browser_state()` to verify actions |
| 260 | +- **Tab Management**: Reuse tabs instead of excessive navigation |
| 261 | + |
| 262 | +### Security Considerations |
| 263 | +- CSRF protection on all state-changing operations |
| 264 | +- API keys encrypted with `ENCRYPTION_KEY` (BYOK feature) |
| 265 | +- Rate limiting on backend endpoints |
| 266 | +- Supabase Row Level Security (RLS) for data access |
| 267 | +- No credentials stored in VM environments |
| 268 | + |
| 269 | +## Common Development Tasks |
| 270 | + |
| 271 | +### Adding a New Feature |
| 272 | +1. Design API endpoints in `backend/app/api/routes/` |
| 273 | +2. Implement business logic in `backend/app/services/` |
| 274 | +3. Create frontend components in `components/` |
| 275 | +4. Add state management in appropriate store (`lib/*-store/`) |
| 276 | +5. Wire up API calls in `lib/services/` or route handlers |
| 277 | + |
| 278 | +### Debugging VM Agent Issues |
| 279 | +1. Check WebSocket connection status in `vm_control.py` logs |
| 280 | +2. Verify agent tools are registered in `multi_agent_executor.py` |
| 281 | +3. Test tool execution with reduced context |
| 282 | +4. Check container logs: `docker logs <container-id>` |
| 283 | +5. Verify VNC connection: `ws://localhost:8081` (localhost) or `ws://<ip>:8080` |
| 284 | + |
| 285 | +### Optimizing Performance |
| 286 | +1. **Frontend**: Use React.memo for expensive components, lazy load routes |
| 287 | +2. **Backend**: Enable caching in `cache.py`, optimize database queries |
| 288 | +3. **Streaming**: Batch small chunks, compress screenshots |
| 289 | +4. **VM**: Reuse connections, minimize tool calls, truncate responses |
| 290 | + |
| 291 | +### Database Migrations |
| 292 | +- Supabase migrations handled via Supabase Dashboard or CLI |
| 293 | +- Schema changes require updating Supabase types in `types/supabase.ts` |
| 294 | +- Run `supabase gen types typescript` to regenerate types |
| 295 | + |
| 296 | +## Ports and Services |
| 297 | + |
| 298 | +- **Frontend**: 3000 (Next.js dev server) |
| 299 | +- **Backend**: 8001 (FastAPI server) |
| 300 | +- **VM Agent WebSocket**: 8080 (remote), 8081 (localhost) |
| 301 | +- **VNC**: 5900 (desktop access) |
| 302 | +- **Supabase**: Hosted service (URLs in .env) |
| 303 | + |
| 304 | +## Additional Notes |
| 305 | + |
| 306 | +- **Frontend uses React Server Components** where applicable for better performance |
| 307 | +- **Backend runs on uvicorn** with auto-reload in development |
| 308 | +- **VM containers** are ephemeral and should be treated as stateless |
| 309 | +- **Billing system** tracks agent usage by session duration |
| 310 | +- **Multi-model support** allows users to switch providers mid-conversation |
| 311 | +- **Screenshot compression** is critical for performance (JPEG, 70% quality) |
0 commit comments