Skip to content

Commit 6f58222

Browse files
committed
feat: docs initial
1 parent 5beefc6 commit 6f58222

File tree

9 files changed

+2257
-21
lines changed

9 files changed

+2257
-21
lines changed

CLAUDE.md

Lines changed: 311 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,311 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
**LLMHub** is a full-stack AI collaboration platform with computer automation capabilities. It features a Next.js frontend with a FastAPI Python backend that orchestrates multi-agent AI systems capable of browser automation, terminal operations, and desktop control through containerized virtual machines.
8+
9+
## Architecture
10+
11+
### Frontend (Next.js 15 + React 19)
12+
- **Framework**: Next.js 15 with App Router, TypeScript, Tailwind CSS
13+
- **State Management**: Zustand stores for chat, models, user, and sessions
14+
- **Key Libraries**:
15+
- Vercel AI SDK (`ai`) for streaming LLM responses
16+
- Radix UI for accessible components
17+
- Supabase for authentication and database
18+
- Stripe for billing/subscriptions
19+
- **Provider System**: Multi-provider AI support (OpenAI, Anthropic, Azure, Google, Mistral, xAI, OpenRouter, Perplexity)
20+
21+
### Backend (Python FastAPI)
22+
- **Framework**: FastAPI with async/await patterns
23+
- **Key Services**:
24+
- `multi_agent_executor.py`: Orchestrates multi-agent task execution with browser, terminal, and desktop agents
25+
- `vm_control.py`: WebSocket-based VM control with persistent connections and auto-reconnection
26+
- `database.py`: Supabase integration for user data, chats, and billing
27+
- `agent_billing.py`: Tracks usage and credits for agent sessions
28+
- `search.py`: Google Custom Search API integration
29+
- **API Routes**: `/api/chat`, `/api/models`, `/api/search`, `/api/vm`, `/api/billing`, `/api/files`
30+
31+
### VM Agent System
32+
- **Architecture**: Docker containers running Ubuntu 22.04 with XFCE desktop
33+
- **Agent Types**:
34+
- **Browser Agent**: Web automation using Chrome with remote debugging (search-first strategy)
35+
- **Terminal Agent**: Command execution and file operations
36+
- **Desktop Agent**: UI automation with screenshot analysis
37+
- **Communication**: WebSocket protocol on port 8080 (8081 for localhost)
38+
- **Tools**: Each agent has specialized tools (browser navigation, terminal commands, desktop controls)
39+
40+
### Key Design Patterns
41+
42+
#### Multi-Agent Execution Flow
43+
1. **Task Planning**: LLM decomposes user request into sequential subtasks
44+
2. **Agent Assignment**: Each subtask assigned to specialized agent (browser/terminal/desktop)
45+
3. **Sequential Execution**: Tasks execute in order (no dependencies system)
46+
4. **Context Passing**: Previous task summaries passed to next task for context
47+
5. **Streaming**: All execution streams via Server-Sent Events to frontend
48+
49+
#### Provider Architecture
50+
- Located in `lib/providers/` and `backend/app/providers/`
51+
- Each provider implements streaming chat with tool calling
52+
- Frontend providers handle model selection and API routing
53+
- Backend providers execute tools and manage agent workflows
54+
55+
#### State Management
56+
- **Chat Store** (`lib/chat-store/`): Manages conversations, messages, attachments
57+
- **Model Store** (`lib/model-store/`): Available models and provider configurations
58+
- **User Store** (`lib/user-store/`): User profile and authentication state
59+
- **VM Store** (`lib/vm-store/`): Virtual machine sessions and connections
60+
61+
## Development Commands
62+
63+
### Frontend Development
64+
65+
```bash
66+
# Install dependencies
67+
npm install
68+
69+
# Development server (with Turbopack)
70+
npm run dev
71+
72+
# Production build
73+
npm run build
74+
75+
# Start production server
76+
npm start
77+
78+
# Type checking
79+
npm run type-check
80+
81+
# Linting
82+
npm run lint
83+
```
84+
85+
### Backend Development
86+
87+
```bash
88+
# Navigate to backend directory
89+
cd backend
90+
91+
# Create virtual environment (first time)
92+
python -m venv venv
93+
94+
# Activate virtual environment
95+
# Windows:
96+
venv\Scripts\activate
97+
# Linux/Mac:
98+
source venv/bin/activate
99+
100+
# Install dependencies
101+
pip install -r requirements.txt
102+
103+
# Run development server (from backend directory)
104+
python main.py
105+
106+
# Or use the helper script
107+
# Windows:
108+
.\run_backend.bat
109+
# Linux/Mac:
110+
./run_backend.sh
111+
```
112+
113+
### Docker Deployment
114+
115+
```bash
116+
# Build and start all services
117+
docker-compose up --build
118+
119+
# Start services in detached mode
120+
docker-compose up -d
121+
122+
# Stop services
123+
docker-compose down
124+
125+
# View logs
126+
docker-compose logs -f
127+
128+
# AI Desktop container (separate compose file)
129+
docker-compose -f docker-compose.ai-desktop.yml up --build
130+
```
131+
132+
### Testing
133+
134+
```bash
135+
# Backend tests
136+
cd backend
137+
pytest
138+
139+
# Run specific test file
140+
pytest tests/test_specific.py
141+
142+
# Run with coverage
143+
pytest --cov=app tests/
144+
```
145+
146+
## Environment Configuration
147+
148+
### Frontend Environment Variables (.env)
149+
- `NEXT_PUBLIC_SUPABASE_URL`: Supabase project URL
150+
- `NEXT_PUBLIC_SUPABASE_ANON_KEY`: Supabase anonymous key
151+
- `SUPABASE_SERVICE_ROLE`: Supabase service role key (server-side)
152+
- `CSRF_SECRET`: CSRF protection secret (required)
153+
- `ENCRYPTION_KEY`: For encrypting user API keys (required for BYOK)
154+
- `PYTHON_BACKEND_URL`: Backend API URL (default: http://0.0.0.0:8001)
155+
- `NEXT_PUBLIC_BACKEND_URL`: Public backend URL (default: http://localhost:8001)
156+
- Azure credentials for VM provisioning (AZURE_*)
157+
- Stripe keys for billing (STRIPE_*)
158+
- Google Search API keys (GOOGLE_SEARCH_*)
159+
160+
### Backend Environment Variables (backend/.env)
161+
- `DEBUG`: Enable debug mode (true/false)
162+
- `CORS_ORIGINS`: Allowed CORS origins (comma-separated)
163+
- `SUPABASE_URL`, `SUPABASE_ANON_KEY`, `SUPABASE_SERVICE_ROLE`: Supabase config
164+
- `CSRF_SECRET`, `ENCRYPTION_KEY`: Security keys (must match frontend)
165+
- `GOOGLE_SEARCH_KEY`, `GOOGLE_SEARCH_CX`: Google Custom Search API
166+
167+
See `.env.example` and `backend/.env.example` for complete configuration templates.
168+
169+
## Code Organization
170+
171+
### Frontend Structure
172+
- `app/`: Next.js app directory with routes and layouts
173+
- `c/[chatId]/`: Individual chat pages
174+
- `api/`: API route handlers (Next.js API routes)
175+
- `auth/`, `billing/`, `account/`: Feature-specific pages
176+
- `components/`: Reusable React components
177+
- `ui/`: shadcn/ui components (Radix UI based)
178+
- `common/`: Shared components (chat interface, message display)
179+
- `prompt-kit/`: Prompt-related components
180+
- `lib/`: Business logic and utilities
181+
- `providers/`: AI provider implementations
182+
- `chat-store/`, `model-store/`, `user-store/`: Zustand state stores
183+
- `supabase/`: Database client and queries
184+
- `services/`: Service layer (API calls, utilities)
185+
186+
### Backend Structure
187+
- `backend/app/`
188+
- `api/routes/`: FastAPI route handlers
189+
- `services/`: Core business logic
190+
- `multi_agent_executor.py`: Multi-agent orchestration
191+
- `vm_control.py`: VM WebSocket management
192+
- `database.py`: Supabase operations
193+
- `agent_billing.py`: Usage tracking
194+
- `core/`: Configuration, middleware, logging
195+
- `models/`: Pydantic data models
196+
- `providers/`: AI provider integrations
197+
- `utils/`: Utility functions
198+
199+
### Docker Structure
200+
- `docker/ai-desktop/`: Ubuntu desktop container with AI agents
201+
- Includes Chrome, Node.js, Python, automation tools
202+
- WebSocket server for agent communication
203+
- VNC server for remote desktop access
204+
205+
## Key Workflows
206+
207+
### Adding a New AI Provider
208+
209+
1. **Frontend**: Create provider in `lib/providers/your-provider.ts`
210+
- Implement `streamChat()` method with tool calling support
211+
- Add to `lib/providers/index.ts`
212+
213+
2. **Backend**: Add provider support in `backend/app/providers/`
214+
- Configure API keys in environment
215+
- Update model lists in `models.py`
216+
217+
### Creating a New Agent Type
218+
219+
1. Add agent type to `AgentType` enum in `multi_agent_executor.py`
220+
2. Create agent prompt in `_get_*_agent_prompt()` method
221+
3. Define agent tools in `_get_*_tools()` method
222+
4. Update task planner to recognize new agent type
223+
224+
### Adding New VM Tools
225+
226+
1. Create tool function in `backend/app/api/routes/chat_vm_tools.py`
227+
2. Define tool schema (name, description, parameters)
228+
3. Add tool to appropriate agent's tool list in `multi_agent_executor.py`
229+
4. Implement tool execution in VM agent server (if needed)
230+
231+
## Important Technical Details
232+
233+
### WebSocket Connection Management
234+
- VM connections are persistent with auto-reconnection
235+
- Heartbeat mechanism prevents stale connections
236+
- Connection reuse minimizes latency
237+
- Password authentication for VNC access
238+
239+
### Tool Response Handling
240+
- Tool responses are truncated to prevent context overflow (5000 chars)
241+
- `frontendScreenshot` field is preserved and not sent to model
242+
- Screenshots are compressed (JPEG, 1280x720 max) before transmission
243+
244+
### Streaming Architecture
245+
- All AI responses stream via Server-Sent Events (SSE)
246+
- Tool calls and results stream separately from text
247+
- Frontend accumulates chunks and updates UI reactively
248+
- `finish` event signals completion with full content
249+
250+
### Task Execution Rules
251+
- Tasks execute sequentially (no parallel execution)
252+
- Each task receives context from all previous completed tasks
253+
- Tasks can request user input via `[NEED_USER_INPUT]` markers
254+
- Execution stops if agent encounters critical blocker
255+
256+
### Browser Agent Strategy
257+
- **Search-First**: Always use Google Search before opening browser
258+
- **Minimal Browsing**: Only open browser when action is required (forms, clicks, purchases)
259+
- **State Validation**: Use `browser_state()` to verify actions
260+
- **Tab Management**: Reuse tabs instead of excessive navigation
261+
262+
### Security Considerations
263+
- CSRF protection on all state-changing operations
264+
- API keys encrypted with `ENCRYPTION_KEY` (BYOK feature)
265+
- Rate limiting on backend endpoints
266+
- Supabase Row Level Security (RLS) for data access
267+
- No credentials stored in VM environments
268+
269+
## Common Development Tasks
270+
271+
### Adding a New Feature
272+
1. Design API endpoints in `backend/app/api/routes/`
273+
2. Implement business logic in `backend/app/services/`
274+
3. Create frontend components in `components/`
275+
4. Add state management in appropriate store (`lib/*-store/`)
276+
5. Wire up API calls in `lib/services/` or route handlers
277+
278+
### Debugging VM Agent Issues
279+
1. Check WebSocket connection status in `vm_control.py` logs
280+
2. Verify agent tools are registered in `multi_agent_executor.py`
281+
3. Test tool execution with reduced context
282+
4. Check container logs: `docker logs <container-id>`
283+
5. Verify VNC connection: `ws://localhost:8081` (localhost) or `ws://<ip>:8080`
284+
285+
### Optimizing Performance
286+
1. **Frontend**: Use React.memo for expensive components, lazy load routes
287+
2. **Backend**: Enable caching in `cache.py`, optimize database queries
288+
3. **Streaming**: Batch small chunks, compress screenshots
289+
4. **VM**: Reuse connections, minimize tool calls, truncate responses
290+
291+
### Database Migrations
292+
- Supabase migrations handled via Supabase Dashboard or CLI
293+
- Schema changes require updating Supabase types in `types/supabase.ts`
294+
- Run `supabase gen types typescript` to regenerate types
295+
296+
## Ports and Services
297+
298+
- **Frontend**: 3000 (Next.js dev server)
299+
- **Backend**: 8001 (FastAPI server)
300+
- **VM Agent WebSocket**: 8080 (remote), 8081 (localhost)
301+
- **VNC**: 5900 (desktop access)
302+
- **Supabase**: Hosted service (URLs in .env)
303+
304+
## Additional Notes
305+
306+
- **Frontend uses React Server Components** where applicable for better performance
307+
- **Backend runs on uvicorn** with auto-reload in development
308+
- **VM containers** are ephemeral and should be treated as stateless
309+
- **Billing system** tracks agent usage by session duration
310+
- **Multi-model support** allows users to switch providers mid-conversation
311+
- **Screenshot compression** is critical for performance (JPEG, 70% quality)

0 commit comments

Comments
 (0)