CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is an AI Chat System with support for three distinct operational modes. The system is built as a containerized full-stack application with separate backend (FastAPI) and frontend (React + TypeScript) services.

Key Features:

Multi-turn conversations with context management
File upload and intelligent summarization (txt, md, pdf)
Agent mode with tool calling (MCP protocol + custom skills)
Code mode with OpenCode-inspired development tools (✅ IMPLEMENTED)
Multi-LLM provider support (Gemini, OpenAI, DeepSeek, Kimi, QWen, Claude, Skywork Router, etc.)
Web-based model configuration (stored in localStorage)
Multilingual response constraints
Conversation history persistence (SQLite)
Anime-style UI with pink gradients and glassmorphism effects

Architecture

Three-Mode System

The system supports three independent modes that run in parallel:

Chat Mode (/api/chat): Standard multi-turn conversations for general Q&A
Agent Mode (/api/agent/chat): Streaming responses with tool calling (MCP + custom skills)
Code Mode (/api/code/chat): OpenCode-inspired development agent with 6 core tools (read/write/edit/bash/glob/grep), permission system, and multi-LLM support (✅ IMPLEMENTED - see OPENCODE_COMPLETE_GUIDE.md)

Model Configuration Priority:

Frontend-configured models (stored in localStorage) - HIGHEST PRIORITY
Environment variables in backend/.env - fallback only

Backend Architecture (FastAPI)

Service Layer Pattern:

services/llm_service.py: Handles all LLM provider integrations, context building, and language constraints
services/agent_service.py: Manages agent mode streaming, tool calling iterations, and MCP integration
services/code_service.py: ✅ OpenCode-style development agent with 6 core tools (read/write/edit/bash/glob/grep), iterative tool calling (max 20 iterations), and multi-LLM support (Gemini, OpenAI, DeepSeek, Kimi, QWen, Skywork Router)
services/permission_service.py: ✅ Fine-grained permission control for code operations with wildcard matching (allow/deny/ask rules)
services/file_service.py: File processing with intelligent chunking (uses summarization for files >50KB)
services/conversation_service.py: SQLite-based conversation persistence
services/mcp_client.py: MCP protocol client for external tool servers
services/skill_manager.py: Built-in skill registry (file operations, code execution, etc.)

Important Implementation Details:

LLMService.generate_response() accepts model_config dict to override environment variables
AgentService.generate_stream() yields SSE events: text, tool_call, tool_result, thinking, error, done
CodeService.generate_code_stream() yields SSE events: text, tool_call, tool_result, permission_required, error, done, metadata (✅ IMPLEMENTED)
File service automatically chunks large files and uses LLM-based summarization for context optimization
Conversation service maintains last 10 turns by default for context window management

Frontend Architecture (React + TypeScript)

Component Hierarchy:

App.tsx (mode selector: Chat/Agent/Code)
├── UserMenu.tsx (avatar dropdown)
│   ├── ModelSettings.tsx (modal for configuring LLM providers)
│   ├── LanguageSettings.tsx (response language constraints)
│   └── AboutDialog.tsx
├── ConversationList.tsx (sidebar with history)
├── ChatWindow.tsx (chat mode interface)
│   ├── MessageList.tsx
│   ├── MessageInput.tsx
│   └── FileUpload.tsx
├── CodeWindow.tsx ✅ (code mode interface - 270 lines)
│   ├── Welcome screen with model info and tool cards
│   ├── Real-time tool execution visualization
│   └── Anime-style UI with pink gradients
├── AgentSettings.tsx (enable/disable MCP and skills)
└── CursorEffect.tsx (UI enhancements)

State Management:

services/modelConfig.ts: localStorage-based model configuration persistence
services/languageConfig.ts: language preference persistence
services/api.ts: API client with both REST and SSE support

Critical Frontend Flow:

User configures models via ModelSettings → saved to localStorage
On chat send → modelConfig passes selected model to backend via llm_config field
Backend uses llm_config if present, otherwise falls back to .env

Development Commands

Local Development

Backend:

cd backend
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt
python main.py  # Runs on http://localhost:8000

Frontend:

cd frontend
npm install
npm start  # Runs on http://localhost:3000

Docker Development

# Start services
make up

# View logs
make logs              # All logs
make logs-backend      # Backend only
make logs-frontend     # Frontend only

# Rebuild specific service
make rebuild-backend
make rebuild-frontend

# Shell access
make shell-backend
make shell-frontend

# Stop services
make down

Testing

Backend tests (>83% coverage):

cd backend
source venv/bin/activate
pytest                                    # Run all tests
pytest tests/test_code_tools.py           # Code mode tool tests
pytest tests/test_code_api.py             # Code mode API tests
pytest --cov=. --cov-report=html         # With coverage report
./run_tests.sh all                       # Run tests + generate HTML report

Frontend tests:

cd frontend
npm test
npm run test:coverage

Agent mode manual testing:

cd backend
python test_agent.py  # Requires model configured in .env

Key Implementation Patterns

Adding a New LLM Provider

Code Mode and Agent Mode use different service implementations:

For Code Mode (services/code_service.py):

Code service routes providers as: gemini → skywork_router → OpenAI-compatible (openai/deepseek/kimi/qwen/etc.)
Skywork Router requires special authentication (app_key header)
All other providers use OpenAI-compatible API
Example for adding a new OpenAI-compatible provider:

# No changes needed in code_service.py - it automatically supports
# any OpenAI-compatible API via base_url configuration

For Chat/Agent Mode (services/llm_service.py):

Add platform config to PLATFORM_CONFIG dict:

"new_provider": {
    "base_url": "https://api.example.com/v1",
    "default_model": "example-model-name",
}

Update _call_openai_compatible() if API is OpenAI-compatible, or add new method for custom API
Add platform option to frontend/src/components/ModelSettings.tsx
Test with both Chat and Agent modes

Understanding Agent Mode Tool Calling

Agent mode uses an iterative loop (max 10 iterations by default):

Send messages + available tools to LLM
If LLM responds with tool calls:
- Execute tools via MCP client or skill manager
- Append tool results to conversation
- Go to step 1
If LLM responds with text (no tool calls):
- Stream final response and exit

Stream Event Types:

thinking: LLM reasoning about what tool to use
tool_call: LLM decided to call a tool
tool_result: Tool execution result
text: Final text response (streamed word-by-word)
error: Error occurred
done: Stream complete

File Upload Processing Logic

Files go through 3-tier processing based on size:

Small files (<20KB): Included in full in context
Medium files (20-50KB): Text chunked into sections
Large files (>50KB): Sent to LLM for intelligent summarization before adding to context

This optimization prevents context window overflow while preserving important information.

Language Constraint System

When user selects a language (e.g., "简体中文"), the system:

Adds a system message: "You must respond in 简体中文 regardless of input language"
This constraint is included in both Chat and Agent mode
"auto" mode means no language constraint

Code Mode Development (✅ IMPLEMENTED)

Code Mode provides OpenCode-inspired development tools integrated into the existing Chat/Agent system. See OPENCODE_COMPLETE_GUIDE.md for comprehensive documentation.

Core Tools (6 tools implemented)

File Operations:

read: Read file contents with line numbers (cat -n style), supports pagination (2000 lines/page), multiple encodings
write: Create or overwrite files with auto parent directory creation and overwrite warnings
edit: Exact string replacement in files with uniqueness checks and diff statistics
glob: File pattern matching with recursive search (**/*.py), sorted output, relative paths
grep: Content search with regex support, case control, result limits (500 matches)

Command Execution:

bash: Execute shell commands with timeout control (1-300 seconds), workspace isolation, dangerous command blocking

Permission System

Default security rules in services/permission_service.py:

{
    "read": {"*": "allow", "*.env": "ask", "*.key": "ask"},
    "write": {"*": "allow", "/etc/*": "deny"},
    "edit": {"*": "allow", "/etc/*": "deny"},
    "bash": {
        "*": "ask",           # All commands require confirmation by default
        "ls *": "allow",      # Safe commands auto-allowed
        "cat *": "allow",
        "rm *": "deny",       # Dangerous commands blocked
        "sudo *": "deny"
    }
}

Permission Actions:

allow: Execute without confirmation
deny: Block execution completely
ask: Require user confirmation (not yet implemented in UI)

Wildcard Patterns:

*: Matches any single path component (e.g., *.py matches test.py but not dir/test.py)
**: Matches any path recursively (e.g., **/*.py matches dir/subdir/test.py)

Code Service Architecture

Multi-LLM Support:

# services/code_service.py routing logic:
if provider == "gemini":
    response = await self._call_gemini_with_tools(...)
elif provider == "skywork_router":
    response = await self._call_skywork_router_with_tools(...)
else:
    # OpenAI-compatible (openai, deepseek, kimi, qwen, etc.)
    response = await self._call_openai_with_tools(...)

Supported LLM Providers:

Gemini (Google Generative AI)
OpenAI (GPT-3.5, GPT-4, GPT-4o, etc.)
DeepSeek (deepseek-chat, deepseek-coder)
Kimi (Moonshot AI)
QWen (Alibaba Tongyi Qianwen)
Skywork Router (with special app_key authentication)
Any OpenAI-compatible API providers

Iterative Tool Calling:

Max 20 iterations (vs Agent mode's 10)
Each iteration: LLM → tool calls → execute → append results → repeat
Stops when: (1) no tool calls, (2) max iterations reached, (3) error occurs

SSE Stream Events:

text: Assistant's text response content
tool_call: LLM decided to execute a tool
tool_result: Tool execution result
permission_required: User confirmation needed (not yet in UI)
error: Error occurred during execution
done: Stream complete
metadata: Conversation ID and other metadata

Frontend Integration

CodeWindow Component (frontend/src/components/CodeWindow.tsx - 270 lines):

Features:

Welcome screen with current model info, tool cards, and quick examples
Real-time tool execution visualization
Workspace path configuration
SSE streaming with event handling
Model configuration check before sending
Anime-style UI matching existing design

Key UI Elements:

// Model configuration check
const defaultModel = modelConfigService.getDefault();
if (!defaultModel) {
    // Show error message with instructions
}

// Send message with model config
const llmConfig = {
    provider: defaultModel.platform,
    api_key: defaultModel.apiKey,
    model_name: defaultModel.modelName,
    base_url: defaultModel.baseUrl || undefined,
};

await sendCodeMessage({
    message: content,
    conversation_id: conversationId,
    history: messages,
    llm_config: llmConfig,
    language: language,
    workspace_root: workspaceRoot || undefined,
    max_iterations: 20,
}, onEvent);

Styling (frontend/src/components/CodeWindow.css - 520+ lines):

Pink gradient color scheme (#ff69b4, #87ceeb, #ffb6c1)
Glassmorphism effects with backdrop-filter
Smooth animations and transitions
Tool call cards with hover effects
Responsive design for mobile/desktop

Testing Code Mode

Unit Tests (backend/tests/test_code_tools.py - 138 lines):

cd backend
pytest tests/test_code_tools.py -v
# Tests all 6 tools with various scenarios

API Tests (backend/tests/test_code_api.py - 150 lines):

cd backend
pytest tests/test_code_api.py -v
# Tests /api/code/chat and /api/code/tools endpoints

Manual Testing:

Start backend: cd backend && python main.py
Start frontend: cd frontend && npm start
Navigate to http://localhost:3000
Click "Code" mode button
Configure LLM model if not already set
Try example commands:
- "List all Python files"
- "Read the README.md file"
- "Find all TODO comments"
- "Show git status"

Adding New Code Tools

To add a new tool (e.g., format):

Define tool schema in services/code_service.py:

def _tool_format(self) -> dict:
    return {
        "type": "function",
        "function": {
            "name": "format",
            "description": "Format code using a formatter",
            "parameters": {
                "type": "object",
                "properties": {
                    "file_path": {"type": "string"},
                    "formatter": {"type": "string", "enum": ["black", "prettier"]}
                },
                "required": ["file_path", "formatter"]
            }
        }
    }

Implement execution method:

async def execute_format(self, file_path: str, formatter: str):
    # Check permission
    perm = self.permission_service.check("format", file_path)
    if perm == "deny":
        return {"error": "Permission denied"}
    if perm == "ask":
        yield {"type": "permission_required", "action": "format", "target": file_path}
        # Wait for user confirmation (future implementation)

    # Execute formatter
    full_path = os.path.join(self.workspace_root, file_path)
    # ... formatting logic

Add to _get_code_tools() list:

def _get_code_tools(self) -> List[dict]:
    return [
        self._tool_read(),
        self._tool_write(),
        # ... existing tools
        self._tool_format(),  # Add new tool
    ]

Add permission rules to services/permission_service.py:

"format": {
    "*": "allow",
    "*.lock": "deny",  # Don't format lock files
}

Write tests in tests/test_code_tools.py

Common Code Mode Issues

Error: "Model configuration is missing"

Cause: No model configured in frontend
Solution: Click user menu → Model Settings → Add a model

Error: "Workspace not found"

Cause: Invalid workspace_root path
Solution: Enter a valid absolute path or leave empty for current directory

Tool execution hangs:

Cause: Command timeout (default 30s for bash)
Solution: Increase timeout or optimize command

Permission denied errors:

Check permission_service.py default rules
Modify rules or implement "ask" confirmation UI

Why Code Mode is Separate

Code Mode is intentionally separate from Chat/Agent modes because:

Different use case: Development tasks vs conversational AI vs task automation
Different tool set: File operations + bash vs general tools vs MCP + skills
Different iteration limit: 20 (code) vs 10 (agent) vs 1 (chat)
Different permission model: Fine-grained file/command control vs open tool calling
Different UX: Tool visualization vs chat flow vs agent thinking

All three modes share:

Same model configuration system (localStorage)
Same conversation persistence (SQLite)
Same LLM service layer (multi-provider support)
Same UI design language (anime-style)

Common Development Scenarios

Debugging Backend Issues

Check backend logs: make logs-backend or docker-compose logs -f backend
Access API docs: http://localhost:8000/docs (Swagger UI)
Test endpoints manually via /docs interactive interface
Check SQLite database: backend/data/conversations.db

Adding a New Agent Skill

Add skill class to services/skill_manager.py:

class NewSkill(BaseSkill):
    def get_tool_definition(self) -> dict:
        return {
            "type": "function",
            "function": {
                "name": "new_skill_name",
                "description": "What this skill does",
                "parameters": { ... }
            }
        }

    async def execute(self, **kwargs) -> str:
        # Implement skill logic
        return result

self.skills["new_skill_name"] = NewSkill()

Test via agent mode

Using Code Mode for Development Tasks

Code Mode is ideal for:

Reading and analyzing existing code
Making precise edits to files
Searching for patterns across the codebase
Executing development commands (git, npm, pytest, etc.)
Exploring project structure

Best Practices:

Always set a workspace_root to limit file access
Review permission rules before sensitive operations
Use read before edit to verify content
Prefer edit over write for existing files (safer)
Set appropriate bash timeouts for long-running commands

Example workflows:

"Read package.json and show all dependencies"
"Find all TODO comments in Python files"
"Run pytest and show the results"
"Edit main.py to add error handling to the send_message function"
"Search for all imports of llm_service"

Modifying Context Window Size

Edit services/conversation_service.py:

def get_conversation_messages(self, conversation_id: str, limit: int = 10):

Change limit parameter to adjust how many turns are included in context.

Note: Code Mode uses a different context strategy - it includes the full conversation history without a turn limit, relying on the 20-iteration cap to prevent runaway loops.

Adding Code Mode Tools (Future)

When implementing Code Mode tools in services/code_service.py:

Define tool using OpenAI function calling format:

def _tool_read(self) -> dict:
    return {
        "type": "function",
        "function": {
            "name": "read",
            "description": "Read file contents with pagination",
            "parameters": {
                "type": "object",
                "properties": {
                    "file_path": {"type": "string"},
                    "offset": {"type": "integer"},
                    "limit": {"type": "integer"}
                },
                "required": ["file_path"]
            }
        }
    }

Implement execution method:

async def execute_read(self, file_path: str, offset: int = 0, limit: int = 2000):
    # Check permission
    perm = self.permission_service.check("read", file_path)
    if perm == "deny":
        return {"error": "Permission denied"}
    if perm == "ask":
        yield {"type": "permission_required", "action": "read", "target": file_path}
        # Future: wait for user confirmation

    # Execute operation
    full_path = os.path.join(self.workspace_root, file_path)
    # ... read file logic with line numbers (cat -n style)

Register tool in _get_code_tools() method
Test with various permission configurations

See "Code Mode Development" section above for the complete implementation guide of the 6 existing tools.

Testing Strategy

The test suite covers:

API endpoints: Request/response handling, validation, error cases
LLM service: Multiple providers, dynamic config, context building
File service: All formats, size limits, chunking, summarization
Conversation service: CRUD operations, message threading
Agent service: Tool calling flow, streaming (mocked)
Code tools: All 6 tools (read/write/edit/bash/glob/grep) with various scenarios (✅ 17/17 tests passing)
Code API: SSE streaming, model config, tool execution (✅ integrated tests)

Run full test suite before submitting PRs:

cd backend && ./run_tests.sh all
cd ../frontend && npm test

Test file organization:

backend/tests/test_api.py - Main API endpoints
backend/tests/test_llm_service.py - LLM provider tests
backend/tests/test_file_service.py - File processing tests
backend/tests/test_conversation_service.py - Conversation DB tests
backend/tests/test_agent.py - Agent mode tests
backend/tests/test_code_tools.py - Code mode tool tests (138 lines)
backend/tests/test_code_api.py - Code mode API tests (150 lines)

Deployment

Production deployment uses Docker Compose with separate containers:

Backend: FastAPI + Uvicorn (port 8000)
Frontend: React static build served by Nginx (port 3000)
Shared network: chatbot-network (bridge)

Data persistence:

Database: ./backend/data/ mounted as volume
Environment: ./backend/.env mounted read-only

Health checks:

Backend: HTTP GET to /health endpoint
Frontend: HTTP GET to /health (nginx serves static fallback)

Important Notes

Python 3.8-3.13 supported (NOT 3.14 yet)
Frontend model config takes priority over backend .env
Agent mode requires streaming-compatible LLM (tool calling support)
Code mode requires function calling support (all tested LLMs support this)
SQLite database is created automatically on first run
CORS is wide open (allow_origins=["*"]) - restrict in production
API keys in localStorage are NOT encrypted - consider security implications for production
Code mode workspace isolation protects against accidental file operations outside the workspace
Permission system defaults to "ask" for bash commands for security (not yet in UI)

Roadmap & Future Development

Code Mode Enhancements (Future)

Current Status: ✅ Code Mode fully implemented with 6 core tools, permission system, multi-LLM support, and anime-style UI

Potential Future Enhancements:

Permission confirmation UI: Implement "ask" permission dialog in frontend
File tree browser: Visual workspace file explorer
Code syntax highlighting: Enhanced result display with language-specific highlighting
Conversation persistence: Save Code Mode conversations to database (currently in-memory only)
LSP integration: Optional language server protocol for code intelligence (autocomplete, go-to-definition)
More tools: Consider adding find_replace_all, run_tests, git_diff, create_file_from_template

See OPENCODE_COMPLETE_GUIDE.md for detailed implementation history and architecture decisions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Architecture

Three-Mode System

Backend Architecture (FastAPI)

Frontend Architecture (React + TypeScript)

Development Commands

Local Development

Docker Development

Testing

Key Implementation Patterns

Adding a New LLM Provider

Understanding Agent Mode Tool Calling

File Upload Processing Logic

Language Constraint System

Code Mode Development (✅ IMPLEMENTED)

Core Tools (6 tools implemented)

Permission System

Code Service Architecture

Frontend Integration

Testing Code Mode

Adding New Code Tools

Common Code Mode Issues

Why Code Mode is Separate

Common Development Scenarios

Debugging Backend Issues

Adding a New Agent Skill

Using Code Mode for Development Tasks

Modifying Context Window Size

Adding Code Mode Tools (Future)

Testing Strategy

Deployment

Important Notes

Roadmap & Future Development

Code Mode Enhancements (Future)

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Architecture

Three-Mode System

Backend Architecture (FastAPI)

Frontend Architecture (React + TypeScript)

Development Commands

Local Development

Docker Development

Testing

Key Implementation Patterns

Adding a New LLM Provider

Understanding Agent Mode Tool Calling

File Upload Processing Logic

Language Constraint System

Code Mode Development (✅ IMPLEMENTED)

Core Tools (6 tools implemented)

Permission System

Code Service Architecture

Frontend Integration

Testing Code Mode

Adding New Code Tools

Common Code Mode Issues

Why Code Mode is Separate

Common Development Scenarios

Debugging Backend Issues

Adding a New Agent Skill

Using Code Mode for Development Tasks

Modifying Context Window Size

Adding Code Mode Tools (Future)

Testing Strategy

Deployment

Important Notes

Roadmap & Future Development

Code Mode Enhancements (Future)