This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This is an AI Chat System with support for three distinct operational modes. The system is built as a containerized full-stack application with separate backend (FastAPI) and frontend (React + TypeScript) services.
Key Features:
- Multi-turn conversations with context management
- File upload and intelligent summarization (txt, md, pdf)
- Agent mode with tool calling (MCP protocol + custom skills)
- Code mode with OpenCode-inspired development tools (✅ IMPLEMENTED)
- Multi-LLM provider support (Gemini, OpenAI, DeepSeek, Kimi, QWen, Claude, Skywork Router, etc.)
- Web-based model configuration (stored in localStorage)
- Multilingual response constraints
- Conversation history persistence (SQLite)
- Anime-style UI with pink gradients and glassmorphism effects
The system supports three independent modes that run in parallel:
- Chat Mode (
/api/chat): Standard multi-turn conversations for general Q&A - Agent Mode (
/api/agent/chat): Streaming responses with tool calling (MCP + custom skills) - Code Mode (
/api/code/chat): OpenCode-inspired development agent with 6 core tools (read/write/edit/bash/glob/grep), permission system, and multi-LLM support (✅ IMPLEMENTED - see OPENCODE_COMPLETE_GUIDE.md)
Model Configuration Priority:
- Frontend-configured models (stored in localStorage) - HIGHEST PRIORITY
- Environment variables in
backend/.env- fallback only
Service Layer Pattern:
services/llm_service.py: Handles all LLM provider integrations, context building, and language constraintsservices/agent_service.py: Manages agent mode streaming, tool calling iterations, and MCP integrationservices/code_service.py: ✅ OpenCode-style development agent with 6 core tools (read/write/edit/bash/glob/grep), iterative tool calling (max 20 iterations), and multi-LLM support (Gemini, OpenAI, DeepSeek, Kimi, QWen, Skywork Router)services/permission_service.py: ✅ Fine-grained permission control for code operations with wildcard matching (allow/deny/ask rules)services/file_service.py: File processing with intelligent chunking (uses summarization for files >50KB)services/conversation_service.py: SQLite-based conversation persistenceservices/mcp_client.py: MCP protocol client for external tool serversservices/skill_manager.py: Built-in skill registry (file operations, code execution, etc.)
Important Implementation Details:
LLMService.generate_response()acceptsmodel_configdict to override environment variablesAgentService.generate_stream()yields SSE events:text,tool_call,tool_result,thinking,error,doneCodeService.generate_code_stream()yields SSE events:text,tool_call,tool_result,permission_required,error,done,metadata(✅ IMPLEMENTED)- File service automatically chunks large files and uses LLM-based summarization for context optimization
- Conversation service maintains last 10 turns by default for context window management
Component Hierarchy:
App.tsx (mode selector: Chat/Agent/Code)
├── UserMenu.tsx (avatar dropdown)
│ ├── ModelSettings.tsx (modal for configuring LLM providers)
│ ├── LanguageSettings.tsx (response language constraints)
│ └── AboutDialog.tsx
├── ConversationList.tsx (sidebar with history)
├── ChatWindow.tsx (chat mode interface)
│ ├── MessageList.tsx
│ ├── MessageInput.tsx
│ └── FileUpload.tsx
├── CodeWindow.tsx ✅ (code mode interface - 270 lines)
│ ├── Welcome screen with model info and tool cards
│ ├── Real-time tool execution visualization
│ └── Anime-style UI with pink gradients
├── AgentSettings.tsx (enable/disable MCP and skills)
└── CursorEffect.tsx (UI enhancements)
State Management:
services/modelConfig.ts: localStorage-based model configuration persistenceservices/languageConfig.ts: language preference persistenceservices/api.ts: API client with both REST and SSE support
Critical Frontend Flow:
- User configures models via ModelSettings → saved to localStorage
- On chat send → modelConfig passes selected model to backend via
llm_configfield - Backend uses
llm_configif present, otherwise falls back to.env
Backend:
cd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
python main.py # Runs on http://localhost:8000Frontend:
cd frontend
npm install
npm start # Runs on http://localhost:3000# Start services
make up
# View logs
make logs # All logs
make logs-backend # Backend only
make logs-frontend # Frontend only
# Rebuild specific service
make rebuild-backend
make rebuild-frontend
# Shell access
make shell-backend
make shell-frontend
# Stop services
make downBackend tests (>83% coverage):
cd backend
source venv/bin/activate
pytest # Run all tests
pytest tests/test_code_tools.py # Code mode tool tests
pytest tests/test_code_api.py # Code mode API tests
pytest --cov=. --cov-report=html # With coverage report
./run_tests.sh all # Run tests + generate HTML reportFrontend tests:
cd frontend
npm test
npm run test:coverageAgent mode manual testing:
cd backend
python test_agent.py # Requires model configured in .envCode Mode and Agent Mode use different service implementations:
For Code Mode (services/code_service.py):
- Code service routes providers as:
gemini→skywork_router→ OpenAI-compatible (openai/deepseek/kimi/qwen/etc.) - Skywork Router requires special authentication (app_key header)
- All other providers use OpenAI-compatible API
- Example for adding a new OpenAI-compatible provider:
# No changes needed in code_service.py - it automatically supports
# any OpenAI-compatible API via base_url configurationFor Chat/Agent Mode (services/llm_service.py):
- Add platform config to
PLATFORM_CONFIGdict:
"new_provider": {
"base_url": "https://api.example.com/v1",
"default_model": "example-model-name",
}-
Update
_call_openai_compatible()if API is OpenAI-compatible, or add new method for custom API -
Add platform option to
frontend/src/components/ModelSettings.tsx -
Test with both Chat and Agent modes
Agent mode uses an iterative loop (max 10 iterations by default):
- Send messages + available tools to LLM
- If LLM responds with tool calls:
- Execute tools via MCP client or skill manager
- Append tool results to conversation
- Go to step 1
- If LLM responds with text (no tool calls):
- Stream final response and exit
Stream Event Types:
thinking: LLM reasoning about what tool to usetool_call: LLM decided to call a tooltool_result: Tool execution resulttext: Final text response (streamed word-by-word)error: Error occurreddone: Stream complete
Files go through 3-tier processing based on size:
- Small files (<20KB): Included in full in context
- Medium files (20-50KB): Text chunked into sections
- Large files (>50KB): Sent to LLM for intelligent summarization before adding to context
This optimization prevents context window overflow while preserving important information.
When user selects a language (e.g., "简体中文"), the system:
- Adds a system message: "You must respond in 简体中文 regardless of input language"
- This constraint is included in both Chat and Agent mode
- "auto" mode means no language constraint
Code Mode provides OpenCode-inspired development tools integrated into the existing Chat/Agent system. See OPENCODE_COMPLETE_GUIDE.md for comprehensive documentation.
File Operations:
read: Read file contents with line numbers (cat -n style), supports pagination (2000 lines/page), multiple encodingswrite: Create or overwrite files with auto parent directory creation and overwrite warningsedit: Exact string replacement in files with uniqueness checks and diff statisticsglob: File pattern matching with recursive search (**/*.py), sorted output, relative pathsgrep: Content search with regex support, case control, result limits (500 matches)
Command Execution:
bash: Execute shell commands with timeout control (1-300 seconds), workspace isolation, dangerous command blocking
Default security rules in services/permission_service.py:
{
"read": {"*": "allow", "*.env": "ask", "*.key": "ask"},
"write": {"*": "allow", "/etc/*": "deny"},
"edit": {"*": "allow", "/etc/*": "deny"},
"bash": {
"*": "ask", # All commands require confirmation by default
"ls *": "allow", # Safe commands auto-allowed
"cat *": "allow",
"rm *": "deny", # Dangerous commands blocked
"sudo *": "deny"
}
}Permission Actions:
allow: Execute without confirmationdeny: Block execution completelyask: Require user confirmation (not yet implemented in UI)
Wildcard Patterns:
*: Matches any single path component (e.g.,*.pymatchestest.pybut notdir/test.py)**: Matches any path recursively (e.g.,**/*.pymatchesdir/subdir/test.py)
Multi-LLM Support:
# services/code_service.py routing logic:
if provider == "gemini":
response = await self._call_gemini_with_tools(...)
elif provider == "skywork_router":
response = await self._call_skywork_router_with_tools(...)
else:
# OpenAI-compatible (openai, deepseek, kimi, qwen, etc.)
response = await self._call_openai_with_tools(...)Supported LLM Providers:
- Gemini (Google Generative AI)
- OpenAI (GPT-3.5, GPT-4, GPT-4o, etc.)
- DeepSeek (deepseek-chat, deepseek-coder)
- Kimi (Moonshot AI)
- QWen (Alibaba Tongyi Qianwen)
- Skywork Router (with special app_key authentication)
- Any OpenAI-compatible API providers
Iterative Tool Calling:
- Max 20 iterations (vs Agent mode's 10)
- Each iteration: LLM → tool calls → execute → append results → repeat
- Stops when: (1) no tool calls, (2) max iterations reached, (3) error occurs
SSE Stream Events:
text: Assistant's text response contenttool_call: LLM decided to execute a tooltool_result: Tool execution resultpermission_required: User confirmation needed (not yet in UI)error: Error occurred during executiondone: Stream completemetadata: Conversation ID and other metadata
CodeWindow Component (frontend/src/components/CodeWindow.tsx - 270 lines):
Features:
- Welcome screen with current model info, tool cards, and quick examples
- Real-time tool execution visualization
- Workspace path configuration
- SSE streaming with event handling
- Model configuration check before sending
- Anime-style UI matching existing design
Key UI Elements:
// Model configuration check
const defaultModel = modelConfigService.getDefault();
if (!defaultModel) {
// Show error message with instructions
}
// Send message with model config
const llmConfig = {
provider: defaultModel.platform,
api_key: defaultModel.apiKey,
model_name: defaultModel.modelName,
base_url: defaultModel.baseUrl || undefined,
};
await sendCodeMessage({
message: content,
conversation_id: conversationId,
history: messages,
llm_config: llmConfig,
language: language,
workspace_root: workspaceRoot || undefined,
max_iterations: 20,
}, onEvent);Styling (frontend/src/components/CodeWindow.css - 520+ lines):
- Pink gradient color scheme (#ff69b4, #87ceeb, #ffb6c1)
- Glassmorphism effects with backdrop-filter
- Smooth animations and transitions
- Tool call cards with hover effects
- Responsive design for mobile/desktop
Unit Tests (backend/tests/test_code_tools.py - 138 lines):
cd backend
pytest tests/test_code_tools.py -v
# Tests all 6 tools with various scenariosAPI Tests (backend/tests/test_code_api.py - 150 lines):
cd backend
pytest tests/test_code_api.py -v
# Tests /api/code/chat and /api/code/tools endpointsManual Testing:
- Start backend:
cd backend && python main.py - Start frontend:
cd frontend && npm start - Navigate to http://localhost:3000
- Click "Code" mode button
- Configure LLM model if not already set
- Try example commands:
- "List all Python files"
- "Read the README.md file"
- "Find all TODO comments"
- "Show git status"
To add a new tool (e.g., format):
- Define tool schema in
services/code_service.py:
def _tool_format(self) -> dict:
return {
"type": "function",
"function": {
"name": "format",
"description": "Format code using a formatter",
"parameters": {
"type": "object",
"properties": {
"file_path": {"type": "string"},
"formatter": {"type": "string", "enum": ["black", "prettier"]}
},
"required": ["file_path", "formatter"]
}
}
}- Implement execution method:
async def execute_format(self, file_path: str, formatter: str):
# Check permission
perm = self.permission_service.check("format", file_path)
if perm == "deny":
return {"error": "Permission denied"}
if perm == "ask":
yield {"type": "permission_required", "action": "format", "target": file_path}
# Wait for user confirmation (future implementation)
# Execute formatter
full_path = os.path.join(self.workspace_root, file_path)
# ... formatting logic- Add to
_get_code_tools()list:
def _get_code_tools(self) -> List[dict]:
return [
self._tool_read(),
self._tool_write(),
# ... existing tools
self._tool_format(), # Add new tool
]- Add permission rules to
services/permission_service.py:
"format": {
"*": "allow",
"*.lock": "deny", # Don't format lock files
}- Write tests in
tests/test_code_tools.py
Error: "Model configuration is missing"
- Cause: No model configured in frontend
- Solution: Click user menu → Model Settings → Add a model
Error: "Workspace not found"
- Cause: Invalid workspace_root path
- Solution: Enter a valid absolute path or leave empty for current directory
Tool execution hangs:
- Cause: Command timeout (default 30s for bash)
- Solution: Increase timeout or optimize command
Permission denied errors:
- Check
permission_service.pydefault rules - Modify rules or implement "ask" confirmation UI
Code Mode is intentionally separate from Chat/Agent modes because:
- Different use case: Development tasks vs conversational AI vs task automation
- Different tool set: File operations + bash vs general tools vs MCP + skills
- Different iteration limit: 20 (code) vs 10 (agent) vs 1 (chat)
- Different permission model: Fine-grained file/command control vs open tool calling
- Different UX: Tool visualization vs chat flow vs agent thinking
All three modes share:
- Same model configuration system (localStorage)
- Same conversation persistence (SQLite)
- Same LLM service layer (multi-provider support)
- Same UI design language (anime-style)
- Check backend logs:
make logs-backendordocker-compose logs -f backend - Access API docs: http://localhost:8000/docs (Swagger UI)
- Test endpoints manually via
/docsinteractive interface - Check SQLite database:
backend/data/conversations.db
- Add skill class to
services/skill_manager.py:
class NewSkill(BaseSkill):
def get_tool_definition(self) -> dict:
return {
"type": "function",
"function": {
"name": "new_skill_name",
"description": "What this skill does",
"parameters": { ... }
}
}
async def execute(self, **kwargs) -> str:
# Implement skill logic
return result- Register in
SkillManager.__init__():
self.skills["new_skill_name"] = NewSkill()- Test via agent mode
Code Mode is ideal for:
- Reading and analyzing existing code
- Making precise edits to files
- Searching for patterns across the codebase
- Executing development commands (git, npm, pytest, etc.)
- Exploring project structure
Best Practices:
- Always set a workspace_root to limit file access
- Review permission rules before sensitive operations
- Use
readbeforeeditto verify content - Prefer
editoverwritefor existing files (safer) - Set appropriate bash timeouts for long-running commands
Example workflows:
"Read package.json and show all dependencies"
"Find all TODO comments in Python files"
"Run pytest and show the results"
"Edit main.py to add error handling to the send_message function"
"Search for all imports of llm_service"
Edit services/conversation_service.py:
def get_conversation_messages(self, conversation_id: str, limit: int = 10):Change limit parameter to adjust how many turns are included in context.
Note: Code Mode uses a different context strategy - it includes the full conversation history without a turn limit, relying on the 20-iteration cap to prevent runaway loops.
When implementing Code Mode tools in services/code_service.py:
- Define tool using OpenAI function calling format:
def _tool_read(self) -> dict:
return {
"type": "function",
"function": {
"name": "read",
"description": "Read file contents with pagination",
"parameters": {
"type": "object",
"properties": {
"file_path": {"type": "string"},
"offset": {"type": "integer"},
"limit": {"type": "integer"}
},
"required": ["file_path"]
}
}
}- Implement execution method:
async def execute_read(self, file_path: str, offset: int = 0, limit: int = 2000):
# Check permission
perm = self.permission_service.check("read", file_path)
if perm == "deny":
return {"error": "Permission denied"}
if perm == "ask":
yield {"type": "permission_required", "action": "read", "target": file_path}
# Future: wait for user confirmation
# Execute operation
full_path = os.path.join(self.workspace_root, file_path)
# ... read file logic with line numbers (cat -n style)-
Register tool in
_get_code_tools()method -
Test with various permission configurations
See "Code Mode Development" section above for the complete implementation guide of the 6 existing tools.
The test suite covers:
- API endpoints: Request/response handling, validation, error cases
- LLM service: Multiple providers, dynamic config, context building
- File service: All formats, size limits, chunking, summarization
- Conversation service: CRUD operations, message threading
- Agent service: Tool calling flow, streaming (mocked)
- Code tools: All 6 tools (read/write/edit/bash/glob/grep) with various scenarios (✅ 17/17 tests passing)
- Code API: SSE streaming, model config, tool execution (✅ integrated tests)
Run full test suite before submitting PRs:
cd backend && ./run_tests.sh all
cd ../frontend && npm testTest file organization:
backend/tests/test_api.py- Main API endpointsbackend/tests/test_llm_service.py- LLM provider testsbackend/tests/test_file_service.py- File processing testsbackend/tests/test_conversation_service.py- Conversation DB testsbackend/tests/test_agent.py- Agent mode testsbackend/tests/test_code_tools.py- Code mode tool tests (138 lines)backend/tests/test_code_api.py- Code mode API tests (150 lines)
Production deployment uses Docker Compose with separate containers:
- Backend: FastAPI + Uvicorn (port 8000)
- Frontend: React static build served by Nginx (port 3000)
- Shared network:
chatbot-network(bridge)
Data persistence:
- Database:
./backend/data/mounted as volume - Environment:
./backend/.envmounted read-only
Health checks:
- Backend: HTTP GET to
/healthendpoint - Frontend: HTTP GET to
/health(nginx serves static fallback)
- Python 3.8-3.13 supported (NOT 3.14 yet)
- Frontend model config takes priority over backend
.env - Agent mode requires streaming-compatible LLM (tool calling support)
- Code mode requires function calling support (all tested LLMs support this)
- SQLite database is created automatically on first run
- CORS is wide open (
allow_origins=["*"]) - restrict in production - API keys in localStorage are NOT encrypted - consider security implications for production
- Code mode workspace isolation protects against accidental file operations outside the workspace
- Permission system defaults to "ask" for bash commands for security (not yet in UI)
Current Status: ✅ Code Mode fully implemented with 6 core tools, permission system, multi-LLM support, and anime-style UI
Potential Future Enhancements:
- Permission confirmation UI: Implement "ask" permission dialog in frontend
- File tree browser: Visual workspace file explorer
- Code syntax highlighting: Enhanced result display with language-specific highlighting
- Conversation persistence: Save Code Mode conversations to database (currently in-memory only)
- LSP integration: Optional language server protocol for code intelligence (autocomplete, go-to-definition)
- More tools: Consider adding
find_replace_all,run_tests,git_diff,create_file_from_template
See OPENCODE_COMPLETE_GUIDE.md for detailed implementation history and architecture decisions.