This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Vlt-Bridge (formerly Document-MCP) is a monorepo containing:
- Document-MCP: Multi-tenant Obsidian-like documentation viewer with AI-first workflow
- vlt-cli: AI memory and context retrieval CLI tool with CodeRAG
AI agents write/update documentation via MCP (Model Context Protocol), while humans read and edit through a web UI. The system provides per-user vaults with Markdown notes, full-text search (SQLite FTS5), wikilink resolution, tag indexing, and backlink tracking.
Vlt Oracle: Multi-source intelligent context retrieval for AI coding agents, combining:
- vlt threads: Development history and memory
- Markdown vault: Documentation (Document-MCP)
- CodeRAG: Code understanding with hybrid retrieval (vector + BM25 + graph)
Architecture: Python 3.11+ backend (FastAPI + FastMCP) + React 19 frontend (Vite 7 + shadcn/ui) + vlt-cli (Python CLI)
Vlt-Bridge/
├── backend/ # Document-MCP FastAPI backend
├── frontend/ # Document-MCP React frontend
├── packages/
│ └── vlt-cli/ # vlt CLI tool (memory, threads, oracle, coderag)
├── specs/ # Feature specifications (SpecKit)
└── data/ # Local data (vaults, indexes)
Key Concepts:
- Vault: Per-user filesystem directory containing .md files
- MCP Server: Exposes tools for AI agents (STDIO for local, HTTP for remote with JWT)
- Indexer: SQLite FTS5 for full-text search + separate tables for tags/links/metadata
- Wikilinks:
[[Note Name]]resolved via case-insensitive slug matching (prefers same folder, then lexicographic) - Optimistic Concurrency: Version counter in SQLite (not frontmatter); UI sends
if_version, MCP uses last-write-wins - RAG: LlamaIndex with Gemini embeddings for semantic search over vault content
- TTS: ElevenLabs integration for text-to-speech note reading
# Automated startup (recommended)
./start-dev.sh # Starts backend (8000) + frontend (5173)
./stop-dev.sh # Stop both services
./status-dev.sh # Check running processescd backend
# Setup (first time)
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e .
uv pip install -e ".[dev]" # Dev dependencies (pytest, httpx)
# Run FastAPI HTTP server (for UI)
uv run uvicorn src.api.main:app --reload --host 0.0.0.0 --port 8000
# Run MCP STDIO server (for Claude Desktop/Code)
uv run python src/mcp/server.py
# Run MCP HTTP server (for remote clients with JWT)
uv run python src/mcp/server.py --http --port 8001
# Tests
uv run pytest # All tests
uv run pytest tests/unit # Unit tests only
uv run pytest tests/integration # Integration tests
uv run pytest -k test_vault_write # Single test pattern
uv run pytest -v # Verbose output
uv run pytest --lf # Last failed testscd frontend
# Setup (first time)
npm install
# Development server
npm run dev # Start Vite dev server (http://localhost:5173)
# Build
npm run build # TypeScript compile + Vite build to dist/
# Lint
npm run lint # ESLint check
# Preview production build
npm run preview # Serve dist/ (after npm run build)# Build and run container locally (mirrors HF Spaces deployment)
docker build -t document-mcp .
docker run -p 7860:7860 -e JWT_SECRET_KEY="dev-secret" document-mcp
# Access at http://localhost:7860# Backend database is auto-initialized on first run
# Manual reset (WARNING: destroys all data)
cd backend
rm -f ../data/index.db
uv run python -c "from src.services.database import DatabaseService; DatabaseService().initialize()"3-tier architecture:
-
Models (
backend/src/models/): Pydantic schemas for validationnote.py: Note, NoteMetadata, NoteSummaryuser.py: User, UserProfilesearch.py: SearchResult, SearchQueryindex.py: IndexHealthauth.py: TokenRequest, TokenResponse
-
Services (
backend/src/services/): Business logicvault.py: Filesystem operations (read/write/list/delete notes)validate_note_path(): Path security (no.., max 256 chars, Unix separators)sanitize_path(): Resolves and enforces vault root boundary
indexer.py: SQLite FTS5 + metadata trackingindex_note(): Updates metadata, FTS, tags, links (synchronous on every write)search_notes(): BM25 ranking with title 3x weight, body 1x, recency bonusget_backlinks(): Follows link graph (note → sources that reference it)
auth.py: JWT + HF OAuth integrationcreate_access_token(): Issues JWT with sub=user_id, exp=90daysverify_token(): Validates JWT and extracts user_id
config.py: Env var management (MODE, JWT_SECRET_KEY, VAULT_BASE_DIR, etc.)database.py: SQLite connection manager + schema DDL
-
API/MCP (
backend/src/api/andbackend/src/mcp/):api/routes/: FastAPI endpointsauth.py: OAuth, JWT, user endpointsnotes.py: CRUD operations (with optimistic concurrency)search.py: Full-text searchindex.py: Index rebuild/healthgraph.py: Note relationship graph for visualizationrag.py: RAG/vector DB queries (LlamaIndex + Gemini)tts.py: Text-to-speech (ElevenLabs)demo.py,system.py: Demo data seeding, system info
api/middleware/auth_middleware.py: JWT Bearer token validationmcp/server.py: FastMCP tools (7 tools: list, read, write, delete, search, backlinks, tags)
Critical Path Validation (in vault.py):
- All note paths MUST pass
validate_note_path()(returns(bool, str)tuple) - Then
sanitize_path()resolves and ensures no vault escape - Failure = 400 Bad Request with specific error message
5 tables (see backend/src/services/database.py):
- note_metadata: Version tracking, size, timestamps (per note)
- note_fts: Contentless FTS5 with porter tokenizer,
prefix='2 3'for autocomplete - note_tags: Many-to-many (user_id, note_path, tag)
- note_links: Link graph (source_path → target_path, is_resolved flag)
- index_health: Aggregate stats (note_count, last_full_rebuild, last_incremental_update)
Indexer Update Flow (in indexer.py):
write_note() → vault.write_note() → indexer.index_note()
↓
[metadata table: version++]
[FTS table: re-insert title+body]
[tags table: clear + re-insert]
[links table: extract wikilinks, resolve, update backlinks]
[health table: note_count++, last_incremental_update=now]
In indexer.py (resolve_wikilink logic):
- Normalize link text to slug:
normalize_slug("API Design")→"api-design" - Find all notes where slug matches
normalize_slug(title)ornormalize_slug(filename_stem) - If multiple matches:
- Prefer same folder as source note
- Else lexicographically smallest path (ASCII sort)
- Store in
note_linkstable withis_resolved=1(or0if no match)
Broken links are tracked (is_resolved=0) and can be queried for UI "Create note" affordance.
STDIO (python src/mcp/server.py):
- For Claude Desktop/Code local integration
- Uses
LOCAL_USER_IDfrom env (default: "local-dev") - No authentication
HTTP (python src/mcp/server.py --http --port 8001):
- For remote clients (HF Space deployment)
- Requires
Authorization: Bearer <jwt>header - JWT validated → user_id extracted → scoped to that user's vault
Endpoint: Tools defined in mcp/server.py with FastMCP decorators (@mcp.tool)
Component Hierarchy:
App.tsx (main layout, routing)
├── MainApp.tsx (authenticated app shell)
│ ├── DirectoryTree.tsx (left sidebar: vault explorer)
│ ├── NoteViewer.tsx (read mode: react-markdown rendering)
│ ├── NoteEditor.tsx (edit mode: split view with live preview)
│ ├── SearchBar.tsx (debounced search with dropdown)
│ ├── ChatPanel.tsx (AI chat interface for RAG)
│ ├── GraphView.tsx (note relationship visualization)
│ └── TableOfContents.tsx (heading navigator)
├── Login.tsx (HF OAuth flow)
└── Settings.tsx (token access, preferences)
Key Libraries:
react-markdown+remark-gfm: Markdown rendering with GFM supportshadcn/ui: UI components (30+ primitives from Radix UI)react-force-graph-2d: Note relationship graph visualizationreact-resizable-panels: Split pane layoutlib/wikilink.ts: Parse[[...]]+ resolve via GET /api/backlinksservices/api.ts: Fetch wrapper with Bearer token injection
Wikilink Rendering (in NoteViewer.tsx):
- Custom
react-markdownrenderer for links - Detect
[[Note Name]]pattern → fetch backlinks → resolve to path → make clickable - Broken links styled differently (e.g., red/dashed underline)
UI Edit Scenario:
- User opens note → GET /api/notes/{path} → receives
{..., version: 5} - User edits → clicks Save → PUT /api/notes/{path} with
{"if_version": 5, ...} - Backend checks: if current version != 5 → return 409 Conflict
- UI shows "Note changed, please reload" message
MCP Write: No version check, always succeeds (last-write-wins).
See .env.example for all variables. Key settings:
- MODE:
local(single-user, no OAuth) orspace(HF multi-tenant) - JWT_SECRET_KEY: Generate with
python -c "import secrets; print(secrets.token_urlsafe(32))" - VAULT_BASE_DIR: Where vaults are stored (e.g.,
./data/vaults) - DB_PATH: SQLite database file (e.g.,
./data/index.db) - LOCAL_USER_ID: Default user for local mode (default:
local-dev)
HF Space variables (only needed when MODE=space):
- HF_OAUTH_CLIENT_ID, HF_OAUTH_CLIENT_SECRET, HF_SPACE_HOST
Optional integrations:
- GOOGLE_API_KEY: Gemini API for RAG embeddings and LLM
- ELEVENLABS_API_KEY, ELEVENLABS_VOICE_ID, ELEVENLABS_MODEL: TTS integration
- Note size: 1 MiB max (enforced in vault.py)
- Vault limit: 5,000 notes per user (configurable in indexer.py)
- Path length: 256 chars max (validated in vault.py)
- Wikilink syntax: Only
[[wikilink]]supported (no aliases like[[link|alias]])
- MCP operations: <500ms for 1,000-note vaults
- UI directory load: <2s
- Note render: <1s
- Search: <1s for 5,000 notes
- Index rebuild: <30s for 1,000 notes
This repo uses the SpecKit methodology for feature planning:
- specs/###-feature-name/: Feature documentation
spec.md: User stories, requirements, success criteriaplan.md: Tech stack, architecture, structuredata-model.md: Entities, schemas, validationcontracts/: OpenAPI + MCP tool schemastasks.md: Implementation task checklist
- Slash commands:
/speckit.specify,/speckit.plan,/speckit.tasks,/speckit.implement - Scripts:
.specify/scripts/bash/(feature scaffolding, context updates)
Implemented features: 001-obsidian-docs-viewer, 002-add-graph-view, 003-ai-chat-window, 004-gemini-vault-chat, 006-ui-polish, 011-coderag-project-init
The vlt CLI includes CodeRAG functionality for indexing and searching codebases with hybrid retrieval (vector + BM25 + graph).
# Interactive project selection
vlt coderag init
# Specify project directly
vlt coderag init --project <project-id>
# Index specific directory
vlt coderag init --project <project-id> --path /path/to/codebase
# Force re-index (overwrite existing)
vlt coderag init --project <project-id> --force
# Run in foreground with progress display
vlt coderag init --project <project-id> --foregroundNotes:
- By default, indexing runs in background via the daemon
- If daemon is not running, you will be prompted to run in foreground
- Existing indexes require
--forceto overwrite
# Human-readable status
vlt coderag status --project <project-id>
# JSON output for scripting
vlt coderag status --project <project-id> --jsonStatus values:
pending: Job queued, waiting for daemonrunning: Indexing in progresscompleted: Indexing finished successfullyfailed: Indexing failed (check error_message)cancelled: Job was cancelled by user
# Semantic search
vlt coderag search "function that handles authentication" --project <project-id>
# Limit results
vlt coderag search "error handling" --project <project-id> --limit 5# Generate overview of codebase structure
vlt coderag map --project <project-id>
# Focus on specific directory
vlt coderag map --project <project-id> --scope src/api/# Start daemon
vlt daemon start
# Stop daemon
vlt daemon stop
# Check daemon status
vlt daemon statusCodeRAG supports: python, typescript, tsx, javascript, go, rust
Files matching patterns in coderag.toml (or default **/*.py) are indexed.
Place in project root for custom settings:
[coderag]
include = ["**/*.py", "**/*.ts", "**/*.tsx"]
exclude = ["**/node_modules/**", "**/.venv/**", "**/dist/**"]
[coderag.embedding]
batch_size = 10
[coderag.repomap]
max_tokens = 4000
include_signatures = trueClaude Desktop (STDIO, local mode):
{
"mcpServers": {
"document-mcp": {
"command": "uv",
"args": ["run", "python", "src/mcp/server.py"],
"cwd": "/absolute/path/to/Document-MCP/backend"
}
}
}Remote HTTP (HF Space with JWT):
{
"mcpServers": {
"document-mcp": {
"url": "https://your-space.hf.space/mcp",
"transport": "http",
"headers": {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
}
}
}Obtain JWT: POST /api/tokens after HF OAuth login.
The app can be embedded in ChatGPT as an iFrame:
- Widget served at
/widget.htmlwith special MIME typetext/html+skybridge - MCP endpoint remains accessible for other AI agents simultaneously
- Entry point:
frontend/src/widget.tsx
The ANS provides real-time notifications to AI agents during task execution, enabling self-awareness about tool failures, budget limits, and operational issues.
Event Source (oracle_agent.py, tool_executor.py)
│
▼ emit(Event)
EventBus (pub/sub)
│
▼ notify handlers
SubscriberLoader → Subscriber configs (*.toml)
│
▼ filter + batch
NotificationAccumulator
│
▼ render with template
ToonFormatter (Jinja2 + python-toon)
│
▼ yield OracleStreamChunk(type="system")
SSE Stream → Frontend ChatPanel
EventBus (backend/src/services/ans/bus.py):
- Pub/sub pattern for decoupled event emission
- Supports wildcard subscriptions (e.g.,
tool.*) - Thread-safe with overflow handling
Subscribers (backend/src/services/ans/subscribers/*.toml):
- TOML-based configuration for each notification type
- Define event types, severity filters, batching windows
- Reference Jinja2 templates for formatting
Example subscriber config:
[subscriber]
id = "tool_failure"
name = "Tool Failure Notifications"
[events]
types = ["tool.call.failure", "tool.call.timeout"]
severity_filter = "warning"
[output]
priority = "high"
inject_at = "after_tool"
template = "tool_failure.toon.j2"
core = true # Cannot be disabled by userTemplates (backend/src/services/ans/templates/*.toon.j2):
- Jinja2 templates producing TOON (Token-Optimized Object Notation) output
- Compact format optimized for LLM context windows
- Supports batching multiple events into single notification
Notification Injection Points:
turn_start: Injected before agent receives next prompt (budget warnings)after_tool: Injected after tool execution (tool failures)immediate: Injected as soon as event occurs (critical alerts)
System messages appear in ChatPanel with distinct styling:
- Yellow/amber left border and background
- AlertCircle icon with "System" attribution
- Rendered inline with agent/user messages
- Persisted in context_nodes.system_messages_json
| Subscriber | Events | Priority | Inject At |
|---|---|---|---|
| tool_failure | tool.call.failure, tool.call.timeout | high | after_tool |
| budget_warning | budget.token.warning, budget.iteration.warning | normal | turn_start |
| budget_exceeded | budget.token.exceeded, budget.iteration.exceeded | critical | immediate |
| loop_detected | agent.loop.detected | high | immediate |
- Create
backend/src/services/ans/subscribers/my_subscriber.toml - Create
backend/src/services/ans/templates/my_subscriber.toon.j2 - Emit events from your service code:
# Within backend/src/services/, use relative imports: from .ans.bus import get_event_bus from .ans.event import Event, Severity bus = get_event_bus() bus.emit(Event( type="my.custom.event", source="my_service", severity=Severity.INFO, payload={"message": "Something happened"} ))
Users can toggle non-core subscribers via Settings > Notifications tab. Core subscribers (marked core = true) cannot be disabled. Settings stored in user_settings.disabled_subscribers_json.
The Oracle uses a REPL-centric inference harness. The LLM is given a Python environment where the entire project lives as variables (project, sub_oracle, Final). It writes code to explore and synthesize answers programmatically.
Before (BT Oracle): Query classifier → prompt composer → BT XML signals → multi-turn loop
After (RLM Oracle): REPL environment → LLM writes Python → Final variable terminates loop
- REPLExecutor: RestrictedPython sandbox with 30s timeout; approved modules (
re,json,math,datetime,collections,itertools);Finalsentinel detection - ProjectContext: Exposes project files, threads, notes as Python objects in REPL namespace
- SubOracleCallable: Recursive sub-oracle calls (max depth 2, max 3 calls per root session)
- Streaming:
progresschunks carry REPL stdout;contentchunks carry the Final answer;donechunk carriesmetadata["iteration_count"]
The Oracle API routes (/api/oracle and /api/oracle/stream) use RLMOracleWrapper:
from backend.src.services.rlm_oracle import RLMOracleWrapper
wrapper = RLMOracleWrapper(
user_id="user-id",
api_key="openrouter-api-key",
project_id="project-id",
model="deepseek/deepseek-chat-v3-0324",
max_tokens=4096,
)
async for chunk in wrapper.process_query(query="Hello", context_id=None):
print(chunk.type, chunk.content)| Variable | Default | Description |
|---|---|---|
ORACLE_MAX_TURNS |
25 |
Max REPL iterations per root session |
ORACLE_SUB_MAX_TURNS |
8 |
Max iterations per sub-oracle session |
| File | Purpose |
|---|---|
backend/src/services/rlm_oracle.py |
RLMOracleWrapper, RLMSession, RLMPromptBuilder, SubOracleCallable |
backend/src/services/project_context.py |
ProjectContext, TextHandle, FileManifest |
backend/src/services/repl_executor.py |
REPLExecutor, REPLNamespace (RestrictedPython sandbox) |
backend/src/services/openrouter_client.py |
OpenRouter HTTP client (moved from bt/services/) |
backend/src/api/routes/oracle.py |
Oracle API routes (updated to use RLMOracleWrapper) |
- 022-rlm-oracle: Replaced BT Oracle with RLM Oracle harness; LLM writes Python in REPL with
project/sub_oracle/Finalnamespace; deleted entirebackend/src/bt/directory; addedREPLExecutor(RestrictedPython),ProjectContext(file/thread/note handles),RLMOracleWrapper; Go symbol extraction +end_linefield in CodeRAG repomap - 018-vlt-mcp-server: Added vlt-mcp unified MCP server (
packages/vlt-cli/src/vlt/mcp/) with 17 tools across 5 modules (thread_tools, meta_tools, code_tools, oracle_tools, vault_tools); Oracle toggle backend route (/api/settings/oracle); Oracle tab in Settings.tsx; 164ms cold-start via STDIO; registered as user-scope MCP in Claude Code
- Python 3.11+ (backend only; no frontend changes) (022-rlm-oracle)
- RestrictedPython>=8.0 for REPL sandbox (022-rlm-oracle)
- No new persistence. Ephemeral
RLMSessionper query.OracleBridge(existing) handles conversation history via existingcontext_nodestable. (022-rlm-oracle)