Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 17 additions & 7 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,13 @@ This file provides guidance to Claude Code (claude.ai/code), Github Copilot, Roo

CodeWeaver is an extensible MCP (Model Context Protocol) server for semantic code search. It provides intelligent codebase context discovery through a single `find_code` tool interface, supporting multiple embedding providers, vector databases, and data sources through a plugin architecture.

**Current Status**: Alpha Release 1. Most core features complete and relatively stable. Advanced functionality planned in several epics.
**Architecture**: CodeWeaver runs as a daemon with separate transport servers:
- **Daemon**: Background services (indexing, file watching, health monitoring)
- **Management Server**: HTTP endpoints at port 9329 (health, status, metrics)
- **MCP HTTP Server**: FastMCP at port 9328 for HTTP clients
- **MCP stdio**: Lightweight proxy that forwards to the HTTP backend (default transport)

**Current Status**: Alpha Release 2. Most core features complete and relatively stable. Advanced functionality planned in several epics.

> [!IMPORTANT]
> Because it is an MCP server, you can use CodeWeaver while assisting with CodeWeaver development!
Expand Down Expand Up @@ -134,10 +140,12 @@ src/codeweaver/
│ ├── context.py # Context exploration commands *scaffolding*
│ ├── doctor.py # Health check and diagnostics
│ ├── index.py # Indexing commands
│ ├── init.py # Project initialization
│ ├── init.py # Project initialization + service persistence
│ ├── list.py # List resources (models, providers, etc.)
│ ├── search.py # Search command (wraps find_code)
│ └── server.py # MCP server management
│ ├── server.py # MCP server management (stdio/HTTP transports)
│ ├── start.py # Start daemon in background (or --foreground)
│ └── stop.py # Stop the running daemon
├── common/ # Shared utilities and infrastructure
│ ├── __init__.py
Expand Down Expand Up @@ -346,14 +354,16 @@ src/codeweaver/
│ ├── token_patterns.py # Token pattern matching for cross-language token identification
│ └── types.py # Semantic analysis types
├── server/ # MCP server implementation
├── server/ # Server implementations
│ ├── __init__.py

│ ├── app_bindings.py # Application dependency bindings and http admin endpoints (i.e. /metrics)
│ ├── app_bindings.py # Application dependency bindings and http admin endpoints
│ ├── health_endpoint.py # Health check endpoint
│ ├── health_models.py # Health check data models
│ ├── health_service.py # Health check service
│ └── server.py # Main MCP server entry point
│ ├── management.py # Management server (Starlette, port 9329)
│ ├── mcp_http.py # MCP HTTP server (FastMCP, port 9328)
│ ├── server.py # Main MCP server entry point
│ └── stdio_proxy.py # stdio-to-HTTP proxy for MCP clients
├── tokenizers/ # Token counting for various models
│ ├── __init__.py
Expand Down
78 changes: 75 additions & 3 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ SPDX-License-Identifier: MIT OR Apache-2.0

**Purpose**: This document serves as the authoritative reference for CodeWeaver's architectural decisions, design principles, and technical philosophy. It consolidates design decisions scattered across multiple project files into a unified resource.

**Status**: Living document - Updated as architectural decisions evolve
**Version**: 1.1.0
**Last Updated**: 2025-11-23
**Status**: Living document - Updated as architectural decisions evolve
**Version**: 1.2.0
**Last Updated**: 2025-12-02

---

Expand Down Expand Up @@ -181,6 +181,77 @@ We transform complexity into clarity using simple modularity with extensible des

## Technical Architecture

### Daemon Architecture

**Design Decision**: Separate background services from MCP transport servers.

CodeWeaver runs as a daemon with distinct server components:

```
┌─────────────────────────────────────────────────────────────────┐
│ CodeWeaver Daemon │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Background Services │ │
│ │ • Indexer (semantic search engine) │ │
│ │ • FileWatcher (real-time index updates) │ │
│ │ • HealthService (system monitoring) │ │
│ │ • Statistics & Telemetry │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────┐ ┌─────────────────────────────┐ │
│ │ Management Server │ │ MCP HTTP Server │ │
│ │ (Starlette) │ │ (FastMCP) │ │
│ │ Port 9329 │ │ Port 9328 │ │
│ │ │ │ │ │
│ │ • /health │ │ • /mcp/ (MCP endpoint) │ │
│ │ • /status │ │ • find_code tool │ │
│ │ • /metrics │ │ │ │
│ │ • /state │ │ │ │
│ └─────────────────────┘ └─────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌───────────────────┴───────────────────┐
│ │
┌────────┴────────┐ ┌─────────┴─────────┐
│ stdio Proxy │ │ HTTP Clients │
│ (MCP Clients) │ │ (Direct) │
│ │ │ │
│ Claude Code │ │ curl, httpie │
│ Cursor, VSCode │ │ Custom clients │
└──────────────────┘ └───────────────────┘
```

**Components**:

- **Daemon**: Long-running background process managing all services
- **Management Server** (port 9329): Starlette HTTP server for health checks, status, and metrics
- **MCP HTTP Server** (port 9328): FastMCP server handling MCP protocol over HTTP
- **stdio Proxy**: Lightweight process that proxies MCP stdio to the HTTP backend

**Transport Modes**:

- **stdio (default)**: MCP clients spawn a stdio process that proxies to the daemon's HTTP server. Auto-starts daemon if needed.
- **streamable-http**: Direct HTTP connection to the MCP server (for persistent server deployments)

**CLI Commands**:

```bash
cw start # Start daemon in background
cw start --foreground # Run daemon in current terminal
cw stop # Stop the daemon
cw init service # Install as system service (systemd/launchd)
cw server # Run MCP server (stdio by default)
```

**Rationale**:
- Separates concerns: background indexing vs. request handling
- Enables safe stdio transport: proxy is stateless, daemon handles state
- Management endpoints accessible regardless of MCP transport
- System service installation for production deployments
- Graceful degradation: stdio auto-starts daemon if not running

### Span-Based Core

**Design Decision**: Use immutable span-based architecture for precise code location tracking.
Expand Down Expand Up @@ -1007,3 +1078,4 @@ This document serves as the historical record of significant architectural decis
**Version History**:
- v1.0.0 (2025-10-21): Initial unified architecture document
- v1.1.0 (2025-11-23): Updated to include recent design decisions and project structure changes.
- v1.2.0 (2025-12-02): Added Daemon Architecture section documenting the new multi-server design with management server, MCP HTTP server, and stdio proxy. Default transport changed to stdio.
22 changes: 21 additions & 1 deletion DOCKER.md
Original file line number Diff line number Diff line change
Expand Up @@ -368,10 +368,14 @@ exclude_patterns = ["node_modules", ".git", "dist", "__pycache__"]

## Architecture

CodeWeaver uses a daemon architecture with stdio as the default transport:

**Standalone/docker-compose mode** (HTTP transport):
```
┌─────────────────────────────────────────────────┐
│ CodeWeaver Container │
│ ├─ MCP Server (port 9328) │
│ ├─ MCP Server (port 9328, HTTP) │
│ ├─ Management Server (port 9329) │
│ ├─ Live File Watcher │
│ ├─ Indexing Engine │
│ └─ Search API │
Expand All @@ -387,6 +391,22 @@ exclude_patterns = ["node_modules", ".git", "dist", "__pycache__"]
└─────────────────────────────────────────────────┘
```

**MCP client spawned mode** (STDIO transport - default):
```
┌──────────────────┐ ┌──────────────────────────┐
│ MCP Client │────▶│ Docker Container │
│ (Claude, etc.) │stdio│ └─ STDIO proxy to HTTP │
└──────────────────┘ └───────────┬──────────────┘
│ HTTP
┌──────────────────────────┐
│ CodeWeaver Daemon │
│ (running on host) │
│ ├─ MCP Server :9328 │
│ └─ Management :9329 │
└──────────────────────────┘
```

Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diagram shows "STDIO proxy to HTTP" running in a Docker container, which then connects to a "CodeWeaver Daemon (running on host)". This architecture is problematic:

  1. The stdio proxy in Docker needs to connect to the host daemon, but 127.0.0.1 inside the container refers to the container's localhost, not the host
  2. Network isolation between container and host makes this challenging without special networking configuration
  3. This use case seems uncommon - if you're using Docker, you'd typically run the full daemon inside the container

The diagram should clarify that this mode is for when the stdio proxy is NOT in Docker, or the documentation should explain the network configuration needed (e.g., using host.docker.internal on Docker Desktop).

Suggested change
> **Note:** In this architecture, the STDIO proxy runs inside a Docker container and connects to a CodeWeaver Daemon running on the host. By default, `127.0.0.1` inside the container refers to the container's own localhost, not the host's. To connect to the host daemon, you must configure Docker networking appropriately:
> - On Docker Desktop (Mac/Windows), use `host.docker.internal` as the host address.
> - On Linux, you may need to use the host network mode (`--network=host`) or set up a custom bridge network.
> This setup is uncommon; typically, both the STDIO proxy and daemon run inside the same container for simplicity.

Copilot uses AI. Check for mistakes.
## Performance Optimization

### Large Codebases
Expand Down
15 changes: 8 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -101,14 +101,15 @@ RUN mkdir -p /app/data /app/config /app/.codeweaver && \
# Switch to non-root user
USER codeweaver

# Health check to ensure service is running
# Health check via management server
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD curl -f http://localhost:9328/health/ || exit 1
CMD curl -f http://localhost:9329/health || exit 1

# Expose the MCP server port
EXPOSE 9328
# Expose the MCP HTTP server port (9328) and management server port (9329)
EXPOSE 9328 9329

# Default command: start the CodeWeaver MCP server
# Users can override this with custom config via docker-compose or docker run
# Default command: start the CodeWeaver daemon in foreground mode
# This runs both management server (9329) and MCP HTTP server (9328)
# For stdio-only mode (MCP clients), use: codeweaver server
ENTRYPOINT ["/entrypoint.sh"]
CMD ["codeweaver", "server", "--host", "0.0.0.0", "--port", "9328"]
CMD ["codeweaver", "start", "--foreground"]
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The Dockerfile changes the health check from port 9328 (MCP HTTP server) to port 9329 (management server), which makes sense. However, the default CMD is changed to run codeweaver start --foreground which starts both servers. The docker-compose.yml overrides this to use --transport streamable-http, but this creates inconsistency: the Dockerfile's default command doesn't actually expose a working MCP endpoint on 9328 in the way docker-compose expects. Consider whether the default CMD should explicitly start with HTTP transport for Docker usage, or update documentation to clarify the daemon mode in containers.

Suggested change
CMD ["codeweaver", "start", "--foreground"]
CMD ["codeweaver", "start", "--foreground", "--transport", "streamable-http"]

Copilot uses AI. Check for mistakes.
31 changes: 21 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,29 +98,29 @@ cw server

### MCP Configuration

Behind the scenes, CodeWeaver always runs an http server for mcp services, but you can use it with *either* stdio or http transports. Your stdio session bridges to the http server.
CodeWeaver uses **stdio transport by default**, which proxies to the HTTP backend daemon. First start the daemon with `codeweaver start`, then MCP clients can connect via stdio.

`cw init` will add CodeWeaver to your project's `.mcp.json`:

```json "with http:"
```json "with stdio (default):"
{
"mcpServers": {
"codeweaver": {
"type": "http",
"url": "http://127.0.0.1:9328/mcp"
"type": "stdio",
"cmd": "uv",
"args": ["run", "codeweaver", "server"],
"env": {"SOME_API_KEY_FOR_PROVIDERS": "value"}
}
}
}
```

```json "with stdio:"
```json "with http (direct connection):"
{
"mcpServers": {
"codeweaver": {
"type": "stdio",
"cmd": "uv",
"args": ["run", "codeweaver", "server", "--transport", "stdio"],
"env": {"SOME_API_KEY_FOR_PROVIDERS": "value"}
"type": "http",
"url": "http://127.0.0.1:9328/mcp"
}
}
}
Expand Down Expand Up @@ -237,7 +237,9 @@ CodeWeaver combines [AST][wiki_ast]-level understanding, semantic relationships,
### CLI Commands

```bash
cw server # Run the MCP server
cw start # Start daemon in background (or --foreground)
cw stop # Stop the daemon
cw server # Run the MCP server (stdio by default)
cw doctor # Full setup diagnostic
cw index # Run indexing without server
cw init # Set up MCP + config
Expand All @@ -247,6 +249,15 @@ cw search # Test the search engine
cw config # View resolved configuration
```

#### Running as a System Service

Install CodeWeaver to start automatically on login:

```bash
cw init service # Install and enable (systemd/launchd)
cw init service --uninstall # Remove the service
```

📖 [Full CLI Guide →][cli_guide]


Expand Down
4 changes: 0 additions & 4 deletions cliff.toml
Original file line number Diff line number Diff line change
Expand Up @@ -92,12 +92,10 @@ commit_preprocessors = [
# Input: "Merge pull request #123 from org/branch\n\nPR Description"
# Output: "Merge pull request #123 from org/branch | PR Description"
{ pattern = 'Merge pull request (#[0-9]+) from ([^\n]+)\n+([\s\S]+)', replace = "Merge pull request ${1} from ${2} | ${3}" },

# 2. Mark squash merge commits for easier template detection
# Input: "feat: some feature (#123)"
# Output: "SQUASH | feat: some feature | #123"
{ pattern = '^(feat|fix|docs|style|refactor|perf|test|build|ci|chore|revert)(\(.+?\))?: (.+) \(#([0-9]+)\)$', replace = "SQUASH | ${1}${2}: ${3} | #${4}" },

# 3. Catch squash merges without conventional commit prefix
# Input: "Some change (#123)"
# Output: "SQUASH | Some change | #123"
Expand All @@ -108,7 +106,6 @@ commit_preprocessors = [
commit_parsers = [
# Skip non-PR merge commits (like "Merge branch 'main'")
{ message = "^Merge branch", skip = true },

# === Squash Merge Parsers (conventional commits) ===
# After preprocessing, squash merges are in format: "SQUASH | type: description | #123"
{ message = "^SQUASH \\| feat", group = "Features" },
Expand All @@ -122,7 +119,6 @@ commit_parsers = [
{ message = "^SQUASH \\| chore", group = "Maintenance" },
{ message = "^SQUASH \\| revert", group = "Reverts" },
{ message = "^SQUASH \\|", group = "Other Changes" },

# === Regular Merge Commit Parsers (branch-based) ===
# Features - check branch name patterns
{ message = "^Merge pull request #[0-9]+ from .*/feat", group = "Features" },
Expand Down
2 changes: 2 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ services:
context: .
dockerfile: Dockerfile
container_name: codeweaver-server
# Use HTTP transport for persistent service mode (daemon not needed in Docker)
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Incomplete documentation in comment. The comment states "Use HTTP transport for persistent service mode (daemon not needed in Docker)" but this is misleading. The daemon architecture is still relevant in Docker - it's just that in docker-compose mode, you're running the full daemon with the start --foreground command.

Consider clarifying:

# Use HTTP transport in docker-compose for direct HTTP connections
# The daemon runs in foreground mode, providing both management and MCP HTTP servers
command: ["codeweaver", "server", "--host", "0.0.0.0", "--port", "9328", "--transport", "streamable-http"]
Suggested change
# Use HTTP transport for persistent service mode (daemon not needed in Docker)
# Use HTTP transport in docker-compose for direct HTTP connections
# The daemon runs in foreground mode, providing both management and MCP HTTP servers

Copilot uses AI. Check for mistakes.
command: ["codeweaver", "server", "--host", "0.0.0.0", "--port", "9328", "--transport", "streamable-http"]
ports:
- ${CODEWEAVER_PORT:-9328}:9328
volumes:
Expand Down
10 changes: 5 additions & 5 deletions docs/CLI.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ $ codeweaver server [OPTIONS]
* `-p, --project`:
* `--host`: *[default: 127.0.0.1]*
* `--port`: *[default: 9328]*
* `-t, --transport`: Transport type for MCP communication (streamable-http or stdio) *[choices: streamable-http, stdio]* *[default: streamable-http]*
* `-t, --transport`: Transport type for MCP communication (stdio or streamable-http) *[choices: stdio, streamable-http]* *[default: stdio]*
* `-v, --verbose, --no-verbose`: Enable verbose logging with timestamps *[default: --no-verbose]*
* `-d, --debug, --no-debug`: Enable debug logging *[default: --no-debug]*

Expand Down Expand Up @@ -574,7 +574,7 @@ flag. *[choices: claude_code, claude_desktop, cursor, gemini_cli, vscode, mcpjs
* `--host`: CodeWeaver server host *[default: http://127.0.0.1]*
* `--port`: CodeWeaver server port *[default: 9328]*
* `-f, --force, --no-force`: Force overwrite existing config *[default: --no-force]*
* `-t, --transport`: Transport type (streamable-http or stdio). Streamable default and recommended. *[choices: streamable-http, stdio]* *[default: streamable-http]*
* `-t, --transport`: Transport type (stdio or streamable-http). Stdio is default and recommended. *[choices: stdio, streamable-http]* *[default: stdio]*
* `--config-extension`: *[choices: toml, yaml, yml, json]* *[default: toml]*
* `--config-path`: Custom path for CodeWeaver configuration file
* `--mcp-config-level`: The level of mcp configuration to write to (project or user) *[choices: project, user]* *[default: project]*
Expand Down Expand Up @@ -617,8 +617,8 @@ Set up MCP client configuration for CodeWeaver.
This command generates MCP client configuration that allows AI assistants like Claude Code, Cursor, or VSCode to connect
to CodeWeaver's MCP server.

Transport Types: - streamable-http (default): HTTP-based transport for persistent server connections - stdio: Standard
input/output transport that launches CodeWeaver per-session
Transport Types: - stdio (default): Standard input/output transport that proxies to HTTP backend - streamable-http: Direct
HTTP-based transport for persistent server connections

Tip: Set a default MCP config in your CodeWeaver config, then just run cw init mcp --client your_client --client
another_client to generate the config for those clients.
Expand All @@ -639,7 +639,7 @@ $ codeweaver init mcp [OPTIONS]
specified. *[choices: claude_code, claude_desktop, cursor, gemini_cli, vscode, mcpjson]*
* `--host`: [http-only] Server host address (default: http://127.0.0.1) *[default: http://127.0.0.1]*
* `--port`: [http-only] Server port (default: 9328) *[default: 9328]*
* `-t, --transport`: Transport type for MCP communication *[choices: streamable-http, stdio]* *[default: streamable-http]*
* `-t, --transport`: Transport type for MCP communication *[choices: stdio, streamable-http]* *[default: stdio]*
* `--timeout`: Timeout in seconds for MCP client connections *[default: 120]*
* `--auth`: Authentication method for MCP client (bearer token, 'oauth', an httpx.Auth object, or None) *[choices: oauth]*
* `--cmd`: [stdio-only] Command to start MCP client process
Expand Down
Loading
Loading