CodeWeaver Architecture & Design Decisions

Purpose: This document serves as the authoritative reference for CodeWeaver's architectural decisions, design principles, and technical philosophy. It consolidates design decisions scattered across multiple project files into a unified resource.

Status: Living document - Updated as architectural decisions evolve Version: 1.2.0 Last Updated: 2025-12-02

Project Constitution
Core Philosophy
Design Principles
Architectural Goals
Technical Architecture
API Architecture
Provider Architecture
Code Design Patterns
Testing Philosophy
Development Workflow
Brand Voice & Terminology
Key Technical Decisions
Trade-offs & Constraints

Project Constitution

AUTHORITATIVE SOURCE: .specify/memory/constitution.md v2.0.1

The CodeWeaver Constitution is the highest authority for all technical and architectural decisions. This constitution supersedes all other development practices and guidelines.

Constitutional Principles (Non-Negotiable)

I. AI-First Context

Deliver precise codebase context for plain language agent requests. Every feature must enhance the ability of AI agents to understand and work with code through "exquisite context." Design APIs, documentation, and tooling with AI consumption as the primary interface, not an afterthought.

Rationale: CodeWeaver's mission is to bridge the gap between human expectations and AI agent capabilities, making AI-first design essential for success.

II. Proven Patterns

Leverage established abstractions and ecosystem alignment over reinvention. Channel proven architectural patterns from FastAPI, pydantic ecosystem, and established open source projects. Use familiar interfaces that reduce learning curve and increase adoption.

Rationale: Established patterns reduce development risk, accelerate onboarding, and provide battle-tested solutions to common problems.

III. Evidence-Based Development (NON-NEGOTIABLE)

All technical decisions must be supported by verifiable evidence: documentation, testing, metrics, or reproducible demonstrations. No workarounds, mock implementations, or placeholder code without explicit user authorization. "No code beats bad code."

Rationale: Evidence-based development ensures reliability, maintainability, and prevents technical debt from accumulating.

IV. Testing Philosophy

Effectiveness over coverage. Focus on critical behavior affecting user experience, realistic integration scenarios, and input/output validation. Integration testing preferred over unit testing. One solid, realistic test beats ten implementation detail tests.

Rationale: Testing should validate real-world usage patterns and prevent user-affecting bugs, not just achieve coverage metrics.

V. Simplicity Through Architecture

Transform complexity into clarity. Use simple modularity with extensible yet intuitive design where purpose should be obvious. Implement flat structure grouping related modules in packages while avoiding unnecessary nesting.

Rationale: Simplicity enables maintainability, reduces bugs, and makes the codebase accessible to contributors.

Constitutional Governance

Amendment Process: Constitution changes require documentation of rationale, impact analysis, and migration plan for affected code
Enforcement: All code reviews, feature planning, and technical decisions must verify compliance with constitutional principles
Complexity Justification: Any deviation requires documented rationale and consideration of simpler alternatives

Core Philosophy

Mission Statement

Bridge the gap between human expectations and AI agent capabilities through "exquisite context." Create beneficial cycles where AI-first tools enhance both agent and human capabilities.

The Problem We Solve

AI coding agents face too much irrelevant context (70-80% unused), causing token waste, missed patterns, and hallucinations.

Root Causes: Tool confusion (5-20+ discovery tools), context bloat (25K-40K tokens in tool descriptions), proprietary lock-in, search fragmentation.

Our Solution: Single find_code tool + agent-driven curation + hybrid search (text + semantic + AST) + platform extensibility.

Impact: 60-80% context reduction, >90% relevance, 5x cost savings per query.

For Complete Analysis: See PRODUCT.md - The Problem We Solve for detailed user impact, competitive landscape, and market positioning.

Design Principles

CodeWeaver is built on five core principles that guide every technical decision:

1. AI-First Context

Every feature enhances AI agent understanding of code through precise context delivery. We design for AI consumption first, human inspection second.

In Practice:

Single-tool interface (find_code) eliminates tool selection cognitive load
Natural language queries over rigid syntax
Span-based precision (exact line/column references)
Task-aware context ranking

2. Proven Patterns Over Reinvention

We use proven patterns from successful open source projects in the pydantic ecosystem (FastAPI, pydantic-ai, FastMCP). Familiar interfaces reduce learning curve and increase adoption.

In Practice:

FastAPI-style dependency injection and middleware patterns
pydantic models for all structured data
Plugin architecture inspired by VS Code and Babel
Configuration via pydantic-settings

3. Evidence-Based Development

All technical decisions backed by verifiable evidence: documentation, testing, metrics, or reproducible demonstrations. No workarounds, no placeholder code, no "it should work" assumptions.

In Practice:

No NotImplementedError or TODO shortcuts
All features backed by tests or documentation
Architectural decisions documented with rationale
Performance claims backed by benchmarks

4. Effectiveness Over Coverage

Testing focuses on critical behavior affecting user experience. One realistic integration test beats ten implementation detail tests. Code coverage scores don't measure outcomes.

In Practice:

Integration tests over unit tests
Test realistic user workflows
Focus on user-affecting behavior
Coverage as indicator, not goal

5. Simplicity Through Architecture

We transform complexity into clarity using simple modularity with extensible design where purpose is obvious. Flat structure, clear naming, minimal nesting.

In Practice:

Flat project structure with obvious purpose
Single responsibility modules
Clear naming conventions
Extensible through composition, not inheritance

Architectural Goals

Goal 1: Semantically-Rich, Ranked Context Delivery

What: Provide developers and AI agents with weighted, prioritized search results using AST analysis and multiple embedding/reranking models.

How:

ast-grep for semantic code analysis
Support for dozens of embedding and reranking models (local and remote)
Fully pluggable provider architecture
Integration of arbitrary data sources beyond codebase

Goal 2: Eliminate Cognitive Load on Agents

What: Reduce all operations to a single, simple, plain language tool.

How:

Single find_code tool replaces 5-20+ specialized tools
Natural language queries with optional structured filters
MCP sampling for context curation (separate agent instance)
Agent evaluates needs without polluting primary agent's context

Goal 3: Significantly Reduce Context Bloat and Costs

What: Restrict context to only what agents need while maintaining precision.

How:

Span-based code intelligence (exact line/column tracking)
Token-aware context assembly
Agent-driven curation for precision
Background indexing keeps context current

Technical Architecture

Daemon Architecture

Design Decision: Separate background services from MCP transport servers.

CodeWeaver runs as a daemon with distinct server components:

┌─────────────────────────────────────────────────────────────────┐
│                     CodeWeaver Daemon                            │
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │              Background Services                          │   │
│  │  • Indexer (semantic search engine)                       │   │
│  │  • FileWatcher (real-time index updates)                  │   │
│  │  • HealthService (system monitoring)                      │   │
│  │  • Statistics & Telemetry                                 │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌─────────────────────┐    ┌─────────────────────────────┐    │
│  │  Management Server   │    │    MCP HTTP Server          │    │
│  │  (Starlette)         │    │    (FastMCP)                │    │
│  │  Port 9329           │    │    Port 9328                │    │
│  │                      │    │                             │    │
│  │  • /health           │    │    • /mcp/ (MCP endpoint)   │    │
│  │  • /status           │    │    • find_code tool         │    │
│  │  • /metrics          │    │                             │    │
│  │  • /state            │    │                             │    │
│  └─────────────────────┘    └─────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────┘
                                        ▲
                                        │
                    ┌───────────────────┴───────────────────┐
                    │                                       │
           ┌────────┴────────┐                    ┌─────────┴─────────┐
           │  stdio Proxy     │                    │   HTTP Clients    │
           │  (MCP Clients)   │                    │   (Direct)        │
           │                  │                    │                   │
           │  Claude Code     │                    │  curl, httpie     │
           │  Cursor, VSCode  │                    │  Custom clients   │
           └──────────────────┘                    └───────────────────┘

Components:

Daemon: Long-running background process managing all services
Management Server (port 9329): Starlette HTTP server for health checks, status, and metrics
MCP HTTP Server (port 9328): FastMCP server handling MCP protocol over HTTP
stdio Proxy: Lightweight process that proxies MCP stdio to the HTTP backend

Transport Modes:

stdio (default): MCP clients spawn a stdio process that proxies to the daemon's HTTP server. Auto-starts daemon if needed.
streamable-http: Direct HTTP connection to the MCP server (for persistent server deployments)

CLI Commands:

cw start              # Start daemon in background
cw start --foreground # Run daemon in current terminal
cw stop               # Stop the daemon
cw init service       # Install as system service (systemd/launchd)
cw server             # Run MCP server (stdio by default)

Rationale:

Separates concerns: background indexing vs. request handling
Enables safe stdio transport: proxy is stateless, daemon handles state
Management endpoints accessible regardless of MCP transport
System service installation for production deployments
Graceful degradation: stdio auto-starts daemon if not running

Span-Based Core

Design Decision: Use immutable span-based architecture for precise code location tracking.

Components:

Span: Represents precise code locations (line/column)
SpanGroup: Composition of spans with set operations (union, intersection, difference)
CodeChunk: Carries spans and metadata for token-aware assembly
CodeMatch: Search results with span-based location tracking

Rationale:

Exact references eliminate "nearby code" confusion
Immutable operations enable thread-safe concurrent processing
Set operations allow accurate merging across search passes
Superior to offset-based or line-only approaches

Semantic Metadata from AST-Aware Indexing

Design Decision: Comprehensive semantic metadata classification system.

Components:

ExtKind: Enumerates language and chunk types
SemanticMetadata: Tracks AST nodes and classifications
Task-based priority ranking for nuanced searches
Support for 26 programming languages

Rationale:

AST awareness improves chunk boundary detection
Semantic classification enables intelligent ranking
Task-aware priorities deliver contextually relevant results

Backup Code-Aware Indexing (170+ Languages)

Design Decision: Sophisticated heuristic fallback for languages without AST support.

Features:

Pattern-based context identification for 170+ languages
Support for legacy codebases (COBOL, Pascal, Fortran, etc.)
Custom language support via configuration
Family pattern association (C-style, Python, ML, Lisp, etc.)

Rationale:

Enables CodeWeaver to work with any codebase, not just modern languages
Maintains value proposition for legacy code maintenance
Graceful degradation when AST unavailable

Hybrid Search Pipeline

Design Decision: Combine multiple search signals with unified ranking.

Architecture:

┌─────────────────────────────────────────┐
│        Hybrid Search Pipeline            │
│                                          │
│  ┌──────────┐  ┌──────────┐  ┌────────┐│
│  │   Text   │  │ Semantic │  │  AST   ││
│  │  Search  │  │Embeddings│  │Analysis││
│  └──────────┘  └──────────┘  └────────┘│
│                                          │
│  ┌────────────────────────────────────┐ │
│  │      Unified Ranking               │ │
│  │   (Span-based Assembly)            │ │
│  └────────────────────────────────────┘ │
└─────────────────────────────────────────┘

Rationale:

Text search: Fast keyword matching
Semantic search: Intent understanding
AST analysis: Structure awareness
Unified ranking: Best of all approaches

API Architecture

Three-Tier API Design

Design Decision: Different interfaces optimized for different users.

1. Human API: Deep Configurability

Interface: Configuration files (TOML/YAML), CLI commands, environment variables

Philosophy: Humans want control, customization, and understanding

Surface Area: Extensive

Provider selection and configuration (10+ embedding providers)
Chunking strategies and parameters
Ranking algorithms and weights
Token budgets and caching policies
Plugin architecture for custom middleware
Integration with existing infrastructure

Example:

[embedding]
provider = "voyageai"
model = "voyage-code-3"
batch_size = 32

[chunking]
strategy = "ast-aware"
max_tokens = 512
overlap_tokens = 50

[ranking]
weights = { semantic = 0.4, keyword = 0.3, ast = 0.3 }

2. User Agent API: Radical Simplicity

Interface: MCP tools

Philosophy: Agents focus on what to find, not how to search

Surface Area: 1-3 tools

find_code(query, intent?, filters?) - primary interface
change_code(...) - (future) for code modification
get_context(...) - (future) for explicit context requests

Example:

Agent uses: find_code
Query: "authentication middleware patterns"
Intent: "implementation" (optional)
Filters: { language: "python", file_type: "code" } (optional)

→ Returns: Precise, ranked results with provenance

3. Context Agent API: Controlled Expansion

Interface: Extended MCP tools for context curation agents

Philosophy: Context agents need more capability than user agents, but still bounded

Surface Area: 3-8 specialized tools

find_code - same as user agent
get_semantic_neighbors - explore related code
get_call_graph - understand execution flow
get_import_tree - track dependencies
analyze_context_coverage - assess completeness

Rationale: Enables sophisticated curation via MCP sampling without exposing complexity to end users

Why This Architecture Matters

Eliminates False Trade-off: Traditional thinking says "powerful features OR simple interface—pick one." CodeWeaver provides both through appropriate interfaces for each user type.

Each API Optimized for Its User:

Humans: Understand, configure, customize, debug, extend
User Agents: Get work done without cognitive overhead
Context Agents: Targeted tools for sophisticated curation

Addresses Different Pain Points:

Human API → Solves "locked into vendor decisions"
User Agent API → Solves "tool confusion and context bloat"
Context Agent API → Enables "intelligent curation without complexity"

Provider Architecture

Design Decision: Pluggable Provider Ecosystem

Philosophy: Infrastructure freedom - use what you already have

Provider Types:

Provider Coverage

Embedding: 10+ providers including VoyageAI, OpenAI, Bedrock, Cohere, Google, Mistral, Hugging Face, fastembed, sentence-transformers

Rerank: VoyageAI, Bedrock, Cohere, fastembed, sentence-transformers

Agent: Full pydantic-ai ecosystem (OpenAI, Anthropic, Google, Mistral, Groq, Hugging Face, Bedrock, and more)

Vector Stores: Qdrant (production), in-memory (development)

Data Sources: Filesystem (with file watching), Tavily, DuckDuckGo | Planned: Context7

Complete Details: See README.md - Providers for installation options and Provider Documentation for configuration guides.

Implementation Pattern:

# Abstract base protocol
class EmbeddingProvider(Protocol):
    async def embed(self, texts: list[str]) -> list[list[float]]: ...
    def get_dimensions(self) -> int: ...

# Concrete implementation
class VoyageAIProvider(EmbeddingProvider):
    def __init__(self, api_key: str, model: str = "voyage-code-3"): ...
    async def embed(self, texts: list[str]) -> list[list[float]]: ...
    def get_dimensions(self) -> int: ...

# Registration
ProviderRegistry.register_embedding_provider("voyageai", VoyageAIProvider)

Rationale:

No vendor lock-in
Use existing infrastructure investments
Community can contribute custom providers
Platform positioning vs. point solution

Configuration Management

Design Decision: Hierarchical configuration via pydantic-settings

Features:

Multi-source: Environment variables, TOML files, defaults
Capability-based provider selection
Dynamic instantiation with health monitoring
All settings documentable and validateable

Rationale:

Proven pattern from pydantic ecosystem
Type-safe configuration with validation
Clear precedence rules
No hardcoded values in implementation

Code Design Patterns

Pydantic Architecture Patterns

Design Decision: Channel FastAPI/pydantic ecosystem patterns

Key Patterns:

Flexible Generics

Few broadly-defined models/protocols/ABCs for wide reuse

class BasedModel(BaseModel):
    """Base for all CodeWeaver models with common config"""
    model_config = ConfigDict(frozen=True, extra="forbid")

Smart Decorators

Extend class/function roles cleanly

@wrap_filters(filterable_fields=DEFAULT_FILTERABLE_FIELDS)
async def find_code(query: str, **dynamic_filters): ...

Dependency Injection

Explicit dependencies, organized pipelines

async def search(
    query: str,
    embedding_provider: EmbeddingProvider = Depends(get_embedding_provider),
    vector_store: VectorStore = Depends(get_vector_store),
): ...

Flat Structure

Group closely related modules, otherwise keep root-level

src/codeweaver/
├── _common.py           # Shared utilities
├── embedding/          # All embedding providers
├── vector_stores/      # All vector store providers
├── services/           # Business logic services
└── middleware/         # FastMCP middleware

Types as Functionality

Types ARE the behavior, not separate from it

class Span:
    """Immutable code location with set operations"""
    def union(self, other: Span) -> SpanGroup: ...
    def intersection(self, other: Span) -> SpanGroup | None: ...

Lazy Evaluation & Immutability

Design Decision: Performance + memory efficiency + fewer debugging headaches

Patterns:

Sequences: Use Generator/AsyncGenerator, tuple over lists
Sets: Use frozenset for set-like objects
Dicts: Read-only dicts use types.MappingProxyType
Models: Use frozen=True for dataclasses/models

Rationale:

Immutable data structures are thread-safe
Lazy evaluation reduces memory footprint
Functional patterns reduce bugs
Easier to reason about code flow

Type System Discipline

Design Decision: Strict typing with opinionated pyright rules

Requirements:

All public functions have type annotations (including -> None)
Use TypedDict, Protocol, NamedTuple, enum.Enum for structured data
Avoid generic types like dict[str, Any] when structure is known
Modern Python ≥3.12 syntax (int | str, typing.Self, type keyword)

Project-Specific Types:

from codeweaver.core.types import BasedModel, BaseEnum, DataclassSerializationMixin

# Use BasedModel instead of BaseModel
class MyModel(BasedModel): ...

# Use BaseEnum instead of Enum
class MyEnum(BaseEnum): ...

# Use pydantic dataclass with serialization mixin
@dataclass
class MyData(DataclassSerializationMixin): ...

Rationale:

Maintainability and self-documentation
Catch errors at development time
Enable better IDE support
Make refactoring safer

Testing Philosophy

Effectiveness Over Coverage

Core Principle: Testing focuses on critical behavior affecting user experience, not coverage metrics.

Priorities:

Realistic integration scenarios
User-affecting behavior
Input/output validation for important functions
Edge cases that could cause failures

Anti-Patterns (What We Don't Do):

Chase 100% coverage for its own sake
Test implementation details
Create barriers to innovation with rigid tests
Maintain low-value tests that don't prevent bugs

Testing Strategy

Integration Testing > Unit Testing: One solid, realistic test beats ten implementation detail tests.

Test Categories (via pytest markers):

unit: Individual component tests (when appropriate)
integration: Component interaction tests (preferred)
e2e: End-to-end workflow tests
benchmark: Performance validation
network/external_api: Tests requiring external services
async_test: Asynchronous test cases

Example: Testing find_code

# Good: Integration test of realistic workflow
@pytest.mark.integration
async def test_find_code_authentication_query():
    """Test that find_code returns relevant authentication code"""
    result = await find_code("how do we handle authentication?")
    
    assert len(result.matches) > 0
    assert any("auth" in match.file_path.lower() for match in result.matches)
    assert result.metadata.token_count < 10000

# Avoid: Testing implementation details
# def test_find_code_internal_query_parser(): ...  # Too low-level

Rationale: This approach ensures tests validate real-world usage and prevent user-affecting bugs, not just achieve arbitrary coverage numbers.

Development Workflow

Code Quality Standards

Docstrings: Google convention (loose), plain language, active voice, present tense

Start with verbs: "Adds numbers" not "This function adds numbers"
Don't waste space explaining the obvious—strong typing makes args/returns clear

Line length: 100 characters

Auto-formatting: Ruff configuration enabled

Python typing: Modern ≥3.12 syntax

Common Linting Patterns

Logging:

No f-strings in log statements (use %s formatting or extra=)
Most errors should use logging.warning to allow them to get handled by codeweaver's UI (logging.exception bypasses our UI)
No print statements in production code

Exception Handling:

Specify exception types (no bare except:)
Use raise from to maintain exception context
Use contextlib.suppress for intentional suppression
Raise to specific CodeWeaver exceptions (codeweaver.exceptions)

Functions:

Type all parameters and returns (including -> None)
Boolean kwargs only: use * separator

def my_function(arg1: str, *, flag: bool = False) -> None:
    pass

Red Flag Protocol for Agents

When to Stop and Investigate:

API behavior differs from expectations
Files/functions aren't where expected
Code behavior contradicts documentation

Response Protocol:

Stop current work immediately
Review understanding and plans systematically
Assess approach:
- Does it comply with Project Constitution?
- Is it consistent with requirements?
- Do you have sufficient information?
Research using available tools
Ask user for clarification if still unclear
Never create workarounds without explicit authorization

Bottom Line: No code beats bad code (Constitutional Principle III).

Brand Voice & Terminology

Mission Alignment

Bridge the gap between human expectations and AI agent capabilities through "exquisite context." This mission directly implements Constitutional Principle I (AI-First Context).

User Terms

Agent/AI Agent (not "model", "LLM", "tool")

Developer's Agent: Focused on developer tasks
Context Agent: Internal agents delivering information

Developer/End User: People using CodeWeaver

Developer User: Uses CodeWeaver as development tool
Platform Developer: Builds with/extends CodeWeaver

Us: First person plural (not "Knitli" or "team")
Contributors: External contributors when distinction needed

Core Values

Simplicity: Transform complexity into clarity, eliminate jargon
Humanity: Enhance human creativity, design people-first
Utility: Solve real problems, meet users where they are
Integration: Power through synthesis, connect disparate elements

Personality

We are: Approachable, thoughtful, clear, empowering, purposeful

We aren't: Intimidating, unnecessarily complex, cold, AI-for-AI's-sake

Communication Style

Plain language accessible to all skill levels
Simple examples with visual aids
Conversational and human, not robotic
Honest about capabilities and limitations
Direct focus on user needs and goals

Key Technical Decisions

Decision: Single-Tool Interface

Context: How many tools should CodeWeaver expose via MCP?

Options Considered:

Multiple specialized tools (search_code, get_definitions, find_references)
Single flexible tool (find_code with intent parameter)
Agent-specific tools (quick_search, deep_analysis)

Decision: Single find_code tool with optional intent parameter for user's agent

Rationale (Constitutional Principles Applied):

Simplicity Through Architecture: One interface reduces cognitive load
AI-First Context: Agents express natural language queries without tool selection
Proven Patterns: FastAPI-style flexible parameters vs. endpoint proliferation
Evidence-Based: User testing showed tool selection added complexity without value

Outcome:

Lower barrier to adoption (one tool to learn)
Cleaner agent prompts (no tool selection logic)
Easier to extend (intent parameter can evolve)
Better user feedback (focus on single interface quality)

Decision: Full Plugin Architecture

Context: Build extensibility now or ship faster with fixed providers?

Options Considered:

Hardcode VoyageAI + Qdrant, ship immediately
Abstract providers minimally, add more later
Full plugin architecture from start

Decision: Full plugin architecture (Constitutional Principle II: Proven Patterns)

Rationale:

FastAPI demonstrates value of dependency injection and extensibility
User research showed diverse infrastructure preferences
Platform thinking: enable ecosystem to extend
Evidence: Successful platforms (VS Code, Babel) prioritize plugins early

Outcome:

10+ embedding providers supported
Vendor independence (no lock-in)
Community can contribute custom providers
Platform positioning vs. point solution

Decision: Span-Based Architecture

Context: How to represent code locations precisely?

Options Considered:

Line numbers only
Character offsets
Span-based with line/column and set operations

Decision: Immutable span-based architecture with set operations

Rationale:

Exact references eliminate "nearby code" confusion
Set operations enable accurate merging across search passes
Immutable operations are thread-safe
Superior composition capabilities

Outcome:

Precise code location tracking
Clean composition of results
Thread-safe concurrent processing
Better than alternatives in every dimension

Decision: Hybrid Search with Unified Ranking

Context: How to combine different search approaches?

Options Considered:

Text search only (fast but imprecise)
Semantic search only (precise but expensive)
Parallel search with simple score combination
Hybrid pipeline with unified span-based ranking

Decision: Hybrid pipeline with unified ranking across all signals

Rationale:

Text search provides speed and keyword matching
Semantic search provides intent understanding
AST analysis provides structural awareness
Unified ranking leverages best of all approaches

Outcome:

Best precision and recall characteristics
Handles both keyword and semantic queries well
Graceful degradation when components unavailable
Extensible to additional signals

Decision: MCP Sampling for Context Curation

Context: How to curate context without polluting agent's context?

Options Considered:

Return all potentially relevant results
Use heuristics to filter results
Use separate agent instance for curation (MCP sampling)

Decision: MCP sampling - separate agent evaluates and curates

Rationale:

AI-First Context: Well-prompted agents can better shape context delivery
Zero context pollution in primary agent
40-60% better precision vs. heuristics
Leverages MCP protocol capability

Outcome:

Superior result quality
No context overhead in primary agent
Natural language curation logic
Innovative use of MCP protocol

Decision: MCP-Independent Context Curation

Context: How to ensure premium context gets delivered in the first turn? (MCP tools activate only after agents call them)

Options Considered:

MCP-only with prompting to encourage early use
Independent agent handling for proactive delivery
Dual approach: MCP sampling + independent agent handling + CLI exposure

Decision: Dual approach with both MCP and independent paths

Rationale:

User Control: Direct context retrieval via CLI before starting agent interactions
Strategic Flexibility: Not locked into MCP protocol evolution
First-Turn Context: Optimal delivery doesn't wait for agent tool selection
Future-Proof: Enables HTTP dashboards, IDE integrations, custom workflows
Proven Stack: pydantic-ai provides production-ready agent handling

Outcome:

Users proactively request tailored context (CLI: codeweaver search "auth patterns")
Agents receive context-rich initial state, biased toward CodeWeaver for follow-ups
MCP protocol changes don't break core functionality
Foundation for Phase 3 platform evolution (MCP orchestration hub)

Trade-off: More implementation complexity → Greater strategic flexibility + better UX

Decision: Only Index Known or User-Defined File Types

Context: Prevent irrelevant files from polluting agent context while maintaining comprehensive coverage.

Options Considered:

Index all text files (simple binary check only)
Constrain to well-known paths only (src/, docs/, etc.)
Comprehensive known-filetype catalog with user extensibility

Decision: Option 3 - Catalog 300+ known extensions across 170+ languages, allow user configuration for additions

Rationale:

Quality over Coverage: Unknown filetypes risk high-noise, low-value context pollution
Smart Inference: Finite set of common filetypes enables confident purpose/importance estimation before indexing
User-Friendly: Doesn't require users to manually configure common cases
Security Benefit: Avoids indexing hidden/unusual files that could contain injection attacks

Outcome:

300+ known code, documentation, and configuration extensions defined
Repository pattern mapping for type/language/purpose/importance classification
User extensibility via simple configuration for edge cases
Optimal balance: comprehensive coverage without noise

Trade-off: More upfront cataloging work → Superior precision and user experience

Decision: Generate Backup Embeddings and Storage

Context: Prevent offline or unexpected vector store unavailability from disabling CodeWeaver's core functions.

Options Considered:

Offer a backup system but don't enable by default
Let CodeWeaver fail, explaining the root issue with the vector store.
Automatic backup-as-a-feature.

Decision: Option 3 -- Create a robust backup/fallback system by default, continually prepared to take over. Use lightweight models and persisted json to keep an effective but relatively lightweight backup using the in-memory provider.

Rationale:

Resilience: CodeWeaver's single point of failure is the vector store. Without a backup, if the vector store is unavailable, such as by being offline or disrupted service, then CodeWeaver can't function. A backup allows for service to continue.
Flexibility: User's don't have to worry about whether CodeWeaver will work or not. It always will be available, with slightly degraded capabilities.
Reduces Confusion: AI Agents don't encounter cryptic errors about CodeWeaver's internals. They can continue to focus on their job.

Trade-off: Increased implementation and complexity; more resource demand (mitigated by smart/low priority resource management)

Strategic Design Principles

These principles guide CodeWeaver's technical decision-making and differentiate our approach from typical MCP servers.

1. Deep Quality as Competitive Moat

Principle: Favor time-intensive, high-quality implementations over fast iterations.

Rationale:

Unknown products in crowded spaces (MCP ecosystem) must differentiate through exceptional quality
Technical debt is expensive for small teams - build maintainable platforms from the start
"Build in the open" benefits from showing polished progress vs. frequent pivots
Quality compounds: each well-built component makes future work easier

In Practice:

Full plugin architecture instead of hardcoded providers (Decision: Full Plugin Architecture)
Span-based architecture with set operations vs. simple line numbers (Decision: Span-Based Architecture)
170+ language support with sophisticated heuristics (not just "the big 10")

Trade-off: Longer time to market → Higher quality at launch → Stronger positioning

2. Principle-Driven vs. Convention-Driven Design

Principle: Design decisions derive from constitutional principles and user needs, not "what others have done."

Rationale:

MCP is young with no established "right way" yet
CodeWeaver is a platform using MCP, not an "MCP tool" - different goals require different approaches
Primary goal: Better development experience through exquisite context (MCP is means, not end)
Following conventions blindly risks optimizing for wrong outcomes

In Practice:

Single-tool interface while others expose 5-20+ tools (Decision: Single-Tool Interface)
MCP-independent agent handling for premium context delivery (Decision: MCP-Independent Context Curation)
Three-tier API architecture vs. one-size-fits-all (Decision: Three-Tier API Design in PRODUCT.md)

Trade-off: Less "idiomatic" MCP → Better user outcomes → Strategic differentiation

Connection to Product Vision

These principles support CodeWeaver's evolution from search tool → context platform → unified MCP orchestration hub (see PRODUCT.md - Product Vision).

Phase 1 (Current): Deep quality in search establishes credibility Phase 2 (2025-2026): Thread integration builds on solid foundation; rudimentary cloud offerings Phase 3 (2026+): Platform play enabled by principle-driven architecture

Trade-offs & Constraints

Trade-off: Complexity vs. Capability

Choice: Rich semantic metadata and span-based architecture

Cost: More complex type system, steeper learning curve

Benefit: Superior precision, extensibility, and composability

Justification: Aligns with Constitutional Principle I (AI-First Context) - precision is the core value proposition. Complexity is managed through clear architecture and documentation.

Trade-off: Configuration Flexibility vs. Simplicity

Choice: Three-tier API architecture

Cost: More surfaces to maintain, more documentation needed

Benefit: Each user type gets optimized interface

Justification: Eliminates false dichotomy between power and simplicity. Maintenance cost is offset by user satisfaction and broader adoption.

Trade-off: Plugin Architecture vs. Time to Market

Choice: Full plugin architecture from the start

Cost: Longer initial development time

Benefit: No vendor lock-in, community extensibility, platform positioning, easier maintenance, less technical debt

Justification: Evidence from successful platforms (VS Code, FastAPI) shows early investment in extensibility pays off. Aligns with Proven Patterns principle. Better long-term to keep technical debt low.

Trade-off: Testing Coverage vs. Development Speed

Choice: Effectiveness over coverage

Cost: Lower coverage metrics, potential for blind spots

Benefit: Faster iteration, focus on user-affecting behavior

Justification: Constitutional Principle IV (Testing Philosophy) - one realistic integration test beats ten implementation tests. Coverage metrics don't prevent user-affecting bugs.

Constraint: Python ≥3.12 Required

Reason: Modern type syntax, performance improvements

Impact: Excludes some legacy environments

Mitigation: Clear documentation, most users on 3.12+ already

Justification: Modern syntax enables better type safety and code clarity (Simplicity Through Architecture)

Constraint: MCP Protocol Dependency

Reason: Primary interface for AI agents

Impact: Tied to MCP ecosystem evolution

Mitigation: Abstract MCP-specific code, design for protocol flexibility (MCP is a transport, not the product)

Justification: MCP is the emerging standard for agent tools. CodeWeaver designed to work with and without MCP.

Constraint: Embedding Provider Costs

Reason: Semantic search requires embedding models

Impact: Cloud providers have API costs

Mitigation: Local embedding options (fastembed), caching strategies, token budgeting

Justification: Value proposition (60-80% context reduction) significantly outweighs embedding costs; free tiers keep use free or very cheap for most independent/open source developers.

References

Primary Documents

Project Constitution - Authoritative governance
README.md - Project overview and quickstart
PRODUCT.md - Product vision and strategy
CODE_STYLE.md - Code style and patterns
AGENTS.md - Agent development guidelines
IMPLEMENTATION_PLAN.md - Technical roadmap

Architecture Decisions Evolution

As the project evolves, architectural decisions should be documented with:

Context: What problem are we solving?
Options: What alternatives were considered?
Decision: What did we choose?
Rationale: Why this choice? Which constitutional principles apply?
Outcome: What were the results?

This document serves as the historical record of significant architectural decisions and should be updated as new patterns emerge.

Document Maintenance: This document should be updated when:

New architectural patterns are established
Significant technical decisions are made
Constitutional principles are amended
Design patterns evolve

Version History:

v1.0.0 (2025-10-21): Initial unified architecture document
v1.1.0 (2025-11-23): Updated to include recent design decisions and project structure changes.
v1.2.0 (2025-12-02): Added Daemon Architecture section documenting the new multi-server design with management server, MCP HTTP server, and stdio proxy. Default transport changed to stdio.

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

CodeWeaver Architecture & Design Decisions

Table of Contents

Project Constitution

Constitutional Principles (Non-Negotiable)

I. AI-First Context

II. Proven Patterns

III. Evidence-Based Development (NON-NEGOTIABLE)

IV. Testing Philosophy

V. Simplicity Through Architecture

Constitutional Governance

Core Philosophy

Mission Statement

The Problem We Solve

Design Principles

1. AI-First Context

2. Proven Patterns Over Reinvention

3. Evidence-Based Development

4. Effectiveness Over Coverage

5. Simplicity Through Architecture

Architectural Goals

Goal 1: Semantically-Rich, Ranked Context Delivery

Goal 2: Eliminate Cognitive Load on Agents

Goal 3: Significantly Reduce Context Bloat and Costs

Technical Architecture

Daemon Architecture

Span-Based Core

Semantic Metadata from AST-Aware Indexing

Backup Code-Aware Indexing (170+ Languages)

Hybrid Search Pipeline

API Architecture

Three-Tier API Design

1. Human API: Deep Configurability

2. User Agent API: Radical Simplicity

3. Context Agent API: Controlled Expansion

Why This Architecture Matters

Provider Architecture

Design Decision: Pluggable Provider Ecosystem

Provider Coverage

Configuration Management

Code Design Patterns

Pydantic Architecture Patterns

Flexible Generics

Smart Decorators

Dependency Injection

Flat Structure

Types as Functionality

Lazy Evaluation & Immutability

Type System Discipline

Testing Philosophy

Effectiveness Over Coverage

Testing Strategy

Development Workflow

Code Quality Standards

Common Linting Patterns

Red Flag Protocol for Agents

Brand Voice & Terminology

Mission Alignment

User Terms

Core Values

Personality

Communication Style

Key Technical Decisions

Decision: Single-Tool Interface

Decision: Full Plugin Architecture

Decision: Span-Based Architecture

Decision: Hybrid Search with Unified Ranking

Decision: MCP Sampling for Context Curation

Decision: MCP-Independent Context Curation

Decision: Only Index Known or User-Defined File Types

Decision: Generate Backup Embeddings and Storage

Strategic Design Principles

1. Deep Quality as Competitive Moat

2. Principle-Driven vs. Convention-Driven Design

Connection to Product Vision