Skip to content

Commit a0807d8

Browse files
committed
Add adr
1 parent b8e49c5 commit a0807d8

15 files changed

+594
-0
lines changed

docs/docs/.pages

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,4 @@ nav:
66
- "🛡️ Manage": manage
77
- "💻 Development": development
88
- "🧪 Testing": testing
9+
- "📐 Architecture": architecture

docs/docs/architecture/.pages

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
title: Architecture
2+
nav:
3+
- Overview: index.md
4+
- Decision Records: adr

docs/docs/architecture/adr/.pages

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
title: Decision Records
2+
nav:
3+
- 1 Adopt FastAPI + Pydantic: 001-adopt-fastapi-pydantic.md
4+
- 2 Use Async SQLAlchemy ORM: 002-use-async-sqlalchemy-orm.md
5+
- 3 Expose Multi-Transport Endpoints: 003-expose-multi-transport-endpoints.md
6+
- 4 Combine JWT & Basic Auth: 004-combine-jwt-and-basic-auth.md
7+
- 5 Structured JSON Logging: 005-structured-json-logging.md
8+
- 6 Gateway & Tool-Level Rate Limiting: 006-gateway-tool-rate-limiting.md
9+
- 7 Pluggable Cache Backend: 007-pluggable-cache-backend.md
10+
- 8 Federation & Auto-Discovery via DNS-SD: 008-federation-discovery.md
11+
- 9 Built-in Health Checks: 000-built-in-health-checks.md
12+
- 10 Observability via Prometheus: 010-observability-prometheus.md
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# ADR-0001: Adopt FastAPI + Pydantic
2+
3+
- *Status:* Accepted
4+
- *Date:* 2025-02-01
5+
- *Deciders:* Mihai Criveti
6+
7+
## Context
8+
9+
The MCP Gateway must serve both human and machine clients with low-latency HTTP and WebSocket endpoints. Payloads require runtime validation and schema documentation, while internal data types must align with environment-driven settings and JSON models.
10+
11+
We explored Python-native frameworks that support async-first operation, data validation, OpenAPI generation, and modular service layout.
12+
13+
## Decision
14+
15+
We will adopt:
16+
17+
- **FastAPI** as the core web framework for routing HTTP, WebSocket, and streaming endpoints.
18+
- **Pydantic v2** for all settings, schemas, and typed data models (e.g., `Tool`, `Resource`, `GatewayMetadata`, etc.).
19+
20+
These will form the foundation for the application layer and public API.
21+
22+
## Consequences
23+
24+
- ✨ Strong typing, runtime validation, and auto-generated OpenAPI specs.
25+
- 🧩 Unified model structure across internal logic, external APIs, and config parsing.
26+
- 🚀 Excellent async performance with Uvicorn and Starlette compatibility.
27+
- 🔒 Tight coupling to Pydantic means future transitions (e.g., to dataclasses or attrs) would be non-trivial.
28+
29+
## Alternatives Considered
30+
31+
| Option | Why Not |
32+
|--------|---------|
33+
| **Flask + Marshmallow** | Sync-first architecture, weak async support, manual OpenAPI generation. |
34+
| **Django REST Framework** | Heavyweight, monolithic, tightly bound to Django ORM, not async-native. |
35+
| **Tornado or Starlette alone** | More boilerplate to assemble middlewares, validators, and routing. |
36+
| **Node.js + Fastify** | Excellent performance but requires a split language/runtime and loss of shared model code. |
37+
| **Pure `httpx` + `uvicorn` + `pydantic-core`** | Too low-level; duplicating FastAPI features manually. |
38+
39+
## Status
40+
41+
This decision has been implemented in the current architecture.
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# ADR-0002: Use Async SQLAlchemy ORM
2+
3+
- *Status:* Accepted
4+
- *Date:* 2025-02-01
5+
- *Deciders:* Mihai Criveti
6+
7+
## Context
8+
9+
The gateway must persist:
10+
11+
- Tool metadata
12+
- Resource configurations
13+
- Usage metrics
14+
- Peer discovery and federation state
15+
16+
We require a relational database with schema evolution, strong typing, and async support. The current codebase already uses SQLAlchemy ORM models with an async engine and declarative mapping style.
17+
18+
## Decision
19+
20+
We will use:
21+
22+
- **SQLAlchemy 2.x (async)** for all data persistence.
23+
- **AsyncSession** and `async with` scoped transactions.
24+
- **Alembic** for migrations, with autogeneration and CLI support.
25+
- **SQLite** for development; **PostgreSQL or MySQL** for production via `DATABASE_URL`.
26+
27+
This provides consistent, well-understood relational behavior and integrates cleanly with FastAPI.
28+
29+
## Consequences
30+
31+
- 🧱 Mature and reliable ORM with a wide developer base.
32+
- 🔄 Fully async I/O stack without thread-pools or blocking.
33+
- 🔧 Migrations handled declaratively using Alembic.
34+
- 📄 Pydantic models can be derived from or synchronized with SQLAlchemy models if needed.
35+
36+
## Alternatives Considered
37+
38+
| Option | Why Not |
39+
|--------|---------|
40+
| **Raw asyncpg / aiosqlite** | Manual query strings, error-prone joins, no built-in migrations. |
41+
| **Tortoise ORM / GINO** | Less widely used, more magic, lower confidence in long-term maintainability. |
42+
| **Django ORM** | Not async-native, tightly coupled to Django ecosystem, too heavyweight. |
43+
| **NoSQL (e.g., MongoDB)** | No relational guarantees, weaker query language, major refactor from current SQL-based model. |
44+
45+
## Status
46+
47+
This decision is in place and all gateway persistence uses SQLAlchemy 2.x with async support.
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# ADR-0003: Expose Multi-Transport Endpoints (HTTP / WebSocket / SSE / STDIO)
2+
3+
- *Status:* Accepted
4+
- *Date:* 2025-02-01
5+
- *Deciders:* Mihai Criveti
6+
7+
## Context
8+
9+
The MCP Gateway must serve diverse clients: web browsers, CLIs, language-specific SDKs, and headless daemons.
10+
Different use cases require support for both **request/response** and **streaming** interactions.
11+
12+
Requirements:
13+
14+
- Human-readable RPC over HTTP for developers
15+
- Low-latency streaming for long-running tools
16+
- IPC-style invocations for local CLI integration
17+
- Unified business logic regardless of transport
18+
19+
## Decision
20+
21+
The gateway will support the following built-in transports:
22+
23+
- **HTTP JSON-RPC** (primary RPC interface)
24+
- **WebSocket** (bidirectional messaging)
25+
- **SSE (Server-Sent Events)** (for push-only event streaming)
26+
- **STDIO** (optional local CLI / subprocess transport)
27+
28+
Transport selection is dynamic, based on environment (`TRANSPORT_TYPE`) and route grouping. All transports share the same service layer and authentication mechanisms.
29+
30+
## Consequences
31+
32+
- ✅ Maximum client flexibility, supporting modern browsers and legacy CLI tools.
33+
- 🔄 Business logic remains decoupled from transport implementation.
34+
- 📶 Streaming transports (WS, SSE) require timeout, reconnection, and back-pressure handling. Easy expansion with new MCP standards
35+
36+
## Alternatives Considered
37+
38+
| Option | Why Not |
39+
|--------|---------|
40+
| **HTTP-only JSON API** | Poor fit for long-lived streaming tasks; requires polling. |
41+
| **gRPC (HTTP/2)** | Not browser-friendly; requires generated stubs; less discoverable. |
42+
| **Separate microservices per transport** | Code duplication, diverging implementations, and operational complexity. |
43+
| **Single transport abstraction** | Reduces explicitness; transport-specific needs get buried in generic interfaces. |
44+
45+
## Status
46+
47+
All four transports are implemented in the current FastAPI application and are toggleable via configuration.
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# ADR-0004: Combine JWT & Basic Auth
2+
3+
- *Status:* Accepted
4+
- *Date:* 2025-02-01
5+
- *Deciders:* Core Engineering Team
6+
7+
## Context
8+
9+
The gateway needs to support two types of clients:
10+
11+
- **Browser-based users** using the Admin UI
12+
- **Headless clients** such as scripts, services, and tools
13+
14+
These use cases require different authentication workflows:
15+
16+
- Browsers prefer form-based login and session cookies.
17+
- Automation prefers stateless, token-based access.
18+
19+
The current config exposes both:
20+
21+
- `BASIC_AUTH_USER` and `BASIC_AUTH_PASSWORD`
22+
- `JWT_SECRET_KEY`, `JWT_EXPIRY_SECONDS`, and cookie settings
23+
24+
## Decision
25+
26+
We will combine both authentication modes as follows:
27+
28+
- **Basic Auth** secures access to `/admin`. Upon success, a short-lived **JWT cookie** is issued.
29+
- **JWT Bearer token** (via header or cookie) is required for all API, WebSocket, and SSE requests.
30+
- Tokens are signed using the shared `JWT_SECRET_KEY` and include standard claims (sub, exp, scopes).
31+
- When `AUTH_REQUIRED=false`, the gateway allows unauthenticated access (dev only).
32+
33+
## Consequences
34+
35+
- ✅ Developers can log in once via browser and obtain an authenticated session.
36+
- ✅ Scripts can use a generated JWT directly, with no credential storage.
37+
- ❌ Tokens must be signed, rotated, and verified securely (TLS required).
38+
- 🔄 JWTs expire and must be refreshed periodically by clients.
39+
40+
## Alternatives Considered
41+
42+
| Option | Why Not |
43+
|--------|---------|
44+
| **JWT only** | CLI tools need a pre-acquired token; not friendly for interactive login. |
45+
| **Basic only** | Password sent on every request; cannot easily revoke or expire credentials. |
46+
| **OAuth2 / OpenID Connect** | Too complex for self-hosted setups; requires external identity provider. |
47+
| **mTLS client auth** | Secure but heavy; not usable in browsers or simple HTTP clients. |
48+
49+
## Status
50+
51+
This combined authentication mechanism is implemented and enabled by default in the gateway.
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# ADR-0005: Structured JSON Logging
2+
3+
- *Status:* Accepted
4+
- *Date:* 2025-02-21
5+
- *Deciders:* Core Engineering Team
6+
7+
## Context
8+
9+
The gateway must emit logs that:
10+
11+
- Are machine-readable and parseable by tools like ELK, Loki, or Datadog
12+
- Include rich context (e.g., request ID, auth user, duration)
13+
- Can be viewed in plaintext locally and JSON in production
14+
15+
Our configuration supports:
16+
17+
- `LOG_FORMAT`: `json` or `plain`
18+
- `LOG_LEVEL`: standard Python levels
19+
- `LOG_FILE`: optional log file destination
20+
21+
Logs are initialized at startup via `LoggingService`.
22+
23+
## Decision
24+
25+
Use the Python standard `logging` module with:
26+
27+
- A **custom JSON formatter** for structured logs (e.g. `{"level": "INFO", "msg": ..., "request_id": ...}`)
28+
- **Plain text output** when `LOG_FORMAT=plain`
29+
- Per-request context via filters or middleware
30+
- Global setup at app startup to avoid late binding issues
31+
32+
## Consequences
33+
34+
- 📋 Easily parsed logs suitable for production observability pipelines
35+
- ⚙️ Compatible with `stdout`, file, or syslog targets
36+
- 🧪 Local development uses plain logs for readability
37+
- 🧱 Minimal dependency footprint (no third-party logging libraries)
38+
39+
## Alternatives Considered
40+
41+
| Option | Why Not |
42+
|--------|---------|
43+
| **loguru** | Elegant syntax, but non-standard; poor compatibility with Python ecosystem. |
44+
| **structlog** | Adds context pipeline complexity; not needed for current log volume. |
45+
| **External sidecar (e.g. Fluent Bit)** | Useful downstream but doesn't solve app-side structure. |
46+
| **Raw print() statements** | Unstructured, difficult to manage at scale. |
47+
48+
## Status
49+
50+
Structured logging is implemented in `LoggingService`, configurable via environment variables.
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# ADR-0006: Gateway & Tool-Level Rate Limiting
2+
3+
- *Status:* Accepted
4+
- *Date:* 2025-02-21
5+
- *Deciders:* Core Engineering Team
6+
7+
## Context
8+
9+
The MCP Gateway may serve hundreds of concurrent clients accessing multiple tools.
10+
Without protection, a single client or misbehaving tool could monopolize resources or overwhelm upstream services.
11+
12+
The configuration includes:
13+
14+
- `TOOL_RATE_LIMIT`: default limit in requests/min per tool/client
15+
- Planned support for Redis-based or database-backed counters
16+
17+
Current implementation is an in-memory token bucket.
18+
19+
## Decision
20+
21+
Implement a **rate limiter at the tool invocation level**, keyed by:
22+
23+
- Tool name
24+
- Authenticated user / client identity (JWT or Basic)
25+
- Time window (per-minute by default)
26+
27+
Backend options:
28+
29+
- **Memory** (default for dev / single instance)
30+
- **Redis** (planned for clustering / shared limits)
31+
- **Database** (eventually consistent fallback)
32+
33+
## Consequences
34+
35+
- ✅ Prevents abuse, controls cost, and provides predictable fairness
36+
- 📉 Failed requests return `429 Too Many Requests` with retry headers
37+
- ❌ Memory backend does not scale across instances; Redis required for HA
38+
- 🔄 Optional override of limits via config/env for testing
39+
40+
## Alternatives Considered
41+
42+
| Option | Why Not |
43+
|--------|---------|
44+
| **No rate limiting** | Leaves gateway and tools vulnerable to overload or accidental DoS. |
45+
| **Global rate limit only** | Heavy tools can starve lightweight tools; no fine-grained control. |
46+
| **Proxy-level throttling (e.g. NGINX, Envoy)** | Can’t distinguish tools or users inside payload; lacks granularity. |
47+
48+
## Status
49+
50+
Rate limiting is implemented for tool routes, with `TOOL_RATE_LIMIT` as the default policy.
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# ADR-0007: Pluggable Cache Backend (memory / Redis / database)
2+
3+
- *Status:* Accepted
4+
- *Date:* 2025-02-21
5+
- *Deciders:* Core Engineering Team
6+
7+
## Context
8+
9+
The MCP Gateway uses short-lived caching for:
10+
11+
- Tool responses and resource lookups
12+
- Peer discovery metadata
13+
- Temporary session state and rate-limiting
14+
15+
Different deployments require different caching characteristics:
16+
17+
- Dev mode: no external services (in-memory only)
18+
- Production: clustered and persistent (Redis)
19+
- Air-gapped: embedded fallback (database table)
20+
21+
The config exposes `CACHE_TYPE=memory|redis|database`.
22+
23+
## Decision
24+
25+
Abstract the caching system via a `CacheBackend` interface and support the following pluggable backends:
26+
27+
- `MemoryCacheBackend`: simple `dict` with TTL, for dev and unit tests
28+
- `RedisCacheBackend`: shared, centralized cache for multi-node clusters
29+
- `DatabaseCacheBackend`: uses SQLAlchemy ORM to persist TTL-based records
30+
31+
Selection is driven by the `CACHE_TYPE` environment variable. Code paths use a consistent interface regardless of backend.
32+
33+
## Consequences
34+
35+
- 🔄 Easy to switch cache backend per environment or load profile
36+
- 🚀 Redis allows horizontal scaling and persistent shared state
37+
- ❌ Memory cache does not survive restarts or share state
38+
- 🐢 Database cache is slower, but useful in restricted networks
39+
40+
## Alternatives Considered
41+
42+
| Option | Why Not |
43+
|--------|---------|
44+
| **Hardcoded Redis** | Adds operational overhead and single point of failure for dev. |
45+
| **Memory-only cache** | Incompatible with horizontal scale or restart resilience. |
46+
| **External CDN or HTTP cache** | Doesn’t address in-process sessions, discovery, or tool state. |
47+
| **Disk-based cache (e.g., shelve, pickle)** | Complex invalidation and concurrency issues; not cloud-ready. |
48+
49+
## Status
50+
51+
All three cache backends are implemented and the gateway selects one dynamically based on configuration.

0 commit comments

Comments
 (0)