Production-ready Docker container management server implementing the Model Context Protocol (MCP)
MCP DevBench provides isolated, persistent development workspaces through a secure, audited, and observable container management API. Built for AI assistants like Claude, it enables safe command execution and filesystem operations in Docker containers.
- π Container Lifecycle Management - Create, start, stop, and remove Docker containers with fine-grained control
- π Secure Filesystem Operations - Read, write, delete files with path validation and ETag-based concurrency control
- β‘ Async Command Execution - Non-blocking execution with streaming output and timeout handling
- π Enterprise Security - Capability dropping, read-only rootfs, resource limits, and comprehensive audit logging
- π Production Observability - Prometheus metrics, structured JSON logging, and system health monitoring
- βοΈ True Async I/O - All blocking operations wrapped in thread pools for optimal concurrency
- Warm Container Pool - Sub-second container provisioning for instant attach
- Graceful Shutdown - Drain active operations before server termination
- Automatic Recovery - Reconciles Docker state with database on startup
- Image Policy Enforcement - Allow-list validation with digest pinning
- Multi-Transport Support - stdio, SSE, or HTTP-based MCP transports
- Flexible Authentication - None, Bearer token, or OIDC authentication modes
- Python 3.11+
- Docker Engine
- uv package manager
# Install uv
pip install uv
# Clone the repository
git clone https://github.com/pvliesdonk/mcp-devbench.git
cd mcp-devbench
# Install dependencies
uv syncDevelopment Mode (stdio)
uv run python -m mcp_devbench.serverProduction Mode (HTTP)
export MCP_TRANSPORT_MODE=streamable-http
export MCP_HOST=0.0.0.0
export MCP_PORT=8000
uv run python -m mcp_devbench.serverUsing Docker
docker build -t mcp-devbench .
docker run -v /var/run/docker.sock:/var/run/docker.sock \
-p 8000:8000 \
-e MCP_TRANSPORT_MODE=streamable-http \
mcp-devbenchUsing Docker Compose
docker-compose up -dβββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP DevBench API β
β (FastMCP Server with Auth) β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββΌββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββ βββββββββββ ββββββββββββ
βContainerβ β Exec β βFilesystemβ
β Manager β β Manager β β Manager β
ββββββ¬βββββ ββββββ¬βββββ ββββββ¬ββββββ
β β β
ββββββββββββββΌβββββββββββββ
β
ββββββββββΌβββββββββ
β β β
βΌ βΌ βΌ
ββββββββββ ββββββ ββββββββββββ
β Docker β β DB β β Audit β
β Daemon β β β β Logger β
ββββββββββ ββββββ ββββββββββββ
Design Patterns:
- Repository Pattern - Data access abstraction
- Manager Pattern - Business logic encapsulation
- Dependency Injection - Loose coupling via factory functions
- Async/Await - Non-blocking I/O throughout with thread pool for blocking operations
from mcp_devbench.mcp_tools import *
# 1. Spawn a container
result = await spawn(SpawnInput(
image="python:3.11-slim",
persistent=False,
alias="dev-workspace"
))
container_id = result.container_id
# 2. Attach to container
await attach(AttachInput(
target=container_id,
client_name="my-client",
session_id="session-123"
))
# 3. Execute command
exec_result = await exec_start(ExecInput(
container_id=container_id,
cmd=["python", "--version"],
timeout_s=30
))
# 4. Poll for output
output = await exec_poll(ExecPollInput(
exec_id=exec_result.exec_id,
after_seq=0
))
# 5. Write a file
await fs_write(FileWriteInput(
container_id=container_id,
path="/workspace/hello.py",
content=b"print('Hello, World!')"
))
# 6. Clean up
await kill(KillInput(
container_id=container_id,
force=True
))All configuration is managed through environment variables with the MCP_ prefix.
| Variable | Default | Description |
|---|---|---|
MCP_TRANSPORT_MODE |
streamable-http |
Transport: stdio, sse, or streamable-http |
MCP_HOST |
0.0.0.0 |
Server bind address (HTTP transports only) |
MCP_PORT |
8000 |
Server port (HTTP transports only) |
MCP_ALLOWED_REGISTRIES |
docker.io,ghcr.io |
Comma-separated allowed registries |
MCP_LOG_LEVEL |
INFO |
Logging level: DEBUG, INFO, WARNING, ERROR |
MCP_LOG_FORMAT |
json |
Log format: json or text |
| Variable | Default | Description |
|---|---|---|
MCP_AUTH_MODE |
none |
Auth mode: none, bearer, or oidc |
MCP_BEARER_TOKEN |
- | Bearer token (when auth_mode=bearer) |
MCP_OAUTH_CLIENT_ID |
- | OIDC client ID |
MCP_OAUTH_CLIENT_SECRET |
- | OIDC client secret |
MCP_OAUTH_CONFIG_URL |
- | OIDC discovery URL |
| Variable | Default | Description |
|---|---|---|
MCP_STATE_DB |
./state.db |
SQLite database path |
MCP_DRAIN_GRACE_S |
60 |
Shutdown grace period (seconds) |
MCP_TRANSIENT_GC_DAYS |
7 |
Transient container retention (days) |
MCP_WARM_POOL_ENABLED |
true |
Enable warm container pool |
MCP_DEFAULT_IMAGE_ALIAS |
python:3.11-slim |
Default warm pool image |
Local Development (stdio)
MCP_TRANSPORT_MODE=stdio
MCP_AUTH_MODE=none
MCP_LOG_LEVEL=DEBUG
MCP_LOG_FORMAT=textProduction (HTTP + OIDC)
MCP_TRANSPORT_MODE=streamable-http
MCP_HOST=0.0.0.0
MCP_PORT=8000
MCP_AUTH_MODE=oidc
MCP_OAUTH_CLIENT_ID=your-client-id
MCP_OAUTH_CLIENT_SECRET=your-secret
MCP_OAUTH_CONFIG_URL=https://auth.example.com/.well-known/openid-configuration
MCP_LOG_LEVEL=INFO
MCP_LOG_FORMAT=jsonCreate and start a new container.
Input:
image(string) - Docker image referencepersistent(boolean) - Persist across restartsalias(string, optional) - User-friendly namettl_s(integer, optional) - Time-to-live for transient containers
Output:
container_id(string) - Opaque container IDalias(string) - Container aliasstatus(string) - Container status
Attach a client to a container for session tracking.
Input:
target(string) - Container ID or aliasclient_name(string) - Client identifiersession_id(string) - Session identifier
Output:
container_id(string) - Actual container IDalias(string) - Container aliasroots(array) - Workspace roots
Stop and remove a container.
Input:
container_id(string) - Container to removeforce(boolean) - Force immediate removal
Output:
status(string) - Operation status
Start command execution in a container.
Input:
container_id(string) - Target containercmd(array) - Command and argumentscwd(string) - Working directory (default:/workspace)env(object) - Environment variablesas_root(boolean) - Execute as roottimeout_s(integer) - Execution timeoutidempotency_key(string) - Prevent duplicate execution
Output:
exec_id(string) - Execution IDstatus(string) - Initial status
Cancel a running execution.
Poll for execution output and status.
Input:
exec_id(string) - Execution IDafter_seq(integer) - Return messages after sequence number
Output:
messages(array) - Stream messagescomplete(boolean) - Execution complete flag
Read a file from container workspace.
Output includes: content, etag, size, mime_type
Write a file to container workspace.
Supports: ETag-based concurrency control via if_match_etag
Delete a file or directory.
Get file/directory metadata.
List directory contents.
Get system health and status.
Output:
- Docker connectivity status
- Active containers/attachments count
- Database status
- Server version
Retrieve Prometheus-formatted metrics.
Manually trigger container reconciliation.
Trigger manual garbage collection.
List all containers or active executions.
- Capability Dropping - All Linux capabilities dropped by default
- Read-Only Root Filesystem - Prevents container modification
- Resource Limits - 512MB memory, 1 CPU, 256 PID limit per container
- Path Validation - Prevents directory traversal attacks
- Image Allow-List - Only approved registries allowed
- Audit Logging - Complete audit trail with PII redaction
- User Isolation - Configurable UID (default 1000)
- Use OIDC Authentication in production
- Restrict allowed registries to trusted sources only
- Enable audit logging and monitor for suspicious activity
- Run with least privilege - never run as root
- Keep images updated - use digest pinning for reproducibility
- Isolate network access - use Docker network policies
All operations are logged in JSON format with:
- ISO8601 timestamps
- Correlation IDs
- Contextual metadata
- Automatic PII redaction
Available via the metrics tool:
mcp_devbench_container_spawns_total- Container creation countmcp_devbench_exec_total- Command execution countmcp_devbench_exec_duration_seconds- Execution duration histogrammcp_devbench_fs_operations_total- Filesystem operation countmcp_devbench_active_containers- Active container gaugemcp_devbench_memory_usage_bytes- Container memory usage
All operations generate audit events:
CONTAINER_SPAWN,CONTAINER_ATTACH,CONTAINER_KILLEXEC_START,EXEC_CANCELFS_READ,FS_WRITE,FS_DELETESYSTEM_RECONCILE,SYSTEM_GC
For detailed development guidelines, see our Project Style Guide.
# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=mcp_devbench --cov-report=html
# Run specific test file
uv run pytest tests/unit/test_container_manager.py
# Run integration tests only
uv run pytest tests/integration/# Lint with ruff
uv run ruff check .
# Format code
uv run ruff format .
# Type checking (recommended)
uv run pyright src/mcp-devbench/
βββ src/mcp_devbench/
β βββ config/ # Configuration management
β βββ models/ # SQLAlchemy ORM models
β βββ managers/ # Business logic layer
β βββ repositories/ # Data access layer
β βββ utils/ # Utilities (logging, Docker, metrics)
β βββ server.py # FastMCP server
β βββ mcp_tools.py # Pydantic models for MCP
βββ tests/
β βββ unit/ # Unit tests
β βββ integration/ # Integration tests
βββ alembic/ # Database migrations
βββ .github/workflows/ # CI/CD pipelines
Current Version: 0.1.0
β Epic 1: Foundation Layer - Configuration, state store, Docker lifecycle β Epic 2: Command Execution - Async exec, streaming, idempotency β Epic 3: Filesystem Operations - CRUD, batch ops, import/export β Epic 4: MCP Integration - Tools, resources, streaming transport β Epic 5: Security - Image policy, hardening, warm pool β Epic 6: State Management - Shutdown, recovery, maintenance β Epic 7: Observability - Audit logging, metrics, admin tools
Test Coverage: ~72% (201 tests) Code Quality: Zero linting issues (ruff) Production Ready: Yes, for small-to-medium deployments
See IMPLEMENTATION_ROADMAP.md for detailed future plans.
We welcome contributions! Please see our Contributing Guide for details on:
- Development workflow
- Testing guidelines
- Code style requirements
- Submission process
Important: We use a main/next branching model. See BRANCHING_STRATEGY.md for details.
- Fork the repository
- Create a feature branch from
next:git checkout next && git pull && git checkout -b feature/amazing-feature - Make your changes and add tests
- Run tests:
uv run pytest - Lint code:
uv run ruff check . - Commit with conventional commits:
git commit -m "feat: add amazing feature" - Push and create a Pull Request to the
nextbranch
This project is licensed under the MIT License - see the LICENSE file for details.
- Documentation: docs/
- Issue Tracker: GitHub Issues
- Discussions: GitHub Discussions
- Changelog: CHANGELOG.md
- Questions? Open a Discussion
- Bug Reports: File an Issue
- Security Issues: See SECURITY.md
Built with β€οΈ using FastMCP, Docker, and modern Python async