@@ -15,6 +15,11 @@ MCP DevBench is a Docker container management server that implements the Model C
1515- ** Configuration Management** : Environment-based configuration with Pydantic Settings
1616- ** Structured Logging** : JSON-formatted logging for production observability
1717- ** Docker Integration** : Secure Docker daemon communication with connection pooling
18+ - ** Audit Logging** : Complete audit trail for all operations with sensitive data redaction
19+ - ** Prometheus Metrics** : Built-in metrics collection for monitoring and alerting
20+ - ** Admin Tools** : System health status, container/exec listing, garbage collection, and reconciliation
21+ - ** Graceful Shutdown** : Drains active operations before shutdown
22+ - ** Automatic Recovery** : Reconciles Docker state with database on startup
1823
1924## Requirements
2025
@@ -247,6 +252,31 @@ This project has completed **Epic 1: Foundation Layer**, **Epic 2: Command Execu
247252 - Database vacuuming for optimization
248253 - Health monitoring and metrics collection
249254
255+ ### Epic 7: Observability & Operations ✅
256+ - [x] Feature 7.1: Structured Audit Logging
257+ - AuditLogger with JSON structured logging for all operations
258+ - Complete audit trail for container, exec, filesystem, security, and transfer events
259+ - Automatic sensitive data redaction (passwords, tokens, keys, secrets)
260+ - ISO8601 timestamps and correlation IDs
261+ - Configurable detail level
262+ - 17 unit tests covering audit functionality
263+
264+ - [x] Feature 7.2: Metrics & Monitoring
265+ - Prometheus metrics collection via MetricsCollector
266+ - Counter metrics: container_spawns_total, exec_total, fs_operations_total
267+ - Histogram metrics: exec_duration_seconds, output_bytes
268+ - Gauge metrics: active_containers, active_attachments, memory_usage_bytes
269+ - ` metrics ` tool to expose Prometheus-formatted metrics
270+ - 14 unit tests covering metrics collection
271+
272+ - [x] Feature 7.3: Debug & Admin Tools
273+ - ` system_status ` tool for overall system health
274+ - ` list_containers ` tool for detailed container information
275+ - ` list_execs ` tool for active execution listing
276+ - ` garbage_collect ` tool for manual cleanup
277+ - ` reconcile ` tool with audit logging (from Epic 6)
278+ - Docker connectivity and database status monitoring
279+
250280### Current Status
251281The project now has:
252282- Full container lifecycle management with image policy enforcement
@@ -256,10 +286,13 @@ The project now has:
256286- Image allow-list validation and resolution with digest pinning
257287- Comprehensive security hardening (capability dropping, resource limits, audit logging)
258288- Warm container pool for fast provisioning (<1s attach time)
259- - ** Graceful shutdown with operation draining**
260- - ** Boot recovery and automatic reconciliation**
261- - ** Background maintenance and health monitoring**
262- - 170 unit and integration tests passing (100% success rate)
289+ - Graceful shutdown with operation draining
290+ - Boot recovery and automatic reconciliation
291+ - Background maintenance and health monitoring
292+ - ** Complete audit logging for all operations with sensitive data redaction**
293+ - ** Prometheus metrics collection and exposure**
294+ - ** Admin tools for system status, container/exec listing, and manual operations**
295+ - 201 unit and integration tests passing (100% success rate)
263296- Comprehensive error handling and resource management
264297
265298## MCP Tools Reference
@@ -460,6 +493,92 @@ This tool performs:
460493}
461494```
462495
496+ ### Observability & Admin Tools
497+
498+ #### ` metrics `
499+ Get Prometheus metrics for monitoring.
500+
501+ Returns current metrics including:
502+ - Container spawn counts by image
503+ - Execution counts and durations
504+ - Filesystem operation counts
505+ - Active container and attachment gauges
506+ - Memory usage by container
507+
508+ ** Input:** None
509+
510+ ** Output:**
511+ - ` metrics ` (string): Prometheus-formatted metrics
512+
513+ ** Example metrics output:**
514+ ```
515+ # HELP mcp_devbench_container_spawns_total Total number of container spawns
516+ # TYPE mcp_devbench_container_spawns_total counter
517+ mcp_devbench_container_spawns_total{image="python:3.11"} 5.0
518+ # HELP mcp_devbench_exec_total Total number of command executions
519+ # TYPE mcp_devbench_exec_total counter
520+ mcp_devbench_exec_total{container_id="c_123",status="success"} 10.0
521+ # HELP mcp_devbench_active_containers Number of active containers
522+ # TYPE mcp_devbench_active_containers gauge
523+ mcp_devbench_active_containers 3.0
524+ ```
525+
526+ #### ` system_status `
527+ Get system health and status information.
528+
529+ ** Input:** None
530+
531+ ** Output:**
532+ - ` status ` (string): Overall system status (healthy, degraded)
533+ - ` docker_connected ` (boolean): Docker daemon connectivity
534+ - ` database_initialized ` (boolean): Database initialization status
535+ - ` active_containers ` (integer): Number of active containers
536+ - ` active_attachments ` (integer): Number of active client attachments
537+ - ` version ` (string): Server version
538+
539+ ** Example:**
540+ ``` json
541+ {
542+ "status" : " healthy" ,
543+ "docker_connected" : true ,
544+ "database_initialized" : true ,
545+ "active_containers" : 3 ,
546+ "active_attachments" : 2 ,
547+ "version" : " 0.1.0"
548+ }
549+ ```
550+
551+ #### ` garbage_collect `
552+ Trigger manual garbage collection.
553+
554+ Cleans up:
555+ - Orphaned transient containers
556+ - Old completed exec records (>24h)
557+ - Abandoned attachments
558+
559+ ** Input:** None
560+
561+ ** Output:**
562+ - ` containers_removed ` (integer): Number of containers removed
563+ - ` execs_cleaned ` (integer): Number of exec records cleaned
564+ - ` attachments_cleaned ` (integer): Number of attachments cleaned
565+
566+ #### ` list_containers `
567+ List all containers with detailed information.
568+
569+ ** Input:** None
570+
571+ ** Output:**
572+ - ` containers ` (array): List of container objects with id, docker_id, alias, image, status, persistent, created_at, last_seen
573+
574+ #### ` list_execs `
575+ List active command executions.
576+
577+ ** Input:** None
578+
579+ ** Output:**
580+ - ` execs ` (array): List of execution objects with exec_id, container_id, cmd, as_root, started_at, status
581+
463582See [ mcp-devbench-work-breakdown.md] ( mcp-devbench-work-breakdown.md ) for the complete implementation roadmap.
464583
465584## License
0 commit comments