Part of Agent OS - Kernel-level governance for AI agents
Production-ready observability stack for Agent OS kernel.
This package provides metrics, tracing, and dashboards for monitoring Agent OS deployments.
- Prometheus Metrics: Kernel, agent, and CMVK metrics
- OpenTelemetry Tracing: Distributed tracing for agent operations
- Grafana Dashboards: Pre-built dashboards for SOC, ML Ops, and SRE teams
- Prometheus Alerts: Safety, performance, and availability alerts
pip install agent-os-observabilityfrom agent_os_observability import KernelMetrics, KernelTracer
# Initialize metrics
metrics = KernelMetrics()
# Record policy check
with metrics.policy_check_latency():
result = policy_engine.check(action)
# Record violation
if not result.allowed:
metrics.record_violation(agent_id, action, policy="data-access", severity="high")
metrics.record_blocked(agent_id, action)
# CMVK metrics
metrics.record_cmvk_verification(
result="verified",
confidence=0.95,
drift_score=0.08,
duration_seconds=2.3,
model_count=3
)
# Expose /metrics endpoint (FastAPI example)
from fastapi import FastAPI, Response
app = FastAPI()
@app.get("/metrics")
def get_metrics():
return Response(
content=metrics.export(),
media_type=metrics.content_type()
)cd packages/observability
docker-compose up -d
# Open dashboards
open http://localhost:3000 # Grafana (admin/admin)
open http://localhost:16686 # Jaeger
open http://localhost:9090 # Prometheus| Metric | Type | Description |
|---|---|---|
agent_os_violations_total |
Counter | Policy violations by agent, action, policy, severity |
agent_os_violations_blocked_total |
Counter | Violations blocked (SIGKILL issued) |
agent_os_violation_rate |
Gauge | Violations per 1000 requests |
agent_os_policy_check_duration_seconds |
Histogram | Policy check latency |
agent_os_signals_total |
Counter | Signals sent by type and reason |
agent_os_sigkill_total |
Counter | SIGKILL signals by agent and reason |
agent_os_mttr_seconds |
Histogram | Mean Time To Recovery |
agent_os_kernel_uptime_seconds |
Gauge | Kernel uptime |
| Metric | Type | Description |
|---|---|---|
agent_os_cmvk_verifications_total |
Counter | Verifications by result (verified/flagged/rejected) |
agent_os_cmvk_consensus_ratio |
Gauge | Current model agreement (0.0-1.0) |
agent_os_cmvk_model_disagreements_total |
Counter | Disagreements by model pair |
agent_os_cmvk_drift_score |
Histogram | Drift score distribution |
agent_os_cmvk_verification_duration_seconds |
Histogram | Verification latency |
agent_os_cmvk_model_latency_seconds |
Histogram | Per-model response latency |
| Metric | Type | Description |
|---|---|---|
agent_os_agent_llm_calls_total |
Counter | LLM API calls by agent and model |
agent_os_agent_errors_total |
Counter | Errors by agent and type |
agent_os_agent_execution_duration_seconds |
Histogram | Task execution time |
Main dashboard for SOC teams: violation rate, SIGKILL count, latency, throughput.
ML Ops dashboard: consensus rate, drift scores, model latency, verification results.
AMB (Agent Message Bus): throughput, queue depth, backpressure, delivery latency.
CISO dashboard: 30-day violation count.
python scripts/export_dashboards.pyThis creates JSON files in grafana/dashboards/ for Grafana provisioning.
Alert rules are defined in alerts/agent-os-alerts.yaml:
AgentOSHighViolationRate: Violation rate >1%AgentOSSIGKILLSpike: >5 SIGKILL in 5 minutesAgentOSKernelCrash: Kernel panic
AgentOSHighPolicyLatency: p99 latency >10msCMVKLowConsensus: Consensus <80%CMVKHighDrift: p95 drift >0.25
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Your Application β
β ββββββββββββββββββββ ββββββββββββββββββββ β
β β Agent OS β β KernelMetrics β β
β β Kernel ββββ .export() βββββΊ /metrics β
β ββββββββββββββββββββ ββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Docker Compose Stack β
β ββββββββββββββ ββββββββββββββ ββββββββββββββ β
β β Prometheus βββΊβ Grafana β β Jaeger β β
β β :9090 β β :3000 β β :16686 β β
β ββββββββββββββ ββββββββββββββ ββββββββββββββ β
β β β² β² β
β βΌ β β β
β ββββββββββββββ β ββββββββββββββ β
β βAlertManagerβ β β OTEL β β
β β :9093 β β β Collector β β
β ββββββββββββββ β ββββββββββββββ β
β β β β² β
β βΌ β β β
β [Slack/PagerDuty] βββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest
# Export dashboards
python scripts/export_dashboards.pyMIT