Skip to content

Commit b5478bb

Browse files
docs: refresh READMEs with updated stats and new features
- Root: add By The Numbers section (6,100+ tests, 7 packages, 12+ integrations) - Root: update Known Limitations (observability now implemented, behavioral detection done) - Root: add NIST RFI mapping and benchmarks to Documentation section - agent-os: update test count 1,680+ -> 2,573+ - agent-mesh: update test count 1,300+ -> 1,669+ - agent-hypervisor: update test count 457+ -> 644+, add behavioral anomaly detection - agent-sre: update test count 1,089+ -> 1,240+, add PagerDuty/Grafana/OTel to roadmap - agent-sre: update observability platform count 11 -> 13 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 6992c8c commit b5478bb

File tree

5 files changed

+29
-15
lines changed

5 files changed

+29
-15
lines changed

README.md

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,17 @@
2424
> composes with container/VM isolation for defense-in-depth.
2525
> See [Architecture Notes](#architecture-notes) for details.
2626
27+
## By The Numbers
28+
29+
| Metric | Value |
30+
|---|---|
31+
| **Tests Passing** | 6,100+ across all packages |
32+
| **Packages** | 7 (kernel, trust mesh, runtime, SRE, compliance, marketplace, lightning) |
33+
| **Framework Integrations** | 12+ (LangChain, CrewAI, AutoGen, Dify, LlamaIndex, OpenAI Agents, Google ADK, …) |
34+
| **Policy Eval Latency** | 0.012 ms p50 — [full benchmarks](BENCHMARKS.md) |
35+
| **OWASP Coverage** | 10/10 Agentic Top 10 risks |
36+
| **Observability** | Prometheus, OpenTelemetry, PagerDuty, Grafana |
37+
2738
## Why Agent Governance?
2839

2940
AI agent frameworks (LangChain, AutoGen, CrewAI, Google ADK, OpenAI Agents SDK) enable agents to call tools, spawn sub-agents, and take real-world actions — but provide **no runtime security model**. The Agent Governance Toolkit provides:
@@ -173,8 +184,10 @@ Full methodology, per-adapter breakdowns, and memory profiling: **[BENCHMARKS.md
173184
## Documentation
174185

175186
- **[Azure Deployment Guides](docs/deployment/README.md)** — AKS, Azure AI Foundry, Container Apps, OpenClaw sidecar
187+
- **[NIST RFI Mapping](docs/nist-rfi-mapping.md)** — Question-by-question mapping to NIST AI Agent Security RFI (2026-00206)
176188
- [OWASP Compliance Mapping](docs/OWASP-COMPLIANCE.md)
177189
- [CSA Agentic Trust Framework Mapping](docs/CSA-ATF-PROPOSAL.md)
190+
- [Performance Benchmarks](BENCHMARKS.md)
178191
- [Changelog](CHANGELOG.md)
179192
- [Contributing Guide](CONTRIBUTING.md)
180193
- [Security Policy](SECURITY.md)
@@ -218,10 +231,10 @@ Policy enforcement benchmarks are measured on a **30-scenario test suite** cover
218231

219232
### Known Limitations & Roadmap
220233

221-
- **ASI-10 Behavioral Detection**: Fully implemented in Agent SRE — tool-call frequency analysis (z-score spike detection), action entropy scoring, and capability profile violation detection. See [`packages/agent-sre/src/agent_sre/anomaly/`](packages/agent-sre/src/agent_sre/anomaly/) (72 tests passing)
234+
- **ASI-10 Behavioral Detection**: Fully implemented — tool-call frequency analysis (z-score spike detection), action entropy scoring, capability profile violation detection, and behavioral anomaly detection with ring-distance amplification. See [`packages/agent-sre/src/agent_sre/anomaly/`](packages/agent-sre/src/agent_sre/anomaly/) and [`packages/agent-hypervisor/src/hypervisor/rings/breach_detector.py`](packages/agent-hypervisor/src/hypervisor/rings/breach_detector.py)
222235
- **Audit Trail Integrity**: Current hash-chain is in-process; external append-only log integration is planned
223236
- **Framework Integration Depth**: Current adapters wrap agent execution at the function level; deeper hooks into framework-native tool dispatch and sub-agent spawning are planned
224-
- **Observability**: OpenTelemetry integration for policy decision tracing is planned
237+
- **Observability**: Prometheus metrics collection, OpenTelemetry span export, PagerDuty alerting, and Grafana dashboards are implemented. See [`packages/agent-hypervisor/src/hypervisor/observability/`](packages/agent-hypervisor/src/hypervisor/observability/) and [`packages/agent-sre/src/agent_sre/integrations/`](packages/agent-sre/src/agent_sre/integrations/)
225238

226239
## Contributing
227240

packages/agent-hypervisor/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
[![PyPI](https://img.shields.io/pypi/v/agent-hypervisor)](https://pypi.org/project/agent-hypervisor/)
2323
[![Downloads](https://img.shields.io/pypi/dm/agent-hypervisor)](https://pypi.org/project/agent-hypervisor/)
2424
[![OWASP](https://img.shields.io/badge/OWASP_Agentic_Top_10-ASI--05,_10-brightgreen)](https://github.com/microsoft/agent-governance-toolkit/blob/master/docs/OWASP-COMPLIANCE.md)
25-
[![Tests](https://img.shields.io/badge/tests-457%20passing-brightgreen)](https://github.com/microsoft/agent-governance-toolkit)
25+
[![Tests](https://img.shields.io/badge/tests-644%20passing-brightgreen)](https://github.com/microsoft/agent-governance-toolkit)
2626
[![Benchmark](https://img.shields.io/badge/latency-268%CE%BCs%20pipeline-orange)](benchmarks/)
2727
[![Discussions](https://img.shields.io/github/discussions/microsoft/agent-governance-toolkit)](https://github.com/microsoft/agent-governance-toolkit/discussions)
2828

@@ -50,7 +50,7 @@
5050

5151
<table>
5252
<tr>
53-
<td align="center"><h3>457+</h3><sub>Tests Passing</sub></td>
53+
<td align="center"><h3>644+</h3><sub>Tests Passing</sub></td>
5454
<td align="center"><h3>4</h3><sub>Execution Rings<br/>(Ring 0–3)</sub></td>
5555
<td align="center"><h3>268μs</h3><sub>Full Governance<br/>Pipeline Latency</sub></td>
5656
<td align="center"><h3>v2.0</h3><sub>Saga Compensation<br/>Kill Switch · Rate Limits</sub></td>
@@ -587,7 +587,7 @@ Forensic-grade delta trails — semantic diffs, hash-chained entries, summary co
587587
<td width="50%">
588588

589589
### 📡 Observability
590-
Structured event bus emits typed events for every action. Causal trace IDs with full delegation tree encoding. Version counters for causal consistency.
590+
Structured event bus emits typed events for every action. Causal trace IDs with full delegation tree encoding. Version counters for causal consistency. **Prometheus metrics collector** for ring transitions and breaches. **OpenTelemetry span exporter** for saga-to-span mapping with distributed trace context.
591591

592592
</td>
593593
</tr>
@@ -605,7 +605,7 @@ Ring 2 (Standard) — Reversible actions — requires eff_score > 0.60
605605
Ring 3 (Sandbox) — Read-only / research — default for unknown agents
606606
```
607607
608-
**v2.0 additions:** Dynamic ring elevation (sudo with TTL), ring breach detection with circuit breakers, ring inheritance for spawned agents.
608+
**v2.0 additions:** Dynamic ring elevation (sudo with TTL), ring breach detection with circuit breakers, ring inheritance for spawned agents, **behavioral anomaly detection** with sliding-window rate analysis and ring-distance amplification.
609609
610610
### 🔄 Saga Orchestrator — Deep Dive
611611
@@ -659,7 +659,7 @@ pip install agent-hypervisor
659659
| `hypervisor.integrations` | Nexus, Verification, IATP cross-module adapters | -- |
660660
| **Integration** | End-to-end lifecycle, edge cases, security | **24** |
661661
| **Scenarios** | Cross-module governance pipelines (7 suites) | **18** |
662-
| **Total** | | **457** |
662+
| **Total** | | **644** |
663663

664664
## Test Suite
665665

@@ -728,7 +728,7 @@ graph TB
728728
| [Agent OS](https://github.com/microsoft/agent-governance-toolkit) | Policy enforcement kernel | 1,500+ tests |
729729
| [Agent Mesh](https://github.com/microsoft/agent-governance-toolkit) | Cryptographic trust network | 1,400+ tests |
730730
| [Agent SRE](https://github.com/microsoft/agent-governance-toolkit) | SLO, chaos, cost guardrails | 1,070+ tests |
731-
| **Agent Hypervisor** | Session isolation & governance runtime | 457+ tests |
731+
| **Agent Hypervisor** | Session isolation & governance runtime | 644+ tests |
732732

733733
## 🗺️ Roadmap
734734

packages/agent-mesh/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@
6262

6363
<table>
6464
<tr>
65-
<td align="center"><h3>1,300+</h3><sub>Tests Passing</sub></td>
65+
<td align="center"><h3>1,669+</h3><sub>Tests Passing</sub></td>
6666
<td align="center"><h3>6</h3><sub>Framework Integrations</sub></td>
6767
<td align="center"><h3>170K+</h3><sub>Combined Stars of<br/>Integrated Projects</sub></td>
6868
<td align="center"><h3>4</h3><sub>Protocol Bridges<br/>(A2A · MCP · IATP · AI Card)</sub></td>

packages/agent-os/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@
6565

6666
<table>
6767
<tr>
68-
<td align="center"><h3>1,680+</h3><sub>Tests Passing</sub></td>
68+
<td align="center"><h3>2,573+</h3><sub>Tests Passing</sub></td>
6969
<td align="center"><h3>12</h3><sub>Framework Integrations</sub></td>
7070
<td align="center"><h3>170K+</h3><sub>Combined Stars of<br/>Integrated Projects</sub></td>
7171
<td align="center"><h3>&lt;0.1ms p99</h3><sub>Governance Latency<br/><a href="benchmarks/results/BENCHMARKS.md">Benchmarks</a></sub></td>

packages/agent-sre/README.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,9 @@ Reliability layer across **170K+ combined GitHub stars** of integrated projects
5050

5151
<table>
5252
<tr>
53-
<td align="center"><h3>1,089+</h3><sub>Tests Passing</sub></td>
53+
<td align="center"><h3>1,240+</h3><sub>Tests Passing</sub></td>
5454
<td align="center"><h3>12+</h3><sub>Framework Adapters<br/><sub>LangChain · CrewAI · AutoGen<br/>LangGraph · Dify · more</sub></sub></td>
55-
<td align="center"><h3>11</h3><sub>Observability Platforms<br/><sub>Langfuse · LangSmith · Arize<br/>Datadog · Prometheus · more</sub></sub></td>
55+
<td align="center"><h3>13</h3><sub>Observability Platforms<br/><sub>Langfuse · LangSmith · Arize<br/>Datadog · Prometheus · PagerDuty<br/>Grafana · OTel · more</sub></sub></td>
5656
<td align="center"><h3>OpenTelemetry</h3><sub>Native OTLP Export</sub></td>
5757
</tr>
5858
<tr>
@@ -479,7 +479,7 @@ agent-sre/
479479
├── operator/ # Kubernetes CRDs (AgentSLO, CostBudget)
480480
├── .github/actions/ # GitHub Actions (canary deployment)
481481
├── examples/ # 4 runnable demos
482-
├── tests/ # 1,089 tests
482+
├── tests/ # 1,240 tests
483483
├── docs/ # Getting started, concepts, integration guide
484484
└── specs/ # SLO templates (coming soon)
485485
```
@@ -512,7 +512,7 @@ Agent SRE tells you *if it was within budget* and *what to do about it*.
512512

513513
## Status & Maturity
514514

515-
### ✅ Fully Implemented (20,000+ lines, 1,089 tests)
515+
### ✅ Fully Implemented (20,000+ lines, 1,240 tests)
516516

517517
| Component | Status | Description |
518518
|---|---|---|
@@ -650,7 +650,8 @@ See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
650650
| Quarter | Milestone |
651651
|---------|-----------|
652652
| **Q1 2026** | ✅ Core 7 engines, OTel integration, Prometheus dashboards |
653-
| **Q2 2026** | Kubernetes operator, PagerDuty/OpsGenie integration |
653+
| **Q1 2026** | ✅ PagerDuty alerting, Grafana SLO dashboards, org budget enforcement, bounded ErrorBudget events |
654+
| **Q2 2026** | Kubernetes operator, OpsGenie integration |
654655
| **Q3 2026** | ML-powered anomaly detection, auto-remediation |
655656
| **Q4 2026** | Managed cloud service, SOC2 compliance automation |
656657

0 commit comments

Comments
 (0)