A kernel architecture for governing autonomous AI agents
β If this project helps you, please star it! It helps others discover Agent OS.
π¦ Install the full stack:
pip install ai-agent-governance[full]β PyPI | GitHub
Quick Start β’ Documentation β’ VS Code Extension β’ Examples β’ Agent Hypervisor β’ AgentMesh β’ Agent SRE
Try Agent OS instantly in your browser - no installation required
| Tests Passing | Framework Integrations | Combined Stars of Integrated Projects |
Governance Latency Benchmarks |
More Framework Proposals Under Review |
| Framework | Stars | Status | Link |
|---|---|---|---|
| Dify | 65K β | β Merged | dify-plugins#2060 |
| LlamaIndex | 47K β | β Merged | llama_index#20644 |
| Microsoft Agent-Lightning | 15K β | β Merged | agent-lightning#478 |
| LangGraph | 24K β | π¦ Published on PyPI | langgraph-trust |
| OpenAI Agents SDK | β | π¦ Published on PyPI | openai-agents-trust |
| OpenClaw | β | π¦ Published on ClawHub | agentmesh-governance |
π Proposals under review at 10+ frameworks (click to expand)
| Framework | Stars | Proposal |
|---|---|---|
| AutoGen | 54K β | microsoft/autogen#7242 |
| CrewAI | 44K β | crewAIInc/crewAI#4502 |
| Haystack | 22K β | deepset-ai/haystack#10615 |
| Semantic Kernel | 27K β | microsoft/semantic-kernel#13556 |
| smolagents | 25K β | β Adapter built β huggingface/smolagents#1989 |
| LangGraph | 24K β | langchain-ai/langgraph#6824 |
| Google ADK | 18K β | β Adapter built β google/adk-python#4517 |
| PydanticAI | 15K β | β Adapter built β pydantic/pydantic-ai#4335 |
| OpenAI Agents SDK | β | openai/openai-agents-python#2515 |
| A2A Protocol | 21K β | a2aproject/A2A#1501 |
| Oracle Agent Spec | β | oracle/agent-spec#105 |
| AI Card Spec | β | agent-card/ai-card#16 |
The AI agent market is projected to reach $47B by 2030. As enterprises deploy autonomous agents at scale, governance becomes the critical infrastructure layer. Agent OS is the kernel that ensures every agent action is policy-enforced, auditable, and compliant β making AI agents enterprise-ready.
The problem: AI agents can execute arbitrary tools, access sensitive data, and make autonomous decisions β with no built-in governance, audit trails, or policy enforcement.
Our solution: A governance kernel that sits between agents and their actions, providing deterministic policy enforcement in <1ms with zero agent code changes.
| Tool | Focus | When it acts |
|---|---|---|
| LangChain/CrewAI | Building agents | N/A (framework) |
| NeMo Guardrails | Input/output filtering | Before/after LLM call |
| LlamaGuard | Content classification | Before/after LLM call |
| Agent OS | Action interception | During execution |
Agent frameworks build agents. Safety tools filter I/O. Agent OS intercepts actions mid-execution β the only kernel-level governance layer.
Agent OS + ecosystem covers 8 out of 10 OWASP Agentic Application Security risks:
| Risk | Coverage | Module |
|---|---|---|
| ASI01 Agent Goal Hijack | β Full | GovernancePolicy.blocked_patterns |
| ASI02 Tool Misuse | β Full | MCPGateway β tool filtering, rate limiting, audit |
| ASI03 Identity & Privilege | β Full | require_human_approval, RBAC policies |
| ASI04 Supply Chain | Tool allowlisting (no deep scanning yet) | |
| ASI05 Code Execution | β Full | blocked_patterns, sandbox integration |
| ASI06 Memory Poisoning | β Full | MemoryGuard β hash integrity, injection detection |
| ASI07 Inter-Agent Comms | β Full | AgentMesh trust handshake, HMAC auth |
| ASI08 Cascading Failures | β Full | Agent SRE circuit breakers, cascade detection |
| ASI09 Human-Agent Trust | β Full | Human approval workflows, audit logging |
| ASI10 Rogue Agents | Agent Hypervisor execution rings, kill switch |
| Layer | Package | Purpose | Install |
|---|---|---|---|
| Kernel | Agent OS | Policy enforcement, action interception | pip install agent-os-kernel |
| Network | AgentMesh | Identity, trust, delegation | pip install agentmesh-platform |
| Reliability | Agent SRE | SLOs, chaos testing, circuit breakers | pip install agent-sre |
| Runtime | Agent Hypervisor | Execution rings, resource limits, saga | pip install agent-hypervisor |
| Full Stack | ai-agent-governance | All of the above | pip install ai-agent-governance[full] |
pip install agent-os-kernelfrom agent_os import StatelessKernel, ExecutionContext
# Create a governed agent in 3 lines
kernel = StatelessKernel()
# Define execution context with governance policies
ctx = ExecutionContext(agent_id="demo-agent", policies=["read_only"])
# Your agent runs with policy enforcement
result = await kernel.execute(
action="database_query",
params={"query": "SELECT * FROM users"},
context=ctx
)
# β
Safe queries execute
# β "DROP TABLE users" β Blocked by kernelThat's it! Your agent now has deterministic policy enforcement. Learn more β
π¬ See all features in action:
git clone https://github.com/imran-siddique/agent-os && python agent-os/demo.pyπ More examples (click to expand)
from agent_os import StatelessKernel
kernel = StatelessKernel()
kernel.load_policy_yaml("""
version: "1.0"
name: api-safety
rules:
- name: block-destructive-sql
condition: "action == 'database_query'"
action: deny
pattern: "DROP|TRUNCATE|DELETE FROM .* WHERE 1=1"
- name: rate-limit-api
condition: "action == 'api_call'"
limit: "100/hour"
""")
result = await kernel.execute(action="database_query", params={"query": "DROP TABLE users"})
# β Blocked: Matched rule 'block-destructive-sql'from agent_os import KernelSpace
kernel = KernelSpace()
# Every kernel action is automatically recorded
result = await kernel.execute(action="read_file", params={"path": "/data/report.csv"})
# Query the flight recorder
entries = kernel.flight_recorder.query(agent_id="agent-001", limit=10)
for entry in entries:
print(f"{entry.timestamp} | {entry.action} | {entry.outcome}")from agent_os import KernelSpace
from agent_os.emk import EpisodicMemory
kernel = KernelSpace(policy_file="policies.yaml")
memory = EpisodicMemory(max_turns=50)
@kernel.register
async def chat(message: str, conversation_id: str = "default") -> str:
history = memory.get_history(conversation_id)
response = await call_llm(history + [{"role": "user", "content": message}])
memory.add_turn(conversation_id, message, response)
return response
# Outputs are checked against content policies; violations trigger SIGSTOPSee examples/ for 20+ runnable demos including SQL agents, GitHub reviewers, and compliance bots.
from agent_os import stateless_execute
# 1. Define safety policies (not prompts β actual enforcement)
# 2. Actions are checked against policies before execution
result = await stateless_execute(
action="database_query",
params={"query": "SELECT revenue FROM sales"},
agent_id="analyst-001",
policies=["read_only"]
)
# β
Safe queries execute
# β "DROP TABLE users" β BLOCKED (not by prompt, by kernel)Result: Defined policies are deterministically enforced by the kernelβnot by hoping the LLM follows instructions.
For the full kernel with signals, VFS, and protection rings:
from agent_os import KernelSpace, AgentSignal, AgentVFS
# Requires: pip install agent-os-kernel[full]
kernel = KernelSpace()
ctx = kernel.create_agent_context("agent-001")
await ctx.write("/mem/working/task.txt", "Hello World")Note:
KernelSpace,AgentSignal, andAgentVFSrequire installing the control-plane module:pip install agent-os-kernel[full]
Agent OS applies operating system concepts to AI agent governance. Instead of relying on prompts to enforce safety ("please don't do dangerous things"), it provides application-level middleware that intercepts and validates agent actions before execution.
Note: This is application-level enforcement (Python middleware), not OS kernel-level isolation. Agents run in the same process. For true isolation, run agents in containers.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER SPACE (Agent Code) β
β Your agent code runs here. The kernel intercepts β
β actions before they execute. β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β KERNEL SPACE β
β Policy Engine β Flight Recorder β Signal Dispatch β
β Actions are checked against policies before execution β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Prompt-based safety asks the LLM to follow rules. The LLM decides whether to comply.
Kernel-based safety intercepts actions before execution. The policy engine decides, not the LLM.
This is the same principle operating systems use: applications request resources, the kernel grants or denies access based on permissions.
graph TB
subgraph Frameworks["External Frameworks"]
LC[LangChain]
CA[CrewAI]
AG[AutoGen]
OA[OpenAI]
AN[Anthropic]
GE[Gemini]
end
ADAPT[[Adapters]]
subgraph L4["Layer 4: Intelligence"]
SC[Self-Correction]
SA[Semantic Analysis]
end
subgraph L3["Layer 3: Control Plane"]
PE[Policy Engine]
AL[Audit Logging]
end
subgraph L2["Layer 2: Infrastructure"]
NM[Nexus Marketplace]
OB[Observability]
end
subgraph L1["Layer 1: Primitives"]
VE[Verification Engine]
CS[Context Service]
MS[Memory Store]
end
LC & CA & AG & OA & AN & GE --> ADAPT
ADAPT --> L4
SC & SA --> PE
SC & SA --> AL
PE & AL --> NM
PE & AL --> OB
NM & OB --> VE
NM & OB --> CS
NM & OB --> MS
agent-os/
βββ src/agent_os/ # Core Python package
β βββ __init__.py # Public API (re-exports from all layers)
β βββ stateless.py # StatelessKernel (zero-dependency core)
β βββ base_agent.py # BaseAgent, ToolUsingAgent classes
β βββ agents_compat.py # AGENTS.md parser (OpenAI/Anthropic standard)
β βββ cli.py # CLI (agent-os check, review, init, etc.)
β βββ integrations/ # Framework adapters (LangChain, OpenAI, etc.)
βββ modules/ # Kernel Modules (4-layer architecture)
β βββ primitives/ # Layer 1: Base types and failures
β βββ cmvk/ # Layer 1: Verification
β βββ emk/ # Layer 1: Episodic memory kernel
β βββ caas/ # Layer 1: Context-as-a-Service
β βββ amb/ # Layer 2: Agent message bus
β βββ iatp/ # Layer 2: Inter-agent trust protocol
β βββ atr/ # Layer 2: Agent tool registry
β βββ observability/ # Layer 3: Prometheus + OpenTelemetry
β βββ control-plane/ # Layer 3: THE KERNEL (policies, signals)
β βββ scak/ # Layer 4: Self-correcting agent kernel
β βββ mute-agent/ # Layer 4: Face/Hands architecture
β βββ nexus/ # Experimental: Trust exchange network
β βββ mcp-kernel-server/ # Integration: MCP protocol support
βββ extensions/ # IDE & AI Assistant Extensions
β βββ mcp-server/ # β MCP Server (Copilot, Claude, Cursor)
β βββ vscode/ # VS Code extension
β βββ copilot/ # GitHub Copilot extension
β βββ jetbrains/ # IntelliJ/PyCharm plugin
β βββ cursor/ # Cursor IDE extension
β βββ chrome/ # Chrome extension
β βββ github-cli/ # gh CLI extension
βββ examples/ # Working examples
βββ docs/ # Documentation
βββ tests/ # Test suite (organized by layer)
βββ notebooks/ # Jupyter tutorials
βββ papers/ # Research papers
βββ templates/ # Policy templates
| Module | Layer | PyPI Package | Description | Status |
|---|---|---|---|---|
primitives |
1 | agent-primitives |
Base failure types, severity levels | β Stable |
cmvk |
1 | cmvk |
Verification, drift detection | β Stable |
emk |
1 | emk |
Episodic memory kernel (append-only ledger) | β Stable |
caas |
1 | caas-core |
Context-as-a-Service, RAG pipeline | β Stable |
amb |
2 | amb-core |
Agent message bus (async pub/sub) | β Stable |
iatp |
2 | inter-agent-trust-protocol |
Sidecar trust protocol, typed IPC pipes | β Stable |
atr |
2 | agent-tool-registry |
Tool registry with LLM schema generation | β Stable |
control-plane |
3 | agent-control-plane |
THE KERNEL β Policy engine, signals, VFS | β Stable |
observability |
3 | agent-os-observability |
Prometheus metrics + OpenTelemetry tracing | |
scak |
4 | scak |
Self-correcting agent kernel | β Stable |
mute-agent |
4 | mute-agent |
Decoupled reasoning/execution architecture | |
nexus |
β | Not published | Trust exchange network | π¬ Prototype |
mcp-kernel-server |
Int | mcp-kernel-server |
MCP server for Claude Desktop | |
hypervisor |
β | agent-hypervisor |
Runtime supervisor β Execution Rings, Joint Liability, Saga Orchestrator (own repo) | β 184 tests |
Runtime supervisor for multi-agent collaboration β think "VMware for AI agents."
Now its own repo:
agent-hypervisorβ 184 tests, 268ΞΌs full pipeline, zero dependencies beyond pydantic.
Just as OS hypervisors isolate virtual machines and enforce resource boundaries, the Agent Hypervisor isolates AI agent sessions and enforces governance boundaries at sub-millisecond latency.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AGENT HYPERVISOR β
β β
β Ring 0 (Root) β SRE Witness required β
β Ring 1 (Privileged)β Ο_eff > 0.95 + consensus β
β Ring 2 (Standard) β Ο_eff > 0.60 β
β Ring 3 (Sandbox) β Default for unknown agents β
β β
β ββββββββββββ βββββββββββββ ββββββββββββββββββββββββββ β
β β Joint β β Semantic β β Hash-Chained β β
β β Liability β β Saga β β Delta Audit Trail β β
β β Engine β β Orchestr. β β (Tamper-Evident) β β
β ββββββββββββ βββββββββββββ ββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Feature | Description | Latency |
|---|---|---|
| Execution Rings | 4-level privilege model (Ring 0β3) based on trust score | 0.3ΞΌs |
| Joint Liability | High-trust agents vouch for low-trust agents with bonded reputation | 7ΞΌs |
| Saga Orchestrator | Multi-step transactions with timeout, retry, and auto-compensation | 151ΞΌs |
| Delta Audit | Hash-chained semantic diffs with blockchain commitment | 27ΞΌs |
| Full Pipeline | Session + join + audit + saga + terminate | 268ΞΌs |
pip install agent-hypervisorfrom hypervisor import Hypervisor, SessionConfig, ConsistencyMode
hv = Hypervisor()
# Create a governed multi-agent session
session = await hv.create_session(
config=SessionConfig(consistency_mode=ConsistencyMode.EVENTUAL, max_participants=5),
creator_did="did:mesh:admin",
)
# Agents are automatically assigned privilege rings based on trust score
ring = await hv.join_session(session.sso.session_id, "did:mesh:agent-alpha", sigma_raw=0.85)
# β Ring 2 (Standard) β can execute reversible actions
# Multi-step saga with automatic timeout and compensation
saga = session.saga.create_saga(session.sso.session_id)
step = session.saga.add_step(
saga.saga_id, "draft_email", "did:mesh:agent-alpha",
execute_api="/api/draft", undo_api="/api/undo-draft",
timeout_seconds=30, max_retries=2,
)
# Terminate β returns tamper-evident summary hash
summary_hash = await hv.terminate_session(session.sso.session_id)π Full Hypervisor documentation β
| Extension | Description | Status |
|---|---|---|
mcp-server |
β MCP Server β Works with Claude, Copilot, Cursor (npx agentos-mcp-server) |
β Published (v1.0.1) |
vscode |
VS Code extension with real-time policy checks, enterprise features | β Published (v1.0.1) |
copilot |
GitHub Copilot extension (Vercel/Docker deployment) | β Published (v1.0.0) |
jetbrains |
IntelliJ, PyCharm, WebStorm plugin (Kotlin) | β Built (v1.0.0) |
cursor |
Cursor IDE extension (Composer integration) | β Built (v0.1.0) |
chrome |
Chrome extension for GitHub, Jira, AWS, GitLab | β Built (v1.0.0) |
github-cli |
gh agent-os CLI extension |
pip install agent-os-kernelOr with optional components:
pip install agent-os-kernel[cmvk] # + verification
pip install agent-os-kernel[iatp] # + inter-agent trust
pip install agent-os-kernel[observability] # + Prometheus/OpenTelemetry
pip install agent-os-kernel[nexus] # + trust exchange network
pip install agent-os-kernel[full] # EverythingmacOS/Linux:
curl -sSL https://raw.githubusercontent.com/imran-siddique/agent-os/main/scripts/quickstart.sh | bashWindows (PowerShell):
iwr -useb https://raw.githubusercontent.com/imran-siddique/agent-os/main/scripts/quickstart.ps1 | iexfrom agent_os import stateless_execute
# Execute with policy enforcement
result = await stateless_execute(
action="database_query",
params={"query": "SELECT * FROM users"},
agent_id="analyst-001",
policies=["read_only"]
)from agent_os import KernelSpace, AgentSignal, PolicyRule
kernel = KernelSpace()
# Create agent context with VFS
ctx = kernel.create_agent_context("agent-001")
await ctx.write("/mem/working/task.txt", "analyze this data")
# Policy enforcement
from agent_os import PolicyEngine
engine = PolicyEngine()
engine.add_rule(PolicyRule(name="no_sql_injection", pattern="DROP|DELETE|TRUNCATE"))Agent OS borrows concepts from POSIX operating systems:
| Concept | POSIX | Agent OS |
|---|---|---|
| Process control | SIGKILL, SIGSTOP |
AgentSignal.SIGKILL, AgentSignal.SIGSTOP |
| Filesystem | /proc, /tmp |
VFS with /mem/working, /mem/episodic |
| IPC | Pipes (|) |
Typed IPC pipes between agents |
| Syscalls | open(), read() |
kernel.execute() |
# Requires: pip install agent-os-kernel[full]
from agent_os import SignalDispatcher, AgentSignal
dispatcher = SignalDispatcher()
dispatcher.signal(agent_id, AgentSignal.SIGSTOP) # Pause
dispatcher.signal(agent_id, AgentSignal.SIGCONT) # Resume
dispatcher.signal(agent_id, AgentSignal.SIGKILL) # Terminate# Requires: pip install agent-os-kernel[full]
from agent_os import AgentVFS
vfs = AgentVFS(agent_id="agent-001")
vfs.write("/mem/working/task.txt", "Current task")
vfs.read("/policy/rules.yaml") # Read-only from user spaceWrap existing frameworks with Agent OS governance:
# LangChain
from agent_os.integrations import LangChainKernel
governed = LangChainKernel().wrap(my_chain)
# OpenAI Assistants
from agent_os.integrations import OpenAIKernel
governed = OpenAIKernel().wrap_assistant(assistant, client)
# Semantic Kernel
from agent_os.integrations import SemanticKernelWrapper
governed = SemanticKernelWrapper().wrap(sk_kernel)
# CrewAI
from agent_os.integrations import CrewAIKernel
governed = CrewAIKernel().wrap(my_crew)
# AutoGen
from agent_os.integrations import AutoGenKernel
governed = AutoGenKernel().wrap(autogen_agent)
# OpenAI Agents SDK
from agent_os.integrations import OpenAIAgentsSDKKernel
governed = OpenAIAgentsSDKKernel().wrap(agent)Note: These adapters use lazy interception β they don't require the target framework to be installed until you call
.wrap().
See integrations documentation for full details.
| Framework | Governance Level | Async Support | Status | Adapter File |
|---|---|---|---|---|
| LangChain | Chain/Agent/Runnable | β
ainvoke |
β Stable | integrations/langchain_adapter.py |
| OpenAI Assistants | Run/Thread/Tool Call | β Streaming | β Stable | integrations/openai_adapter.py |
| AutoGen | Multi-Agent Orchestration | β Sync only | β Stable | integrations/autogen_adapter.py |
| Semantic Kernel | Function/Plugin/Memory | β Native async | β Stable | integrations/semantic_kernel_adapter.py |
| CrewAI | Crew/Agent/Task | β Sync only | β Stable | integrations/crewai_adapter.py |
| OpenAI Agents SDK | Agent/Tool/Handoff | β Native async | β Stable | integrations/openai_agents_sdk_adapter.py |
The examples/ directory contains demos at various levels:
| Demo | Description | Command |
|---|---|---|
| demo-app | Uses the stateless API (most reliable) | cd examples/demo-app && python demo.py |
| hello-world | Minimal example | cd examples/hello-world && python agent.py |
| quickstart | Quick intro | cd examples/quickstart && python my_first_agent.py |
These examples are self-contained and don't require external Agent OS imports:
| Demo | Description |
|---|---|
| healthcare-hipaa | HIPAA-compliant agent |
| customer-service | Customer support agent |
| legal-review | Legal document analysis |
| crewai-safe-mode | CrewAI with safety wrappers |
| Demo | Description | Command |
|---|---|---|
| carbon-auditor | Multi-model verification | cd examples/carbon-auditor && docker-compose up |
| grid-balancing | Multi-agent coordination | cd examples/grid-balancing && docker-compose up |
| defi-sentinel | Real-time attack detection | cd examples/defi-sentinel && docker-compose up |
| pharma-compliance | Document analysis | cd examples/pharma-compliance && docker-compose up |
Each production demo includes:
- Grafana dashboard on port 300X
- Prometheus metrics on port 909X
- Jaeger tracing on port 1668X
# Run carbon auditor with full observability
cd examples/carbon-auditor
cp .env.example .env # Optional: add API keys
docker-compose up
# Open dashboards
open http://localhost:3000 # Grafana (admin/admin)
open http://localhost:16686 # Jaeger tracesAgent OS includes pre-built safe tools via the Agent Tool Registry:
# Requires: pip install agent-os-kernel[full]
from atr import ToolRegistry, tool
@tool(name="safe_http", description="Rate-limited HTTP requests")
async def safe_http(url: str) -> dict:
# Tool is automatically registered and sandboxed
...
registry = ToolRegistry()
registry.register(safe_http)
# Generate schemas for any LLM
openai_tools = registry.to_openai_schema()
anthropic_tools = registry.to_anthropic_schema()Connect agents using the async message bus:
# Requires: pip install agent-os-kernel[full]
from amb_core import MessageBus, Message
bus = MessageBus()
await bus.subscribe("tasks", handler)
await bus.publish("tasks", Message(payload={"task": "analyze"}))Broker adapters available for Redis, Kafka, and NATS (requires optional dependencies).
Agent OS includes a CLI for terminal workflows:
# Check files for safety violations
agentos check src/app.py
# β src/app.py: No violations
# OR
# β οΈ 2 violation(s) found in src/app.py:
# Line 12: DROP TABLE users;
# Violation: Destructive SQL: DROP operation detected
# Policy: block-destructive-sql
# Check staged git files (ideal for pre-commit hooks)
agentos check --staged
# β No violations in staged files
# Machine-readable JSON output (for CI pipelines)
agentos check src/app.py --format json
# CI mode (no colors, strict exit codes)
agentos check --staged --ci# Initialize Agent OS in a project
agentos init
# Initialized Agent OS in .agents/
# - agents.md: Agent instructions (OpenAI/Anthropic standard)
# - security.md: Kernel policies (Agent OS extension)
# - Template: strict
# Choose a permissive or audit-only template
agentos init --template permissive
agentos init --template audit
# Overwrite an existing .agents/ directory
agentos init --force# Enable kernel governance and verify the configuration
agentos secure
# Securing agents in .
# [PASS] kernel version
# [PASS] signals defined
# [PASS] policies defined
# Security configuration valid.# Audit agent security configuration
agentos audit
# Auditing .
# [OK] agents.md
# [OK] security.md
# No issues found.
# JSON output for CI
agentos audit --format json# Show kernel status (version, installed packages)
agentos status
# Agent OS Kernel Status
# ========================================
# Version: 1.2.0
# Status: Installed
# Project: /home/user/myproject
# Agents: Configured (.agents/ found)# Multi-model code review with CMVK consensus
agentos review src/app.py --cmvk
# π Reviewing src/app.py with CMVK...
# Multi-Model Review (3 models):
# β
gpt-4: No issues
# β οΈ claude-sonnet-4: 1 potential issue(s)
# β
gemini-pro: No issues
# Consensus: 67%
# Specify models
agentos review src/app.py --cmvk --models "gpt-4,claude-sonnet-4"# Validate policy YAML files
agentos validate
# Checking .agents/policy.yaml... OK
# β All 1 policy file(s) valid.
# Validate specific files in strict mode
agentos validate policies/*.yaml --strict# Install git pre-commit hook
agentos install-hooks
# β Installed pre-commit hook: .git/hooks/pre-commit
# Agent OS will now check staged files before each commit.
# Append to an existing hook
agentos install-hooks --append# Start the HTTP API server
agentos serve --port 8080
# Agent OS API server starting on 0.0.0.0:8080
# Endpoints:
# GET /health Health check
# GET /status Kernel status
# GET /agents List agents
# POST /agents/{id}/execute Execute agent action# Output Prometheus-style metrics
agentos metrics
# # HELP agentos_policy_violations_total Total policy violations.
# # TYPE agentos_policy_violations_total counter
# agentos_policy_violations_total 0
# ...Agent OS provides an MCP server that works with any MCP-compatible AI assistant:
# Quick install via npx
npx agentos-mcp-servernpm: agentos-mcp-server
MCP Registry: io.github.imran-siddique/agentos
Add to your config file:
Claude Desktop (%APPDATA%\Claude\claude_desktop_config.json on Windows):
{
"mcpServers": {
"agentos": {
"command": "npx",
"args": ["-y", "agentos-mcp-server"]
}
}
}Features: 10 tools for agent creation, policy enforcement, compliance checking (SOC 2, GDPR, HIPAA), human-in-the-loop approvals, and audit logging.
See MCP server documentation for full details.
- 5-Minute Quickstart β Get running fast
- 30-Minute Deep Dive β Comprehensive walkthrough
- Building Your First Governed Agent β Complete tutorial
- Using Message Bus Adapters β Connect agents
- Creating Custom Tools β Build safe tools
- Cheatsheet β Quick reference
| Notebook | Description | Time |
|---|---|---|
| Hello Agent OS | Your first governed agent | 5 min |
| Episodic Memory | Agent memory that persists | 15 min |
| Time-Travel Debugging | Replay and debug decisions | 20 min |
| Verification | Detect hallucinations | 15 min |
| Multi-Agent Coordination | Trust between agents | 20 min |
| Policy Engine | Deep dive into policies | 15 min |
- Quickstart Guide β 60 seconds to first agent
- Framework Integrations β LangChain, OpenAI, etc.
- Kernel Internals β How the kernel works
- Architecture Overview β Getting started
- Kernel Internals β How the kernel works
- RFC-003: Agent Signals β POSIX-style signals
- RFC-004: Agent Primitives β Core primitives
This is a research project exploring kernel concepts for AI agent governance.
These components are fully implemented and tested:
| Component | Tests |
|---|---|
StatelessKernel β Zero-dependency policy enforcement (src/agent_os/) |
β Full coverage |
| Policy Engine β Deterministic rule enforcement | β Tested |
| Flight Recorder β SQLite-based audit logging | β Tested |
CLI β agent-os check, init, secure, validate |
β Tested |
| Framework Adapters β LangChain, OpenAI, Semantic Kernel, CrewAI, AutoGen, OpenAI Agents SDK | β Implemented |
| AGENTS.md Parser β OpenAI/Anthropic standard agent config | β Full coverage |
Primitives (agent-primitives) β Failure types, severity levels |
β Tested |
CMVK (cmvk) β Drift detection, distance metrics (955+ lines) |
β Tested |
EMK (emk) β Episodic memory with JSONL storage |
β 8 test files |
AMB (amb-core) β Async message bus, DLQ, tracing |
β 6 test files |
IATP (inter-agent-trust-protocol) β Sidecar trust, typed IPC |
β 9 test files |
ATR (agent-tool-registry) β Multi-LLM schema generation |
β 6 test files |
Control Plane (agent-control-plane) β Signals, VFS, protection rings |
β 18 test files |
SCAK (scak) β Self-correcting agent kernel |
β 23 test files |
| Component | What's Missing |
|---|---|
Mute Agent (mute-agent) |
No tests; all layer dependencies use mock adapters |
Observability (agent-os-observability) |
No tests; Prometheus metrics, Grafana dashboards, OTel tracing implemented |
MCP Kernel Server (mcp-kernel-server) |
No tests; 1173-line implementation |
| GitHub CLI Extension | Single bash script with simulated output |
| Control Plane MCP Adapter | Placeholder β returns canned responses |
| Control Plane A2A Adapter | Placeholder β negotiation accepts all params |
| Component | What's Missing |
|---|---|
| Nexus Trust Exchange | No pyproject.toml, no tests, placeholder cryptography (XOR β not secure), all signature verification stubbed, in-memory storage only |
| Limitation | Impact | Mitigation |
|---|---|---|
| Application-level only | Direct stdlib calls (subprocess, open) bypass kernel |
Pair with container isolation for production |
| Blocklist-based policies | Novel attack patterns not in rules will pass | Add AST-level parsing (#32), use defense in depth |
| Shadow Mode single-step | Multi-step agent simulations diverge from reality | Use for single-turn validation only |
| No tamper-proof audit | Flight Recorder SQLite can be modified by compromised agent | Write to external sink for critical audits |
| Provider-coupled adapters | Each SDK needs separate adapter | Abstract interface planned (#47) |
See GitHub Issues for the full roadmap.
Prompt-based safety relies on instructing the LLM to follow rules via system prompts. This approach is probabilistic β the model may still produce unsafe outputs under certain conditions.
Agent OS enforces policies at the middleware layer. Actions are intercepted and validated before execution, making enforcement deterministic rather than dependent on model compliance.
Agent OS can wrap and govern agents built with popular frameworks including LangChain, CrewAI, AutoGen, Semantic Kernel, and the OpenAI SDK. It also supports MCP-based integrations.
Core components such as the StatelessKernel and Policy Engine are production-ready. However, Agent OS provides application-level enforcement. For high-security environments, it should be combined with infrastructure isolation (e.g., containers).
Custom policies can be defined programmatically in Python or declaratively using YAML. Policies define rules that inspect and allow or deny agent actions before execution.
Policy checks are lightweight and typically introduce only minimal latency per action. The overhead depends on the number and complexity of rules configured.
ModuleNotFoundError: No module named 'agent_os'
# Install from source
git clone https://github.com/imran-siddique/agent-os.git
cd agent-os
pip install -e .Optional modules not available
# Check what's installed
python -c "from agent_os import check_installation; check_installation()"
# Install everything
pip install -e ".[full]"Permission errors on Windows
# Run PowerShell as Administrator, or use --user flag
pip install --user -e .Docker not working
# Build with Dockerfile (no Docker Compose needed for simple tests)
docker build -t agent-os .
docker run -it agent-os python examples/demo-app/demo.pyTests failing with API errors
# Most tests work without API keys β mock mode is default
pytest tests/ -v
# For real LLM tests, set environment variables
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...What is the difference between Agent OS and prompt-based guardrails? Prompt-based guardrails ask the LLM to self-police, which is probabilistic. Agent OS enforces governance at the middleware level using deterministic policy engines and POSIX-inspired access controls. It controls what agents can do (capability-based), not just what they should not do (filter-based).
How does Agent OS work with other frameworks?
Agent OS integrates with 14+ frameworks via adapters. Install the governance layer alongside your existing framework: use langgraph-trust for LangGraph, openai-agents-trust for OpenAI Agents, or the MCP server for any MCP-compatible client. Agent OS acts as a kernel layer underneath your agent framework.
What is the Agent Governance Ecosystem? Agent OS is part of a suite of four projects: Agent OS (policy kernel), AgentMesh (trust network), Agent Hypervisor (runtime supervisor), and Agent SRE (reliability platform). Together they provide 4,310+ tests across 17 modules.
Can I use Agent OS in production?
Yes. Agent OS has 1,500+ tests, a VS Code extension, PyPI package (pip install agent-os-kernel), and is integrated into production frameworks like Dify (65K stars) and LlamaIndex (47K stars). It supports Python 3.9+ and runs on any platform.
git clone https://github.com/imran-siddique/agent-os.git
cd agent-os
pip install -e ".[dev]"
pytestMIT β See LICENSE