Every team building AI agents in production hits the same moment. An agent that worked correctly for weeks starts producing subtly wrong outputs. No code changed. No model was updated. The team spends days reconstructing what happened — only to discover that something in the retrieval layer shifted, a memory write accumulated the wrong belief across sessions, or a tool call escalated in a direction nobody had modeled. The execution was a black box. The post-mortem raises more questions than it answers.
The tools on the market each solve one layer. Runtime scanners block injections but generate no compliance evidence. Legacy GRC platforms govern employees and laptops — they have no concept of an autonomous agent, a swarm, or a non-human identity with its own permission lifecycle. General frameworks describe the risk landscape without specifying how to engineer the fix. What is missing in all of them is a governance contract: a formal specification of the complete operating envelope for agentic AI that defines what gets sanitized, what gets logged, how failures are contained, who holds the authority to stop a deployment when it needs to stop, and what evidence satisfies the audit.
That is exactly what this framework is. AI SAFE² v3.0 is the engineering specification for agentic AI that happens to satisfy every major compliance requirement simultaneously — because it was built by reverse-engineering actual failure modes from production deployments, then defining the controls required to prevent them. Version 3.0 adds 23 new pillar controls grounded in validated red-team findings, bringing the total to 161 controls, 151 across five operational pillars. It also introduces 10 cross-pillar governance controls that address what no other framework has yet touched: agent replication governance (the moment one agent can clone itself, four IAM assumptions collapse at once), named kill-switch authority for autonomous deployments, and the first integration of OWASP AIVSS v0.8 amplification scoring into a GRC risk formula.
What users get: Consistency, privacy, security, reliability, and predictability — so AI systems deliver their intended outcomes without silent failures, governance gaps, or compliance surprises.
The framework is organized around 5 Operational Pillars plus a Cross-Pillar Governance Layer introduced in v3.0. Together they form a complete operational contract covering every phase of agentic AI.
| Section | Link | What You'll Find |
|---|---|---|
| Pillar 1: Sanitize & Isolate | 01-sanitize-isolate/ | Input defense, injection coverage, memory governance, no-code security |
| Pillar 2: Audit & Inventory | 02-audit-inventory/ | Tracing, logging, model lineage, RAG integrity |
| Pillar 3: Fail-Safe & Recovery | 03-fail-safe-recovery/ | Circuit breakers, recursion limits, rollback |
| Pillar 4: Engage & Monitor | 04-engage-monitor/ | Detection pipelines, HITL, platform monitoring |
| Pillar 5: Evolve & Educate | 05-evolve-educate/ | Adversarial evaluation, red-team artifacts |
| Cross-Pillar Governance | 00-cross-pillar/ | CP.1-CP.10: ACT tiers, HEAR doctrine, replication governance |
| AISM Layer | AISM/ | Governance, control mapping, operational oversight |
| Research Notes | research/ | Deep-dive evidence for all controls (001-014) |
| Interactive Dashboard | Launch Dashboard | Search, filter, and explore all 161 controls live |
Don't wait for a breach. Choose your path and lock it down.
Download
skill.mdand upload it to Claude Projects > Project Knowledge. Your Claude instance becomes a certified AI SAFE² Architect immediately.
| I am a... | 🛠️ Your Action Plan | ⏱️ Time |
|---|---|---|
| Developer / Engineer | Run the 5-Minute Audit | 5 min |
| Python Builder | Secure Python Implementation | 15 min |
| No-Code / Automation | Secure Make.com & n8n Workflows | 10 min |
| CISO / Compliance | Get the Full GRC Toolkit | Instant |
New in v2.0: The AI SAFE² OpenClaw Core File Standard ships 11 governance files that apply the full five-pillar model to a personal AI agent workspace. Drop them in, fill the placeholders, run the smoke test, and your agent is governed.
OpenClaw is the first widely-deployed, self-hosted autonomous agent with shell access — exactly the class of system AI SAFE² was designed to govern. The integration gives every OpenClaw operator a complete, auditable governance stack in under an afternoon.
| Layer | What | Where |
|---|---|---|
| Internal Governance | 11 core files defining values, rules, memory, identity, and workspace policy | examples/openclaw/core/ |
| External Enforcement | Scanner, gateway, v1 memory vaccine — infrastructure that wraps the agent | examples/openclaw/ |
Both layers are required. Internal governance defines what the agent intends to do. External enforcement ensures nothing harmful escapes even if the agent is deceived.
Quick Start:
cp -r examples/openclaw/core/. ~/my-agent/
# Then open OPENCLAW-AGENT-TEMPLATE.md and follow the checklistQuick Start: 10-Minute Hardening Guide
Full Resources: examples/openclaw/
Most frameworks stop at the model. AI SAFE² v3.0 explicitly models and mandates controls across the entire real-world stack, securing the tools your developers actually use.
| Layer | Scope | Key Controls |
|---|---|---|
| L1: Core Models | LLMs, Fine-Tuned Weights | A2.3 Model Lineage Provenance Ledger |
| L2: Data Infrastructure | Vector DBs, RAG, Knowledge Bases | S1.5 Memory Governance + A2.6 RAG Corpus Diff Tracking |
| L3: System Patterns | MCP, A2A, API Integrations, Protocol Meshes | CP.5 Platform-Specific Profiles + P2.T3.10 Vuln Scanning |
| L4: Agentic AI | Swarms, Orchestration, n8n, LangGraph, CrewAI | F3.2-F3.5 Fail-Safe Suite + CP.9 Agent Replication Governance |
| L5: Non-Human Identities | Service Accounts, Agents, API Keys | CP.4 Agentic Control Plane + CP.10 HEAR Doctrine |
Legend: Green = Dedicated Control | Orange = Cross-Pillar Governance | 🔗 = Inherited Coverage
graph LR;
A[User Input / Agent Action] -->|Interception| B{Pillar 1: Firewall};
B -- "Injection Detected" --> C[BLOCK & LOG];
B -- "Clean" --> D{Pillar 2: Policy Check};
D -- "Violation" --> C;
D -- "Approved" --> E[Model Inference];
E --> F{Pillar 3: Fail-Safe Governor};
F -- "Recursion / Drift" --> G[Contain & Alert];
F -- "Safe" --> H{Pillar 4: Monitor & Detect};
H -- "Anomaly" --> G;
H -- "Clear" --> I[Execute Action];
I --> J{Cross-Pillar: HEAR / Replication};
J -- "Class-H Action" --> K[HEAR Authorization Required];
J -- "Standard" --> L[Complete + Log];
style C fill:#B80000,stroke:#333,stroke-width:2px;
style L fill:#006400,stroke:#333,stroke-width:2px;
style K fill:#cc6600,stroke:#333,stroke-width:2px;
Explore all 161 AI SAFE² controls through our live, interactive taxonomy explorer.
👉 Launch Dashboard 👈
Features:
- 🔍 Real-time search across all control metadata
- 🎨 Pillar-based filtering for strategic domain focus
- 📊 Risk-level visualization (Critical, High, Medium, Low)
- 💼 Executive summaries with business impact statements
- 🏷️ Framework mappings to all 32 compliance standards
- 🆕 v3.0 highlights including CP.1-CP.10 Cross-Pillar controls
- 📱 Responsive design optimized for all devices
A single AI SAFE² v3.0 implementation satisfies the requirements of all 32 frameworks simultaneously, eliminating the need for fragmented governance initiatives.
| Standard | Coverage | Key Mapping |
|---|---|---|
| NIST AI RMF 1.0 / 2.0 | 100% | GOVERN: CP.3, CP.4, CP.8 / MAP: A2.3, A2.4 / MEASURE: M4.x, E5.1 / MANAGE: F3.x |
| ISO/IEC 42001:2023 | 100% | Sec 8.1: P1 / Sec 8.2: P2 / Sec 8.3: P4 / Sec 8.4: P5 / Sec 9: CP.6 |
| OWASP AIVSS v0.8 | 100% (NEW) | All 10 core risks + AAF scoring formula integrated — first framework to do this |
| OWASP Top 10 LLM | 100% | LLM01-LLM10 all mapped including new agentic variants |
| OWASP Agentic Top 10 (ASI) | 100% (NEW) | ASI01-ASI10; CP.9 uniquely addresses ASI03 Identity Abuse; CP.10 addresses ASI09 |
| MITRE ATLAS (Oct 2025) | 100% | All 14 new agent-specific techniques fully mapped |
| MIT AI Risk Repository v4 | 100% | 7 domains, catastrophic risk pathways (CP.8), CBRN risks |
| Google SAIF | 97% | Exceeds SAIF in swarm security, NHI governance, and memory poisoning |
| CSA Agentic Control Plane | 85% | CP.4 covers identity, authorization, orchestration, and runtime trust |
| CSA Zero Trust for LLMs (NEW) | 90% | S1.3 micro-perimeter per agent, CP.4 policy-as-code, A2.5 output trace |
| MAESTRO (CSA 7-Layer) | 95% | Layers 1-7 fully covered via pillars and CP controls |
| Arcanum PI Taxonomy | 95% | Evasion techniques in P1.T1.2, indirect injection in P1.T1.10, cognitive layer S1.6 |
| AIDEFEND (7 Tactics) | 90% | Deceive tactic (CP.7), Evict (F3.5), Harden shift-left (S1.4) |
| AIID Agentic Incidents | 90% | CP.6 incident feedback loop; M4.8 platform monitoring |
| EU AI Act (2024) | Aligned | High-risk AI: CP.3 / GPAI: A2.3 / Transparency: A2.5 / Human oversight: CP.10 |
| International AI Safety Report 2026 (NEW) | Aligned | Catastrophic risk: CP.8 / Loss of control: F3.2-F3.5 / Evaluation: E5.1-E5.4 |
| CSETv1 Harm | 92% | All 8 harm types including physical safety, financial loss, and democratic norms |
| Standard | Coverage | Key Mapping |
|---|---|---|
| HIPAA | Aligned+ | P1.T1.5 PHI masking / P3.T6 disaster recovery §164.308 / S1.5 cross-session PHI |
| PCI-DSS v4.0 | Aligned+ | P1.T1.5 PAN masking / P1.T2 network segmentation Req 1.3 / M4.8 cloud AI Req 6.4 |
| SOC 2 Type II | Aligned+ | CC.6.1-6.6: P1.T2, CP.4 / CC.7.x: P4, M4.x / C.1: S1.5 / CC.7.4: CP.10 HEAR |
| ISO 27001:2022 | Aligned+ | A.5.15 access: P1.T2 / A.8.8 vuln mgmt: M4.8 / A.12.4 logging: A2.5 |
| NIST CSF 2.0 | Aligned+ | GOVERN: CP.x / IDENTIFY: P2 / PROTECT: P1 / DETECT: P4 / RESPOND: P3 + CP.6 |
| NIST SP 800-53 Rev 5 | Aligned+ | AC: P1.T2, CP.4 / AU: P2.T3, A2.5 / IR: CP.6, F3.x / RA: CP.2, CP.3 |
| FedRAMP | Aligned+ | High baseline: full ACT-3/ACT-4 controls / S1.7 for no-code interconnections |
| CMMC 2.0 | Aligned+ | Level 1: P1-P2 / Level 2: P1-P5 + CP.3-CP.4 / Level 3: E5.x + CP.8 |
| CIS Controls v8 | Aligned+ | CIS-1: A2.4 / CIS-3: S1.5 / CIS-6: CP.4 / CIS-8: A2.5 / CIS-17: CP.6 |
| GDPR | Aligned+ | Art.22 automated decisions: E5.2 + P4.T7 / Art.25 design: S1.5 / Art.33: CP.6 |
| CCPA / CPRA | Aligned+ | P1.T1.5 PII in AI inputs / S1.5 cross-session memory / M4.6 decision bias |
| SEC Cyber Disclosure | Aligned+ | Material incident: CP.6 IICR / Board accountability: CP.3, CP.4, CP.10 |
| DORA | Aligned+ | ICT risk: CP.2 / Incident reporting: CP.6 / Resilience testing: E5.1 |
| CVE / CVSS | Integrated | Combined Risk Score: CVSS + (100 - Pillar Score) / 10 + (AAF / 10) |
| Zero Trust | Native | Built on "Never Trust, Always Verify" for Non-Human Identities |
- OWASP AIVSS v0.8: AI SAFE² v3.0 is the first framework to integrate all 10 core agentic risks and the AAF amplification factor into a composite GRC risk formula.
- OWASP Agentic Top 10: CP.9 (Agent Replication Governance) and CP.10 (HEAR Doctrine) address ASI03 and ASI09 — controls no other framework currently provides.
- CVE/CVSS Integration: Unlike static frameworks, AI SAFE² uses technical vulnerability scores adjusted for agentic deployment context. A CVSS 7.5 in an ACT-4 orchestrator with high AAF is a materially different risk than CVSS 7.5 in an ACT-1 read-only agent.
- Foundational Security: ISO 27001 and NIST CSF are treated as the general security foundation, with the AI-specific SAFE² pillars mapping directly into standard enterprise operations.
| Feature / Capability | AI SAFE² v3.0 (The OS) | Legacy GRC | AI Point Tools |
|---|---|---|---|
| Universal Mapping | ✅ 32 frameworks, one implementation | ❌ No compliance evidence | |
| Agentic Awareness | ✅ Native: swarms, loops, orchestration | ❌ Treats AI as generic software | |
| Agent Replication Governance | ✅ CP.9 — first in any framework | ❌ Not defined | ❌ Not defined |
| Named Kill-Switch Authority | ✅ CP.10 HEAR Doctrine | ❌ No individual accountability | ❌ No process defined |
| AIVSS Scoring Integrated | ✅ AAF in risk formula — first | ❌ None | ❌ None |
| Active Deception Defense | ✅ CP.7 canary tokens + honeypots | ❌ None | ❌ None |
| No-Code Platform Security | ✅ S1.7 — first, CVE-2026-25049 covered | ❌ None | ❌ None |
| Non-Human Identity | ✅ First-class citizen with lifecycle | ❌ Human SSO only | |
| Memory & RAG Governance | ✅ Full lifecycle controls | ❌ Zero coverage | |
| Implementation | ✅ 60 minutes with Toolkit | ❌ 6-12 months | ❌ Code integration first |
The Verdict: You can keep looking for a tool that catches up to AI SAFE², or you can adopt the standard that defined the race.
This repository contains the definitions (the "What"). To operationalize this in an enterprise (the "How"), use the Implementation Toolkit.
| Asset | Description | Access |
|---|---|---|
| Framework Taxonomy | Full Markdown definitions of all 151 controls across 5 pillars + 10 cross-pillar governance controls (CP.1-CP.10) | ✅ Free (This Repo) |
| 161-Point Audit Scorecard | Excel calculator with auto-calculated risk scores including the v3.0 AAF formula | 🔒 Get Toolkit |
| Enterprise Governance Policy | Word template with ACT tier assignments, HEAR designation, and CP.9 replication language | 🔒 Get Toolkit |
| AI SAFE² v3.0 Framework Document | Complete framework with all 161 controls, cross-pillar governance, and 32-framework crosswalk | 🔒 Get Toolkit |
| Vendor Risk Questionnaire | Updated for v3.0 protocol-layer supply chain assessment (CP.5) | 🔒 Get Toolkit |
| 30-Day Implementation Roadmap | Week-by-week path from greenfield or v2.1 to full v3.0 compliance | 🔒 Get Toolkit |
| Risk Command Center Dashboard | Interactive v3.0 scorecard with ACT tier visualization and board-ready exports | 🔒 Get Toolkit |
Consultants charge $5,000-$15,000 for equivalent implementation work. One time. $97.
AI SAFE² is a living standard that adapts to the threat landscape.
| Version | Focus | Key Additions | Controls |
|---|---|---|---|
| v3.0 | Swarm Governance + Production Evidence | 23 new pillar controls, 10 cross-pillar governance controls (CP.1-CP.10), AIVSS scoring integration, HEAR Doctrine, Agent Replication Governance | 161 |
| v2.1 | Agentic & Distributed | NHI governance, swarm controls, memory vaccine, OpenSSF OMS | 128 |
| v2.0 | Enterprise Operations | NIST/ISO mapping | 99 |
| v1.0 | Foundational Concepts | 10 core topics | 10 |
👉 Read the Full Evolution History & Changelog
/
├── .github/ # CI/CD Workflows & Dependabot Config
├── 00-cross-pillar/ # Governance OS: CP.1-CP.10 (ACT Tiers, HEAR Doctrine, Replication)
├── 01-sanitize-isolate/ # Pillar 1: Input Filters & Boundaries
├── 02-audit-inventory/ # Pillar 2: Logging & Asset Tracking
├── 03-fail-safe-recovery/ # Pillar 3: Circuit Breakers & Kill Switches
├── 04-engage-monitor/ # Pillar 4: Human-in-the-Loop
├── 05-evolve-educate/ # Pillar 5: Red Teaming & Updates
├── AISM/ # AI Security Management Layer: Governance, Control Mapping, Operational Oversight
├── FORGE-Act/ # The American Marshall Plan for AI economic engine in all 435 congressional districts
├── assets/ # Visual Maps, Badges & Diagrams
├── config/ # Security Configurations (default.yaml)
├── examples/ # 🧪 Real-world usage examples
├── gateway/ # 🛡️ The AI SAFE² Gateway (Runtime Enforcement Layer)
├── guides/ # 📚 Implementation Guides (Python & No-Code)
├── research/ # 🧠 Threat Intelligence & Deep Dive Evidence (001-014)
├── resources/ # Community Tools & Checklists
├── scanner/ # 🕵️ The Audit Scanner CLI (Assessment Engine)
├── ADVANCED_AGENT_THREATS.md # Guide: Swarm & RAG Vulnerabilities
├── Dockerfile # Gateway Build Instruction
├── INTEGRATIONS.md # 🔌 Ecosystem Map (Cursor, n8n, CI/CD)
├── QUICKSTART_5_MIN.md # ⚡ START HERE: 5-Minute Audit
├── docker-compose.yml # Container Orchestration
├── pyproject.toml # Python Dependencies
├── README.md # The Universal GRC Standard (You are here)
└── skill.md # 🧠 The Brain (Context for AI Agents/IDEs)
This isn't just a repo — it's a mission. We recognize and reward the top 1% of security engineers who contribute to the standard.
- ⭐ Star the Repo: Unlock the "Supporter" role
- 💡 Contribute: Submit a PR to earn "Contributor" status
- 🏆 The Vanguard: Earn Priority Beta Access to Agentic Shield (SaaS) by helping harden the framework
Read the Vanguard Program Details
AI SAFE² secures the AI system. It does not secure the human operating it.
An operator who has experienced sufficient cognitive offloading or decision automation capture can be fully compromised — regardless of how well-hardened the AI infrastructure is. That gap has a companion framework.
| AI SAFE² | CSF | |
|---|---|---|
| Layer | Machine | Human |
| Defends | The AI system | The human operator |
| Prevents | Prompt injection, data leakage, unsafe autonomy | Cognitive offloading, attention capture, decision automation capture |
| Ensures | AI stays in its lane | The human stays capable of defining the lane |
→ CSF Learning Hub → Threat Explorer → Full Repository
@misc{aisafe2_framework,
title = {AI SAFE² Framework v3.0: The Universal GRC Standard for Agentic AI},
author = {Sullivan, Vincent and {Cyber Strategy Institute}},
year = {2026},
publisher = {Cyber Strategy Institute},
url = {https://github.com/CyberStrategyInstitute/ai-safe2-framework},
note = {Version 3.0. Swarm Governance and Production Evidence Edition. 161 Controls, 32 Frameworks.}
}
Code (MIT License): Applies to MCP Server scripts, JSON schemas, HTML dashboards, and code snippets. Use commercially, modify freely, close-source your modifications.
Framework/Docs (CC-BY-SA 4.0): Applies to the AI SAFE² methodology text, pillar definitions, and PDF manuals. Share with attribution; public derivatives must share back under this same license.
Copyright © 2025-2026. All Rights Reserved.
