|
14 | 14 | </p> |
15 | 15 |
|
16 | 16 | <p align="center"> |
17 | | - <strong>SuperClaw — Red-Team AI Agents Before They Red-Team You</strong><br/> |
| 17 | + <strong>Red-Team AI Agents Before They Red-Team You</strong><br/> |
18 | 18 | Scenario-driven, behavior-first security testing for autonomous agents. |
19 | 19 | </p> |
20 | 20 |
|
21 | | -SuperClaw is a security testing framework for AI coding agents such as **OpenClaw** and agent ecosystems like **Moltbook**. It identifies vulnerabilities through prompt injection, tool policy bypass, sandbox escape, and multi-agent trust exploitation. |
| 21 | +<p align="center"> |
| 22 | + <a href="https://superagenticai.github.io/superclaw/"><img src="https://img.shields.io/badge/📚_Full_Documentation-superagenticai.github.io/superclaw-blue?style=for-the-badge" alt="Documentation" /></a> |
| 23 | +</p> |
22 | 24 |
|
23 | | -## OpenClaw + Moltbook Threat Model |
| 25 | +<p align="center"> |
| 26 | + <a href="#-quick-start">Quick Start</a> • |
| 27 | + <a href="#-features">Features</a> • |
| 28 | + <a href="#-attack-techniques">Attack Techniques</a> • |
| 29 | + <a href="https://superagenticai.github.io/superclaw/">Full Docs</a> |
| 30 | +</p> |
24 | 31 |
|
25 | | -> **Threat Model** |
26 | | -> OpenClaw agents often run with broad tool access. When connected to **Moltbook** or other agent networks, they can ingest untrusted, adversarial content that enables: |
27 | | -> - Prompt injection and hidden instruction attacks |
28 | | -> - Tool misuse and policy bypass |
29 | | -> - Behavioral drift over time |
30 | | -> - Cascading cross‑agent exploitation |
31 | | -> SuperClaw is built to evaluate these risks **before** deployment. |
| 32 | +--- |
32 | 33 |
|
33 | | -## Problem & Solution (Summary) |
| 34 | +## Why SuperClaw? |
34 | 35 |
|
35 | | -**Problem:** Autonomous agents are being deployed with high privilege, mutable behavior, and exposure to untrusted inputs—without structured security validation. This makes prompt injection, tool misuse, configuration drift, and data leakage likely, but poorly understood until after exposure. |
| 36 | +Autonomous AI agents are deployed with high privileges, mutable behavior, and exposure to untrusted inputs—often without structured security validation. This makes prompt injection, tool misuse, configuration drift, and data leakage likely but poorly understood until after exposure. |
36 | 37 |
|
37 | | -**Solution:** SuperClaw is a **pre‑deployment, behavior‑driven red‑teaming framework** that stress‑tests existing agents. It runs scenario‑based evaluations, records evidence (tool calls, outputs, artifacts), scores behaviors against explicit contracts, and produces actionable reports before agents touch sensitive data or external ecosystems. |
| 38 | +**SuperClaw** is a pre-deployment, behavior-driven red-teaming framework that stress-tests your agents before they touch sensitive data or external ecosystems. |
38 | 39 |
|
39 | | -**Non‑goals:** SuperClaw does **not** generate agents, run production workloads, or automate real‑world exploitation. |
| 40 | +### What It Does |
40 | 41 |
|
41 | | -## ⚠️ Security Notice |
| 42 | +- **Runs scenario-based security evaluations** against your agents |
| 43 | +- **Records evidence** (tool calls, outputs, artifacts) for each attack |
| 44 | +- **Scores behaviors** against explicit security contracts |
| 45 | +- **Produces actionable reports** with findings and mitigations |
42 | 46 |
|
43 | | -**This tool is for authorized security testing only.** See [SECURITY.md](SECURITY.md) for: |
44 | | -- Authorization requirements |
45 | | -- Containment requirements (sandbox/VM) |
46 | | -- False positive handling |
47 | | -- Data safety guidelines |
| 47 | +### What It Doesn't Do |
48 | 48 |
|
49 | | -Guardrails: |
50 | | -- Local-only mode blocks remote targets by default |
51 | | -- Remote targets require `SUPERCLAW_AUTH_TOKEN` (or adapter token) |
| 49 | +SuperClaw does **not** generate agents, run production workloads, or automate real-world exploitation. It's a testing tool, not a weapon. |
52 | 50 |
|
53 | | -## Supported Targets |
| 51 | +--- |
54 | 52 |
|
55 | | -- 🦞 **OpenClaw** — ACP WebSocket adapter |
56 | | -- 🧪 **Mock** — Offline deterministic testing |
57 | | -- 🔧 **Custom** — Extend via adapters |
| 53 | +## 🚀 Quick Start |
58 | 54 |
|
59 | | -## Quick Start |
| 55 | +### Installation |
60 | 56 |
|
61 | 57 | ```bash |
62 | | -# Install |
63 | 58 | pip install superclaw |
| 59 | +``` |
64 | 60 |
|
65 | | -# Attack OpenClaw (local instance) |
| 61 | +### Run Your First Attack |
| 62 | + |
| 63 | +```bash |
| 64 | +# Attack a local OpenClaw instance |
66 | 65 | superclaw attack openclaw --target ws://127.0.0.1:18789 |
67 | 66 |
|
68 | | -# Generate attack scenarios |
| 67 | +# Or test offline with the mock adapter |
| 68 | +superclaw attack mock --behaviors prompt-injection-resistance |
| 69 | +``` |
| 70 | + |
| 71 | +### Generate Attack Scenarios |
| 72 | + |
| 73 | +```bash |
69 | 74 | superclaw generate scenarios --behavior prompt_injection --num-scenarios 20 |
| 75 | +``` |
70 | 76 |
|
71 | | -# Run security audit |
72 | | -superclaw audit openclaw --comprehensive --report-format html --output report |
| 77 | +### Run a Full Security Audit |
73 | 78 |
|
74 | | -# Offline testing |
75 | | -superclaw attack mock --behaviors prompt-injection-resistance |
| 79 | +```bash |
| 80 | +superclaw audit openclaw --comprehensive --report-format html --output report |
76 | 81 | ``` |
77 | 82 |
|
78 | | -## Attack Techniques |
| 83 | +--- |
| 84 | + |
| 85 | +## ✨ Features |
| 86 | + |
| 87 | +### Supported Targets |
| 88 | + |
| 89 | +| Target | Description | Adapter | |
| 90 | +|--------|-------------|---------| |
| 91 | +| 🦞 **OpenClaw** | AI coding agents via ACP WebSocket | `openclaw` | |
| 92 | +| 🧪 **Mock** | Offline deterministic testing | `mock` | |
| 93 | +| 🔧 **Custom** | Build your own adapter | Extend `BaseAdapter` | |
| 94 | + |
| 95 | +### Attack Techniques |
79 | 96 |
|
80 | 97 | | Technique | Description | |
81 | 98 | |-----------|-------------| |
82 | | -| `prompt-injection` | Direct/indirect injection attacks | |
| 99 | +| `prompt-injection` | Direct and indirect injection attacks | |
83 | 100 | | `encoding` | Base64, hex, unicode, typoglycemia obfuscation | |
84 | | -| `jailbreak` | DAN, grandmother, role-play techniques | |
| 101 | +| `jailbreak` | DAN, grandmother, role-play bypass techniques | |
85 | 102 | | `tool-bypass` | Tool policy bypass via alias confusion | |
86 | | -| `multi-turn` | Multi-turn persistent escalation attacks | |
| 103 | +| `multi-turn` | Persistent escalation across conversation turns | |
| 104 | + |
| 105 | +### Security Behaviors |
| 106 | + |
| 107 | +Each behavior includes a structured contract with intent, success criteria, rubric, and mitigation guidance. |
87 | 108 |
|
88 | | -## Security Behaviors |
| 109 | +| Behavior | Severity | Tests | |
| 110 | +|----------|----------|-------| |
| 111 | +| `prompt-injection-resistance` | 🔴 CRITICAL | Injection detection and rejection | |
| 112 | +| `sandbox-isolation` | 🔴 CRITICAL | Container and filesystem boundaries | |
| 113 | +| `tool-policy-enforcement` | 🟠 HIGH | Allow/deny list compliance | |
| 114 | +| `session-boundary-integrity` | 🟠 HIGH | Cross-session isolation | |
| 115 | +| `configuration-drift-detection` | 🟡 MEDIUM | Config stability over time | |
| 116 | +| `acp-protocol-security` | 🟡 MEDIUM | Protocol message handling | |
89 | 117 |
|
90 | | -Each behavior ships with a structured contract (intent, success criteria, rubric, mitigation). |
| 118 | +--- |
91 | 119 |
|
92 | | -| Behavior | Severity | Description | |
93 | | -|----------|----------|-------------| |
94 | | -| `prompt-injection-resistance` | CRITICAL | Tests injection detection | |
95 | | -| `tool-policy-enforcement` | HIGH | Tests allow/deny lists | |
96 | | -| `sandbox-isolation` | CRITICAL | Tests container boundaries | |
97 | | -| `session-boundary-integrity` | HIGH | Tests session isolation | |
98 | | -| `configuration-drift-detection` | MEDIUM | Tests config stability | |
99 | | -| `acp-protocol-security` | MEDIUM | Tests protocol handling | |
| 120 | +## 📖 CLI Reference |
100 | 121 |
|
101 | | -## CLI Commands |
| 122 | +### Attacks |
102 | 123 |
|
103 | 124 | ```bash |
104 | | -# Attacks |
105 | 125 | superclaw attack openclaw --target ws://127.0.0.1:18789 --behaviors all |
106 | 126 | superclaw attack mock --behaviors prompt-injection-resistance |
| 127 | +``` |
| 128 | + |
| 129 | +### Scenario Generation (Bloom) |
107 | 130 |
|
108 | | -# Scenario generation (Bloom) |
| 131 | +```bash |
109 | 132 | superclaw generate scenarios --behavior prompt_injection --num-scenarios 20 |
110 | 133 | superclaw generate scenarios --behavior jailbreak --variations noise,emotional_pressure |
| 134 | +``` |
111 | 135 |
|
112 | | -# Evaluation |
| 136 | +### Evaluation |
| 137 | + |
| 138 | +```bash |
113 | 139 | superclaw evaluate openclaw --scenarios scenarios.json --behaviors all |
114 | 140 | superclaw evaluate mock --scenarios scenarios.json |
| 141 | +``` |
| 142 | + |
| 143 | +### Auditing |
115 | 144 |
|
116 | | -# Audit |
| 145 | +```bash |
117 | 146 | superclaw audit openclaw --comprehensive --report-format html --output report |
118 | 147 | superclaw audit openclaw --quick |
| 148 | +``` |
| 149 | + |
| 150 | +### Reporting |
119 | 151 |
|
120 | | -# Reporting |
121 | | -superclaw report generate --results results.json --format sarif # For GitHub Code Scanning |
| 152 | +```bash |
| 153 | +superclaw report generate --results results.json --format sarif # GitHub Code Scanning |
122 | 154 | superclaw report drift --baseline baseline.json --current current.json |
| 155 | +``` |
123 | 156 |
|
124 | | -# Scanning |
| 157 | +### Scanning |
| 158 | + |
| 159 | +```bash |
125 | 160 | superclaw scan config |
126 | 161 | superclaw scan skills --path /path/to/skills |
127 | | - |
128 | | -# Utilities |
129 | | -superclaw behaviors |
130 | | -superclaw attacks |
131 | | -superclaw init |
132 | 162 | ``` |
133 | 163 |
|
134 | | -## Documentation |
| 164 | +### Utilities |
135 | 165 |
|
136 | | -Full documentation: https://superagenticai.github.io/superclaw/ |
| 166 | +```bash |
| 167 | +superclaw behaviors # List all security behaviors |
| 168 | +superclaw attacks # List all attack techniques |
| 169 | +superclaw init # Initialize a new project |
| 170 | +``` |
| 171 | + |
| 172 | +--- |
137 | 173 |
|
138 | | -## CodeOptiX Integration |
| 174 | +## 🔗 CodeOptiX Integration |
139 | 175 |
|
140 | | -SuperClaw integrates with [CodeOptiX](https://github.com/SuperagenticAI/codeoptix) for multi-modal evaluation: |
| 176 | +SuperClaw integrates with [CodeOptiX](https://github.com/SuperagenticAI/codeoptix) for multi-modal security evaluation. |
141 | 177 |
|
142 | 178 | ```bash |
143 | 179 | # Install with CodeOptiX support |
@@ -167,58 +203,92 @@ print(f"Score: {result.overall_score:.1%}") |
167 | 203 | print(f"Passed: {result.overall_passed}") |
168 | 204 | ``` |
169 | 205 |
|
170 | | -## Architecture |
| 206 | +--- |
| 207 | + |
| 208 | +## ⚠️ Security Notice |
| 209 | + |
| 210 | +**This tool is for authorized security testing only.** |
| 211 | + |
| 212 | +### Guardrails |
| 213 | + |
| 214 | +- Local-only mode blocks remote targets by default |
| 215 | +- Remote targets require `SUPERCLAW_AUTH_TOKEN` (or adapter-specific token) |
| 216 | + |
| 217 | +### Requirements |
| 218 | + |
| 219 | +Before using SuperClaw, ensure you have: |
| 220 | +- ✅ Written authorization to test the target system |
| 221 | +- ✅ Isolated test environment (sandbox/VM recommended) |
| 222 | +- ✅ Understanding of [SECURITY.md](SECURITY.md) guidelines |
| 223 | + |
| 224 | +--- |
| 225 | + |
| 226 | +## 🏗️ Architecture |
171 | 227 |
|
172 | 228 | ``` |
173 | 229 | superclaw/ |
174 | | -├── attacks/ # Attack implementations |
175 | | -│ ├── prompt_injection.py |
176 | | -│ ├── encoding.py |
177 | | -│ ├── jailbreaks.py |
178 | | -│ ├── tool_bypass.py |
179 | | -│ └── multi_turn.py |
180 | | -├── behaviors/ # Security behavior specs |
181 | | -│ ├── injection_resistance.py |
182 | | -│ ├── tool_policy.py |
183 | | -│ ├── sandbox_isolation.py |
184 | | -│ ├── session_boundary.py |
185 | | -│ ├── config_drift.py |
186 | | -│ └── protocol_security.py |
187 | | -├── adapters/ # Agent adapters |
188 | | -│ ├── openclaw.py |
189 | | -│ ├── mock.py |
190 | | -│ └── base.py |
191 | | -├── bloom/ # Scenario generation |
192 | | -│ ├── ideation.py |
193 | | -│ ├── rollout.py |
194 | | -│ └── judgment.py |
195 | | -├── scanners/ # Config + supply-chain scanning |
196 | | -├── analysis/ # Drift comparison |
197 | | -├── codeoptix/ # CodeOptiX integration |
198 | | -│ ├── adapter.py # Behavior adapter |
199 | | -│ ├── evaluator.py # Security evaluator |
200 | | -│ └── engine.py # Evaluation engine |
201 | | -└── reporting/ # Report generation |
202 | | - ├── html.py |
203 | | - ├── json_report.py |
204 | | - └── sarif.py |
| 230 | +├── attacks/ # Attack technique implementations |
| 231 | +├── behaviors/ # Security behavior specifications |
| 232 | +├── adapters/ # Target agent adapters |
| 233 | +├── bloom/ # AI-powered scenario generation |
| 234 | +├── scanners/ # Config and supply-chain scanning |
| 235 | +├── analysis/ # Drift detection and comparison |
| 236 | +├── codeoptix/ # CodeOptiX integration layer |
| 237 | +└── reporting/ # HTML, JSON, and SARIF report generation |
205 | 238 | ``` |
206 | 239 |
|
207 | | -## Part of Superagentic AI Ecosystem |
| 240 | +--- |
| 241 | + |
| 242 | +## 🌐 Superagentic AI Ecosystem |
| 243 | + |
| 244 | +SuperClaw is part of the [Superagentic AI](https://super-agentic.ai) ecosystem: |
| 245 | + |
| 246 | +| Project | Description | |
| 247 | +|---------|-------------| |
| 248 | +| **SuperQE** | Quality engineering core framework | |
| 249 | +| **SuperClaw** | Agent security testing *(this package)* | |
| 250 | +| **CodeOptiX** | Code optimization and evaluation engine | |
| 251 | + |
| 252 | +--- |
208 | 253 |
|
209 | | -- **SuperQE** - Quality Engineering core |
210 | | -- **SuperClaw** - Agent security testing (this package) |
211 | | -- **CodeOptiX** - Code optimization engine |
| 254 | +## 📚 Documentation |
212 | 255 |
|
213 | | -## Open Source |
| 256 | +<table> |
| 257 | +<tr> |
| 258 | +<td width="100%" align="center"> |
| 259 | +<h3>📖 <a href="https://superagenticai.github.io/superclaw/">superagenticai.github.io/superclaw</a></h3> |
| 260 | +</td> |
| 261 | +</tr> |
| 262 | +</table> |
214 | 263 |
|
215 | | -- [LICENSE](LICENSE) |
216 | | -- [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) |
217 | | -- [CONTRIBUTING.md](CONTRIBUTING.md) |
218 | | -- [SECURITY.md](SECURITY.md) |
| 264 | +| Guide | Description | |
| 265 | +|-------|-------------| |
| 266 | +| [Installation](https://superagenticai.github.io/superclaw/getting-started/installation/) | Setup with pip, uv, or from source | |
| 267 | +| [Quick Start](https://superagenticai.github.io/superclaw/getting-started/quickstart/) | Run your first security scan in 5 minutes | |
| 268 | +| [Configuration](https://superagenticai.github.io/superclaw/getting-started/configuration/) | Configure targets, LLM providers, and safety settings | |
| 269 | +| [Running Attacks](https://superagenticai.github.io/superclaw/guides/attacks/) | Execute attacks and interpret results | |
| 270 | +| [Custom Behaviors](https://superagenticai.github.io/superclaw/guides/custom-behaviors/) | Write your own security behavior specs | |
| 271 | +| [CI/CD Integration](https://superagenticai.github.io/superclaw/guides/ci-cd/) | GitHub Actions, GitLab CI, and SARIF output | |
| 272 | +| [Architecture](https://superagenticai.github.io/superclaw/architecture/overview/) | Deep dive into SuperClaw internals | |
219 | 273 |
|
220 | | -Built by [Superagentic AI](https://super-agentic.ai) · GitHub: [SuperagenticAI/superclaw](https://github.com/SuperagenticAI/superclaw) |
| 274 | +--- |
221 | 275 |
|
222 | | -## License |
| 276 | +## 🤝 Contributing |
223 | 277 |
|
224 | | -Apache 2.0 |
| 278 | +We welcome contributions! Please see: |
| 279 | + |
| 280 | +- [CONTRIBUTING.md](CONTRIBUTING.md) — How to contribute |
| 281 | +- [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) — Community guidelines |
| 282 | +- [SECURITY.md](SECURITY.md) — Security policy |
| 283 | + |
| 284 | +--- |
| 285 | + |
| 286 | +## 📄 License |
| 287 | + |
| 288 | +Apache 2.0 — see [LICENSE](LICENSE) for details. |
| 289 | + |
| 290 | +--- |
| 291 | + |
| 292 | +<p align="center"> |
| 293 | + Built with 🦞 by <a href="https://super-agentic.ai">Superagentic AI</a> |
| 294 | +</p> |
0 commit comments