Runtime safety and human approval infrastructure for computer-using agents.
OpenClaw is powerful. That's the problem. ClawReins is the watchdog layer.
ClawReins sits between an AI agent and the real world. Itβs the watchdog layer for computer-using agents. ClawReins protects agents at two stages:
- Before runtime β security scanning
- During runtime β action interception
Think of it as sudo for AI agents. The first production integration is OpenClaw. ClawReins plugs into the before_tool_call event and adds:
- Prevent destructive actions before they execute
- Pause for human approval with YES / ALLOW / CONFIRM flows
- Prove what happened with durable audit logs and post-incident review
OpenClaw cannot be its own watchdog. Neither can any CUA.
Hero example: an OpenClaw agent tries to bulk-delete 4,382 Gmail messages. ClawReins blocks it before execution.
That is the core runtime story:
- destructive action detected
- execution paused before side effects
- human approval required
- decision written to the audit trail
- TechCrunch (February 23, 2026): A Meta AI security researcher said an OpenClaw agent ran amok on her inbox
Runtime interception is the enforcement layer. It is what stops an agent mid-trajectory when the action is destructive, irreversible, or operating under risky browser state.
Core capabilities:
- Browser-state awareness for CAPTCHA, 2FA, and challenge walls
- Irreversibility scoring for risky versus catastrophic actions
- Runtime intervention across terminal and messaging approval channels
- ToolShield-aligned hardening for new tool rollouts
- Full audit logging for every approval decision
ClawReins includes a security scanner that audits the local OpenClaw environment for high-signal misconfigurations before runtime problems turn into incidents.
clawreins scan audits a local OpenClaw installation for high-signal security misconfigurations, writes an HTML report to ~/Downloads/scan-report.html, and prints a file:// link directly in the terminal.
Usage:
# Run the 13-check audit and save the HTML report
clawreins scan
# Save the report and try to open it automatically
clawreins scan --html
# Machine-readable output for CI
clawreins scan --json
# Apply supported auto-fixes after confirmation
clawreins scan --fix
# Apply supported auto-fixes without prompting
clawreins scan --fix --yes
# Compare against the last saved baseline and alert on drift
clawreins scan --monitor
# Compare against the baseline and invoke a notifier when drift is detected
clawreins scan --monitor --alert-command "/path/to/send-openclaw-alert.sh"Supported auto-fixes:
- Rebinding gateway host from
0.0.0.0to127.0.0.1 - Tightening config file permissions to
600 - Injecting a default
tools.exec.safeBinsallowlist - Disabling
authBypass/skipAuth/disableAuthstyle flags
Before any fix is applied, ClawReins creates a timestamped backup in ~/.scan-backup/.
Drift monitoring is opt-in. It is designed for scheduled runs, not enabled by default.
Default monitoring behavior:
- disabled by default
- run every 24 hours when scheduled
- compare against
~/.openclaw/clawreins/scan-state.json - alert only on worsened posture: verdict worsening, new
WARN, or newFAIL - no background auto-fix
- HTML report still written to
~/Downloads/scan-report.html
Manual run:
clawreins scan --monitorThe first run creates a baseline. Later runs compare the current report against that saved baseline and only alert when posture worsens.
If you want scheduled jobs to notify through your own transport, add --alert-command. This command runs only when drift is detected. ClawReins exports these environment variables to the notifier:
CLAWREINS_SCAN_SUMMARYCLAWREINS_SCAN_VERDICTCLAWREINS_SCAN_REPORT_PATHCLAWREINS_SCAN_REPORT_URLCLAWREINS_SCAN_STATE_PATHCLAWREINS_SCAN_WORSENED_CHECKS
That makes it easy to route alerts through:
- an OpenClaw messaging wrapper
- a webhook sender
- email, Slack, Telegram, or WhatsApp bridge scripts
Notifier example:
clawreins scan --monitor \
--alert-command "$HOME/bin/send-openclaw-alert.sh"The alert hook is generic on purpose. The scan CLI does not directly call the in-process OpenClaw plugin API from cron or system schedulers, so the notifier command is the bridge if you want alerts to land through OpenClaw-managed messaging.
Recommended operating model:
- run once per day
- use
--monitorso each run compares against the last saved baseline - add
--alert-commandif you want drift notifications delivered outside the terminal - never use
--fixin scheduled jobs
What happens on scheduled runs:
- The first scheduled run creates the baseline in
~/.openclaw/clawreins/scan-state.json. - Later runs compare the current
ScanReportagainst that saved baseline. - ClawReins alerts only when posture worsens: verdict gets worse, a check changes from
PASStoWARN, or a check changes fromPASSorWARNtoFAIL. - Every run still writes
~/Downloads/scan-report.htmlso the latest full report is easy to inspect.
Recommended scheduler settings:
- frequency: every 24 hours
- stdout/stderr: append to a dedicated log file such as
~/.openclaw/clawreins/scan-monitor.log - environment: set
HOMEandOPENCLAW_HOMEexplicitly - notifier: use
--alert-commandfor OpenClaw wrappers, webhooks, or messaging bridges
Example daily job with drift logging only:
0 9 * * * /usr/bin/env \
HOME=$HOME \
OPENCLAW_HOME=$HOME/.openclaw \
/usr/local/bin/clawreins scan --monitor \
>> $HOME/.openclaw/clawreins/scan-monitor.log 2>&1Example daily job with drift alert delivery:
0 9 * * * /usr/bin/env \
HOME=$HOME \
OPENCLAW_HOME=$HOME/.openclaw \
/usr/local/bin/clawreins scan --monitor \
--alert-command "$HOME/bin/send-openclaw-alert.sh" \
>> $HOME/.openclaw/clawreins/scan-monitor.log 2>&1If you want the scheduled job to fail loudly for automation, the exit codes stay the same in monitor mode:
0forSECURE1forNEEDS ATTENTION2forEXPOSED
That makes scheduled monitoring usable from cron, systemd, CI, or any wrapper that reacts to non-zero exit codes.
| Check | Severity | Detects | Auto-fix |
|---|---|---|---|
GATEWAY_BINDING |
Critical | Gateway listening on 0.0.0.0 or missing localhost binding |
Yes |
API_KEYS_EXPOSURE |
Critical | Plaintext API keys, tokens, passwords, or secrets stored directly in config files | No |
FILE_PERMISSIONS |
Critical | Config files readable by group or other users instead of 600 |
Yes |
HTTPS_TLS |
Warning | Missing HTTPS/TLS or certificate-related configuration | No |
SHELL_COMMAND_ALLOWLIST |
Critical | Missing safeBins or equivalent shell allowlist / unrestricted shell execution |
Yes |
SENSITIVE_DIRECTORIES |
Warning | Agent environment can still reach directories like ~/.ssh, ~/.gnupg, ~/.aws, or /etc/shadow |
No |
WEBHOOK_AUTH |
Warning | Webhook endpoints configured without auth tokens or shared secrets | No |
SANDBOX_ISOLATION |
Warning | No Docker or sandbox isolation detected | No |
DEFAULT_WEAK_CREDENTIALS |
Critical | Default, weak, undefined, or missing gateway credentials | No |
RATE_LIMITING |
Warning | No gateway throttling or rate limit configuration | No |
NODEJS_VERSION |
Critical | Node.js versions affected by CVE-2026-21636 permission-model bypass window | No |
CONTROL_UI_AUTH |
Critical | Control UI authentication bypass flags enabled | Yes |
BROWSER_UNSANDBOXED |
Critical | Browser skill config missing headless: true or sandbox: true protection |
No |
Exit codes:
0=SECURE1=NEEDS ATTENTION2=EXPOSED
OpenClaw can execute shell commands, modify files, and access your APIs. OS-level isolation (containers, VMs) protects your host machine, but it doesn't protect the services your agent has access to.
ClawReins solves this by hooking into OpenClaw's before_tool_call plugin event. Before any dangerous action executes (writes, deletes, shell commands, API calls), the agent pauses and waits for your decision. In a terminal, you get an interactive prompt. On messaging channels (WhatsApp, Telegram), the agent asks for YES/NO/ALLOW or explicit CONFIRM token (for irreversible actions) via a dedicated clawreins_respond tool. Every choice is logged to an immutable audit trail. Think of it as sudo for your AI agent: nothing happens without your explicit permission.
- Prevent Stop destructive actions before execution, score irreversibility, detect risky browser state, and harden tool rollout with ToolShield-aligned guardrails.
- Pause
Route high-impact actions through terminal or messaging approval flows, including explicit
CONFIRM-*tokens for catastrophic operations. - Prove Preserve audit logs, approval decisions, security scan findings, and post-fix artifacts so incidents are reviewable after the fact.
ClawReins now applies deterministic pre-execution gating for destructive actions.
- Destructive calls are intercepted before execution and forced through HITL approval
HIGHseverity supportsYES/ALLOWCATASTROPHICseverity requires explicitCONFIRM-*token- Fail-secure behavior: if approval tooling is unavailable, action stays blocked
Environment toggles:
CLAWREINS_DESTRUCTIVE_GATING=on # default on
CLAWREINS_BULK_THRESHOLD=20 # default 20
CLAWREINS_CONFIRM_THRESHOLD=80 # optional, irreversibility confirm thresholdDemo script (GIF-friendly):
npm run demo:destructive- Node.js >= 18.0.0
- OpenClaw installed
# Install plugin
openclaw plugins install clawreins@beta
# Run setup
node ~/.openclaw/extensions/clawreins/dist/cli/index.js init
# Reload gateway
openclaw gateway restartDone! ClawReins is now protecting your OpenClaw instance.
clawreins init now enables ToolShield by default:
- Uses bundled ToolShield core from this repo first (
src/core/toolshield) - Falls back to auto-install via
piponly if bundled core is unavailable - Syncs bundled experiences into OpenClaw
AGENTS.md - Keeps ClawReins runtime interception + ToolShield instruction hardening aligned
If you use ToolShield for instruction-level hardening, sync it directly into your
OpenClaw AGENTS.md through ClawReins:
clawreins toolshield-syncWhat it does:
- Uses bundled ToolShield core from
src/core/toolshieldwhen available - Falls back to installed/pip ToolShield if bundled core is unavailable
- Removes previously injected ToolShield guidelines by default (idempotent sync)
- Imports bundled experiences into OpenClaw instructions (
AGENTS.md)
ToolShield project reference: CHATS-lab/ToolShield
Useful overrides:
# Use a different bundled model
clawreins toolshield-sync --model claude-sonnet-4.5
# Custom OpenClaw home/profile
OPENCLAW_HOME=~/.openclaw-profile-a clawreins toolshield-sync
# Target a custom AGENTS.md path
clawreins toolshield-sync --agents-file /path/to/AGENTS.md
# Force a specific bundled ToolShield source root
clawreins toolshield-sync --bundled-dir /path/to/toolshield-root
# Do not auto-install ToolShield (fail if missing)
clawreins toolshield-sync --no-install
# Append without unloading existing ToolShield section
clawreins toolshield-sync --appendAgent calls tool: write('/etc/passwd', 'hacked')
β before_tool_call hook fires
β ClawReins checks policy: write = ASK
β Interactive prompt:
βββββββββββββββββββββββββββββββββββββββ
β π¦ CLAWREINS SECURITY ALERT β
β β
β Module: FileSystem β
β Method: write β
β Args: ["/etc/passwd", "hacked"] β
β β
β β― β Approve β
β β Reject β
βββββββββββββββββββββββββββββββββββββββ
β You reject β { block: true }
β Decision logged to audit trail
Agent calls tool: bash('rm -rf /tmp/data')
β before_tool_call β policy = ASK β blocked (pending approval)
β Agent asks user for approval (or explicit token for irreversible actions)
User replies YES (normal risk):
β Agent calls clawreins_respond({ decision: "yes" })
β before_tool_call intercepts β approves pending entry
β Agent retries bash('rm -rf /tmp/data') β approved β
User replies NO:
β Agent calls clawreins_respond({ decision: "no" })
β before_tool_call intercepts β denies pending entry
β Agent does NOT retry β cancelled β
For high irreversibility actions:
β ClawReins returns token requirement (e.g. CONFIRM-AB12CD)
β Agent calls clawreins_respond({ decision: "confirm", confirmation: "CONFIRM-AB12CD" })
β Retry proceeds only after token match β
The clawreins_respond tool is registered automatically via api.registerTool() when the gateway supports it (yes, no, allow, confirm).
Before execution, ClawReins now evaluates accumulated session memory and predicts
high-risk turn N+1 trajectories.
Signals:
- Drift score: semantic drift from initial intent to current trajectory
- Salami index: low-risk looking steps composing into a harmful chain
- Commitment creep: rising irreversibility and narrowing rollback options
When memory trajectory risk crosses threshold, ClawReins escalates to HITL before execution and includes predicted next-step danger paths in the approval summary.
ClawReins uses three decision types:
| Policy | Behavior |
|---|---|
| ALLOW | Execute immediately (e.g., file reads) |
| ASK | Prompt for approval (e.g., file writes) |
| DENY | Block automatically (e.g., file deletes) |
Default policy (Balanced):
- FileSystem: read=ALLOW, write=ASK, delete=DENY
- Shell: bash=ASK, exec=ASK
- Browser: screenshot=ALLOW, navigate/click/type/evaluate=ASK
- Gateway: sendMessage=ASK
- Network: fetch=ASK, request=ASK
- Everything else: ASK (fail-secure default)
clawreins init # Interactive setup wizard
clawreins configure # Alias for init (OpenClaw configure entrypoint)
clawreins configure --non-interactive --json # Automation-friendly machine output
clawreins policy # Manage security policies
clawreins stats # View statistics
clawreins audit # View decision history
clawreins reset # Reset statistics
clawreins disable # Temporarily disable
clawreins enable # Re-enable
clawreins toolshield-sync # Sync ToolShield guardrails into AGENTS.md
clawreins upgrade # Reinstall latest clawreins@beta in OpenClaw + restart gateway
clawreins update # Alias for upgrade
clawreins scan # Run 13 security checks and save an HTML report
clawreins scan --fix # Backup config and apply supported remediations
clawreins scan --monitor # Compare with the last baseline and alert on drift
clawreins scan --monitor --alert-command "/path/to/notifier.sh" # Run a notifier on drift$ clawreins audit --lines 5
16:05:00 | FileSystem.read | ALLOWED | 0.0s
16:06:00 | FileSystem.write | APPROVED | 3.5s (human)
16:07:00 | Shell.bash | REJECTED | 1.2s (human)
16:08:00 | FileSystem.delete | BLOCKED | 0.0s - Policy: DENY$ clawreins stats
π ClawReins Statistics
Total Calls: 142
Decisions:
β
Allowed: 35 (24.6%)
β
Approved: 89 (62.7%) - by user
β Rejected: 12 (8.5%) - by user
π« Blocked: 6 (4.2%) - by policy
Average Decision Time: 2.8sAll data stored in ~/.openclaw/clawreins/:
~/.openclaw/clawreins/
βββ policy.json # Your security rules
βββ decisions.jsonl # Audit trail (append-only)
βββ stats.json # Statistics
βββ scan-state.json # Last drift-monitoring baseline
βββ browser-sessions.json # Encrypted persistent browser auth/session state
βββ clawreins.log # Application logs
import { Interceptor, createToolCallHook } from 'clawreins';
// Create interceptor with default policy
const interceptor = new Interceptor();
// Create a hook handler for OpenClaw's before_tool_call event
const hook = createToolCallHook(interceptor);
// Register with the OpenClaw plugin API
api.on('before_tool_call', hook);ClawReins intercepts every tool mapped in TOOL_TO_MODULE:
- FileSystem: read, write, edit, glob
- Shell: bash, exec
- Browser: navigate, screenshot, click, type, evaluate
- Network: fetch, request, webhook, download
- Gateway: listSessions, listNodes, sendMessage
Any unmapped tool falls through to defaultAction (ASK by default).
src/
βββ core/
β βββ Interceptor.ts # Policy evaluation engine
β βββ Arbitrator.ts # Human-in-the-loop (TTY prompt / channel queue)
β βββ ApprovalQueue.ts # In-memory approval state for channel mode
β βββ MemoryRiskForecaster.ts # Drift/salami/commitment pre-turn forecasting
β βββ toolshield/ # Bundled ToolShield core used for default sync
β βββ Logger.ts # Winston-based logging
βββ plugin/
β βββ index.ts # Plugin entry point (hook + tool registration)
β βββ tool-interceptor.ts # before_tool_call handler + clawreins_respond intercept
β βββ config-manager.ts # OpenClaw config management (register/unregister)
βββ storage/ # Persistence (PolicyStore, DecisionLog, StatsTracker)
βββ cli/ # Command-line interface
βββ toolshield/ # ToolShield sync integration helpers
βββ types.ts # TypeScript definitions
βββ config.ts # Default policies
# Clone repo
git clone github.com/pegasi-ai/clawreins
cd clawreins
# Install dependencies
npm install
# Build
npm run build
# Test CLI locally
node dist/cli/index.js init
# Link for global testing
npm link
clawreins --helpβ Zero Trust - Every action evaluated β Synchronous Blocking - Agent waits for approval β No Bypass - Plugin hooks intercept all tool calls β Immutable Audit - JSON Lines append-only format β Human Authority - Critical decisions need approval β Fail Secure - Unknown actions default to ASK/DENY
We believe in safe AI. PRs welcome!
- Fork the repo
- Create your feature branch:
git checkout -b feature/amazing - Commit changes:
git commit -m 'Add amazing feature' - Push:
git push origin feature/amazing - Open a Pull Request
See CONTRIBUTING.md for details.
Apache 2.0 - See LICENSE for details.
- Built for OpenClaw agents
- ToolShield methodology and implementation from CHATS-lab/ToolShield
- Inspired by the need for human oversight in AI systems
- Thanks to the AI safety community
Built with β€οΈ for a safer AI future.


