Skip to content

pegasi-ai/clawreins

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ClawReins Logo

🦞 + πŸͺ’ ClawReins

Runtime safety and human approval infrastructure for computer-using agents.

github.com/pegasi-ai/clawreins

License: Apache 2.0 TypeScript Node.js >= 18.0.0

OpenClaw is powerful. That's the problem. ClawReins is the watchdog layer.

ClawReins sits between an AI agent and the real world. It’s the watchdog layer for computer-using agents. ClawReins protects agents at two stages:

  • Before runtime β†’ security scanning
  • During runtime β†’ action interception

Think of it as sudo for AI agents. The first production integration is OpenClaw. ClawReins plugs into the before_tool_call event and adds:

  • Prevent destructive actions before they execute
  • Pause for human approval with YES / ALLOW / CONFIRM flows
  • Prove what happened with durable audit logs and post-incident review

OpenClaw cannot be its own watchdog. Neither can any CUA.

Demo

ClawReins demo

Hero example: an OpenClaw agent tries to bulk-delete 4,382 Gmail messages. ClawReins blocks it before execution.

That is the core runtime story:

  • destructive action detected
  • execution paused before side effects
  • human approval required
  • decision written to the audit trail

In The News

Intercept Example

ClawReins intercept example

Runtime Interception

Runtime interception is the enforcement layer. It is what stops an agent mid-trajectory when the action is destructive, irreversible, or operating under risky browser state.

Core capabilities:

  • Browser-state awareness for CAPTCHA, 2FA, and challenge walls
  • Irreversibility scoring for risky versus catastrophic actions
  • Runtime intervention across terminal and messaging approval channels
  • ToolShield-aligned hardening for new tool rollouts
  • Full audit logging for every approval decision

Security Scan

ClawReins includes a security scanner that audits the local OpenClaw environment for high-signal misconfigurations before runtime problems turn into incidents.

ClawReins security scan

clawreins scan audits a local OpenClaw installation for high-signal security misconfigurations, writes an HTML report to ~/Downloads/scan-report.html, and prints a file:// link directly in the terminal.

Usage:

# Run the 13-check audit and save the HTML report
clawreins scan

# Save the report and try to open it automatically
clawreins scan --html

# Machine-readable output for CI
clawreins scan --json

# Apply supported auto-fixes after confirmation
clawreins scan --fix

# Apply supported auto-fixes without prompting
clawreins scan --fix --yes

# Compare against the last saved baseline and alert on drift
clawreins scan --monitor

# Compare against the baseline and invoke a notifier when drift is detected
clawreins scan --monitor --alert-command "/path/to/send-openclaw-alert.sh"

Supported auto-fixes:

  • Rebinding gateway host from 0.0.0.0 to 127.0.0.1
  • Tightening config file permissions to 600
  • Injecting a default tools.exec.safeBins allowlist
  • Disabling authBypass / skipAuth / disableAuth style flags

Before any fix is applied, ClawReins creates a timestamped backup in ~/.scan-backup/.

Drift Monitoring

Drift monitoring is opt-in. It is designed for scheduled runs, not enabled by default.

Default monitoring behavior:

  • disabled by default
  • run every 24 hours when scheduled
  • compare against ~/.openclaw/clawreins/scan-state.json
  • alert only on worsened posture: verdict worsening, new WARN, or new FAIL
  • no background auto-fix
  • HTML report still written to ~/Downloads/scan-report.html

Manual run:

clawreins scan --monitor

The first run creates a baseline. Later runs compare the current report against that saved baseline and only alert when posture worsens.

If you want scheduled jobs to notify through your own transport, add --alert-command. This command runs only when drift is detected. ClawReins exports these environment variables to the notifier:

  • CLAWREINS_SCAN_SUMMARY
  • CLAWREINS_SCAN_VERDICT
  • CLAWREINS_SCAN_REPORT_PATH
  • CLAWREINS_SCAN_REPORT_URL
  • CLAWREINS_SCAN_STATE_PATH
  • CLAWREINS_SCAN_WORSENED_CHECKS

That makes it easy to route alerts through:

  • an OpenClaw messaging wrapper
  • a webhook sender
  • email, Slack, Telegram, or WhatsApp bridge scripts

Notifier example:

clawreins scan --monitor \
  --alert-command "$HOME/bin/send-openclaw-alert.sh"

The alert hook is generic on purpose. The scan CLI does not directly call the in-process OpenClaw plugin API from cron or system schedulers, so the notifier command is the bridge if you want alerts to land through OpenClaw-managed messaging.

Scheduled Runs

Recommended operating model:

  • run once per day
  • use --monitor so each run compares against the last saved baseline
  • add --alert-command if you want drift notifications delivered outside the terminal
  • never use --fix in scheduled jobs

What happens on scheduled runs:

  1. The first scheduled run creates the baseline in ~/.openclaw/clawreins/scan-state.json.
  2. Later runs compare the current ScanReport against that saved baseline.
  3. ClawReins alerts only when posture worsens: verdict gets worse, a check changes from PASS to WARN, or a check changes from PASS or WARN to FAIL.
  4. Every run still writes ~/Downloads/scan-report.html so the latest full report is easy to inspect.

Recommended scheduler settings:

  • frequency: every 24 hours
  • stdout/stderr: append to a dedicated log file such as ~/.openclaw/clawreins/scan-monitor.log
  • environment: set HOME and OPENCLAW_HOME explicitly
  • notifier: use --alert-command for OpenClaw wrappers, webhooks, or messaging bridges

Example daily job with drift logging only:

0 9 * * * /usr/bin/env \
  HOME=$HOME \
  OPENCLAW_HOME=$HOME/.openclaw \
  /usr/local/bin/clawreins scan --monitor \
  >> $HOME/.openclaw/clawreins/scan-monitor.log 2>&1

Example daily job with drift alert delivery:

0 9 * * * /usr/bin/env \
  HOME=$HOME \
  OPENCLAW_HOME=$HOME/.openclaw \
  /usr/local/bin/clawreins scan --monitor \
  --alert-command "$HOME/bin/send-openclaw-alert.sh" \
  >> $HOME/.openclaw/clawreins/scan-monitor.log 2>&1

If you want the scheduled job to fail loudly for automation, the exit codes stay the same in monitor mode:

  • 0 for SECURE
  • 1 for NEEDS ATTENTION
  • 2 for EXPOSED

That makes scheduled monitoring usable from cron, systemd, CI, or any wrapper that reacts to non-zero exit codes.

Security Checks

Check Severity Detects Auto-fix
GATEWAY_BINDING Critical Gateway listening on 0.0.0.0 or missing localhost binding Yes
API_KEYS_EXPOSURE Critical Plaintext API keys, tokens, passwords, or secrets stored directly in config files No
FILE_PERMISSIONS Critical Config files readable by group or other users instead of 600 Yes
HTTPS_TLS Warning Missing HTTPS/TLS or certificate-related configuration No
SHELL_COMMAND_ALLOWLIST Critical Missing safeBins or equivalent shell allowlist / unrestricted shell execution Yes
SENSITIVE_DIRECTORIES Warning Agent environment can still reach directories like ~/.ssh, ~/.gnupg, ~/.aws, or /etc/shadow No
WEBHOOK_AUTH Warning Webhook endpoints configured without auth tokens or shared secrets No
SANDBOX_ISOLATION Warning No Docker or sandbox isolation detected No
DEFAULT_WEAK_CREDENTIALS Critical Default, weak, undefined, or missing gateway credentials No
RATE_LIMITING Warning No gateway throttling or rate limit configuration No
NODEJS_VERSION Critical Node.js versions affected by CVE-2026-21636 permission-model bypass window No
CONTROL_UI_AUTH Critical Control UI authentication bypass flags enabled Yes
BROWSER_UNSANDBOXED Critical Browser skill config missing headless: true or sandbox: true protection No

Exit codes:

  • 0 = SECURE
  • 1 = NEEDS ATTENTION
  • 2 = EXPOSED

Why?

OpenClaw can execute shell commands, modify files, and access your APIs. OS-level isolation (containers, VMs) protects your host machine, but it doesn't protect the services your agent has access to.

ClawReins solves this by hooking into OpenClaw's before_tool_call plugin event. Before any dangerous action executes (writes, deletes, shell commands, API calls), the agent pauses and waits for your decision. In a terminal, you get an interactive prompt. On messaging channels (WhatsApp, Telegram), the agent asks for YES/NO/ALLOW or explicit CONFIRM token (for irreversible actions) via a dedicated clawreins_respond tool. Every choice is logged to an immutable audit trail. Think of it as sudo for your AI agent: nothing happens without your explicit permission.

Features

  • Prevent Stop destructive actions before execution, score irreversibility, detect risky browser state, and harden tool rollout with ToolShield-aligned guardrails.
  • Pause Route high-impact actions through terminal or messaging approval flows, including explicit CONFIRM-* tokens for catastrophic operations.
  • Prove Preserve audit logs, approval decisions, security scan findings, and post-fix artifacts so incidents are reviewable after the fact.

Destructive Action Intercept (Pre-Execution)

ClawReins now applies deterministic pre-execution gating for destructive actions.

  • Destructive calls are intercepted before execution and forced through HITL approval
  • HIGH severity supports YES / ALLOW
  • CATASTROPHIC severity requires explicit CONFIRM-* token
  • Fail-secure behavior: if approval tooling is unavailable, action stays blocked

Environment toggles:

CLAWREINS_DESTRUCTIVE_GATING=on   # default on
CLAWREINS_BULK_THRESHOLD=20       # default 20
CLAWREINS_CONFIRM_THRESHOLD=80    # optional, irreversibility confirm threshold

Demo script (GIF-friendly):

npm run demo:destructive

Quick Start

Prerequisites

  • Node.js >= 18.0.0
  • OpenClaw installed

Installation

# Install plugin
openclaw plugins install clawreins@beta

# Run setup
node ~/.openclaw/extensions/clawreins/dist/cli/index.js init

# Reload gateway
openclaw gateway restart

Done! ClawReins is now protecting your OpenClaw instance.

clawreins init now enables ToolShield by default:

  • Uses bundled ToolShield core from this repo first (src/core/toolshield)
  • Falls back to auto-install via pip only if bundled core is unavailable
  • Syncs bundled experiences into OpenClaw AGENTS.md
  • Keeps ClawReins runtime interception + ToolShield instruction hardening aligned

ToolShield Sync (One Command)

If you use ToolShield for instruction-level hardening, sync it directly into your OpenClaw AGENTS.md through ClawReins:

clawreins toolshield-sync

What it does:

  • Uses bundled ToolShield core from src/core/toolshield when available
  • Falls back to installed/pip ToolShield if bundled core is unavailable
  • Removes previously injected ToolShield guidelines by default (idempotent sync)
  • Imports bundled experiences into OpenClaw instructions (AGENTS.md)

ToolShield project reference: CHATS-lab/ToolShield

Useful overrides:

# Use a different bundled model
clawreins toolshield-sync --model claude-sonnet-4.5

# Custom OpenClaw home/profile
OPENCLAW_HOME=~/.openclaw-profile-a clawreins toolshield-sync

# Target a custom AGENTS.md path
clawreins toolshield-sync --agents-file /path/to/AGENTS.md

# Force a specific bundled ToolShield source root
clawreins toolshield-sync --bundled-dir /path/to/toolshield-root

# Do not auto-install ToolShield (fail if missing)
clawreins toolshield-sync --no-install

# Append without unloading existing ToolShield section
clawreins toolshield-sync --append

How It Works

Terminal Mode (TTY)

Agent calls tool: write('/etc/passwd', 'hacked')
  β†’ before_tool_call hook fires
  β†’ ClawReins checks policy: write = ASK
  β†’ Interactive prompt:
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ 🦞 CLAWREINS SECURITY ALERT         β”‚
    β”‚                                     β”‚
    β”‚ Module: FileSystem                  β”‚
    β”‚ Method: write                       β”‚
    β”‚ Args: ["/etc/passwd", "hacked"]     β”‚
    β”‚                                     β”‚
    β”‚ ❯ βœ“ Approve                         β”‚
    β”‚   βœ— Reject                          β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  β†’ You reject β†’ { block: true }
  β†’ Decision logged to audit trail

Channel Mode (WhatsApp / Telegram)

Agent calls tool: bash('rm -rf /tmp/data')
  β†’ before_tool_call β†’ policy = ASK β†’ blocked (pending approval)
  β†’ Agent asks user for approval (or explicit token for irreversible actions)

User replies YES (normal risk):
  β†’ Agent calls clawreins_respond({ decision: "yes" })
  β†’ before_tool_call intercepts β†’ approves pending entry
  β†’ Agent retries bash('rm -rf /tmp/data') β†’ approved βœ“

User replies NO:
  β†’ Agent calls clawreins_respond({ decision: "no" })
  β†’ before_tool_call intercepts β†’ denies pending entry
  β†’ Agent does NOT retry β†’ cancelled βœ“

For high irreversibility actions:
  β†’ ClawReins returns token requirement (e.g. CONFIRM-AB12CD)
  β†’ Agent calls clawreins_respond({ decision: "confirm", confirmation: "CONFIRM-AB12CD" })
  β†’ Retry proceeds only after token match βœ“

The clawreins_respond tool is registered automatically via api.registerTool() when the gateway supports it (yes, no, allow, confirm).

Memory-Aware Pre-Turn Forecasting

Before execution, ClawReins now evaluates accumulated session memory and predicts high-risk turn N+1 trajectories.

Signals:

  • Drift score: semantic drift from initial intent to current trajectory
  • Salami index: low-risk looking steps composing into a harmful chain
  • Commitment creep: rising irreversibility and narrowing rollback options

When memory trajectory risk crosses threshold, ClawReins escalates to HITL before execution and includes predicted next-step danger paths in the approval summary.

Security Policies

ClawReins uses three decision types:

Policy Behavior
ALLOW Execute immediately (e.g., file reads)
ASK Prompt for approval (e.g., file writes)
DENY Block automatically (e.g., file deletes)

Default policy (Balanced):

  • FileSystem: read=ALLOW, write=ASK, delete=DENY
  • Shell: bash=ASK, exec=ASK
  • Browser: screenshot=ALLOW, navigate/click/type/evaluate=ASK
  • Gateway: sendMessage=ASK
  • Network: fetch=ASK, request=ASK
  • Everything else: ASK (fail-secure default)

CLI Commands

clawreins init        # Interactive setup wizard
clawreins configure   # Alias for init (OpenClaw configure entrypoint)
clawreins configure --non-interactive --json  # Automation-friendly machine output
clawreins policy      # Manage security policies
clawreins stats       # View statistics
clawreins audit       # View decision history
clawreins reset       # Reset statistics
clawreins disable     # Temporarily disable
clawreins enable      # Re-enable
clawreins toolshield-sync  # Sync ToolShield guardrails into AGENTS.md
clawreins upgrade     # Reinstall latest clawreins@beta in OpenClaw + restart gateway
clawreins update      # Alias for upgrade
clawreins scan        # Run 13 security checks and save an HTML report
clawreins scan --fix  # Backup config and apply supported remediations
clawreins scan --monitor  # Compare with the last baseline and alert on drift
clawreins scan --monitor --alert-command "/path/to/notifier.sh"  # Run a notifier on drift

Example: View Audit Trail

$ clawreins audit --lines 5

16:05:00 | FileSystem.read              | ALLOWED    |   0.0s
16:06:00 | FileSystem.write             | APPROVED   |   3.5s (human)
16:07:00 | Shell.bash                   | REJECTED   |   1.2s (human)
16:08:00 | FileSystem.delete            | BLOCKED    |   0.0s - Policy: DENY

Example: View Statistics

$ clawreins stats

πŸ“Š ClawReins Statistics

Total Calls:    142

Decisions:
  βœ… Allowed:      35 (24.6%)
  βœ… Approved:     89 (62.7%) - by user
  ❌ Rejected:     12 (8.5%)  - by user
  🚫 Blocked:       6 (4.2%)  - by policy

Average Decision Time: 2.8s

Data Storage

All data stored in ~/.openclaw/clawreins/:

~/.openclaw/clawreins/
β”œβ”€β”€ policy.json       # Your security rules
β”œβ”€β”€ decisions.jsonl   # Audit trail (append-only)
β”œβ”€β”€ stats.json        # Statistics
β”œβ”€β”€ scan-state.json   # Last drift-monitoring baseline
β”œβ”€β”€ browser-sessions.json  # Encrypted persistent browser auth/session state
└── clawreins.log          # Application logs

Use as a Library

import { Interceptor, createToolCallHook } from 'clawreins';

// Create interceptor with default policy
const interceptor = new Interceptor();

// Create a hook handler for OpenClaw's before_tool_call event
const hook = createToolCallHook(interceptor);

// Register with the OpenClaw plugin API
api.on('before_tool_call', hook);

Protected Tools

ClawReins intercepts every tool mapped in TOOL_TO_MODULE:

  • FileSystem: read, write, edit, glob
  • Shell: bash, exec
  • Browser: navigate, screenshot, click, type, evaluate
  • Network: fetch, request, webhook, download
  • Gateway: listSessions, listNodes, sendMessage

Any unmapped tool falls through to defaultAction (ASK by default).

Architecture

src/
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ Interceptor.ts    # Policy evaluation engine
β”‚   β”œβ”€β”€ Arbitrator.ts     # Human-in-the-loop (TTY prompt / channel queue)
β”‚   β”œβ”€β”€ ApprovalQueue.ts  # In-memory approval state for channel mode
β”‚   β”œβ”€β”€ MemoryRiskForecaster.ts  # Drift/salami/commitment pre-turn forecasting
β”‚   β”œβ”€β”€ toolshield/       # Bundled ToolShield core used for default sync
β”‚   └── Logger.ts         # Winston-based logging
β”œβ”€β”€ plugin/
β”‚   β”œβ”€β”€ index.ts              # Plugin entry point (hook + tool registration)
β”‚   β”œβ”€β”€ tool-interceptor.ts   # before_tool_call handler + clawreins_respond intercept
β”‚   └── config-manager.ts     # OpenClaw config management (register/unregister)
β”œβ”€β”€ storage/        # Persistence (PolicyStore, DecisionLog, StatsTracker)
β”œβ”€β”€ cli/            # Command-line interface
β”œβ”€β”€ toolshield/     # ToolShield sync integration helpers
β”œβ”€β”€ types.ts        # TypeScript definitions
└── config.ts       # Default policies

Development

# Clone repo
git clone github.com/pegasi-ai/clawreins
cd clawreins

# Install dependencies
npm install

# Build
npm run build

# Test CLI locally
node dist/cli/index.js init

# Link for global testing
npm link
clawreins --help

Security Guarantees

βœ… Zero Trust - Every action evaluated βœ… Synchronous Blocking - Agent waits for approval βœ… No Bypass - Plugin hooks intercept all tool calls βœ… Immutable Audit - JSON Lines append-only format βœ… Human Authority - Critical decisions need approval βœ… Fail Secure - Unknown actions default to ASK/DENY

Contributing

We believe in safe AI. PRs welcome!

  1. Fork the repo
  2. Create your feature branch: git checkout -b feature/amazing
  3. Commit changes: git commit -m 'Add amazing feature'
  4. Push: git push origin feature/amazing
  5. Open a Pull Request

See CONTRIBUTING.md for details.

License

Apache 2.0 - See LICENSE for details.

Acknowledgments

  • Built for OpenClaw agents
  • ToolShield methodology and implementation from CHATS-lab/ToolShield
  • Inspired by the need for human oversight in AI systems
  • Thanks to the AI safety community

Built with ❀️ for a safer AI future.

About

Intervention layer with audit logs for OpenClaw agents. Browser-aware. Trajectory-aware. Human-routable.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors