Skip to content

feat(tools): three-tier tool safety model#96

Merged
priyanshujain merged 21 commits intomasterfrom
sandboxing-bash
Mar 18, 2026
Merged

feat(tools): three-tier tool safety model#96
priyanshujain merged 21 commits intomasterfrom
sandboxing-bash

Conversation

@priyanshujain
Copy link
Copy Markdown
Collaborator

Summary

  • Three-tier tool classification: Tier 1 (Free — no gate: file_read, dir_explore, content_search), Tier 2 (Approved — user confirms: bash, file_write, file_edit), Tier 3 (Sandboxed — OS sandbox: sandbox_exec)
  • Allowlist + approval for bash: Switches from a blocklist (provably broken) to a soft allowlist. Commands on InteractiveAllowlist auto-run; unknown commands prompt the user via GuardedAction. Scheduled tasks use a strict allowlist (obk, sqlite3 only)
  • Approval gates on file_write/file_edit: Both now require user approval when an Interactor is provided, with session-scoped auto-approve after repeated approvals of the same path
  • New Tier 1 tools: dir_explore (ls/tree/find/glob in pure Go) and content_search (regex search in pure Go) — safe read-only tools registered in both interactive and scheduled registries
  • OS-level sandboxing: SeatbeltRuntime (macOS sandbox-exec) and BwrapRuntime (Linux bubblewrap) with read-only filesystem, no network, 30s timeout. sandbox_exec tool wraps these — no approval needed since the sandbox is the safety
  • CLIInteractor: Extracts terminal y/n approval prompts into a reusable tools.Interactor implementation

Test plan

  • go test ./agent/tools/... -count=1 — all pass (135+ tests including 35 new)
  • go test ./internal/cli/... -count=1 — all pass (5 tests including 1 new)
  • go build ./... — clean build
  • Three-outcome filter: allow, deny, prompt for soft allowlist
  • Bash approval flow: prompt with/without interactor, auto-approve rules
  • File write/edit approval: approved, denied, auto-approve via rules
  • Dir explore: all actions (ls, tree, find, glob), error cases
  • Content search: regex matching, glob filter, binary skip, edge cases
  • Sandbox: profile assertions, runtime detection, exec errors
  • Registry: standard vs scheduled tool sets, interactor wiring
  • Prompt: conditional sections for dir_explore and sandbox_exec
  • Sanitize: new tools in untrusted output list

Closes #94

Add FilterResult (Allow/Deny/Prompt), CheckWithResult method, and
NewSoftAllowlistFilter for interactive mode approval prompts.
Also add InteractiveAllowlist of safe commands.
BashTool now uses CheckWithResult and prompts user via GuardedAction
when FilterPrompt is returned. Add bash pattern extraction for
approval rule grouping.
Extracted from chat.go's inline cliInteractor into a standalone type
with compile-time interface check and tests.
NewStandardRegistry now accepts Interactor and ApprovalRuleSet.
With interactor: bash uses soft allowlist, file tools require approval.
Without interactor: backward-compatible blocklist behavior.
Updated all call sites.
Pure Go implementation (ls, tree, find, glob) with no subprocess.
Tier 1 (free) — no approval needed.
Pure Go implementation with glob filter and binary file detection.
Tier 1 (free) — no approval needed.
SandboxRuntime interface with macOS Seatbelt and Linux bubblewrap
implementations. DetectRuntime auto-selects available sandbox.
Tier 3 tool — no approval needed, the OS sandbox IS the safety.
Supports python, bash, node, ruby. Gracefully degrades when
sandbox runtime is unavailable.
Add dir_explore, content_search, sandbox_exec to registries.
Mark new tools as untrusted output. Add tool guidance to prompt.
Document the three-tier model, soft allowlist, OS-level sandboxing,
and updated defense layer numbering.
DefaultBlocklist is still used for subagent registries (no interactor).
Updated comment to reflect that interactive mode now uses InteractiveAllowlist.
Tests cover no-interactor, approved, and denied scenarios for
both FileWriteTool and FileEditTool.
…cases

Cover empty command, invalid JSON, FilterPrompt with/without interactor,
allowlist auto-run, workdir option, approval rules auto-approve, and
combined stdout+stderr output.
…rules

Cover FileEditTool approved path, FileReadTool nonexistent file,
invalid JSON for file tools, approval rules with file_write/edit
patterns, wildcard matching, duplicate rule prevention, sandbox
runtime exec error, and sandbox invalid JSON.
…lore

Cover seatbelt profile empty workDir/SSH deny/write dir, content_search
empty pattern/nonexistent path/invalid JSON, dir_explore nonexistent
dir/glob without pattern/invalid JSON.
…ests

Cover dir_explore/sandbox_exec conditional prompt sections and
NewStandardRegistry with non-nil interactor.
TrimSpace before empty check so "   " is caught.
@priyanshujain priyanshujain merged commit bf4f144 into master Mar 18, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

shell safety

1 participant