feat(tools): three-tier tool safety model#96
Merged
priyanshujain merged 21 commits intomasterfrom Mar 18, 2026
Merged
Conversation
Add FilterResult (Allow/Deny/Prompt), CheckWithResult method, and NewSoftAllowlistFilter for interactive mode approval prompts. Also add InteractiveAllowlist of safe commands.
BashTool now uses CheckWithResult and prompts user via GuardedAction when FilterPrompt is returned. Add bash pattern extraction for approval rule grouping.
Extracted from chat.go's inline cliInteractor into a standalone type with compile-time interface check and tests.
NewStandardRegistry now accepts Interactor and ApprovalRuleSet. With interactor: bash uses soft allowlist, file tools require approval. Without interactor: backward-compatible blocklist behavior. Updated all call sites.
Pure Go implementation (ls, tree, find, glob) with no subprocess. Tier 1 (free) — no approval needed.
Pure Go implementation with glob filter and binary file detection. Tier 1 (free) — no approval needed.
SandboxRuntime interface with macOS Seatbelt and Linux bubblewrap implementations. DetectRuntime auto-selects available sandbox.
Tier 3 tool — no approval needed, the OS sandbox IS the safety. Supports python, bash, node, ruby. Gracefully degrades when sandbox runtime is unavailable.
Add dir_explore, content_search, sandbox_exec to registries. Mark new tools as untrusted output. Add tool guidance to prompt.
Document the three-tier model, soft allowlist, OS-level sandboxing, and updated defense layer numbering.
DefaultBlocklist is still used for subagent registries (no interactor). Updated comment to reflect that interactive mode now uses InteractiveAllowlist.
Tests cover no-interactor, approved, and denied scenarios for both FileWriteTool and FileEditTool.
…cases Cover empty command, invalid JSON, FilterPrompt with/without interactor, allowlist auto-run, workdir option, approval rules auto-approve, and combined stdout+stderr output.
…rules Cover FileEditTool approved path, FileReadTool nonexistent file, invalid JSON for file tools, approval rules with file_write/edit patterns, wildcard matching, duplicate rule prevention, sandbox runtime exec error, and sandbox invalid JSON.
…lore Cover seatbelt profile empty workDir/SSH deny/write dir, content_search empty pattern/nonexistent path/invalid JSON, dir_explore nonexistent dir/glob without pattern/invalid JSON.
…ests Cover dir_explore/sandbox_exec conditional prompt sections and NewStandardRegistry with non-nil interactor.
TrimSpace before empty check so " " is caught.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
file_read,dir_explore,content_search), Tier 2 (Approved — user confirms:bash,file_write,file_edit), Tier 3 (Sandboxed — OS sandbox:sandbox_exec)InteractiveAllowlistauto-run; unknown commands prompt the user viaGuardedAction. Scheduled tasks use a strict allowlist (obk,sqlite3only)Interactoris provided, with session-scoped auto-approve after repeated approvals of the same pathdir_explore(ls/tree/find/glob in pure Go) andcontent_search(regex search in pure Go) — safe read-only tools registered in both interactive and scheduled registriesSeatbeltRuntime(macOS sandbox-exec) andBwrapRuntime(Linux bubblewrap) with read-only filesystem, no network, 30s timeout.sandbox_exectool wraps these — no approval needed since the sandbox is the safetytools.InteractorimplementationTest plan
go test ./agent/tools/... -count=1— all pass (135+ tests including 35 new)go test ./internal/cli/... -count=1— all pass (5 tests including 1 new)go build ./...— clean buildCloses #94