MCP bridge argument guards: per-server arg validation via mcp.json#469
Merged
Conversation
….12.31) Third-party MCP servers resolve path arguments inside their own pod: a download_file call with save_directory=/tmp succeeds but the file lands pod-local, invisible to the agent and script pods sharing /rockbot/shared — while the tool reports success (hit during the 2026-06-11 patrol). Add per-server argGuards declared in mcp.json and enforced by the bridge before forwarding a tool call: - IMcpArgGuard handlers resolved by NAME from a DI registry (the TokenProviderRegistry pattern) — never Type.GetType(): mcp.json is LLM-writable via register_mcp_server, so config-driven type loading would be an arbitrary-code-execution channel. - Built-in "path-prefix" handler: lexical Linux-semantics normalization, Ordinal boundary-aware prefix match, rejects relative paths and escaping traversal; optional requireArgs closes the omitted-argument hole. Reject-only; messages teach the model to self-correct in one turn. - Guards run on the model's original arguments, after the invoke_tool unwrap and before the attachment gateway mutates them. Rejection is ToolError invalid_arguments (not retryable), matching the attachment failure precedent. - Fail closed: invalid/unknown guard config refuses the server connection (before _serverConfigs is populated) and register_mcp_server returns a descriptive error. Re-registering an existing name preserves guards so the LLM channel cannot strip operator policy. - ArgGuards excluded from CanonicalIdentity (policy about how the server is invoked, not which server it is) — pinned by regression test. Docs: docs/tools.md "Argument guards" section + design/mcp-arg-guards.md recording the security decisions. 60 new unit tests; full suite green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Third-party MCP servers resolve path arguments inside their own pod. During the 2026-06-11 patrol, a subagent called the OneDrive server's
download_filewithsave_directory: "/tmp"— the file saved pod-locally, invisible to the agent and script pods that share therockbot-sharedPVC at/rockbot/shared, while the tool reported success. The subagent flailed for minutes hunting for output that was truthfully claimed to be saved. We can't patch third-party servers; we own the bridge every MCP call flows through.Feature
Per-server
argGuardsdeclared in mcp.json, enforced by the bridge before forwarding a tool call:Design highlights
Type.GetType()— mcp.json is LLM-writable viaregister_mcp_server; config-driven type loading would be an arbitrary-code-execution channel into the bridge process. Handlers register in DI (McpArgGuardRegistration+McpArgGuardRegistry, theTokenProviderRegistrypattern) and config selects from that closed set.path-prefixhandler — lexical Linux-semantics normalization (notPath.GetFullPath; target FS is the server pod),Ordinalboundary-aware prefix matching (/rockbot/shared≠/rockbot/shared-evil), rejects relative paths and escaping traversal. OptionalrequireArgsrejects calls that omit the argument (server-side defaults are pod-local too). Reject-only — rejection messages name the argument, prefixes, and the why, so the model self-corrects in one turn.invoke_toolunwrap, before the attachment gateway mutates anything. Rejection publishesToolError { invalid_arguments, IsRetryable = false }, the attachment-failure precedent._serverConfigsis populated, so invokes get server-not-found);register_mcp_serverreturns a descriptive error; invoke-time unresolvable guards reject. Re-registering an existing name preserves its guards, closing the LLM channel for stripping operator policy (note:Attachments/Authhave the same exposure — flagged in the design doc as follow-up).CanonicalIdentity()likeattachments— policy about how the server is invoked, not which server it is. Pinned by regression test.tool.invoke.mcpto the one bridge handler; the transparent reconnect-retry reuses already-validated arguments, so no bypass.Testing
PathPrefixArgGuardTests(normalization, traversal, boundaries, case, requireArgs, options validation),McpArgGuardEvaluatorTests(tool filtering, fail-closed paths, short-circuit, mutable-args contract),McpArgGuardRegistryTests,McpArgGuardConfigTests(mcp.json deserialization + persistence round-trip), plus theCanonicalIdentityexclusion regression.Docs
docs/tools.md— operator-facing "Argument guards" section after attachment passthrough.design/mcp-arg-guards.md— security rationale that must survive refactors.Version bumped to 0.12.31 (agent image rebuild required — the bridge is embedded in the agent).
🤖 Generated with Claude Code