Skip to content

MCP bridge argument guards: per-server arg validation via mcp.json#469

Merged
rockfordlhotka merged 1 commit into
mainfrom
feature/mcp-arg-guards
Jun 12, 2026
Merged

MCP bridge argument guards: per-server arg validation via mcp.json#469
rockfordlhotka merged 1 commit into
mainfrom
feature/mcp-arg-guards

Conversation

@rockfordlhotka

Copy link
Copy Markdown
Member

Problem

Third-party MCP servers resolve path arguments inside their own pod. During the 2026-06-11 patrol, a subagent called the OneDrive server's download_file with save_directory: "/tmp" — the file saved pod-locally, invisible to the agent and script pods that share the rockbot-shared PVC at /rockbot/shared, while the tool reported success. The subagent flailed for minutes hunting for output that was truthfully claimed to be saved. We can't patch third-party servers; we own the bridge every MCP call flows through.

Feature

Per-server argGuards declared in mcp.json, enforced by the bridge before forwarding a tool call:

"onedrive-personal": {
  "type": "sse",
  "url": "...",
  "argGuards": [
    { "handler": "path-prefix",
      "tools": ["download_file"],
      "options": { "args": ["save_directory"], "allowedPrefixes": ["/rockbot/shared"], "requireArgs": true } }
  ]
}

Design highlights

  • Named handlers, never Type.GetType() — mcp.json is LLM-writable via register_mcp_server; config-driven type loading would be an arbitrary-code-execution channel into the bridge process. Handlers register in DI (McpArgGuardRegistration + McpArgGuardRegistry, the TokenProviderRegistry pattern) and config selects from that closed set.
  • Built-in path-prefix handler — lexical Linux-semantics normalization (not Path.GetFullPath; target FS is the server pod), Ordinal boundary-aware prefix matching (/rockbot/shared/rockbot/shared-evil), rejects relative paths and escaping traversal. Optional requireArgs rejects calls that omit the argument (server-side defaults are pod-local too). Reject-only — rejection messages name the argument, prefixes, and the why, so the model self-corrects in one turn.
  • Runs on original arguments — after the invoke_tool unwrap, before the attachment gateway mutates anything. Rejection publishes ToolError { invalid_arguments, IsRetryable = false }, the attachment-failure precedent.
  • Fail closed — unknown handler or invalid options refuses the server connection (before _serverConfigs is populated, so invokes get server-not-found); register_mcp_server returns a descriptive error; invoke-time unresolvable guards reject. Re-registering an existing name preserves its guards, closing the LLM channel for stripping operator policy (note: Attachments/Auth have the same exposure — flagged in the design doc as follow-up).
  • Excluded from CanonicalIdentity() like attachments — policy about how the server is invoked, not which server it is. Pinned by regression test.
  • Single enforcement point — all callers (agent, subagents, workers, wisps, A2A, self-repair) flow through tool.invoke.mcp to the one bridge handler; the transparent reconnect-retry reuses already-validated arguments, so no bypass.

Testing

  • 60 new unit tests: PathPrefixArgGuardTests (normalization, traversal, boundaries, case, requireArgs, options validation), McpArgGuardEvaluatorTests (tool filtering, fail-closed paths, short-circuit, mutable-args contract), McpArgGuardRegistryTests, McpArgGuardConfigTests (mcp.json deserialization + persistence round-trip), plus the CanonicalIdentity exclusion regression.
  • Full solution test suite green (integration tests skipped as designed without RabbitMQ).

Docs

  • docs/tools.md — operator-facing "Argument guards" section after attachment passthrough.
  • design/mcp-arg-guards.md — security rationale that must survive refactors.

Version bumped to 0.12.31 (agent image rebuild required — the bridge is embedded in the agent).

🤖 Generated with Claude Code

….12.31)

Third-party MCP servers resolve path arguments inside their own pod: a
download_file call with save_directory=/tmp succeeds but the file lands
pod-local, invisible to the agent and script pods sharing /rockbot/shared
— while the tool reports success (hit during the 2026-06-11 patrol).

Add per-server argGuards declared in mcp.json and enforced by the bridge
before forwarding a tool call:

- IMcpArgGuard handlers resolved by NAME from a DI registry (the
  TokenProviderRegistry pattern) — never Type.GetType(): mcp.json is
  LLM-writable via register_mcp_server, so config-driven type loading
  would be an arbitrary-code-execution channel.
- Built-in "path-prefix" handler: lexical Linux-semantics normalization,
  Ordinal boundary-aware prefix match, rejects relative paths and
  escaping traversal; optional requireArgs closes the omitted-argument
  hole. Reject-only; messages teach the model to self-correct in one turn.
- Guards run on the model's original arguments, after the invoke_tool
  unwrap and before the attachment gateway mutates them. Rejection is
  ToolError invalid_arguments (not retryable), matching the attachment
  failure precedent.
- Fail closed: invalid/unknown guard config refuses the server connection
  (before _serverConfigs is populated) and register_mcp_server returns a
  descriptive error. Re-registering an existing name preserves guards so
  the LLM channel cannot strip operator policy.
- ArgGuards excluded from CanonicalIdentity (policy about how the server
  is invoked, not which server it is) — pinned by regression test.

Docs: docs/tools.md "Argument guards" section + design/mcp-arg-guards.md
recording the security decisions. 60 new unit tests; full suite green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@rockfordlhotka rockfordlhotka merged commit 72c2fcd into main Jun 12, 2026
2 checks passed
@rockfordlhotka rockfordlhotka deleted the feature/mcp-arg-guards branch June 12, 2026 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant