Skip to content

fix(security): unify SSRF protection for WASM host calls#1060

Open
ferr079 wants to merge 1 commit intoRightNow-AI:mainfrom
ferr079:fix/unify-ssrf-protection
Open

fix(security): unify SSRF protection for WASM host calls#1060
ferr079 wants to merge 1 commit intoRightNow-AI:mainfrom
ferr079:fix/unify-ssrf-protection

Conversation

@ferr079
Copy link
Copy Markdown

@ferr079 ferr079 commented Apr 15, 2026

Summary

The WASM sandbox host_net_fetch() has its own SSRF implementation (is_ssrf_target) that diverged from the canonical check_ssrf() in web_fetch.rs. This creates a security gap where WASM agents can bypass protections that builtin tools correctly enforce.

Issues found in host_functions.rs::is_ssrf_target():

Gap Impact
Missing 6 blocked hostnames (ip6-localhost, [::1], ::1, 0.0.0.0, Alibaba 100.100.100.200, Azure 192.0.0.192) WASM agent can reach cloud metadata endpoints
No is_metadata_ip() check on resolved IPs DNS rebinding can bypass hostname blocklist to reach IMDS
ssrf_allowed_hosts from config.toml completely ignored Self-hosted deployments that whitelist internal services get inconsistent behavior between builtin tools and WASM
No IPv6 bracket notation support [::1]:8080 not parsed correctly
Duplicate is_private_ip() and extract_host_from_url() Maintenance burden, divergence risk

Changes

  • Remove is_ssrf_target(), is_private_ip(), and extract_host_from_url() from host_functions.rs
  • Delegate to web_fetch::check_ssrf() which has the complete implementation
  • Propagate ssrf_allowed_hosts through SandboxConfigGuestState → host call
  • Promote extract_host() to pub(crate) for reuse
  • Update tests to exercise the unified code path with new coverage for IPv6 and cloud metadata endpoints

Result

  • -42 lines net (56 added, 98 removed)
  • All 908 runtime tests pass
  • Zero clippy warnings
  • Cloud metadata endpoints now blocked consistently across all code paths
  • ssrf_allowed_hosts config now respected by WASM host calls

Test plan

  • cargo check -p openfang-runtime -p openfang-kernel -p openfang-types — compiles clean
  • cargo test -p openfang-runtime — 908 tests pass
  • cargo clippy -p openfang-runtime -p openfang-kernel -- -D warnings — zero warnings
  • Manual: deploy with ssrf_allowed_hosts = ["192.168.1.0/24"] and verify WASM agent can reach allowed internal services

@jaberjaber23
Copy link
Copy Markdown
Member

Reviewed this end-to-end — this is exactly the kind of cleanup we want:

  • Removes the duplicated SSRF logic from host_functions.rs and delegates to the canonical web_fetch::check_ssrf.
  • WASM host path now inherits IPv6 ([::1]) blocking, metadata.aws.internal / instance-data / 100.64.0.0/10 AWS-link-local coverage, and respects ssrf_allowed_hosts from [web.fetch] config — all of which were previously missing on the WASM call path.
  • Tests added for IPv6 and the allowlist.
  • Net -42 LOC despite adding functionality.

Status: mergeable but CI rollup is empty (no checks registered — likely branch pre-dates current workflow triggers). Can you rebase on latest main (we just merged #1041 which upgraded wasmtime/rumqttc; sandbox.rs was touched there too, so there may be a small conflict on the GuestState struct)? Once rebased and CI runs green, this is a straightforward merge.

The WASM sandbox host_net_fetch() had its own SSRF implementation
(is_ssrf_target) that was incomplete compared to the canonical
check_ssrf() in web_fetch.rs:

- Missing 6 blocked hostnames (ip6-localhost, Alibaba/Azure IMDS,
  0.0.0.0, ::1, [::1])
- Missing cloud metadata IP detection (is_metadata_ip)
- Missing IPv6 bracket notation support
- Ignoring ssrf_allowed_hosts from config.toml entirely
- Duplicate is_private_ip() and extract_host_from_url() functions

This meant a WASM agent could bypass SSRF protections that the
builtin web_fetch tool correctly enforced.

Changes:
- Remove duplicated is_ssrf_target(), is_private_ip(), and
  extract_host_from_url() from host_functions.rs
- Delegate to web_fetch::check_ssrf() which has the complete
  implementation with allowlist, CIDR matching, and metadata
  IP detection
- Add ssrf_allowed_hosts to SandboxConfig and GuestState so the
  config propagates to WASM host calls
- Make extract_host() pub(crate) for reuse
- Update tests to exercise the unified code path, including new
  coverage for IPv6 and cloud metadata endpoints

All 908 runtime tests pass. Zero clippy warnings.
@ferr079 ferr079 force-pushed the fix/unify-ssrf-protection branch from a493df6 to de8a692 Compare April 18, 2026 12:28
@ferr079
Copy link
Copy Markdown
Author

ferr079 commented Apr 18, 2026

Rebased on latest main (v0.5.10, including #1041). No conflicts — GuestState changes merged cleanly.

Verified locally:

  • cargo check -p openfang-runtime -p openfang-kernel -p openfang-types — clean
  • cargo test -p openfang-runtime929 tests pass (21 new tests from main since the original PR)
  • cargo clippy -p openfang-runtime -p openfang-kernel -- -D warnings — zero warnings (only a future-incompat note on imap-proto, unrelated third-party dep)

Ready for CI + merge when you're satisfied.

pbranchu pushed a commit to pbranchu/openfang-upstream that referenced this pull request Apr 18, 2026
- RwLock: replace Arc<Mutex<String>> with Arc<RwLock<String>> for
  A2A_TASK_PROGRESS — eliminates exclusive lock in read-heavy path

- SSE parsing: extract pure parse_sse_data_line fn + SseLineOutcome enum;
  refactor both send_task_streaming and send_task_streaming_with_progress
  to use it; add timeout: Option<Duration> to both; 8 unit tests covering
  all outcomes (final, update, empty, malformed, error, unknown)

- Bounded maps: add MAX_CONCURRENT_ASYNC_TASKS=256 cap with clear error;
  AsyncTaskEntry{handle,inserted_at} tracks task age; OnceLock-based
  background sweep (10 min interval, 2 h TTL) aborts stale handles;
  TaskCleanupGuard RAII ensures cleanup on panic

- Prompt injection: rewrite inject_async_callback to deliver async results
  via structural ToolUse+ToolResult pair instead of text framing; remote
  agent content lands in a ToolResult block where LLM API semantics enforce
  the data boundary; add prepend_turns: Option<Vec<Message>> to both
  run_agent_loop and run_agent_loop_streaming so the synthetic ToolUse is
  inserted AFTER validate_and_repair (prevents orphan removal of the
  ToolResult user turn)

- Strip @default suffix from agent IDs in tool_agent_send (pre-existing fix)
- Update test to correctly verify prune_failed_tool_turns behavior

SSRF (RightNow-AI#1060): already using canonical crate::web_fetch::check_ssrf
Thread context (RightNow-AI#1054): context.thread_id passed through; will work
correctly once smart-thread sets it

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ferr079
Copy link
Copy Markdown
Author

ferr079 commented Apr 22, 2026

Hey @jaberjaber23 — just a friendly ping. The rebase on latest main is done, 929 tests pass, zero clippy warnings. Anything else needed on my end before this can be merged? Happy to address any additional feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants