Skip to content

feat(DATAGO-126152): Secure Tool Runtime (STR) sandbox worker#1086

Draft
mo-radwan1 wants to merge 4 commits intomainfrom
mradwan/DATAGO-126152-str-broker-executor
Draft

feat(DATAGO-126152): Secure Tool Runtime (STR) sandbox worker#1086
mo-radwan1 wants to merge 4 commits intomainfrom
mradwan/DATAGO-126152-str-broker-executor

Conversation

@mo-radwan1
Copy link
Collaborator

Summary

  • Add Secure Tool Runtime (STR) sandbox worker for executing user-uploaded Python tools in isolated containers
  • Add generic object storage abstraction supporting S3, GCS, and Azure Blob Storage
  • Add sandboxed executor for broker-based tool invocation with timeout and resource limits

Details

Sandbox Worker

  • Dockerized worker that syncs tool packages from object storage, executes tools in isolated processes
  • Manifest-driven tool discovery and hot-reload on manifest changes
  • Broker-based communication protocol (request/response via Solace topics)
  • Resource limits, timeout enforcement, and graceful shutdown
  • Context facade providing get_config(), get_secret() for tool authors

Sandboxed Executor

  • New executor type (sandboxed_executor.py) for SAM agents to invoke STR-hosted tools
  • Sends tool invocation requests via broker, waits for sandbox worker response
  • Configurable timeout per tool, automatic retry on transient failures

Object Storage Abstraction

  • services/platform/storage/ — async clients for S3, GCS, Azure (platform service side)
  • sandbox/storage/ — sync clients for S3, GCS, Azure (sandbox worker side)
  • Factory pattern with provider auto-detection from environment variables

A2A Protocol Extensions

  • Common A2A protocol types for tool invocation messages

Test plan

  • Unit tests for object storage abstraction
  • Unit tests for tool sync service
  • Unit tests for sandbox storage factory
  • E2E: Upload tool package, deploy agent, invoke tool via sam task send
  • E2E: Verify sandbox worker syncs manifest and tools from S3
  • E2E: Verify tool config values flow through to ctx.get_config()

Implement broker-based remote tool execution via bubblewrap-sandboxed
worker containers. Agents delegate tool calls over Solace to a sandbox
worker that executes customer-uploaded Python tools in isolated
namespaces with resource limits.

Key components:
- sandbox-worker/: Container image (Dockerfile, entrypoint, build/run scripts)
- sandbox/app.py, component.py: Worker application and message handling
- sandbox/sandbox_runner.py: bwrap subprocess lifecycle and artifact I/O
- sandbox/tool_runner.py: In-sandbox tool function executor with type-aware injection
- sandbox/context_facade.py: ToolContextFacade API backed by local FS and named pipes
- sandbox/manifest.py: YAML tool manifest with auto-reload and wheel installation
- sandbox/protocol.py: JSON-RPC 2.0 request/response models
- sandbox/storage/: S3/GCS/Azure sync clients for tool file distribution
- sandbox/tool_sync_service.py: Background ETag-based incremental sync
- agent/tools/executors/sandboxed_executor.py: Agent-side executor (sam_remote)
- common/a2a/protocol.py: Solace topic helpers for invoke/response/status routing
…ecks

- Add HTTP health server with K8s startup/readiness/liveness probes
- Switch from --ro-bind / / to whitelist filesystem mounts (only /usr,
  /lib, /bin, /sbin, select /etc files, and tool source directory)
- Add --unshare-user to run sandboxed code as nobody (uid 65534)
- Add --tmpfs /var/run/secrets to hide K8s service account tokens
- Add path traversal protection for artifact filenames
- Add RLIMIT_NPROC to all sandbox profiles (fork bomb prevention)
- Add disk space checks and stale work directory cleanup
- Move auth validation before request parsing in component
- Add RESOURCE_EXHAUSTED error code to protocol
- Remove curl from final image to reduce attack surface
- Add sandbox unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant