-
Notifications
You must be signed in to change notification settings - Fork 8.5k
Open
Description
Problem
The LocalCommandLineCodeExecutor writes LLM-generated code to disk and executes it via asyncio.create_subprocess_exec with full host privileges. The only safeguard is a suppressible UserWarning.
Attack scenario: indirect prompt injection causes the LLM to generate malicious code (e.g., exfiltrating SSH keys, installing backdoors). The code runs with full filesystem, network, and process access.
The Docker executor provides real isolation but is opt-in and adds significant overhead. There is no lightweight, local sandboxing option between "completely unsandboxed" and "full Docker container."
Proposal
Add sandlock as a new code executor backend. Sandlock is a lightweight Linux process sandbox using Landlock, seccomp-bpf, and user namespaces.
What sandlock provides
| Layer | Protection |
|---|---|
| Landlock | Filesystem path whitelisting (read-only / read-write), network domain + port restrictions |
| seccomp-bpf | Syscall filtering at kernel level: blocks ptrace, mount, unshare, kexec_load, bpf, etc. |
| User namespaces | Privilege escalation prevention without root |
| Resource limits | Memory, process count, CPU, open file caps (no cgroups needed) |
Why this fits AutoGen
- ~20ms startup vs ~200ms+ for Docker. Low enough to be a reasonable default.
- No root, no daemon. Unlike Docker, just
pip install sandlock. No Docker socket needed. - Fits the executor interface. AutoGen already has a clean
CodeExecutorabstraction. ASandlockCommandLineCodeExecutorwould implement the same interface asLocalCommandLineCodeExecutorbut with kernel-enforced confinement. - Aligns with Proposal: Standardized Safety Sandbox for Agent Tool Execution #7230. The proposed
ToolSafetyPolicyinterface (allow_network, allow_filesystem, max_memory) maps directly to sandlock's capabilities. - Self-hosted. No Azure subscription needed. Works in air-gapped environments.
- Linux primary. Non-Linux users can continue using the Docker executor.
Where it fits in the executor hierarchy
LocalCommandLineCodeExecutor → SandlockCodeExecutor → DockerCodeExecutor → AzureCodeExecutor
(unsandboxed) (lightweight (container (cloud
OS-level) isolation) isolation)
Example usage
from autogen_ext.code_executors.sandlock import SandlockCommandLineCodeExecutor
executor = SandlockCommandLineCodeExecutor(
work_dir="./coding",
fs_read=["/usr/lib/python3", "/usr/local/lib"],
fs_write=["./coding"],
net_allow=[], # no network by default
max_memory_mb=512,
max_procs=5,
)Relation to existing issues
- [Security] LocalCommandLineCodeExecutor executes LLM-generated code without sandboxing #7462 (LocalCommandLineCodeExecutor unsandboxed): sandlock provides a drop-in replacement with kernel enforcement
- Proposal: Standardized Safety Sandbox for Agent Tool Execution #7230 (ToolSafetyPolicy proposal): sandlock's per-sandbox policy maps directly to the proposed interface
- Security: Fix Path Traversal in LocalCommandLineCodeExecutor #7181 (path traversal in LocalCommandLineCodeExecutor): Landlock filesystem whitelisting prevents writes outside allowed paths at the kernel level
- MCP tool poisoning can enable arbitrary code execution via unsigned tool definitions #7427 (MCP tool poisoning): sandboxed tool execution limits blast radius of malicious MCP tools
Alternatives considered
- Docker-only: adds complexity and overhead; not available in all environments (CI, minimal VMs, air-gapped)
- Hardening LocalCommandLineCodeExecutor with blocklists: fundamentally limited; new bypasses will always emerge
- firejail: requires root or setuid binary
- bubblewrap (bwrap): lower-level, no Python API, no resource limits without cgroups
- gVisor: heavy, requires Docker or dedicated kernel
References
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels