Skip to content

Add lightweight OS-level sandboxing via sandlock for LocalCommandLineCodeExecutor #7475

@congwang-mk

Description

@congwang-mk

Problem

The LocalCommandLineCodeExecutor writes LLM-generated code to disk and executes it via asyncio.create_subprocess_exec with full host privileges. The only safeguard is a suppressible UserWarning.

Attack scenario: indirect prompt injection causes the LLM to generate malicious code (e.g., exfiltrating SSH keys, installing backdoors). The code runs with full filesystem, network, and process access.

The Docker executor provides real isolation but is opt-in and adds significant overhead. There is no lightweight, local sandboxing option between "completely unsandboxed" and "full Docker container."

Proposal

Add sandlock as a new code executor backend. Sandlock is a lightweight Linux process sandbox using Landlock, seccomp-bpf, and user namespaces.

What sandlock provides

Layer Protection
Landlock Filesystem path whitelisting (read-only / read-write), network domain + port restrictions
seccomp-bpf Syscall filtering at kernel level: blocks ptrace, mount, unshare, kexec_load, bpf, etc.
User namespaces Privilege escalation prevention without root
Resource limits Memory, process count, CPU, open file caps (no cgroups needed)

Why this fits AutoGen

  • ~20ms startup vs ~200ms+ for Docker. Low enough to be a reasonable default.
  • No root, no daemon. Unlike Docker, just pip install sandlock. No Docker socket needed.
  • Fits the executor interface. AutoGen already has a clean CodeExecutor abstraction. A SandlockCommandLineCodeExecutor would implement the same interface as LocalCommandLineCodeExecutor but with kernel-enforced confinement.
  • Aligns with Proposal: Standardized Safety Sandbox for Agent Tool Execution #7230. The proposed ToolSafetyPolicy interface (allow_network, allow_filesystem, max_memory) maps directly to sandlock's capabilities.
  • Self-hosted. No Azure subscription needed. Works in air-gapped environments.
  • Linux primary. Non-Linux users can continue using the Docker executor.

Where it fits in the executor hierarchy

LocalCommandLineCodeExecutor  →  SandlockCodeExecutor  →  DockerCodeExecutor  →  AzureCodeExecutor
      (unsandboxed)               (lightweight             (container            (cloud
                                   OS-level)                isolation)            isolation)

Example usage

from autogen_ext.code_executors.sandlock import SandlockCommandLineCodeExecutor

executor = SandlockCommandLineCodeExecutor(
    work_dir="./coding",
    fs_read=["/usr/lib/python3", "/usr/local/lib"],
    fs_write=["./coding"],
    net_allow=[],  # no network by default
    max_memory_mb=512,
    max_procs=5,
)

Relation to existing issues

Alternatives considered

  • Docker-only: adds complexity and overhead; not available in all environments (CI, minimal VMs, air-gapped)
  • Hardening LocalCommandLineCodeExecutor with blocklists: fundamentally limited; new bypasses will always emerge
  • firejail: requires root or setuid binary
  • bubblewrap (bwrap): lower-level, no Python API, no resource limits without cgroups
  • gVisor: heavy, requires Docker or dedicated kernel

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions