Skip to content

An implementation of a Docker-based Sandbox for code-writing LangGraph agents

Notifications You must be signed in to change notification settings

MatteoFalcioni/LangGraph-Sandbox

Repository files navigation

LangGraph Sandbox – Docker-Based Code Execution Environment

A production-ready Docker sandbox for executing Python code within LangGraph agentic systems. Provides secure, isolated code execution with persistent state, dataset management, and automatic artifact storage.

Table of Contents

Overview

This sandbox replaces services like E2B by running isolated Docker containers for each conversation session. It provides:

  • Session-pinned containers: One container per conversation with persistent Python state
  • Multiple storage modes: RAM (TMPFS) or disk (BIND) for session data
  • Flexible dataset access: No datasets, local read-only mounts, or API-based staging
  • Automatic artifact management: Files are deduplicated, stored securely, and accessible via signed URLs
  • Enhanced debugging: Complete execution logs and state inspection in BIND mode
  • Docker Compose setup out-of-the-box: Possibility to choose network-oriented compose or simple docker image runs

Key Features

  • βœ… Resource isolation: CPU, memory, and timeout limits
  • βœ… Session state persistence: Variables and imports persist across tool calls
  • βœ… Six execution modes: From simple code execution to full dataset analysis
  • βœ… Automatic artifact pipeline: Deduplication, metadata tracking, and secure downloads
  • βœ… Enhanced debugging: Complete execution logs and state snapshots (BIND mode)
  • βœ… Production-ready: Secure token system, configurable timeouts, and error handling

Execution Modes

The system offers eight execution modes based on two independent configuration knobs:

Mode Session Storage Dataset Access Description Best For
TMPFS_NONE RAM (TMPFS) None Simple sandbox in memory Algorithms, calculations, demos
BIND_NONE Disk (BIND) None Persistent sandbox on disk Debugging, persistent development
TMPFS_LOCAL RAM (TMPFS) Local RO Memory + host datasets Fast analysis with large datasets
BIND_LOCAL Disk (BIND) Local RO Persistent + host datasets Full development with datasets
TMPFS_API RAM (TMPFS) API Memory + API datasets Multi-tenant, cloud datasets
BIND_API Disk (BIND) API Persistent + API datasets Development with API datasets
TMPFS_HYBRID RAM (TMPFS) Hybrid Memory + local + API datasets Best of both worlds
BIND_HYBRID Disk (BIND) Hybrid Persistent + local + API datasets Full development with mixed datasets

Recommended Defaults

  • Simple coding: TMPFS_NONE - Perfect for general-purpose coding without datasets
  • Production demos: TMPFS_API - Multi-tenant with dynamic dataset loading
  • Local development: BIND_LOCAL - Full debugging with local datasets
  • Mixed datasets: TMPFS_HYBRID - Local datasets + API fetching for maximum flexibility

Quick Start

Option 1: Manual Installation

# Clone the repository and build the Docker Image
git clone https://github.com/MatteoFalcioni/LangGraph-Sandbox
cd LangGraph-Sandbox

# Install Dependencies
pip install -r requirements.txt

# Set up the sandbox environment (copies required files)
python sandbox-setup

you should see:

Setting up LangGraph Sandbox...
βœ“ Copied Dockerfile.sandbox
βœ“ Copied docker-compose.yml
βœ“ Copied docker-compose.override.yml
βœ“ Copied sandbox.env.example
βœ“ Copied sandbox/ directory
#...

then:

# Set up configuration
cp sandbox.env.example sandbox.env
# Edit sandbox.env and add your OPENAI_API_KEY

# Build the Docker image
docker build -t sandbox:latest -f Dockerfile.sandbox .

Run a simple example:

# Run Simple Example
langgraph-sandbox

Option 2: Python Package Installation

# Clone and install the package
git clone https://github.com/MatteoFalcioni/LangGraph-Sandbox
cd LangGraph-Sandbox

# Ensure you have Python >= 3.11
python --version  # Should show Python 3.11.x or higher

# Install the package in development mode
pip install -e .

# Set up the sandbox environment (copies required files)
sandbox-setup

you should see:

Setting up LangGraph Sandbox...
βœ“ Copied Dockerfile.sandbox
βœ“ Copied docker-compose.yml
βœ“ Copied docker-compose.override.yml
βœ“ Copied sandbox.env.example
βœ“ Copied sandbox/ directory
#...

then:

# Set up configuration
cp sandbox.env.example sandbox.env
# Edit sandbox.env and add your OPENAI_API_KEY

# Build the Docker image
docker build -t sandbox:latest -f Dockerfile.sandbox .

Run the example:

# Run the simple sandbox
langgraph-sandbox

For Using in Other Projects:

Once you have cloned the repo to your <path_to_LangGraph-Sandbox>:

# Install the package (development mode)
pip install -e <path_to_LangGraph-Sandbox>

then follow the same steps of option 2 above

Project Structure After Setup

your-project/
β”œβ”€β”€ sandbox.env                    # Your configuration (copy from sandbox.env.example)
β”œβ”€β”€ Dockerfile.sandbox             # Copied from package
β”œβ”€β”€ docker-compose.yml             # Copied from package
β”œβ”€β”€ docker-compose.override.yml    # Copied from package
β”œβ”€β”€ sandbox.env.example            # Template (copied from package)
β”œβ”€β”€ sandbox/                       # Sandbox runtime files
β”œβ”€β”€ your_code.py                   # Your Python code
└── ...

Configuration

Configuration is managed through environment variables with sensible defaults:

Core Settings

# --- Core knobs ---
SESSION_STORAGE=TMPFS        # TMPFS | BIND
DATASET_ACCESS=API           # API | LOCAL_RO | NONE | HYBRID

# --- Host paths ---
SESSIONS_ROOT=./sessions
BLOBSTORE_DIR=./blobstore
ARTIFACTS_DB=./artifacts.db

# Required ONLY if DATASET_ACCESS=LOCAL_RO
# DATASETS_HOST_RO=./example_llm_data

# Required ONLY if DATASET_ACCESS=HYBRID
# HYBRID_LOCAL_PATH=./heavy_llm_data

# --- Docker / runtime ---
SANDBOX_IMAGE=sandbox:latest
TMPFS_SIZE_MB=1024

# --- Artifact display options ---
IN_CHAT_URL=false            # true | false (default: false)

# --- Network configuration ---
# "host" strategy: Uses port mapping, works everywhere (recommended)
# "container" strategy: Uses Docker network DNS, requires Docker Compose
SANDBOX_ADDRESS_STRATEGY=host  # "container" for Docker network DNS, "host" for port mapping
COMPOSE_NETWORK=langgraph-network  # Docker network name (optional, only used with container strategy)
HOST_GATEWAY=host.docker.internal  # Gateway hostname for host strategy (auto-detected in WSL2)

# --- External API keys ---
# OPENAI_API_KEY=your_openai_api_key_here

# --- Artifacts service (optional) ---
# ARTIFACTS_SECRET=fixed_artifact_secret           # use this if you want to standardize keys, (otherwise generated randomly)
# ARTIFACTS_PUBLIC_BASE_URL=http://localhost:8000  # default: http://localhost:8000
# ARTIFACTS_TOKEN_TTL_SECONDS=600                  # default: 600 seconds

Quick Configuration Examples

Simple execution:

In sandbox.env, set DATASET_ACCESS=NONE, then:

langgraph-sandbox

Local development:

In sandbox.env, set SESSION_STORAGE=BIND, DATASET_ACCESS=LOCAL_RO and DATASETS_HOST_RO=./, then:

langgraph-sandbox

Hybrid mode (local + API datasets):

In sandbox.env, set DATASET_ACCESS=HYBRID and HYBRID_LOCAL_PATH=./heavy_llm_data, then:

langgraph-sandbox

Production (default -> TMPFS+API):

langgraph-sandbox

Docker Compose:

# Start container
docker-compose up -d

# Run langgraph-sandbox on HOST machine (connects to container)
langgraph-sandbox

# Or customize in docker-compose.yml:
# SANDBOX_ADDRESS_STRATEGY=container
# COMPOSE_NETWORK=langgraph-network

Artifact Display Options

The IN_CHAT_URL setting controls how generated artifacts (plots, files, etc.) are displayed:

  • IN_CHAT_URL=false (default): Artifacts are not displayed in chat, but are still returned as artifacts in tools and can be displayed in main app; see main.py for an example.

  • IN_CHAT_URL=true: Artifacts are displayed directly in the chat with download links. Useful for interactive sessions where you want immediate access to generated files.

Note: careful if adding artifacts URLs to chat, because they might confuse your agent. For bigger, smarter models it's fine, but smaller models may run off track seeing urls.

Network Configuration

Host Strategy (default, recommended):

SANDBOX_ADDRESS_STRATEGY=host
HOST_GATEWAY=host.docker.internal
  • Uses host port mapping and gateway
  • Compatible with traditional Docker deployment
  • URL: http://host.docker.internal:{mapped_port} (or http://localhost:{mapped_port} in WSL2)
  • Auto-detects environment: Automatically uses localhost in WSL2, host.docker.internal in Docker Desktop
  • This is the default configuration - works out of the box everywhere

Container Strategy (for Docker Compose deployments):

SANDBOX_ADDRESS_STRATEGY=container
COMPOSE_NETWORK=langgraph-network
  • Containers communicate via Docker network DNS
  • No port mapping needed
  • URL: http://sbox-{session_id}:9000
  • Requires Docker Compose setup (see Docker Compose Setup below)

Docker Compose Setup

The project includes a docker-compose.yml file with pre-configured networking for the container strategy:

βœ… Container Strategy Networking (Already Configured):

The docker-compose.yml file already includes:

  • Custom Docker network: langgraph-network
  • Proper network configuration for container strategy
  • Default environment variables set correctly

Note: The default sandbox.env.example uses Host Strategy. To use Container Strategy with Docker Compose, change SANDBOX_ADDRESS_STRATEGY=container in your sandbox.env file.

Manual Network Creation (if needed):

If you encounter network issues, you can manually create the network:

docker network create langgraph-network

Docker Compose Example

The project includes a docker-compose.yml file demonstrating both strategies:

Container Strategy (default):

services:
  app:
    environment:
      SANDBOX_ADDRESS_STRATEGY: container
      COMPOSE_NETWORK: langgraph-network
    networks:
      - langgraph-network

networks:
  langgraph-network:
    driver: bridge

Host Strategy (override):

# docker-compose.override.yml
services:
  app:
    environment:
      SANDBOX_ADDRESS_STRATEGY: host
      HOST_GATEWAY: host.docker.internal
    ports:
      - "9000:9000"  # Sandbox REPL

Run with:

# Container strategy (default)
docker-compose up -d
langgraph-sandbox  # Run on HOST machine

# Host strategy (with override)
docker-compose -f docker-compose.yml -f docker-compose.override.yml up -d
langgraph-sandbox  # Run on HOST machine

Tool Factory

The system provides production-ready LangGraph tools out of the box through simple factory functions. These tools handle all the complexity of container management, session persistence, and artifact processing automatically.

πŸš€ Ready-to-Use Tools

make_code_sandbox_tool - Execute Python code with persistent state

  • Maintains variables and imports across executions
  • Automatic artifact detection and ingestion
  • Memory cleanup to prevent container bloat
  • Session-pinned containers for consistency

make_select_dataset_tool - Load datasets on-demand (API and HYBRID modes)

  • Fetches datasets from your API sources
  • Stages them directly into the sandbox
  • Smart caching with PENDING/LOADED/FAILED status tracking
  • Client wrapper pattern for easy integration

make_export_datasets_tool - Export modified datasets to host filesystem

  • Timestamped exports to prevent overwrites
  • Automatic artifact ingestion for download URLs
  • Works with any file in /data/

make_list_datasets_tool - List available datasets in the sandbox

  • API mode: Lists datasets loaded in /data (dynamically loaded datasets)
  • LOCAL_RO mode: Lists statically mounted files in /data (host-mounted datasets)
  • HYBRID mode: Lists both local mounted files and API-loaded datasets in /data
  • NONE mode: Returns message indicating no datasets are available
  • Provides detailed file information including size, modification time, and paths
# Example: Create tools in 4 lines
def get_session_key():
    return "my_session_id"  # Implement your session management logic if needed

code_tool = make_code_sandbox_tool(session_manager=sm, session_key_fn=get_session_key)
dataset_tool = make_select_dataset_tool(session_manager=sm, fetch_fn=my_fetch, client=my_client)
export_tool = make_export_datasets_tool(session_manager=sm, session_key_fn=get_session_key)
list_tool = make_list_datasets_tool(session_manager=sm, session_key_fn=get_session_key)

Session Key Management: The session_key_fn parameter is crucial for maintaining session isolation. Each unique session key gets its own container, so implement proper session management in your application (e.g., user ID, conversation ID, or thread ID). You can see in our main.py example that we did it by generating a random session key.

Note: Notice that the make_select_dataset_tool expects the fetch_fn parameter to be a function that, given your dataset_id, returns the dataset bytes through your API client.

If in your implementation you need to pass your API client to the select_dataset tool (as we did in our custom example), you can do so by simply specifing client as an input parameter. It will be automatically wrapped without any needed changes.

Artifact System

Files saved to /session/artifacts/ are automatically processed:

  1. Ingestion: Files are copied from the container
  2. Deduplication: SHA-256 based content addressing
  3. Storage: Saved in blobstore/ with metadata in SQLite
  4. Access: Secure download URLs with time-limited tokens
  5. Cleanup: Original files removed from session directory

Artifact Features

  • Automatic deduplication: Identical files share storage
  • Secure downloads: Signed URLs with configurable expiration
  • Metadata tracking: Size, MIME type, creation time, session links
  • Content addressing: SHA-256 based blob storage
  • API access: REST endpoints for artifact management

Session Management

Session Lifecycle

  1. Creation: Container started with session-specific configuration
  2. Execution: Code runs with persistent Python state
  3. Artifact Processing: Files automatically ingested after each run
  4. Cleanup: Container stopped, resources released

Session State

  • Variables: All Python variables persist between executions
  • Imports: Module imports remain available
  • Working Directory: Consistent /session working directory
  • Environment: Isolated Python environment with common packages

Development & Debugging

BIND Mode Enhanced Features

When using SESSION_STORAGE=BIND, the system provides comprehensive debugging capabilities:

Session Directory Structure

sessions/<session_id>/
β”œβ”€β”€ session.log                    # Complete execution history
β”œβ”€β”€ session_metadata.json          # Session info and statistics
β”œβ”€β”€ python_state.json             # Current Python state snapshot
└── artifacts/                    # Ingested at runtime

Session Viewer Tool

# View complete session information
python langgraph_sandbox/sandbox/session_viewer.py sessions/<session_id>

# View last 10 log entries
python langgraph_sandbox/sandbox/session_viewer.py sessions/<session_id> --limit 10

# Skip state and artifacts display
python langgraph_sandbox/sandbox/session_viewer.py sessions/<session_id> --no-state --no-artifacts

Debugging Benefits

  • Complete execution history: See all code executed and results
  • State inspection: View variables, imports, and Python environment
  • Artifact tracking: Monitor file creation and processing
  • Container information: Track container lifecycle and configuration
  • Performance analysis: Execution timing and resource usage

API Reference

Artifact API Endpoints

  • GET /artifacts/{artifact_id}?token={token} - Download artifact
  • HEAD /artifacts/{artifact_id}?token={token} - Get artifact metadata

Configuration API

  • Config.from_env() - Load configuration from environment
  • Config.from_env(env_file_path) - Load from specific file

Session Management

  • SessionManager.start(session_id) - Start new session
  • SessionManager.exec(session_id, code) - Execute code
  • SessionManager.stop(session_id) - Stop session

Next Steps

  • Quotas: Add per-session size limits and retention policies
  • Scalability: Migrate to Postgres + S3/MinIO for production
  • Monitoring: Add metrics and health checks

For more examples and detailed documentation, see the usage_examples/ directory.

About

An implementation of a Docker-based Sandbox for code-writing LangGraph agents

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages