Skip to content

fix(docker): bundle Python runtime for portable /agent-server#2678

Closed
simonrosenberg wants to merge 1 commit intomainfrom
fix/portable-agent-server-python
Closed

fix(docker): bundle Python runtime for portable /agent-server#2678
simonrosenberg wants to merge 1 commit intomainfrom
fix/portable-agent-server-python

Conversation

@simonrosenberg
Copy link
Copy Markdown
Collaborator

@simonrosenberg simonrosenberg commented Apr 2, 2026

Summary

  • Bundle the Python interpreter, stdlib, and libpython from the builder stage into /agent-server/.python/ and repoint the venv symlinks at it
  • This makes /agent-server fully self-contained — eval images can COPY it onto any base image without needing Python at /usr/local/bin/python3
  • Keeps --python-preference only-system to avoid the seccomp/executable-stack issue with python-build-standalone

Problem

SDK v1.15.0 (commit 06b9186) switched from uv-managed Python to --python-preference only-system, which creates a venv with symlinks to /usr/local/bin/python3 (from the python:3.13-bookworm builder). When this venv is COPYed onto commit0 base images (Ubuntu 22.04, Python at /usr/bin/python3), the symlink is broken and the container fails to start:

exec: "/agent-server/.venv/bin/python": stat /usr/local/bin/python3: no such file or directory

This causes all commit0 evaluation pods to get stuck in "pending" status (OpenHands/benchmarks#607).

SWE-bench was unaffected because its base images derive from Python Docker images that have /usr/local/bin/python3.

Test plan

  • Built image locally with commit0 base (docker.io/wentingzhao/tinydb:v0) — container starts and Python resolves correctly
  • Verified venv symlinks point to bundled .python/ directory
  • Confirmed bundled Python runs: Python: /agent-server/.venv/bin/python 3.13.12
  • Rebuild commit0 eval images and run evaluation

Fixes OpenHands/benchmarks#607

🤖 Generated with Claude Code


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:d7b6700-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-d7b6700-python \
  ghcr.io/openhands/agent-server:d7b6700-python

All tags pushed for this build

ghcr.io/openhands/agent-server:d7b6700-golang-amd64
ghcr.io/openhands/agent-server:d7b6700-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:d7b6700-golang-arm64
ghcr.io/openhands/agent-server:d7b6700-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:d7b6700-java-amd64
ghcr.io/openhands/agent-server:d7b6700-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:d7b6700-java-arm64
ghcr.io/openhands/agent-server:d7b6700-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:d7b6700-python-amd64
ghcr.io/openhands/agent-server:d7b6700-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:d7b6700-python-arm64
ghcr.io/openhands/agent-server:d7b6700-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:d7b6700-golang
ghcr.io/openhands/agent-server:d7b6700-java
ghcr.io/openhands/agent-server:d7b6700-python

About Multi-Architecture Support

  • Each variant tag (e.g., d7b6700-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., d7b6700-python-amd64) are also available if needed

After building the venv with system Python, copy the interpreter binary,
standard library, and libpython shared objects into /agent-server/.python/.
Re-point the venv symlinks and pyvenv.cfg at the bundled copy so that the
entire /agent-server directory is self-contained.

This means eval images (and any other consumer) can COPY /agent-server onto
any base image — even one without Python at /usr/local/bin — and the
entrypoint will resolve.

Without this fix, commit0 eval images (Ubuntu 22.04 base, Python at
/usr/bin/python3) fail to start because the venv symlinks point to the
builder's /usr/local/bin/python3 which doesn't exist in the target image.

Fixes OpenHands/benchmarks#607

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

simonrosenberg pushed a commit to OpenHands/benchmarks that referenced this pull request Apr 2, 2026
Update SDK submodule to include the Python runtime bundling fix
(OpenHands/software-agent-sdk#2678). The agent-server Dockerfile now
bundles Python into /agent-server/.python/, making the venv portable
across base images.

This fixes commit0 evaluations where all runtime pods were stuck in
pending because the agent-server container couldn't start — the venv
symlinked to /usr/local/bin/python3 which doesn't exist in commit0's
Ubuntu 22.04 base images.

Fixes #607

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable - Pragmatic fix for a real production issue, but adds build complexity without automated verification.

Key Insight: This solves the immediate portability problem, but the solution's robustness depends on several implicit assumptions about Python's installation structure that could break in future Debian releases.

Comment on lines +60 to +64
RUN set -eux; \
REAL_PYTHON=$(readlink -f .venv/bin/python3); \
PY_VER=$("${REAL_PYTHON}" -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')"); \
PYTHON_PREFIX=$("${REAL_PYTHON}" -c "import sys; print(sys.base_prefix)"); \
# --- copy interpreter binary ------------------------------------------------- \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important - Robustness: This loop assumes libpython*.so* files exist and follow Debian's naming convention. If no libraries match (unlikely but possible in minimal Python builds), the loop silently succeeds.

Consider adding a verification step after the loop:

if ! ls .python/lib/libpython*.so* 1>/dev/null 2>&1; then \
    echo "ERROR: No libpython shared libraries found"; \
    exit 1; \
fi;

Comment on lines +70 to +77
mkdir -p .python/lib; \
cp -a "${PYTHON_PREFIX}/lib/python${PY_VER}" ".python/lib/python${PY_VER}"; \
rm -rf ".python/lib/python${PY_VER}/test" \
".python/lib/python${PY_VER}/tests" \
".python/lib/python${PY_VER}/idle_test" \
".python/lib/python${PY_VER}/idlelib"; \
# --- copy shared libraries (libpython) --------------------------------------- \
for lib in "${PYTHON_PREFIX}"/lib/libpython*.so*; do \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion - Defensive Programming: This loop assumes all python* files in .venv/bin/ are symlinks (the || continue handles non-symlinks). While this should be true for uv-created venvs, it's an implicit assumption.

The subsequent explicit checks (lines 80-81) are good defensive programming, but consider documenting this assumption in a comment.

[ -L .venv/bin/python3 ] || ln -s "../../.python/bin/python3" .venv/bin/python3; \
sed -i "s|^home = .*|home = /agent-server/.python/bin|" .venv/pyvenv.cfg; \
# --- quick smoke-test inside the builder ------------------------------------- \
.venv/bin/python -c "import sys; print('bundled python:', sys.executable, sys.version)"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important - Testing Gap: The smoke test only verifies the interpreter runs and can import sys. It doesn't verify:

  • Critical stdlib modules (e.g., import ssl, json, urllib)
  • That the bundled libpython is actually being used
  • That the venv can install/import packages

Consider a more comprehensive smoke test:

.venv/bin/python -c "import sys, ssl, json, urllib.request; import openhands.agent_server; print('✓ bundled python:', sys.executable, sys.version)"

FROM base-image AS source
ARG USERNAME
COPY --chown=${USERNAME}:${USERNAME} --from=builder /agent-server /agent-server
# Bundled Python's libpython*.so lives under /agent-server/.python/lib
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion - Documentation: The LD_LIBRARY_PATH addition is critical for the bundled libpython to work. Consider adding a comment explaining:

  • Why this is needed (bundled libpython.so)
  • Security consideration (directory is owned by ${USERNAME}, not world-writable)

Comment on lines +49 to +92
# Bundle the Python runtime inside /agent-server so that the entire directory
# is self-contained and portable. Eval images (and any other consumer) can
# COPY /agent-server onto *any* base image without requiring that base image
# to ship a compatible system Python.
#
# What we copy:
# .python/bin/python3.13 – the interpreter binary
# .python/lib/python3.13/ – the standard library (minus tests)
# .python/lib/libpython*.so* – shared libraries (Debian builds --enable-shared)
#
# We then repoint the venv's symlinks and pyvenv.cfg at the bundled copy.
RUN set -eux; \
REAL_PYTHON=$(readlink -f .venv/bin/python3); \
PY_VER=$("${REAL_PYTHON}" -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')"); \
PYTHON_PREFIX=$("${REAL_PYTHON}" -c "import sys; print(sys.base_prefix)"); \
# --- copy interpreter binary ------------------------------------------------- \
mkdir -p .python/bin; \
cp "${REAL_PYTHON}" ".python/bin/python${PY_VER}"; \
ln -s "python${PY_VER}" .python/bin/python3; \
ln -s "python${PY_VER}" .python/bin/python; \
# --- copy standard library (skip test suite to save ~30 MB) ------------------ \
mkdir -p .python/lib; \
cp -a "${PYTHON_PREFIX}/lib/python${PY_VER}" ".python/lib/python${PY_VER}"; \
rm -rf ".python/lib/python${PY_VER}/test" \
".python/lib/python${PY_VER}/tests" \
".python/lib/python${PY_VER}/idle_test" \
".python/lib/python${PY_VER}/idlelib"; \
# --- copy shared libraries (libpython) --------------------------------------- \
for lib in "${PYTHON_PREFIX}"/lib/libpython*.so*; do \
[ -e "$lib" ] && cp -a "$lib" .python/lib/; \
done; \
# --- repoint venv at the bundled Python -------------------------------------- \
for f in .venv/bin/python*; do \
[ -L "$f" ] || continue; \
name=$(basename "$f"); \
rm "$f"; \
ln -s "../../.python/bin/${name}" "$f"; \
done; \
# Ensure canonical names resolve (some venvs only create python3 + python) \
[ -L .venv/bin/python ] || ln -s "../../.python/bin/python" .venv/bin/python; \
[ -L .venv/bin/python3 ] || ln -s "../../.python/bin/python3" .venv/bin/python3; \
sed -i "s|^home = .*|home = /agent-server/.python/bin|" .venv/pyvenv.cfg; \
# --- quick smoke-test inside the builder ------------------------------------- \
.venv/bin/python -c "import sys; print('bundled python:', sys.executable, sys.version)"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion - Maintainability: This 45-line shell script could be extracted into a separate script file (e.g., bundle-python.sh) and copied/executed. Benefits:

  • Easier to read the Dockerfile
  • Easier to test the script in isolation
  • Easier to maintain and debug

Not blocking, but worth considering for future refactoring.

@simonrosenberg
Copy link
Copy Markdown
Collaborator Author

Closing in favor of #2676 which includes the same Dockerfile fix plus regression tests (Dockerfile.portability-test + unit tests). Added root cause context to #2676.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

commit0: runtime pods stuck in pending, all instances fail

2 participants