Skip to content

fix: always install Node 22 for ACP packages in agent-server images#2652

Open
simonrosenberg wants to merge 4 commits intomainfrom
fix/always-install-node22-for-acp
Open

fix: always install Node 22 for ACP packages in agent-server images#2652
simonrosenberg wants to merge 4 commits intomainfrom
fix/always-install-node22-for-acp

Conversation

@simonrosenberg
Copy link
Copy Markdown
Collaborator

@simonrosenberg simonrosenberg commented Apr 1, 2026

Summary

  • Always install Node.js 22 from nodesource in the agent-server Dockerfile, removing the conditional if ! command -v npm check
  • SWE-bench base images ship with NVM-managed old Node.js (v8–v14) that already have npm in PATH, causing the conditional to skip Node 22 installation
  • ACP packages (claude-agent-acp, codex-acp, gemini-cli) installed into old Node.js crash with SyntaxError on modern ES module syntax (e.g. import ... with { type: "json" })
  • This is the root cause of ACP agent failures on swebenchmultimodal (and potentially other SWE-bench benchmarks)

Root cause analysis

The Dockerfile had:

if ! command -v npm >/dev/null 2>&1; then
    # install Node 22
fi
npm install -g @zed-industries/claude-agent-acp ...

SWE-bench images are built from repos that often include .nvmrc or NVM-managed Node.js versions:

  • p5.js: Node 14.17.3
  • marked: Node 12.22.12
  • wp-calypso: Node 8.9.3
  • react-pdf: Node 8.17.0

Since npm exists in PATH via the old Node.js, the conditional skipped Node 22 installation. ACP packages were installed into the old Node.js and crashed immediately on startup.

Chart.js (Node 21.6.2) worked because its Node.js version is new enough.

Related

Test plan

  • Build an agent-server image from a SWE-bench base that has old Node.js (e.g. p5.js with Node 14)
  • Verify node --version shows v22.x inside the built image
  • Verify claude-agent-acp starts without SyntaxError
  • Run swebenchmultimodal eval with ACP agent on previously-failing instances

🤖 Generated with Claude Code


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:612ef5a-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-612ef5a-python \
  ghcr.io/openhands/agent-server:612ef5a-python

All tags pushed for this build

ghcr.io/openhands/agent-server:612ef5a-golang-amd64
ghcr.io/openhands/agent-server:612ef5a-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:612ef5a-golang-arm64
ghcr.io/openhands/agent-server:612ef5a-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:612ef5a-java-amd64
ghcr.io/openhands/agent-server:612ef5a-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:612ef5a-java-arm64
ghcr.io/openhands/agent-server:612ef5a-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:612ef5a-python-amd64
ghcr.io/openhands/agent-server:612ef5a-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:612ef5a-python-arm64
ghcr.io/openhands/agent-server:612ef5a-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:612ef5a-golang
ghcr.io/openhands/agent-server:612ef5a-java
ghcr.io/openhands/agent-server:612ef5a-python

About Multi-Architecture Support

  • Each variant tag (e.g., 612ef5a-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 612ef5a-python-amd64) are also available if needed

SWE-bench base images often ship with NVM-managed old Node.js versions
(v8–v14) that already have `npm` in PATH. The previous conditional
install (`if ! command -v npm`) would skip Node 22, causing ACP packages
(claude-agent-acp, codex-acp, gemini-cli) to be installed into the old
Node.js where they crash with SyntaxError on modern ES module syntax.

Remove the conditional and always install Node 22 from nodesource so ACP
subprocess servers can start reliably regardless of the base image.

Fixes: OpenHands/runtime-api#458

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Eliminates broken conditional that failed on old Node versions. Fix is correct: always installing Node 22 ensures ACP packages work on all base images.

⚠️ Note for maintainer: Changes runtime environment (Node.js version). Recommend lightweight evals to verify no unintended benchmark impact before merge.

Worth merging - Fixes real production bug with sound solution.

Debug Agent and others added 3 commits April 1, 2026 17:35
Instead of replacing the system Node.js with nodesource, download a
Node 22 binary tarball to /opt/acp-node/ and install ACP packages there.
Wrapper scripts in /usr/local/bin/ prepend the ACP Node to PATH only
for the ACP subprocess, so the repo's own Node.js (needed for test
suites) stays untouched.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The POST /api/conversations request triggers lazy agent initialization
(ACP subprocess startup, model config loading, tool registration) which
routinely takes >60s on heavy SWE-bench multimodal images.  The 60s
default caused httpx.ReadTimeout on 4/5 test instances even after the
Node.js fix landed.

Align with the parent RemoteWorkspace.read_timeout default (600s) and
the legacy code which used 120-600s for this value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants