fix(acp): show the real model in logs and serialized state by simonrosenberg · Pull Request #2881 · OpenHands/software-agent-sdk

simonrosenberg · 2026-04-18T20:57:53Z

Summary

Split the ACP sentinel LLM's two concerns. Today LLM(model="acp-managed") is both the placeholder shown everywhere AND the signal that tells title_utils "don't call this — it has no credentials." That means logs, serialized agent state, and any tool that reads agent.llm.model sees "acp-managed" even when a real acp_model is configured.

This PR:

Stamps the sentinel marker on llm.usage_id = "acp-managed" instead — a field that is internal, never shown to the ACP server, and independent of the model name.
Overwrites llm.model with self.acp_model in ACPAgent.model_post_init() when acp_model is set, so the rest of the system sees the actual model in logs and state dumps.
Switches title_utils to detect the sentinel via usage_id instead of model.

Zero behavior change when acp_model isn't set — llm.model stays "acp-managed".

Test plan

uv run pytest tests/sdk/agent/test_acp_agent.py — 158 passed
uv run pytest tests/agent_server/test_conversation_service.py::TestAutoTitle::test_autotitle_falls_back_for_acp_managed_llm — passes with the new usage_id-based signal
Added test_acp_model_propagated_to_llm_model and test_sentinel_usage_id_without_acp_model to lock in the contract.

Part of #2471.

🤖 Generated with Claude Code

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:55faae1-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-55faae1-python \
  ghcr.io/openhands/agent-server:55faae1-python

All tags pushed for this build

ghcr.io/openhands/agent-server:55faae1-golang-amd64
ghcr.io/openhands/agent-server:55faae1-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:55faae1-golang-arm64
ghcr.io/openhands/agent-server:55faae1-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:55faae1-java-amd64
ghcr.io/openhands/agent-server:55faae1-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:55faae1-java-arm64
ghcr.io/openhands/agent-server:55faae1-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:55faae1-python-amd64
ghcr.io/openhands/agent-server:55faae1-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:55faae1-python-arm64
ghcr.io/openhands/agent-server:55faae1-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:55faae1-golang
ghcr.io/openhands/agent-server:55faae1-java
ghcr.io/openhands/agent-server:55faae1-python

About Multi-Architecture Support

Each variant tag (e.g., 55faae1-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 55faae1-python-amd64) are also available if needed

The ACPAgent sentinel LLM used model="acp-managed" as both the placeholder name and the signal that told title_utils "don't call this — it has no credentials". That conflated two concerns and meant logs, serialized agent state, and any tooling that reads agent.llm.model saw "acp-managed" even when a real acp_model was configured. Split them: - The sentinel marker now lives on llm.usage_id="acp-managed", which is independent of the model name and never shown to the ACP server. - ACPAgent.model_post_init() now overwrites llm.model with acp_model (when provided), so the rest of the system sees the actual model. - title_utils detects the sentinel via usage_id instead of model. Part of #2471. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-18T20:58:18Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-04-18T20:58:29Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-04-18T21:00:15Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/agent
acp_agent.py	551	40	92%	294–296, 426–427, 460, 462, 466, 470, 478, 536–537, 542, 609, 750, 753–754, 771–772, 779, 784, 815, 825, 830, 841–844, 953–956, 960–964, 966, 1450–1451
openhands-sdk/openhands/sdk/conversation
title_utils.py	56	2	96%	35, 75
TOTAL	23488	5683	75%

all-hands-bot

🟢 Good taste - Clean separation of concerns.

KEY INSIGHT: Moving the sentinel marker from model (user-visible) to usage_id (internal) elegantly solves the dual requirement: downstream code can still detect "don't call this LLM" while logs/state show the actual model name. This makes debugging significantly easier without changing any agent behavior.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟢 LOW

No changes to agent decision logic, prompts, or tool execution. Only affects which field holds the sentinel marker for title generation fallback. Well-tested with clear backward compatibility path.

VERDICT: ✅ Worth merging - Solves a real usability issue (confusing "acp-managed" in logs) with a simple, principled design.

all-hands-bot

✅ QA Report: PASS

Functionally verified that ACP agents now display the real model name in logs and serialized state while preserving sentinel detection via usage_id.

Does this PR achieve its stated goal?

Yes. The PR successfully splits the ACP sentinel LLM's two concerns. Previously, model="acp-managed" served dual purposes: (1) the placeholder shown everywhere and (2) the signal to title_utils not to call the LLM. Now the sentinel marker lives on usage_id="acp-managed" while llm.model shows the real model name when acp_model is configured. Verified via before/after testing, serialization round-trips, and title generation fallback behavior.

Phase	Result
Environment Setup	✅ uv environment ready
CI & Tests	✅ 17/27 checks passing, 146/146 ACP tests pass
Functional Verification	✅ All behaviors verified with before/after comparison

Functional Verification

Test 1: Before/After Comparison — Model Display

Step 1 — Baseline on main branch (before fix):

Checked out main and created an ACPAgent with acp_model="gpt-4o":

agent = ACPAgent(
    acp_command=["echo", "test"],
    acp_model="gpt-4o",
    workspace=LocalWorkspace(working_dir="/tmp/test"),
    tools=[]
)

Output:

agent.acp_model: gpt-4o
agent.llm.model: acp-managed  # ← Shows sentinel, not the real model
agent.llm.usage_id: default

This confirms the bug: even though acp_model is set to "gpt-4o", the llm.model field still shows the sentinel "acp-managed". Logs and serialized state would show the wrong model.

Step 2 — Apply PR changes:

Checked out PR branch fix/acp-sentinel-show-real-model at commit 55faae19.

Step 3 — Re-run with the fix:

Created the same ACPAgent with acp_model="gpt-4o":

Output:

agent.acp_model: gpt-4o
agent.llm.model: gpt-4o        # ← Now shows the real model ✓
agent.llm.usage_id: acp-managed # ← Sentinel moved here ✓
agent.llm.metrics.model_name: gpt-4o

This confirms the fix works: llm.model now shows the real model name, while the sentinel marker lives on usage_id.

Test 2: Without acp_model — Sentinel Preserved

Created an ACPAgent without specifying acp_model:

agent = ACPAgent(
    acp_command=["echo", "test"],
    workspace=LocalWorkspace(working_dir="/tmp/test"),
    tools=[]
)

Output:

agent.acp_model: None
agent.llm.model: acp-managed     # ← Sentinel
agent.llm.usage_id: acp-managed  # ← Sentinel

✓ PASS: When no acp_model is configured, both fields show the sentinel as expected (zero behavior change).

Test 3: Serialization Round-Trip

Serialized and deserialized an ACPAgent with acp_model="gpt-4o":

serialized = agent.model_dump()
restored = ACPAgent.model_validate(serialized)

Results:

Serialized llm.model: gpt-4o
Serialized llm.usage_id: acp-managed
Restored llm.model: gpt-4o
Restored llm.usage_id: acp-managed

✓ PASS: Both fields are correctly preserved through serialization/deserialization. Persisted agent state will show the real model.

Test 4: title_utils Sentinel Detection

Verified that title_utils.generate_title_from_message() still correctly detects the ACP sentinel via usage_id (not model):

from openhands.sdk.conversation.title_utils import generate_title_from_message

agent = ACPAgent(
    acp_command=["echo", "test"],
    acp_model="gpt-4o",
    workspace=LocalWorkspace(working_dir="/tmp/test"),
    tools=[]
)

title = generate_title_from_message(
    "Fix the authentication bug in the login system that causes crashes",
    llm=agent.llm,
    max_length=50
)

Output:

Agent configuration:
  llm.model: gpt-4o
  llm.usage_id: acp-managed

Generated title: 'Fix the authentication bug in the login system ...'
Title length: 50

✓ PASS: The title was truncated to 50 characters, confirming that title_utils detected the sentinel via usage_id and fell back to truncation (did not attempt to call the LLM) even though llm.model now shows "gpt-4o".

Test 5: New Unit Tests

Ran the two new tests added by this PR:

uv run pytest tests/sdk/agent/test_acp_agent.py \
  -k "test_acp_model_propagated_to_llm_model or test_sentinel_usage_id_without_acp_model" \
  -v

Output:

tests/sdk/agent/test_acp_agent.py::TestACPAgentInstantiation::test_acp_model_propagated_to_llm_model PASSED
tests/sdk/agent/test_acp_agent.py::TestACPAgentInstantiation::test_sentinel_usage_id_without_acp_model PASSED
2 passed

✓ PASS: Both tests verify the new contract (real model in llm.model, sentinel in usage_id).

Test 6: Full Test Suite

Ran the complete ACP agent test suite:

uv run pytest tests/sdk/agent/test_acp_agent.py

Result: 146 passed in 6.73s

✓ PASS: All existing tests continue to pass with the new behavior.

Test 7: Autotitle Fallback Test

Ran the autotitle test that verifies ACP agents fall back to truncation:

uv run pytest tests/agent_server/test_conversation_service.py::TestAutoTitle::test_autotitle_falls_back_for_acp_managed_llm -v

Result: PASSED in 0.20s

✓ PASS: The test was correctly updated to check usage_id instead of model, and the fallback behavior works as expected.

Issues Found

None.

Now that ACPAgent.model_post_init() writes acp_model into llm.model (merged in #2881), llm.vision_is_active() gives an authoritative answer for explicitly-configured models via LiteLLM. Use it when the user has set acp_model — even if they pick a non-vision variant, that choice wins over the server-identity default. The provider-identity fallback only kicks in when acp_model is unset and we can't know which default model the server will pick; for all three providers we support, that default is vision-capable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

simonrosenberg requested a review from all-hands-bot April 18, 2026 20:59

simonrosenberg self-assigned this Apr 18, 2026

all-hands-bot approved these changes Apr 18, 2026

View reviewed changes

all-hands-bot reviewed Apr 18, 2026

View reviewed changes

simonrosenberg merged commit 0937073 into main Apr 18, 2026
36 of 38 checks passed

simonrosenberg deleted the fix/acp-sentinel-show-real-model branch April 18, 2026 21:19

This was referenced Apr 18, 2026

fix(acp): derive vision capability from ACP provider, not sentinel LLM #2879

Closed

ACP agent cannot use LLM for auxiliary tasks (title generation, vision check, sub-agents, etc.) #2471

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(acp): show the real model in logs and serialized state#2881

fix(acp): show the real model in logs and serialized state#2881
simonrosenberg merged 1 commit intomainfrom
fix/acp-sentinel-show-real-model

simonrosenberg commented Apr 18, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

simonrosenberg commented Apr 18, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions bot commented Apr 18, 2026

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions bot commented Apr 18, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: Before/After Comparison — Model Display

Test 2: Without acp_model — Sentinel Preserved

Test 3: Serialization Round-Trip

Test 4: title_utils Sentinel Detection

Test 5: New Unit Tests

Test 6: Full Test Suite

Test 7: Autotitle Fallback Test

Issues Found

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simonrosenberg commented Apr 18, 2026 •

edited by github-actions bot

Loading