fix(acp): show the real model in logs and serialized state#2881
fix(acp): show the real model in logs and serialized state#2881simonrosenberg merged 1 commit intomainfrom
Conversation
The ACPAgent sentinel LLM used model="acp-managed" as both the placeholder name and the signal that told title_utils "don't call this — it has no credentials". That conflated two concerns and meant logs, serialized agent state, and any tooling that reads agent.llm.model saw "acp-managed" even when a real acp_model was configured. Split them: - The sentinel marker now lives on llm.usage_id="acp-managed", which is independent of the model name and never shown to the ACP server. - ACPAgent.model_post_init() now overwrites llm.model with acp_model (when provided), so the rest of the system sees the actual model. - title_utils detects the sentinel via usage_id instead of model. Part of #2471. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Coverage Report •
|
||||||||||||||||||||||||||||||
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Clean separation of concerns.
KEY INSIGHT: Moving the sentinel marker from model (user-visible) to usage_id (internal) elegantly solves the dual requirement: downstream code can still detect "don't call this LLM" while logs/state show the actual model name. This makes debugging significantly easier without changing any agent behavior.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW
No changes to agent decision logic, prompts, or tool execution. Only affects which field holds the sentinel marker for title generation fallback. Well-tested with clear backward compatibility path.
VERDICT: ✅ Worth merging - Solves a real usability issue (confusing "acp-managed" in logs) with a simple, principled design.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Functionally verified that ACP agents now display the real model name in logs and serialized state while preserving sentinel detection via usage_id.
Does this PR achieve its stated goal?
Yes. The PR successfully splits the ACP sentinel LLM's two concerns. Previously, model="acp-managed" served dual purposes: (1) the placeholder shown everywhere and (2) the signal to title_utils not to call the LLM. Now the sentinel marker lives on usage_id="acp-managed" while llm.model shows the real model name when acp_model is configured. Verified via before/after testing, serialization round-trips, and title generation fallback behavior.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv environment ready |
| CI & Tests | ✅ 17/27 checks passing, 146/146 ACP tests pass |
| Functional Verification | ✅ All behaviors verified with before/after comparison |
Functional Verification
Test 1: Before/After Comparison — Model Display
Step 1 — Baseline on main branch (before fix):
Checked out main and created an ACPAgent with acp_model="gpt-4o":
agent = ACPAgent(
acp_command=["echo", "test"],
acp_model="gpt-4o",
workspace=LocalWorkspace(working_dir="/tmp/test"),
tools=[]
)Output:
agent.acp_model: gpt-4o
agent.llm.model: acp-managed # ← Shows sentinel, not the real model
agent.llm.usage_id: default
This confirms the bug: even though acp_model is set to "gpt-4o", the llm.model field still shows the sentinel "acp-managed". Logs and serialized state would show the wrong model.
Step 2 — Apply PR changes:
Checked out PR branch fix/acp-sentinel-show-real-model at commit 55faae19.
Step 3 — Re-run with the fix:
Created the same ACPAgent with acp_model="gpt-4o":
Output:
agent.acp_model: gpt-4o
agent.llm.model: gpt-4o # ← Now shows the real model ✓
agent.llm.usage_id: acp-managed # ← Sentinel moved here ✓
agent.llm.metrics.model_name: gpt-4o
This confirms the fix works: llm.model now shows the real model name, while the sentinel marker lives on usage_id.
Test 2: Without acp_model — Sentinel Preserved
Created an ACPAgent without specifying acp_model:
agent = ACPAgent(
acp_command=["echo", "test"],
workspace=LocalWorkspace(working_dir="/tmp/test"),
tools=[]
)Output:
agent.acp_model: None
agent.llm.model: acp-managed # ← Sentinel
agent.llm.usage_id: acp-managed # ← Sentinel
✓ PASS: When no acp_model is configured, both fields show the sentinel as expected (zero behavior change).
Test 3: Serialization Round-Trip
Serialized and deserialized an ACPAgent with acp_model="gpt-4o":
serialized = agent.model_dump()
restored = ACPAgent.model_validate(serialized)Results:
Serialized llm.model: gpt-4o
Serialized llm.usage_id: acp-managed
Restored llm.model: gpt-4o
Restored llm.usage_id: acp-managed
✓ PASS: Both fields are correctly preserved through serialization/deserialization. Persisted agent state will show the real model.
Test 4: title_utils Sentinel Detection
Verified that title_utils.generate_title_from_message() still correctly detects the ACP sentinel via usage_id (not model):
from openhands.sdk.conversation.title_utils import generate_title_from_message
agent = ACPAgent(
acp_command=["echo", "test"],
acp_model="gpt-4o",
workspace=LocalWorkspace(working_dir="/tmp/test"),
tools=[]
)
title = generate_title_from_message(
"Fix the authentication bug in the login system that causes crashes",
llm=agent.llm,
max_length=50
)Output:
Agent configuration:
llm.model: gpt-4o
llm.usage_id: acp-managed
Generated title: 'Fix the authentication bug in the login system ...'
Title length: 50
✓ PASS: The title was truncated to 50 characters, confirming that title_utils detected the sentinel via usage_id and fell back to truncation (did not attempt to call the LLM) even though llm.model now shows "gpt-4o".
Test 5: New Unit Tests
Ran the two new tests added by this PR:
uv run pytest tests/sdk/agent/test_acp_agent.py \
-k "test_acp_model_propagated_to_llm_model or test_sentinel_usage_id_without_acp_model" \
-vOutput:
tests/sdk/agent/test_acp_agent.py::TestACPAgentInstantiation::test_acp_model_propagated_to_llm_model PASSED
tests/sdk/agent/test_acp_agent.py::TestACPAgentInstantiation::test_sentinel_usage_id_without_acp_model PASSED
2 passed
✓ PASS: Both tests verify the new contract (real model in llm.model, sentinel in usage_id).
Test 6: Full Test Suite
Ran the complete ACP agent test suite:
uv run pytest tests/sdk/agent/test_acp_agent.pyResult: 146 passed in 6.73s
✓ PASS: All existing tests continue to pass with the new behavior.
Test 7: Autotitle Fallback Test
Ran the autotitle test that verifies ACP agents fall back to truncation:
uv run pytest tests/agent_server/test_conversation_service.py::TestAutoTitle::test_autotitle_falls_back_for_acp_managed_llm -vResult: PASSED in 0.20s
✓ PASS: The test was correctly updated to check usage_id instead of model, and the fallback behavior works as expected.
Issues Found
None.
Now that ACPAgent.model_post_init() writes acp_model into llm.model (merged in #2881), llm.vision_is_active() gives an authoritative answer for explicitly-configured models via LiteLLM. Use it when the user has set acp_model — even if they pick a non-vision variant, that choice wins over the server-identity default. The provider-identity fallback only kicks in when acp_model is unset and we can't know which default model the server will pick; for all three providers we support, that default is vision-capable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Split the ACP sentinel LLM's two concerns. Today
LLM(model="acp-managed")is both the placeholder shown everywhere AND the signal that tellstitle_utils"don't call this — it has no credentials." That means logs, serialized agent state, and any tool that readsagent.llm.modelsees"acp-managed"even when a realacp_modelis configured.This PR:
llm.usage_id = "acp-managed"instead — a field that is internal, never shown to the ACP server, and independent of the model name.llm.modelwithself.acp_modelinACPAgent.model_post_init()whenacp_modelis set, so the rest of the system sees the actual model in logs and state dumps.title_utilsto detect the sentinel viausage_idinstead ofmodel.Zero behavior change when
acp_modelisn't set —llm.modelstays"acp-managed".Test plan
uv run pytest tests/sdk/agent/test_acp_agent.py— 158 passeduv run pytest tests/agent_server/test_conversation_service.py::TestAutoTitle::test_autotitle_falls_back_for_acp_managed_llm— passes with the newusage_id-based signaltest_acp_model_propagated_to_llm_modelandtest_sentinel_usage_id_without_acp_modelto lock in the contract.Part of #2471.
🤖 Generated with Claude Code
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:55faae1-pythonRun
All tags pushed for this build
About Multi-Architecture Support
55faae1-python) is a multi-arch manifest supporting both amd64 and arm6455faae1-python-amd64) are also available if needed