fix: synchronize ACP telemetry and refresh remote final state#2460
fix: synchronize ACP telemetry and refresh remote final state#2460simonrosenberg merged 3 commits intomainfrom
Conversation
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable - Core telemetry fix is solid, but bundled breaking changes need attention
Verdict: ❌ Needs documentation - The synchronization fix is correct, but two undocumented breaking changes (retry removal + hook regression) should be called out in the PR description or split into separate PRs.
Key Insight: Moving cost recording to a single synchronized path after UsageUpdate receipt is the right fix for zero-cost telemetry. The per-session tracking is cleaner than global state. However, removing ~150 lines of retry logic and changing hook behavior are significant changes that deserve explicit justification.
all-hands-bot
left a comment
There was a problem hiding this comment.
Additional feedback on test coverage:
|
Addressed the remaining review items:
Verification:
|
Coverage Report •
|
||||||||||||||||||||||||||||||
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟢 Good taste - Elegant fix that eliminates race conditions through proper synchronization
Verdict: ✅ Approved - Solid architecture, comprehensive tests, real problem solved
Key Insight: Moving cost recording to a single synchronized path after UsageUpdate receipt is exactly the right fix. The asyncio.Event-based synchronization is clean, the per-session tracking handles concurrent sessions correctly, and the final state refresh ensures accurate reporting. Well done.
✅ Telemetry Fix ValidatedRan a SWE-bench evaluation (50 instances, ACP Claude Code agent) on this branch. Telemetry is now working correctly. Validation Results
Total cost tracked: $40.94 across 50 instances (mean $0.82/instance) Methodology
Links
|
Fixes #2375
This implements the fix direction from the issue discussion:
ACPAgentsession_update()UsageUpdatebefore recording cost/tokens/latencyrun()returnsWhy:
Latest zero-cost ACP benchmark rows were caused by two separate correctness problems:
RemoteConversationcould return from REST fallback with stale cached state, leavingconversation_statsat zero even when the server had final stats.Tests:
PYTHONPATH=/tmp/sdk-issue-2375/openhands-sdk${PYTHONPATH:+:$PYTHONPATH} pytest tests/sdk/agent/test_acp_agent.py tests/sdk/conversation/remote/test_remote_conversation.pyAgent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:544db68-pythonRun
All tags pushed for this build
About Multi-Architecture Support
544db68-python) is a multi-arch manifest supporting both amd64 and arm64544db68-python-amd64) are also available if needed