fix(sdk): surface iteration budgets gracefully by enyst · Pull Request #2739 · OpenHands/software-agent-sdk

enyst · 2026-04-07T12:57:19Z

Summary

Fixes #2406 by making the per-run iteration budget visible to the LLM, warning the model when it is about to run out of steps, and ending max-iteration runs with a dedicated iteration-limit status/event instead of a generic error.

This PR:

adds an <iteration_budget> block to the dynamic system prompt so the LLM sees the run budget up front
injects late-stage wrap-up warnings when only 2 steps remain and when the final step begins
emits ConversationExecutionStatus.ITERATION_LIMIT plus a dedicated ConversationIterationLimitEvent when the budget is exhausted
adds focused SDK tests for the new prompt context, warning injection, and status parsing/serialization
confirms the Commit0 End Status discussed on the issue comes from the benchmarks repo's benchmarks/utils/console_logging.py, which derives it from conversation.state.execution_status plus FinishAction usage rather than agent-server REST/WebSocket metadata

Checklist

If the PR is changing/adding functionality, are there tests to reflect this?
If there is an example, have you run the example to make sure that it works? (N/A: no example changes)
If there are instructions on how to run the code, have you followed the instructions and made sure that it works? (Validated with targeted SDK pytest coverage.)
If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name? (N/A: no public API or docs changes.)
Is the github CI passing?

This PR was created by an AI assistant (OpenHands) on behalf of the user.

@enyst can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:a3fb7ff-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-a3fb7ff-python \
  ghcr.io/openhands/agent-server:a3fb7ff-python

All tags pushed for this build

ghcr.io/openhands/agent-server:a3fb7ff-golang-amd64
ghcr.io/openhands/agent-server:a3fb7ff-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:a3fb7ff-golang-arm64
ghcr.io/openhands/agent-server:a3fb7ff-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:a3fb7ff-java-amd64
ghcr.io/openhands/agent-server:a3fb7ff-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:a3fb7ff-java-arm64
ghcr.io/openhands/agent-server:a3fb7ff-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:a3fb7ff-python-amd64
ghcr.io/openhands/agent-server:a3fb7ff-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:a3fb7ff-python-arm64
ghcr.io/openhands/agent-server:a3fb7ff-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:a3fb7ff-golang
ghcr.io/openhands/agent-server:a3fb7ff-java
ghcr.io/openhands/agent-server:a3fb7ff-python

About Multi-Architecture Support

Each variant tag (e.g., a3fb7ff-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., a3fb7ff-python-amd64) are also available if needed

Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-04-07T12:57:43Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-04-07T12:57:55Z

REST API breakage checks (OpenAPI) — ❌ FAILED

Result: ❌ FAILED

⚠️ Breaking REST API changes or policy violations detected.

Log excerpt (first 1000 characters)


::notice title=openhands-agent-server REST API::Additive oneOf/anyOf expansion detected in response schemas. This is expected for extensible discriminated-union APIs and does not break backward compatibility.
  - added '#/components/schemas/PatternSecurityAnalyzer-Output, #/components/schemas/PolicyRailSecurityAnalyzer-Output, #/components/schemas/EnsembleSecurityAnalyzer-Output' to the '/items/anyOf[subschema #1: ACPConversationInfo]/security_analyzer/anyOf[#/components/schemas/SecurityAnalyzerBase-Output]/' response property 'oneOf' list for the response status '200'
  - added '#/components/schemas/PatternSecurityAnalyzer-Output, #/components/schemas/PolicyRailSecurityAnalyzer-Output, #/components/schemas/EnsembleSecurityAnalyzer-Output' to the 'security_analyzer/anyOf[#/components/schemas/SecurityAnalyzerBase-Output]/' response property 'oneOf' list for the response status '200'
  - added '#/components/schemas/PatternSecurityAnalyzer-Output, #/components/schemas/PolicyRailSecurityA

Action log

github-actions · 2026-04-07T12:58:08Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-04-07T13:02:10Z

🧪 Integration Tests Results

Overall Success Rate: 96.7%
Total Cost: $0.88
Models Tested: 4
Timestamp: 2026-04-07 13:02:02 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_gemini_3.1_pro_preview: 📥 View & Download Logs
litellm_proxy_anthropic_claude_sonnet_4_6: 📥 View & Download Logs
litellm_proxy_moonshot_kimi_k2_thinking: 📥 View & Download Logs
litellm_proxy_deepseek_deepseek_reasoner: 📥 View & Download Logs

📊 Summary

Model	Overall	Tests Passed	Skipped	Total	Cost	Tokens
litellm_proxy_gemini_3.1_pro_preview	100.0%	8/8	0	8	$0.31	245,496
litellm_proxy_anthropic_claude_sonnet_4_6	87.5%	7/8	0	8	$0.44	252,809
litellm_proxy_moonshot_kimi_k2_thinking	100.0%	7/7	1	8	$0.09	328,306
litellm_proxy_deepseek_deepseek_reasoner	100.0%	7/7	1	8	$0.05	694,008

📋 Detailed Results

litellm_proxy_gemini_3.1_pro_preview

Success Rate: 100.0% (8/8)
Total Cost: $0.31
Token Usage: prompt: 241,489, completion: 4,007, cache_read: 122,291, reasoning: 2,324
Run Suffix: litellm_proxy_gemini_3.1_pro_preview_792626f_gemini_3_1_pro_run_N8_20260407_125831

litellm_proxy_anthropic_claude_sonnet_4_6

Success Rate: 87.5% (7/8)
Total Cost: $0.44
Token Usage: prompt: 247,371, completion: 5,438, cache_read: 165,559, cache_write: 81,572, reasoning: 1,042
Run Suffix: litellm_proxy_anthropic_claude_sonnet_4_6_792626f_claude_sonnet_4_6_run_N8_20260407_125832

Failed Tests:

t02_add_bash_hello: Shell script is not executable (Cost: $0.05)

litellm_proxy_moonshot_kimi_k2_thinking

Success Rate: 100.0% (7/7)
Total Cost: $0.09
Token Usage: prompt: 322,670, completion: 5,636, cache_read: 263,424
Run Suffix: litellm_proxy_moonshot_kimi_k2_thinking_792626f_kimi_k2_thinking_run_N8_20260407_125829
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_deepseek_deepseek_reasoner

Success Rate: 100.0% (7/7)
Total Cost: $0.05
Token Usage: prompt: 678,329, completion: 15,679, cache_read: 595,968, reasoning: 6,954
Run Suffix: litellm_proxy_deepseek_deepseek_reasoner_792626f_deepseek_v3_2_reasoner_run_N8_20260407_125832
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

fix(sdk): surface iteration budgets gracefully

792626f

Co-authored-by: openhands <openhands@all-hands.dev>

enyst added the integration-test Runs the integration tests and comments the results label Apr 7, 2026 — with OpenHands AI

openhands-ai bot mentioned this pull request Apr 7, 2026

proposal: make Max Iteration Limit visible to the LLM and have a Graceful Termination Message #2406

Open

enyst closed this Apr 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sdk): surface iteration budgets gracefully#2739

fix(sdk): surface iteration budgets gracefully#2739
enyst wants to merge 1 commit intomainfrom
openhands/issue-2406-iteration-limit

enyst commented Apr 7, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

enyst commented Apr 7, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

github-actions bot commented Apr 7, 2026

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions bot commented Apr 7, 2026

REST API breakage checks (OpenAPI) — ❌ FAILED

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026

🧪 Integration Tests Results

📁 Detailed Logs & Artifacts

📊 Summary

📋 Detailed Results

litellm_proxy_gemini_3.1_pro_preview

litellm_proxy_anthropic_claude_sonnet_4_6

litellm_proxy_moonshot_kimi_k2_thinking

litellm_proxy_deepseek_deepseek_reasoner

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

enyst commented Apr 7, 2026 •

edited by github-actions bot

Loading