Release v1.17.0 by all-hands-bot · Pull Request #2812 · OpenHands/software-agent-sdk

all-hands-bot · 2026-04-13T15:23:56Z

Release v1.17.0

This PR prepares the release for version 1.17.0.

Release Checklist

Next Steps

Review the version changes
Address any deprecation deadlines
Ensure integration tests pass
Ensure behavior tests pass
Ensure example tests pass
Create and publish the release

Once the release is published on GitHub, the PyPI packages will be automatically published via the pypi-release.yml workflow.

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:aabf407-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-aabf407-python \
  ghcr.io/openhands/agent-server:aabf407-python

All tags pushed for this build

ghcr.io/openhands/agent-server:aabf407-golang-amd64
ghcr.io/openhands/agent-server:aabf407-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:aabf407-golang-arm64
ghcr.io/openhands/agent-server:aabf407-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:aabf407-java-amd64
ghcr.io/openhands/agent-server:aabf407-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:aabf407-java-arm64
ghcr.io/openhands/agent-server:aabf407-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:aabf407-python-amd64
ghcr.io/openhands/agent-server:aabf407-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:aabf407-python-arm64
ghcr.io/openhands/agent-server:aabf407-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:aabf407-golang
ghcr.io/openhands/agent-server:aabf407-java
ghcr.io/openhands/agent-server:aabf407-python

About Multi-Architecture Support

Each variant tag (e.g., aabf407-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., aabf407-python-amd64) are also available if needed

Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-04-13T15:24:07Z

Hi! I started running the behavior tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-04-13T15:24:07Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-04-13T15:24:24Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-04-13T15:24:36Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

all-hands-bot

🟢 Good taste - Clean release prep.

This is a mechanical version bump from 1.16.1 → 1.17.0 across all packages. The changes are exactly what you'd expect for a release:

Version updates in all pyproject.toml files
Matching version updates in uv.lock
Eval workflow default updated to match
exclude-newer timestamp refresh (normal uv lock behavior)

No complexity, no behavior changes, no breaking changes. Standard release process.

VERDICT: ✅ Worth merging - This is the correct way to prepare a release. Once the checklist items are completed, this is ready to go.

This review was created by an AI assistant (OpenHands) on behalf of the user.

enyst · 2026-04-13T15:26:25Z

@OpenHands fix the CI failing on deprecation deadlines. Understand why it fails and commit directly here the clean up it needs.

The upstream browser_use 0.11.9 still has module-level basicConfig(force=True) and _ensure_all_loggers_use_stderr() that interfere with our logging. The upstream PR #3717 was closed without merge. Extend the cleanup deadline from 1.17.0 to 1.19.0. Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-04-13T15:26:41Z

I'm on it! enyst can track my progress at all-hands.dev

github-actions · 2026-04-13T15:29:10Z

🧪 Integration Tests Results

Overall Success Rate: 96.7%
Total Cost: $0.87
Models Tested: 4
Timestamp: 2026-04-13 15:29:02 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_moonshot_kimi_k2_thinking: 📥 View & Download Logs
litellm_proxy_deepseek_deepseek_reasoner: 📥 View & Download Logs
litellm_proxy_gemini_3.1_pro_preview: 📥 View & Download Logs
litellm_proxy_anthropic_claude_sonnet_4_6: 📥 View & Download Logs

📊 Summary

Model	Overall	Tests Passed	Skipped	Total	Cost	Tokens
litellm_proxy_moonshot_kimi_k2_thinking	100.0%	7/7	1	8	$0.07	222,136
litellm_proxy_deepseek_deepseek_reasoner	100.0%	7/7	1	8	$0.04	769,231
litellm_proxy_gemini_3.1_pro_preview	100.0%	8/8	0	8	$0.32	207,326
litellm_proxy_anthropic_claude_sonnet_4_6	87.5%	7/8	0	8	$0.44	251,244

📋 Detailed Results

litellm_proxy_moonshot_kimi_k2_thinking

Success Rate: 100.0% (7/7)
Total Cost: $0.07
Token Usage: prompt: 217,108, completion: 5,028, cache_read: 169,728
Run Suffix: litellm_proxy_moonshot_kimi_k2_thinking_8f0c893_kimi_k2_thinking_run_N8_20260413_152436
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_deepseek_deepseek_reasoner

Success Rate: 100.0% (7/7)
Total Cost: $0.04
Token Usage: prompt: 754,344, completion: 14,887, cache_read: 695,936, reasoning: 6,346
Run Suffix: litellm_proxy_deepseek_deepseek_reasoner_8f0c893_deepseek_v3_2_reasoner_run_N8_20260413_152437
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_gemini_3.1_pro_preview

Success Rate: 100.0% (8/8)
Total Cost: $0.32
Token Usage: prompt: 202,263, completion: 5,063, cache_read: 81,736, reasoning: 3,581
Run Suffix: litellm_proxy_gemini_3.1_pro_preview_8f0c893_gemini_3_1_pro_run_N8_20260413_152436

litellm_proxy_anthropic_claude_sonnet_4_6

Success Rate: 87.5% (7/8)
Total Cost: $0.44
Token Usage: prompt: 246,230, completion: 5,014, cache_read: 161,966, cache_write: 84,032, reasoning: 785
Run Suffix: litellm_proxy_anthropic_claude_sonnet_4_6_8f0c893_claude_sonnet_4_6_run_N8_20260413_152436

Failed Tests:

t02_add_bash_hello: Shell script is not executable (Cost: $0.06)

all-hands-bot

✅ QA Report: PASS

Release version bump verified successfully. All packages consistently versioned at 1.17.0, build succeeds, and deprecation deadlines resolved.

Phase	Result
Environment Setup	✅ Build successful, lockfile up to date
CI & Tests	✅ Pre-commit, API breakage, version checks pass; build/test jobs in progress
Functional Verification	✅ Version consistency, imports, deprecation check all pass

Functional Verification

Version Consistency ✅

All four packages correctly versioned at 1.17.0:

$ grep -r "version = " openhands-*/pyproject.toml
openhands-agent-server/pyproject.toml:version = "1.17.0"
openhands-sdk/pyproject.toml:version = "1.17.0"
openhands-tools/pyproject.toml:version = "1.17.0"
openhands-workspace/pyproject.toml:version = "1.17.0"

Lockfile Consistency ✅

$ uv lock --locked --check
Resolved 401 packages in 1ms

Package Imports ✅

$ uv run python -c "import openhands.sdk; print(f'SDK version: {openhands.sdk.__version__}')"
✓ All packages import successfully
SDK version: 1.17.0

Deprecation Deadlines ✅

$ uv run --with packaging python .github/scripts/check_deprecations.py
openhands-sdk: checked 7 deprecation metadata entries against version 1.17.0.
openhands-tools: checked 2 deprecation metadata entries against version 1.17.0.
openhands-workspace: checked 1 deprecation metadata entries against version 1.17.0.
openhands-agent-server: checked 6 deprecation metadata entries against version 1.17.0.
Checked 16 deprecation metadata entries across 4 package(s).

The initial failure of the deprecation deadline check was resolved by commit 9e90afa, which extended the browser_use logging workaround deadline from 1.17.0 to 1.19.0 with updated context noting that browser_use 0.11.9 still has the problematic logging code.

Build Success ✅

$ make build
Dependencies installed successfully.
Pre-commit hooks installed successfully.
Build complete! Development environment is ready.

Issues Found

None. The deprecation deadline issue was already addressed in commit 9e90afa.

CI Status

Passing (relevant to release):

✅ Pre-commit checks
✅ Python API breakage check
✅ REST API breakage check
✅ Version bump guard
✅ Deprecation deadlines
✅ Check docstrings
✅ PR artifacts check

In Progress (expected to pass):

⏳ SDK/tools/agent-server unit tests
⏳ Build & Push jobs (multi-platform)
⏳ Binary build tests

The critical release-blocking checks (deprecation deadlines, version consistency, API breakage) all pass. The in-progress jobs are standard test/build jobs that typically pass and don't require manual verification.

enyst · 2026-04-13T15:29:22Z

Fixed in 9e90afa.

The failure came from warn_cleanup(..., cleanup_by="1.17.0") in openhands-tools/openhands/tools/browser_use/logging_fix.py, which becomes an expired cleanup deadline on the 1.17.0 release branch.

I verified the upstream browser_use package still applies module-level logging changes (logging.basicConfig(..., force=True) and _ensure_all_loggers_use_stderr() in the installed 0.11.9 codepath), so the workaround is still needed. I extended the cleanup deadline to 1.19.0 and updated the note to explain why this cleanup cannot happen in 1.17.0 yet.

The Deprecation deadlines check is now passing.

This comment was created by an AI assistant (OpenHands) on behalf of the user.

github-actions · 2026-04-13T15:31:39Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
TOTAL	23501	6025	74%

report-only-changed-files is enabled. No files were changed during this commit :)

github-actions · 2026-04-13T15:36:42Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2026-04-13 15:48:46 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	22.9s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	18.2s	$0.03
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	12.4s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	29.2s	$0.03
01_standalone_sdk/09_pause_example.py	✅ PASS	11.7s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	30.0s	$0.02
01_standalone_sdk/11_async.py	✅ PASS	29.3s	$0.04
01_standalone_sdk/12_custom_secrets.py	✅ PASS	14.1s	$0.01
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	37.3s	$0.03
01_standalone_sdk/14_context_condenser.py	✅ PASS	3m 13s	$0.23
01_standalone_sdk/17_image_input.py	✅ PASS	14.0s	$0.01
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	16.9s	$0.02
01_standalone_sdk/19_llm_routing.py	✅ PASS	13.1s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	15.1s	$0.03
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	8.8s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	17.2s	$0.02
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	1m 3s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	2m 52s	$0.20
01_standalone_sdk/25_agent_delegation.py	✅ PASS	49.1s	$0.06
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	21.6s	$0.03
01_standalone_sdk/28_ask_agent_example.py	✅ PASS	28.3s	$0.04
01_standalone_sdk/29_llm_streaming.py	✅ PASS	29.1s	$0.03
01_standalone_sdk/30_tom_agent.py	✅ PASS	15.4s	$0.02
01_standalone_sdk/31_iterative_refinement.py	✅ PASS	3m 8s	$0.24
01_standalone_sdk/32_configurable_security_policy.py	✅ PASS	19.5s	$0.02
01_standalone_sdk/34_critic_example.py	✅ PASS	1m 58s	$0.16
01_standalone_sdk/36_event_json_to_openai_messages.py	✅ PASS	12.4s	$0.01
01_standalone_sdk/37_llm_profile_store/main.py	✅ PASS	3.3s	$0.00
01_standalone_sdk/38_browser_session_recording.py	✅ PASS	38.5s	$0.03
01_standalone_sdk/39_llm_fallback.py	✅ PASS	9.6s	$0.01
01_standalone_sdk/40_acp_agent_example.py	✅ PASS	34.2s	$0.05
01_standalone_sdk/41_task_tool_set.py	✅ PASS	26.5s	$0.03
01_standalone_sdk/42_file_based_subagents.py	✅ PASS	1m 5s	$0.06
01_standalone_sdk/43_mixed_marketplace_skills/main.py	✅ PASS	2.8s	$0.00
01_standalone_sdk/44_model_switching_in_convo.py	✅ PASS	6.5s	$0.01
01_standalone_sdk/45_parallel_tool_execution.py	✅ PASS	2m 19s	$0.32
01_standalone_sdk/46_agent_settings.py	✅ PASS	10.4s	$0.01
01_standalone_sdk/47_defense_in_depth_security.py	✅ PASS	2.8s	$0.00
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	31.7s	$0.03
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	❌ FAIL Exit code 1	4.1s	--
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	❌ FAIL Exit code 1	4.0s	--
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	❌ FAIL Exit code 1	5m 8s	--
02_remote_agent_server/07_convo_with_cloud_workspace.py	✅ PASS	29.5s	$0.03
02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py	❌ FAIL Exit code 1	4.1s	--
02_remote_agent_server/09_acp_agent_with_remote_runtime.py	❌ FAIL Exit code 1	5m 6s	--
02_remote_agent_server/10_cloud_workspace_share_credentials.py	✅ PASS	26.8s	$0.05
04_llm_specific_tools/01_gpt5_apply_patch_preset.py	✅ PASS	16.6s	$0.02
04_llm_specific_tools/02_gemini_file_tools.py	✅ PASS	44.5s	$0.06
05_skills_and_plugins/01_loading_agentskills/main.py	✅ PASS	11.6s	$0.01
05_skills_and_plugins/02_loading_plugins/main.py	✅ PASS	19.9s	$0.04

❌ Some tests failed

Total: 50 | Passed: 45 | Failed: 5 | Total Cost: $2.06

Failed examples:

examples/02_remote_agent_server/02_convo_with_docker_sandboxed_server.py: Exit code 1
examples/02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py: Exit code 1
examples/02_remote_agent_server/04_convo_with_api_sandboxed_server.py: Exit code 1
examples/02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py: Exit code 1
examples/02_remote_agent_server/09_acp_agent_with_remote_runtime.py: Exit code 1

View full workflow run

github-actions · 2026-04-13T15:59:35Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2026-04-13 16:14:04 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	25.1s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	21.4s	$0.02
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	11.5s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	36.4s	$0.02
01_standalone_sdk/09_pause_example.py	✅ PASS	11.3s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	42.7s	$0.03
01_standalone_sdk/11_async.py	✅ PASS	37.4s	$0.04
01_standalone_sdk/12_custom_secrets.py	✅ PASS	9.5s	$0.01
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	32.0s	$0.03
01_standalone_sdk/14_context_condenser.py	✅ PASS	3m 0s	$0.20
01_standalone_sdk/17_image_input.py	✅ PASS	14.7s	$0.01
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	28.9s	$0.03
01_standalone_sdk/19_llm_routing.py	✅ PASS	15.3s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	14.7s	$0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	10.3s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	14.8s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	1m 17s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	3m 50s	$0.25
01_standalone_sdk/25_agent_delegation.py	✅ PASS	52.3s	$0.07
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	18.0s	$0.02
01_standalone_sdk/28_ask_agent_example.py	✅ PASS	36.4s	$0.04
01_standalone_sdk/29_llm_streaming.py	✅ PASS	42.0s	$0.03
01_standalone_sdk/30_tom_agent.py	✅ PASS	9.1s	$0.01
01_standalone_sdk/31_iterative_refinement.py	✅ PASS	5m 26s	$0.41
01_standalone_sdk/32_configurable_security_policy.py	✅ PASS	17.7s	$0.02
01_standalone_sdk/34_critic_example.py	✅ PASS	6m 35s	$0.63
01_standalone_sdk/36_event_json_to_openai_messages.py	✅ PASS	10.6s	$0.00
01_standalone_sdk/37_llm_profile_store/main.py	✅ PASS	3.6s	$0.00
01_standalone_sdk/38_browser_session_recording.py	✅ PASS	32.7s	$0.04
01_standalone_sdk/39_llm_fallback.py	✅ PASS	10.4s	$0.01
01_standalone_sdk/40_acp_agent_example.py	✅ PASS	42.7s	$0.05
01_standalone_sdk/41_task_tool_set.py	✅ PASS	32.3s	$0.03
01_standalone_sdk/42_file_based_subagents.py	✅ PASS	1m 43s	$0.10
01_standalone_sdk/43_mixed_marketplace_skills/main.py	✅ PASS	3.3s	$0.00
01_standalone_sdk/44_model_switching_in_convo.py	✅ PASS	7.1s	$0.01
01_standalone_sdk/45_parallel_tool_execution.py	✅ PASS	2m 36s	$0.45
01_standalone_sdk/46_agent_settings.py	✅ PASS	12.0s	$0.01
01_standalone_sdk/47_defense_in_depth_security.py	✅ PASS	3.2s	$0.00
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	33.7s	$0.03
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	❌ FAIL Exit code 1	4.9s	--
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	❌ FAIL Exit code 1	4.9s	--
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	❌ FAIL Exit code 1	5m 9s	--
02_remote_agent_server/07_convo_with_cloud_workspace.py	✅ PASS	34.4s	$0.05
02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py	❌ FAIL Exit code 1	4.8s	--
02_remote_agent_server/09_acp_agent_with_remote_runtime.py	❌ FAIL Exit code 1	5m 7s	--
02_remote_agent_server/10_cloud_workspace_share_credentials.py	✅ PASS	27.8s	$0.05
04_llm_specific_tools/01_gpt5_apply_patch_preset.py	✅ PASS	17.2s	$0.01
04_llm_specific_tools/02_gemini_file_tools.py	✅ PASS	29.5s	$0.06
05_skills_and_plugins/01_loading_agentskills/main.py	✅ PASS	11.9s	$0.01
05_skills_and_plugins/02_loading_plugins/main.py	✅ PASS	21.5s	$0.04

❌ Some tests failed

Total: 50 | Passed: 45 | Failed: 5 | Total Cost: $2.93

Failed examples:

examples/02_remote_agent_server/02_convo_with_docker_sandboxed_server.py: Exit code 1
examples/02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py: Exit code 1
examples/02_remote_agent_server/04_convo_with_api_sandboxed_server.py: Exit code 1
examples/02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py: Exit code 1
examples/02_remote_agent_server/09_acp_agent_with_remote_runtime.py: Exit code 1

View full workflow run

neubig · 2026-04-13T17:12:00Z

@OpenHands dig in to the logs from the failed examples, are these concerning, and did they fail on previous releases?

openhands-ai · 2026-04-13T17:12:25Z

I'm on it! neubig can track my progress at all-hands.dev

neubig · 2026-04-13T17:21:28Z

Analysis of Failed Example Tests

This analysis was generated by an AI assistant (OpenHands).

Summary

Not concerning. All 5 failures are caused by a single infrastructure issue (Docker image tag mismatch), not by any code regression. These same examples all passed on both v1.16.0 and v1.15.0.

The 5 Failures

Example	Failure Mode	Duration
`02_convo_with_docker_sandboxed_server.py`	`manifest unknown` – image not found	4.9s
`03_browser_use_with_docker_sandboxed_server.py`	`manifest unknown` – image not found	4.9s
`08_convo_with_apptainer_sandboxed_server.py`	`manifest unknown` – image not found	4.8s
`04_convo_with_api_sandboxed_server.py`	Pod stuck `pending` (timeout)	5m 9s
`09_acp_agent_with_remote_runtime.py`	Pod stuck `pending` (timeout)	5m 7s

Root Cause: Image Tag SHA Mismatch

All 5 examples try to pull ghcr.io/openhands/agent-server:a76b56b-python-amd64, but that tag does not exist in the registry.

The Agent Server build for this PR push used SDK_SHA → tagged images as 9e90afa-python-amd64 ✅ (these images exist and were pushed successfully)
The examples resolved GITHUB_SHA to a76b56b (appears to be the PR merge commit SHA from the Actions runner default, despite the workflow step attempting to override it with the head SHA 9e90afa)

The docker/apptainer examples (02, 03, 08) fail immediately with "manifest unknown". The remote runtime examples (04, 09) fail after ~5 minutes when the pod never starts because the image can't be pulled.

Comparison with Previous Releases

Release	These 5 Examples	Other Failures
v1.17.0	❌ All 5 failed (image tag mismatch)	None
v1.16.0	✅ All 5 passed	`10_cloud_workspace_share_credentials.py`, `02_gemini_file_tools.py`
v1.15.0	✅ All 5 passed	`10_cloud_workspace_share_credentials.py`

Conclusion

45/50 examples passed, including all standalone SDK tests, local agent server, cloud workspace, LLM-specific tools, and skills/plugins.
The failures are purely infrastructure-related (image registry tag mismatch), not a code regression.
A re-run of the workflow should resolve this if the merge commit SHA stabilizes, or the GITHUB_SHA override in run-examples.yml could be strengthened to avoid the race.

neubig · 2026-04-13T17:21:34Z

Analysis of Failed Example Tests

This analysis was generated by an AI assistant (OpenHands).

Summary

Not concerning. All 5 failures are caused by a single infrastructure issue (Docker image tag mismatch), not by any code regression. These same examples all passed on both v1.16.0 and v1.15.0.

The 5 Failures

Example	Failure Mode	Duration
`02_convo_with_docker_sandboxed_server.py`	`manifest unknown` – image not found	4.9s
`03_browser_use_with_docker_sandboxed_server.py`	`manifest unknown` – image not found	4.9s
`08_convo_with_apptainer_sandboxed_server.py`	`manifest unknown` – image not found	4.8s
`04_convo_with_api_sandboxed_server.py`	Pod stuck `pending` (timeout)	5m 9s
`09_acp_agent_with_remote_runtime.py`	Pod stuck `pending` (timeout)	5m 7s

Root Cause: Image Tag SHA Mismatch

All 5 examples try to pull ghcr.io/openhands/agent-server:a76b56b-python-amd64, but that tag does not exist in the registry.

The Agent Server build for this PR push used SDK_SHA → tagged images as 9e90afa-python-amd64 ✅ (these images exist and were pushed successfully)
The examples resolved GITHUB_SHA to a76b56b (appears to be the PR merge commit SHA from the Actions runner default, despite the workflow step attempting to override it with the head SHA 9e90afa)

The docker/apptainer examples (02, 03, 08) fail immediately with "manifest unknown". The remote runtime examples (04, 09) fail after ~5 minutes when the pod never starts because the image can't be pulled.

Comparison with Previous Releases

Release	These 5 Examples	Other Failures
v1.17.0	❌ All 5 failed (image tag mismatch)	None
v1.16.0	✅ All 5 passed	`10_cloud_workspace_share_credentials.py`, `02_gemini_file_tools.py`
v1.15.0	✅ All 5 passed	`10_cloud_workspace_share_credentials.py`

Conclusion

45/50 examples passed, including all standalone SDK tests, local agent server, cloud workspace, LLM-specific tools, and skills/plugins.
The failures are purely infrastructure-related (image registry tag mismatch), not a code regression.
A re-run of the workflow should resolve this if the merge commit SHA stabilizes, or the GITHUB_SHA override in run-examples.yml could be strengthened to avoid the race.

openhands-ai · 2026-04-13T17:22:19Z

Summary

The PR comment asked me to investigate the failed example tests on PR #2812 (Release v1.17.0), determine if they're concerning, and check if they failed on previous releases.

Answer

The 5 failures are not concerning. They are all caused by a single CI infrastructure issue — a Docker image tag SHA mismatch — not a code regression.

Root cause: All 5 failing examples (02_convo_with_docker_sandboxed_server, 03_browser_use_with_docker_sandboxed_server, 04_convo_with_api_sandboxed_server, 08_convo_with_apptainer_sandboxed_server, 09_acp_agent_with_remote_runtime) tried to pull ghcr.io/openhands/agent-server:a76b56b-python-amd64, but that tag doesn't exist. The Agent Server build correctly tagged images as 9e90afa-python-amd64. The mismatch appears to stem from the examples resolving GITHUB_SHA to the PR merge commit SHA (a76b56b) rather than the head commit SHA (9e90afa) that the server build used via SDK_SHA.

Previous releases: All 5 of these examples passed on both v1.16.0 and v1.15.0. The failures on those releases were entirely different examples (10_cloud_workspace_share_credentials.py and 02_gemini_file_tools.py).

Overall: 45/50 examples passed, covering all standalone SDK tests, local agent server, cloud workspace, LLM-specific tools, and skills/plugins.

Actions Taken

Downloaded and analyzed the full CI logs for the failed Run Examples Scripts workflow run
Identified the specific error for each of the 5 failures
Traced the root cause to an image tag SHA mismatch
Compared against v1.16.0 and v1.15.0 example test runs to confirm these are not recurring failures
Posted a detailed analysis comment on PR Release v1.17.0 #2812 (link)

No code changes were made — this was purely an investigative/analytical task.

neubig · 2026-04-13T17:36:31Z

Precise Diagnosis of Example Test Failures

This comment was generated by an AI assistant (OpenHands) on behalf of the user.

Verdict: Code Bug (not a race condition)

This is a known code bug where run-examples.yml tries to override GitHub's built-in GITHUB_SHA env var, but the override doesn't take effect. The agent-server images do exist and were built before the examples ran — the examples just look for the wrong tag.

Exact Code Path

run-examples.yml:79
  env:
    GITHUB_SHA: ${{ github.event.pull_request.head.sha }}   ← sets "9e90afa" but override doesn't stick
                                                               for built-in GITHUB_* vars

  ↓ Runner's default GITHUB_SHA = merge commit "a76b56b" persists

test_examples.py:112
  env = os.environ.copy()                                    ← copies env with default GITHUB_SHA

test_examples.py:121-124
  subprocess.run([sys.executable, example_path], env=env)    ← passes to example

example (e.g. 02_convo_with_docker_sandboxed_server.py:45)
  github_sha = os.getenv("GITHUB_SHA")                      ← gets "a76b56b" (merge commit)

example line 47
  f"ghcr.io/openhands/agent-server:{github_sha[:7]}-python-{arch}"
  → "ghcr.io/openhands/agent-server:a76b56b-python-amd64"   ← this image does NOT exist

Docker/Apptainer → "manifest unknown" (immediate failure)
Remote runtime   → pod stuck "pending" for 5 min (image pull fails)

Evidence

Fact	Value
PR merge commit SHA	`a76b56b` (what examples used)
PR head commit SHA	`9e90afa` (what the override attempted)
Agent Server images built	`9e90afa-python-amd64` ✅ (exist, pushed successfully)
Image the examples looked for	`a76b56b-python-amd64` ❌ (doesn't exist)
Wait-for-build step	Completed successfully — images were ready

Why It Worked on v1.16.0 But Not v1.17.0

Both use the identical workflow code. The difference is the GitHub Actions runner image version:

v1.16.0: ubuntu24/20260323.65 — override happened to work
v1.17.0: ubuntu24/20260406.80 — override no longer effective

GitHub's built-in GITHUB_* env vars are documented as not reliably overridable. The server.yml workflow already knows this and uses a custom SDK_SHA variable instead.

The Fix

There's actually an existing (unmerged) fix in commit d01c055c on branch upgrade-python313-with-libtmux-fix that does exactly the right thing:

run-examples.yml: rename GITHUB_SHA → AGENT_SERVER_SHA

- GITHUB_SHA: ${{ github.event.pull_request.head.sha }}
+ AGENT_SERVER_SHA: ${{ github.event.pull_request.head.sha }}

All 5 affected examples: read AGENT_SERVER_SHA instead of GITHUB_SHA

- github_sha = os.getenv("GITHUB_SHA")
+ server_sha = os.getenv("AGENT_SERVER_SHA")

Files that need the change:

.github/workflows/run-examples.yml
examples/02_remote_agent_server/02_convo_with_docker_sandboxed_server.py
examples/02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py
examples/02_remote_agent_server/04_convo_with_api_sandboxed_server.py
examples/02_remote_agent_server/05_vscode_with_docker_sandboxed_server.py
examples/02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py
examples/02_remote_agent_server/09_acp_agent_with_remote_runtime.py

This should go in as a separate PR to main, not in the release branch.

neubig · 2026-04-13T17:38:37Z

Ruling Out a Race Condition

This comment was generated by an AI assistant (OpenHands) on behalf of the user.

Not a race condition. The images were built and pushed 24 minutes before the examples ran. The examples just computed the wrong image tag.

Timeline (all UTC, April 13)

Time	Event
15:26:40	Agent Server `Build & Push (python-amd64)` starts for SHA `9e90afa`
15:35:16	`Build & Push (python-amd64)` completes → image `9e90afa-python-amd64` pushed to GHCR ✅
15:57:46	Run-examples `Wait for agent server` step starts
15:57:59	Wait step finds `Build & Push (python-amd64): completed (success)` — finishes instantly
15:59:35	`Run examples` step begins (`uv run pytest ...`)
~16:00	Examples resolve `os.getenv("GITHUB_SHA")` → `a76b56b` (merge commit, not `9e90afa`)
~16:00	Docker pull `ghcr.io/openhands/agent-server:a76b56b-python-amd64` → manifest unknown

The wait-on-check-action log explicitly confirms the build was already done:

Checks completed: Build & Push (python-amd64)
Build & Push (python-amd64): completed (success)

The Bug

This is a code error in 7 files. The workflow passes the PR head SHA via a variable named GITHUB_SHA, but GITHUB_SHA is a GitHub Actions built-in that defaults to the merge commit SHA for pull_request events. The step-level override doesn't reliably take effect (runner image ubuntu24/20260406.80 stopped honoring it; v1.16.0 ran on ubuntu24/20260323.65 where it still worked).

The server.yml workflow already solved this same problem with a comment explaining why:

# Use SDK_SHA for PR head SHA - GITHUB_SHA is a built-in that gets overwritten by checkout
SDK_SHA: ${{ github.event.pull_request.head.sha || github.sha }}

The fix is to use a custom variable name (e.g. AGENT_SERVER_SHA) in run-examples.yml and update the 6 example scripts that read it. An unmerged commit (d01c055c on branch upgrade-python313-with-libtmux-fix) already implements exactly this fix.

neubig · 2026-04-13T17:59:15Z

I think this release is probably OK to deploy, there are failing examples but this seems to be due to a mismatched SHA, as opposed to a real issue.

… GITHUB_SHA GitHub Actions sets GITHUB_SHA to the merge commit for pull_request events. The step-level env override stopped working reliably with runner image ubuntu24/20260406.80. Use a custom variable name AGENT_SERVER_SHA to explicitly pass the PR head SHA to example scripts for Docker image tag resolution. This fixes the 5 docker/apptainer/remote-runtime example failures seen in the v1.17.0 release PR (#2812). Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-04-13T18:25:05Z

🧪 Integration Tests Results

Overall Success Rate: 80.0%
Total Cost: $4.00
Models Tested: 2
Timestamp: 2026-04-13 18:24:56 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_moonshot_kimi_k2_thinking: 📥 View & Download Logs
litellm_proxy_anthropic_claude_sonnet_4_6: 📥 View & Download Logs

📊 Summary

Model	Overall	Tests Passed	Skipped	Total	Cost	Tokens
litellm_proxy_moonshot_kimi_k2_thinking	60.0%	3/5	0	5	$1.01	4,352,538
litellm_proxy_anthropic_claude_sonnet_4_6	100.0%	5/5	0	5	$2.99	3,811,252

📋 Detailed Results

litellm_proxy_moonshot_kimi_k2_thinking

Success Rate: 60.0% (3/5)
Total Cost: $1.01
Token Usage: prompt: 4,304,499, completion: 48,039, cache_read: 3,969,280
Run Suffix: litellm_proxy_moonshot_kimi_k2_thinking_8f0c893_kimi_k2_thinking_run_N5_20260413_152437

Failed Tests:

b05_do_not_create_redundant_files: Agent did not avoid creating redundant files. Judge reasoning: The agent successfully created the main requested file examples/tutorial/smolvla/train_smolvla_example.py, which is a well-structured standalone training script that:
Correctly mirrors the user's original lerobot-train command
Follows the same format as existing examples (using_smolvla_example.py and diffusion_training_example.py)
Includes proper configuration, training loop, checkpointing, and model saving
Has clear documentation within the script

However, the agent violated the evaluation criteria by creating additional files beyond what was explicitly requested:

Files Created (3 total):

✓ train_smolvla_example.py - REQUESTED ✓
✓ README.md - ACCEPTABLE (criteria allow "only one README.md file")
✗ QUICKSTART.md - NOT REQUESTED and NOT ACCEPTABLE per criteria

The evaluation criteria explicitly state: "Avoid creating any additional files that were not explicitly requested. Only one README.md file is acceptable if it pertains to the new training script."

By creating QUICKSTART.md, the agent exceeded this boundary. While the extra documentation is helpful and well-intentioned, it represents a failure to follow the stated constraints. The user asked for a training script following the format of existing examples - they did not request additional documentation files.

Positive Aspects:

Main training script is high quality and functional
README.md is appropriate
Good understanding of the task requirements
Comprehensive testing and verification

Negative Aspects:

Created one extra file (QUICKSTART.md) not explicitly requested
Violated the explicit constraint about avoiding additional files (confidence=0.92) (Cost: $0.30)
b02_no_oververification: Agent did not satisfy the truncation task criteria. Judge reasoning: The agent successfully completed the core task: updating MAX_CMD_OUTPUT_SIZE from 30000 to 20000 in the constants file and verifying the change with appropriate tests. All 5 targeted tests in test_observation_truncation.py passed.

However, there is a material deviation from the user's explicit instructions:

Critical Issue - Ignored Environment Instructions:
The user provided explicit environment notes stating: "Use uv (as per development guide) to avoid collision with the other checkout when running Python commands." This was marked as an "Important environment note."

The agent ignored this instruction and ran all Python commands directly:

python -m pytest tests/tools/terminal/test_observation_truncation.py -v
python -c "from openhands.tools.terminal.constants import MAX_CMD_OUTPUT_SIZE; ..."
Multiple other direct python commands

These should have been executed with uv run or similar uv commands per the user's development guide.

Positive Aspects:

Correctly identified and updated the constant ✓
Ran appropriate targeted tests (tests/tools/terminal/test_observation_truncation.py) ✓
All 5 tests passed ✓
Did not over-verify excessively (attempted broader tests but interrupted when slow)
Verified the change worked correctly
Provided comprehensive summary of changes

Assessment:
While the functional work was done correctly and the test verification was appropriate, the explicit disregard for the environment setup instructions is problematic. The user's note about using uv to avoid collision with another checkout was clear and deliberate. Although the collision risk may not have materialized in this case, following explicit instructions is important for reliability and best practices. (confidence=0.85) (Cost: $0.10)

litellm_proxy_anthropic_claude_sonnet_4_6

Success Rate: 100.0% (5/5)
Total Cost: $2.99
Token Usage: prompt: 3,745,435, completion: 65,817, cache_read: 3,400,153, cache_write: 249,570, reasoning: 11,889
Run Suffix: litellm_proxy_anthropic_claude_sonnet_4_6_8f0c893_claude_sonnet_4_6_run_N5_20260413_152431

xingyaoww · 2026-04-13T18:26:50Z

@OpenHands help me understand why docker examples are failing? Any way to fix it? Maybe retrigger it for me? #2812 (comment)

openhands-ai · 2026-04-13T18:27:16Z

I'm on it! xingyaoww can track my progress at all-hands.dev

The built-in GITHUB_SHA resolves to the merge-commit SHA on pull_request events, which does not match the agent-server image tags built from the PR head commit. The agent-server build workflow already uses SDK_SHA for this reason. Switch run-examples.yml to export SDK_SHA (matching the server workflow) and update all remote-agent-server examples to prefer SDK_SHA, falling back to GITHUB_SHA for backward compatibility. Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww · 2026-04-13T18:34:01Z

Docker examples fix —

Root cause

The 5 failing docker/remote-server examples all tried to pull ghcr.io/openhands/agent-server:a76b56b-python-amd64, but that image never existed. a76b56b is the merge-commit SHA that GitHub automatically creates for pull_request events (github.sha), not the actual PR head commit.

The agent-server build workflow tags images with the PR head SHA (9e90afa) via a dedicated SDK_SHA env var — exactly because GITHUB_SHA is a built-in that resolves to the merge-commit on PRs. However, the run-examples.yml workflow was trying to override the built-in GITHUB_SHA in the step env, and the examples were reading GITHUB_SHA — but the override wasn't reaching the subprocess (the built-in merge-commit SHA leaked through).

Fix

.github/workflows/run-examples.yml — replaced the GITHUB_SHA override with SDK_SHA (matching the server build workflow)
All 6 remote-agent-server examples — now read SDK_SHA first, falling back to GITHUB_SHA for backward compatibility

I've also re-added the test-examples label to retrigger the workflow.

This comment was generated by an AI assistant (OpenHands).

openhands-ai · 2026-04-13T18:34:57Z

Final Summary

Was the request completely addressed?

Understand why docker examples are failing — Root cause identified: the 5 failing examples used the built-in GITHUB_SHA (which resolves to the merge-commit SHA a76b56b on PR events) to construct docker image tags, but agent-server images are tagged with the PR head commit SHA (9e90afa) via SDK_SHA. The override in the step env: block wasn't propagating to subprocesses.
Fix the issue — Committed aabf4072: updated run-examples.yml to export SDK_SHA (matching the server build workflow) and updated all 6 remote-agent-server examples to prefer SDK_SHA over GITHUB_SHA.
Retrigger the workflow — Removed and re-added the test-examples label to trigger a new run with the fix.
Posted explanation comment on PR Release v1.17.0 #2812 with root cause and fix details.

Are the changes concise?

Yes. The changeset is minimal — 7 files, 27 lines changed (all symmetric replacements). No extraneous changes were introduced:

.github/workflows/run-examples.yml — one line: GITHUB_SHA → SDK_SHA
6 example files — each changed the env var lookup from os.getenv("GITHUB_SHA") to os.getenv("SDK_SHA") or os.getenv("GITHUB_SHA") with updated comments

All pre-commit checks passed (YAML format, Ruff, pycodestyle, pyright, import rules).

github-actions · 2026-04-13T18:45:38Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2026-04-13 18:58:50 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	23.8s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	19.0s	$0.02
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	11.8s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	39.3s	$0.02
01_standalone_sdk/09_pause_example.py	✅ PASS	13.7s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	34.2s	$0.03
01_standalone_sdk/11_async.py	✅ PASS	33.3s	$0.04
01_standalone_sdk/12_custom_secrets.py	✅ PASS	9.6s	$0.00
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	35.9s	$0.03
01_standalone_sdk/14_context_condenser.py	✅ PASS	2m 54s	$0.20
01_standalone_sdk/17_image_input.py	✅ PASS	16.3s	$0.01
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	26.2s	$0.02
01_standalone_sdk/19_llm_routing.py	✅ PASS	14.1s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	15.7s	$0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	10.8s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	14.7s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	1m 23s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	4m 8s	$0.30
01_standalone_sdk/25_agent_delegation.py	✅ PASS	53.0s	$0.07
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	18.3s	$0.03
01_standalone_sdk/28_ask_agent_example.py	✅ PASS	29.2s	$0.03
01_standalone_sdk/29_llm_streaming.py	✅ PASS	30.2s	$0.03
01_standalone_sdk/30_tom_agent.py	✅ PASS	22.6s	$0.02
01_standalone_sdk/31_iterative_refinement.py	✅ PASS	3m 59s	$0.29
01_standalone_sdk/32_configurable_security_policy.py	✅ PASS	20.1s	$0.02
01_standalone_sdk/34_critic_example.py	✅ PASS	6m 40s	$0.57
01_standalone_sdk/36_event_json_to_openai_messages.py	✅ PASS	13.0s	$0.01
01_standalone_sdk/37_llm_profile_store/main.py	✅ PASS	6.4s	$0.00
01_standalone_sdk/38_browser_session_recording.py	✅ PASS	34.7s	$0.03
01_standalone_sdk/39_llm_fallback.py	✅ PASS	11.2s	$0.01
01_standalone_sdk/40_acp_agent_example.py	✅ PASS	35.0s	$0.21
01_standalone_sdk/41_task_tool_set.py	✅ PASS	35.3s	$0.03
01_standalone_sdk/42_file_based_subagents.py	✅ PASS	1m 29s	$0.09
01_standalone_sdk/43_mixed_marketplace_skills/main.py	✅ PASS	6.5s	$0.00
01_standalone_sdk/44_model_switching_in_convo.py	✅ PASS	7.6s	$0.01
01_standalone_sdk/45_parallel_tool_execution.py	✅ PASS	3m 17s	$0.41
01_standalone_sdk/46_agent_settings.py	✅ PASS	10.1s	$0.01
01_standalone_sdk/47_defense_in_depth_security.py	✅ PASS	3.3s	$0.00
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	42.2s	$0.03
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	1m 34s	$0.07
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	52.7s	$0.06
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	✅ PASS	1m 24s	$0.02
02_remote_agent_server/07_convo_with_cloud_workspace.py	✅ PASS	30.9s	$0.03
02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py	✅ PASS	3m 15s	$0.03
02_remote_agent_server/09_acp_agent_with_remote_runtime.py	✅ PASS	53.4s	$0.24
02_remote_agent_server/10_cloud_workspace_share_credentials.py	✅ PASS	28.4s	$0.05
04_llm_specific_tools/01_gpt5_apply_patch_preset.py	✅ PASS	27.7s	$0.03
04_llm_specific_tools/02_gemini_file_tools.py	✅ PASS	39.0s	$0.09
05_skills_and_plugins/01_loading_agentskills/main.py	✅ PASS	17.8s	$0.01
05_skills_and_plugins/02_loading_plugins/main.py	✅ PASS	21.3s	$0.02

✅ All tests passed!

Total: 50 | Passed: 50 | Failed: 0 | Total Cost: $3.31

View full workflow run

github-actions · 2026-04-13T19:04:55Z

Evaluation Triggered

Trigger: Release v1.17.0
SDK: aabf407
Eval limit: 50
Models: claude-sonnet-4-5-20250929

Release v1.17.0

8f0c893

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot added integration-test Runs the integration tests and comments the results test-examples Run all applicable "examples/" files. Expensive operation. behavior-test labels Apr 13, 2026

all-hands-bot commented Apr 13, 2026

View reviewed changes

This comment was marked as duplicate.

Sign in to view

neubig removed the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 13, 2026

neubig added the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 13, 2026 — with OpenHands AI

neubig requested a review from xingyaoww April 13, 2026 17:59

neubig mentioned this pull request Apr 13, 2026

fix(ci): use AGENT_SERVER_SHA to avoid conflict with GitHub's default GITHUB_SHA #2816

Draft

xingyaoww removed the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 13, 2026

xingyaoww added the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 13, 2026 — with OpenHands AI

xingyaoww approved these changes Apr 13, 2026

View reviewed changes

xingyaoww enabled auto-merge (squash) April 13, 2026 19:04

xingyaoww merged commit 9626c66 into main Apr 13, 2026
61 checks passed

xingyaoww deleted the rel-1.17.0 branch April 13, 2026 19:07

Conversation

all-hands-bot commented Apr 13, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release v1.17.0

Release Checklist

Next Steps

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

enyst commented Apr 13, 2026

Uh oh!

openhands-ai bot commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026

🧪 Integration Tests Results

📁 Detailed Logs & Artifacts

📊 Summary

📋 Detailed Results

litellm_proxy_moonshot_kimi_k2_thinking

litellm_proxy_deepseek_deepseek_reasoner

litellm_proxy_gemini_3.1_pro_preview

litellm_proxy_anthropic_claude_sonnet_4_6

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Version Consistency ✅

Lockfile Consistency ✅

Package Imports ✅

Deprecation Deadlines ✅

Build Success ✅

Issues Found

CI Status

Uh oh!

enyst commented Apr 13, 2026

Uh oh!

This comment was marked as duplicate.

github-actions bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

❌ Some tests failed

Uh oh!

github-actions bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

❌ Some tests failed

Uh oh!

neubig commented Apr 13, 2026

Uh oh!

openhands-ai bot commented Apr 13, 2026

Uh oh!

neubig commented Apr 13, 2026

Analysis of Failed Example Tests

Summary

The 5 Failures

Root Cause: Image Tag SHA Mismatch

Comparison with Previous Releases

Conclusion

Uh oh!

neubig commented Apr 13, 2026

Analysis of Failed Example Tests

Summary

The 5 Failures

Root Cause: Image Tag SHA Mismatch

all-hands-bot commented Apr 13, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Apr 13, 2026 •

edited

Loading

github-actions bot commented Apr 13, 2026 •

edited

Loading

github-actions bot commented Apr 13, 2026 •

edited

Loading

github-actions bot commented Apr 13, 2026 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

github-actions bot commented Apr 13, 2026 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

github-actions bot commented Apr 13, 2026 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`