fix(tools): return browser timeout as observation#2455
Conversation
Catch browser action timeouts at the browser tool boundary so a hung browser call returns a BrowserObservation error instead of bubbling up as a fatal conversation TimeoutError. Also format empty browser exceptions more clearly to avoid blank 'Browser operation failed:' messages. Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Bump openhands-agent-server from 1.14.0 to 1.15.0 so the REST API breakage workflow reflects the current breaking API surface already present on main. This is intentionally separate from PR #2455, whose failing REST API check is unrelated to the browser timeout fix. Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands test this life by creating a web service that does not respond in the allotted time and creating an example following the examples folder that asks the agent to query this web service. Run the example before and after this fix and demonstrate that it's better. Fix any failing tests as well |
|
I'm on it! neubig can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed in 4010efb. What changed:
Validation:
Before/after demo using the new example:
I also checked local targeted tests after the rebase; the new regression coverage is green. |
|
Done — the new changes since the last summary are complete and pushed. Checklist:
Conciseness check:
Validation run:
Pushed commit:
PR update comment: |
Coverage Report •
|
||||||||||||||||||||
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Clean, minimal fix that converts fatal timeouts into recoverable observations.
Verdict: ✅ Code is solid and well-tested. Waiting on eval confirmation.
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟢 Good taste - Exactly the right approach.
Key Insight: This eliminates a special case (fatal timeout) by normalizing it into the standard error observation flow. The core change is 3 lines in __call__: catch TimeoutError, return BrowserObservation. Simple, direct, and well-tested with a real slow HTTP service.
Verdict: ✅ Code is solid and ready. Waiting on eval verification (as noted in previous review) since this changes tool execution behavior.
VascoSch92
left a comment
There was a problem hiding this comment.
Just one suggestion for a change.
Otherwise, LGTM.
|
@OpenHands fix the remaining review comment |
|
I'm on it! neubig can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
|
Done. New work since the last substantive update:
Conciseness check:
In short: the remaining review feedback has been fully addressed, the thread was resolved, and the review-thread CI gate now passes. |
Summary
BrowserToolExecutor.__call__asBrowserObservationerrors instead of bubbling up a fatal conversationTimeoutErrorBrowser operation failed: ...messagescheck-examplesChanges Made
_format_browser_operation_error()so blank exceptions and timeout errors produce stable observation textBrowserToolExecutoruse a configurableaction_timeout_seconds, validate it, and catchbuiltins.TimeoutErrorin__call__tests/tools/browser_use/test_browser_executor.py, including a live slow-service timeout caseexamples/01_standalone_sdk/45_browser_timeout_observation.pyand its now-unneeded exclusion entry because undocumented examples fail the docs sync checkTesting
uv run pre-commit run --files tests/examples/test_examples.pyuv run pytest tests/tools/browser_use/test_browser_executor.py -quv run pytest tests/tools/browser_use/test_browser_executor.py -k live_service -vvDOCS_PATH=/workspace/project/software-agent-sdk/.agent_tmp/docs-check uv run python .github/scripts/check_documented_examples.pyEvidence
Live local SDK run with an actual LLM (exact
BrowserToolExecutor.__call__timeout path)Hosted OpenHands Cloud conversation (real hosted agent recovery)
Conversation: https://app.all-hands.dev/conversations/f4b7a646ad6241e1bfe90ac206019391
The local SDK run above uses a real LLM plus
BrowserToolSet(action_timeout_seconds=2.0)to exercise the exact timeout-to-BrowserObservationpath on this branch. The hosted OpenHands Cloud conversation shows a real hosted agent also recovered from a browser hang observation and continued withpwd/finishinstead of crashing the conversation.Checklist
check-examples.)Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:1d59870-pythonRun
All tags pushed for this build
About Multi-Architecture Support
1d59870-python) is a multi-arch manifest supporting both amd64 and arm641d59870-python-amd64) are also available if needed