Fix three critical bugs causing bash/tmux session leaks#2804
Fix three critical bugs causing bash/tmux session leaks#2804
Conversation
- Register DelegateTool in register_default_tools() and get_default_tools() so it is available in the default tool set alongside terminal, file_editor, and task_tracker. - Call register_builtins_agents() in agent-server's tool_router.py at startup so built-in sub-agent definitions (default, bash, explore) are available for the delegate tool to target. - Add 10 stress tests in tests/agent_server/test_delegation_stress.py that exercise ~10 concurrent sub-agent delegations with mocked LLM/run calls: spawn, parallel delegation, simulated latency, mixed success/failure, metrics merging, thread independence, max_children limits, nonexistent agents, repeated rounds, and typed agent variants. Co-authored-by: openhands <openhands@all-hands.dev>
1. EventService.close() fire-and-forget (CRITICAL): Added missing await to ensure LocalConversation.close() actually runs during shutdown 2. No tmux session cleanup on startup (MEDIUM): Added cleanup_stale_tmux_sessions() to kill orphaned sessions from previous server runs on startup 3. Task manager sub-conversations use delete_on_close=False (MEDIUM): Changed to delete_on_close=True in both locations to ensure tool cleanup runs These fixes prevent unbounded accumulation of orphaned tmux sessions and their bash children processes that were never being terminated. Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
|
Closing this PR to create a clean version without unrelated commits. See new PR with clean fix. |
all-hands-bot
left a comment
There was a problem hiding this comment.
🔴 Needs improvement - This PR has critical scope and design issues that must be addressed.
The PR description claims to fix tmux session leaks, but includes ~350 lines of unrelated DelegateTool changes and a massive stress test file. The actual tmux cleanup logic is overly aggressive and lacks test coverage.
| register_builtins_agents(enable_browser=True) | ||
| register_gemini_tools(enable_browser=True) | ||
| register_planning_tools() | ||
| register_builtins_agents() |
There was a problem hiding this comment.
🔴 Critical: Duplicate registration call. register_builtins_agents(enable_browser=True) is already called on line 16. This second call with different parameters will either be redundant or cause double-registration issues.
Remove this duplicate line.
| async def cleanup_stale_tmux_sessions() -> None: | ||
| """Clean up any stale tmux sessions on server startup. | ||
|
|
||
| Tmux sessions live in a separate process that survives agent-server restarts. | ||
| This function kills all existing sessions on the openhands socket to prevent | ||
| accumulation of orphaned sessions. Reconnecting conversations will create | ||
| fresh sessions as needed. | ||
| """ | ||
| try: | ||
| import libtmux | ||
|
|
||
| # Connect to the dedicated OpenHands tmux server | ||
| server = libtmux.Server(socket_name="openhands") | ||
|
|
||
| # Get all sessions on this server | ||
| sessions = server.sessions | ||
| if not sessions: | ||
| logger.debug("No tmux sessions found on openhands socket") | ||
| return | ||
|
|
||
| logger.info("Cleaning up %d stale tmux session(s) on startup", len(sessions)) | ||
|
|
||
| # Kill all sessions - they're all stale since we're starting up | ||
| for session in sessions: | ||
| try: | ||
| logger.debug("Killing tmux session: %s", session.name) | ||
| session.kill() | ||
| except Exception as e: | ||
| logger.warning("Failed to kill tmux session %s: %s", session.name, e) | ||
|
|
||
| logger.info("Tmux cleanup completed") | ||
|
|
||
| except ImportError: | ||
| logger.debug("libtmux not available, skipping tmux cleanup") | ||
| except Exception as e: | ||
| # Don't let tmux cleanup failures prevent server startup | ||
| logger.warning("Failed to cleanup tmux sessions: %s", e) |
There was a problem hiding this comment.
🟠 Important: This cleanup logic is overly aggressive - it kills ALL sessions on the "openhands" socket without checking if they're actually stale.
Problem: What if multiple agent-server instances share the same tmux socket? What if there are legitimate long-running sessions that should persist across server restarts?
Better approach: Track session ownership (e.g., PID file, session metadata) and only kill sessions that belong to this server instance or are truly orphaned.
| from openhands.tools.delegate import DelegateTool | ||
| from openhands.tools.file_editor import FileEditorTool | ||
| from openhands.tools.task_tracker import TaskTrackerTool | ||
| from openhands.tools.terminal import TerminalTool | ||
|
|
||
| logger.debug(f"Tool: {TerminalTool.name} registered.") | ||
| logger.debug(f"Tool: {FileEditorTool.name} registered.") | ||
| logger.debug(f"Tool: {TaskTrackerTool.name} registered.") |
There was a problem hiding this comment.
🔴 Critical - Scope Creep: These DelegateTool changes are completely unrelated to "fixing tmux session leaks" as described in the PR title and description.
This PR mixes unrelated changes:
- Tmux cleanup (described)
- Task manager delete_on_close fix (described)
- DelegateTool registration (NOT described)
- 328-line delegation stress test (NOT described)
Split the DelegateTool changes into a separate PR with proper description and justification.
| @@ -0,0 +1,328 @@ | |||
| """Stress tests for delegation in the agent-server context. | |||
There was a problem hiding this comment.
🔴 Critical - Scope Creep: This entire 328-line test file is unrelated to the PR's stated purpose of "fixing tmux session leaks".
The PR should include tests for:
- The
cleanup_stale_tmux_sessions()function - The task manager
delete_on_close=Truebehavior
Instead, this adds an unrelated delegation stress test suite. Move this to a separate PR focused on delegation testing.
all-hands-bot
left a comment
There was a problem hiding this comment.
QA Report
Summary
The core bug fixes work correctly, but the PR mixes bug fixes with an unrelated feature and has a merge conflict artifact.
Environment Setup
✅ SUCCESS - Environment setup completed successfully:
- uv 0.11.6 installed (requirement: 0.8.13+)
- Dependencies installed via
uv run - All imports successful
- Python 3.13.13
CI & Test Status
✅ PASSING - Key CI checks:
pre-commit: SUCCESSsdk-tests: SUCCESSagent-server-tests: SUCCESScross-tests: SUCCESS- Python/REST API breakage checks: SUCCESS
Functional Verification
✅ Bug Fix #1: EventService.close() await
Status: Already fixed in codebase (as stated in PR description)
✅ Bug Fix #2: Tmux cleanup on startup
Tested:
uv run python -c "from openhands.agent_server.api import cleanup_stale_tmux_sessions; print('Import successful')"Result: Import successful. Function is well-implemented with:
- Proper try/except error handling
- Won't block server startup on failure
- Logs cleanup progress
- Handles missing libtmux gracefully
Code: openhands-agent-server/openhands/agent_server/api.py:52-88
✅ Bug Fix #3: Task manager delete_on_close
Tested:
uv run pytest tests/tools/task/test_task_manager.py -vResult: ✅ 47/47 tests PASSED in 0.66s
Verified changes in:
openhands-tools/openhands/tools/task/manager.py:201(delete_on_close=True)openhands-tools/openhands/tools/task/manager.py:288(delete_on_close=True)
✅ Additional Feature: DelegateTool in defaults
Tested:
uv run python -c "from openhands.tools.preset.default import get_default_tools; print([t.name for t in get_default_tools(enable_browser=False)])"Result: ['terminal', 'file_editor', 'task_tracker', 'delegate']
Stress Tests:
uv run pytest tests/agent_server/test_delegation_stress.py -vResult: ✅ 10/10 tests PASSED in 0.83s
Issues Found
🟠 Important: Duplicate function call (merge conflict artifact) - see inline comment
🟡 Minor: PR organization - This PR contains two distinct changes:
- Bug fixes (commit d0289a6) - tmux cleanup + delete_on_close fixes
- Feature addition (commit 0fd0ea6) - DelegateTool in defaults + stress tests
The PR description only mentions the bug fixes. Consider updating the title/description to reflect both changes, or splitting into separate PRs.
Verdict
The bug fixes work correctly and all tests pass. However:
- There is a duplicate function call that should be removed
- The PR mixes bug fixes with an unrelated feature
The code is functional and safe to merge after addressing the duplicate call.
| register_builtins_agents(enable_browser=True) | ||
| register_gemini_tools(enable_browser=True) | ||
| register_planning_tools() | ||
| register_builtins_agents() |
There was a problem hiding this comment.
🟠 Important: Duplicate function call.
This line calls register_builtins_agents() again, but line 16 already calls register_builtins_agents(enable_browser=True). This is a merge conflict artifact - commit 0fd0ea6 added this call, then a merge from main added the call on line 16, but this one wasn't removed.
While harmless (the function is idempotent via register_agent_if_absent), it's wasteful to register agents twice on every server startup.
Recommendation: Remove this line, keep only line 16.
Coverage Report •
|
||||||||||||||||||||||||||||||||||||||||
Problem
This PR fixes three critical bugs that cause unbounded accumulation of orphaned tmux sessions and bash processes:
awaitmeant LocalConversation.close() never actually ran during shutdownSolution
Bug 1 - EventService.close() await fix
openhands-agent-server/openhands/agent_server/event_service.py:659awaitbeforeloop.run_in_executor(None, self._conversation.close)Bug 2 - Tmux cleanup on startup
openhands-agent-server/openhands/agent_server/api.pycleanup_stale_tmux_sessions()function called during server startupBug 3 - Task manager delete_on_close fix
openhands-tools/openhands/tools/task/manager.py:201openhands-tools/openhands/tools/task/manager.py:288delete_on_close=Falsetodelete_on_close=TrueTesting
Impact
These fixes prevent resource exhaustion from unbounded session accumulation that could crash the agent-server over time.
Closes: Fixes critical memory/process leaks in agent-server
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:d0289a6-pythonRun
All tags pushed for this build
About Multi-Architecture Support
d0289a6-python) is a multi-arch manifest supporting both amd64 and arm64d0289a6-python-amd64) are also available if needed