Skip to content
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
e286b4b
fix: send hook_config to server in RemoteConversation
xingyaoww Feb 18, 2026
80ba8c6
test: add test for hook_config being sent to server
xingyaoww Feb 18, 2026
a790935
feat: Add HookExecutionEvent for hook observability
xingyaoww Feb 18, 2026
8acdc2f
Update examples/02_remote_agent_server/01_convo_with_local_agent_serv…
xingyaoww Feb 18, 2026
16caa43
Apply suggestion from @xingyaoww
xingyaoww Feb 18, 2026
a8eae38
feat: Add hook_config to ConversationState and SystemPromptEvent
xingyaoww Feb 18, 2026
2444bd1
test: add stop hook and verify HookExecutionEvent for both hooks
openhands-agent Feb 19, 2026
9990f4e
Merge branch 'main' into fix/remote-conversation-hook-config
xingyaoww Mar 2, 2026
ecacb88
fix: avoid circular import when rebuilding SystemPromptEvent
openhands-agent Mar 2, 2026
008a9d7
fix: truncate HookExecutionEvent logs to 50k chars
openhands-agent Mar 2, 2026
988b89b
docs: fix RemoteConversation PostToolUse hook comment
openhands-agent Mar 2, 2026
071702b
test: update example to demonstrate on_stop hook with syntax validation
xingyaoww Mar 2, 2026
69f37a7
test: fix agent instruction to demonstrate full on_stop hook cycle
xingyaoww Mar 2, 2026
fd8a2af
Merge main into fix/remote-conversation-hook-config
xingyaoww Mar 6, 2026
32c781e
Add example run log after merge from main
xingyaoww Mar 6, 2026
4c14b6e
Add JSON mode example run log
xingyaoww Mar 6, 2026
cfc7445
Rename pre_commit_check.sh to pycompile_check.sh
xingyaoww Mar 6, 2026
70ebee0
Remove old log files from .pr folder
xingyaoww Mar 6, 2026
ce7f1e9
Merge branch 'main' into fix/remote-conversation-hook-config
xingyaoww Mar 9, 2026
e346ea2
fix: make all hooks server-only in RemoteConversation
openhands-agent Mar 9, 2026
ee3a7c0
refactor: remove hook_config from SystemPromptEvent
openhands-agent Mar 9, 2026
aa0129f
chore: Remove PR-only artifacts [automated]
Mar 9, 2026
e68d1ee
Merge branch 'main' into fix/remote-conversation-hook-config
xingyaoww Mar 9, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
611 changes: 611 additions & 0 deletions .pr/example_run_json_20260306_135350.log

Large diffs are not rendered by default.

97 changes: 97 additions & 0 deletions .pr/on_stop_hook_demo_summary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# On-Stop Hook Demo Summary

## Test Scenario

The example (`examples/02_remote_agent_server/01_convo_with_local_agent_server.py`) was modified to demonstrate the `on_stop` hook feature:

1. **Hook Configuration**: A Stop hook is configured to run `pre_commit_check.sh`, which validates Python syntax in the workspace
2. **Agent Instruction**: The agent is asked to create a Python file with a syntax error and then finish
3. **Expected Flow**:
- Agent creates broken Python file
- Agent tries to finish
- Stop hook runs syntax check, finds error, returns `deny` with feedback
- Agent receives feedback and continues
- Agent attempts to fix the error
- Cycle repeats until successful or max retries

## Evidence from Test Runs

### Hook Configuration Sent to Server ✅

From the logs:
```
ConversationStateUpdate(key=hook_config, value={'pre_tool_use': [], 'post_tool_use': [], 'user_prompt_submit': [], 'session_start': [], 'session_end': [], 'stop': [{'matcher': '*', 'hooks': [{'type': 'command', 'command': '/mnt/data/software-agent-sdk/examples/02_remote_agent_server/hook_scripts/pre_commit_check.sh', 'timeout': 60, 'async_': False}]}]})
```

### Stop Hook Denied Stopping ✅

From server logs:
```
"Stop hook denied stopping: Blocked by hook"
"Stop hook denied agent stopping"
```

### Feedback Sent to Agent ✅

From client logs:
```
[Stop hook feedback]
SyntaxError: invalid syntax
```

### Agent Continued Running After Denial ✅

State transitions observed:
```
execution_status: running -> finished -> running -> finished -> running -> ...
```

This pattern repeated 5+ times, showing the hook successfully denying the stop and the agent continuing to work.

## Key Findings

1. **The `hook_config` IS being properly sent to the server** in the conversation creation payload
2. **The Stop hook IS executing on the server side** when the agent tries to finish
3. **The hook denial IS being communicated** back to the agent as feedback
4. **The agent IS continuing to run** after receiving the denial feedback
5. **The agent successfully fixes issues and completes** - the full cycle works end-to-end

## Successful Test Run (17:29)

The full cycle was demonstrated:
1. ✅ Agent created `test_broken.py` with syntax error
2. ✅ Stop hook denied stopping ("Blocked by hook")
3. ✅ Agent received feedback about the error
4. ✅ Agent fixed the syntax error in `test_broken.py`
5. ✅ Agent finished successfully

From the logs:
```
Create a Python file called 'test_broken.py'
[File test_broken.py edited with 1 changes.]
Stop hook denied stopping: Blocked by hook
I'll check and correct the syntax error in test_broken.py.
[File test_broken.py edited with 1 changes.] (fix applied)
finished!
```

## Files Modified

- `examples/02_remote_agent_server/01_convo_with_local_agent_server.py` - Updated to use Stop hook
- `examples/02_remote_agent_server/hook_scripts/pre_commit_check.sh` - New script for syntax validation

## Log Files

- `example_run_output.log` - First test run
- `example_run_output2.log` - Second test run
- `example_run_output3.log` - Third test run with retry logic
- `test_run_20260302_172931.log` - Final successful test run with complete cycle

## Conclusion

The PR's fix to send `hook_config` to the server in RemoteConversation is working correctly. The Stop hook demonstrates the complete feedback loop where:
1. Hooks run on the server
2. Hook results (allow/deny) affect agent behavior
3. Feedback from denied hooks is sent back to the agent
4. The agent continues working to address the feedback
5. **The agent successfully completes after fixing issues** ✅
148 changes: 133 additions & 15 deletions examples/02_remote_agent_server/01_convo_with_local_agent_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,21 @@
import tempfile
import threading
import time
from pathlib import Path

from pydantic import SecretStr

from openhands.sdk import LLM, Conversation, RemoteConversation, Workspace, get_logger
from openhands.sdk.event import ConversationStateUpdateEvent
from openhands.sdk.event import ConversationStateUpdateEvent, HookExecutionEvent
from openhands.sdk.hooks import HookConfig, HookDefinition, HookMatcher
from openhands.tools.preset.default import get_default_agent


logger = get_logger(__name__)

# Hook script directory for this example
HOOK_SCRIPTS_DIR = Path(__file__).parent / "hook_scripts"


def _stream_output(stream, prefix, target_stream):
"""Stream output from subprocess to target stream with prefix."""
Expand Down Expand Up @@ -168,42 +173,155 @@ def event_callback(event):
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: This example has grown significantly in complexity. The original showed basic RemoteConversation usage (~50 lines). Now it's a comprehensive hook testing harness (~350 lines) that deliberately creates broken Python files to test stop hook denial and retry logic.

For a "01_" introductory example, consider splitting this into:

  • 01_convo_with_local_agent_server.py - basic usage (~50 lines)
  • 02_hooks_with_stop_validation.py - comprehensive hook testing (current content)

The comprehensive demonstration is valuable for showing hook capabilities, but might be better as a separate, clearly-labeled advanced example.

Not blocking - this is an organizational suggestion to improve discoverability for users learning the basics.

logger.info(f"Output: {result.stdout}")

# Configure hooks - demonstrating the hooks system with RemoteConversation
# Server-side hooks (PreToolUse, PostToolUse, UserPromptSubmit, Stop) are
# executed by the agent server. Client-side hooks (SessionStart, SessionEnd)
# are executed locally.

hook_config = HookConfig(
# Stop hook - run Python syntax check before allowing agent to finish.
# If any Python file has syntax errors, the hook returns "deny" with the
# error output, which gets sent back to the agent as feedback, and the
# agent continues working to fix the issue.
stop=[
HookMatcher(
matcher="*", # Match all stop reasons
hooks=[
HookDefinition(
command=str(HOOK_SCRIPTS_DIR / "pycompile_check.sh"),
timeout=60,
)
],
)
],
)

conversation = Conversation(
agent=agent,
workspace=workspace,
callbacks=[event_callback],
hook_config=hook_config,
)
assert isinstance(conversation, RemoteConversation)

# Track hook execution events
hook_events: list[HookExecutionEvent] = []

def hook_event_tracker(event):
"""Additional callback to track hook execution events."""
if isinstance(event, HookExecutionEvent):
hook_events.append(event)
logger.info(f"🪝 HookExecutionEvent captured: {event.hook_event_type}")

# Append our hook tracker to the existing callbacks
conversation._callbacks.append(hook_event_tracker)

try:
logger.info(f"\n📋 Conversation ID: {conversation.state.id}")

# Send first message and run
logger.info("📝 Sending first message...")
# Test scenario: Ask the agent to create a Python file with syntax errors
# The stop hook should detect the syntax error and send feedback back
# to the agent to fix it
logger.info("📝 Sending message to test on_stop hook with syntax check...")
conversation.send_message(
"Read the current repo and write 3 facts about the project into FACTS.txt."
"Create a Python file called 'test_broken.py' in the current directory "
"with an obvious syntax error (like 'def broken(:\n pass' - missing "
"closing parenthesis). After creating the file, immediately use the "
"finish action. If you receive any feedback about errors, fix them and "
"try to finish again."
)

# Generate title using a specific LLM
title = conversation.generate_title(max_length=60, llm=title_gen_llm)
logger.info(f"Generated conversation title: {title}")

logger.info("🚀 Running conversation...")
conversation.run()
logger.info(
"Expected behavior: Agent creates broken .py file -> tries to finish "
"-> stop hook runs syntax check -> check fails -> hook sends feedback "
"-> agent fixes the syntax error -> tries to finish again -> passes"
)

logger.info("✅ First task completed!")
logger.info(f"Agent status: {conversation.state.execution_status}")
# Keep running until the agent actually finishes
# When a stop hook denies, the state goes: running -> finished -> running
# The client's run() may return when it sees 'finished', so we need to
# check if the agent is still running and continue
max_runs = 10 # Allow enough retries for agent to fix issues
run_count = 0
while run_count < max_runs:
run_count += 1
logger.info(f"🔄 Run attempt #{run_count}")
conversation.run()
current_status = conversation.state.execution_status
logger.info(f" After run(), status = {current_status}")

# Small delay to let any pending state updates arrive
time.sleep(0.5)
current_status = conversation.state.execution_status
logger.info(f" After delay, status = {current_status}")

if current_status.value == "finished":
logger.info(" ✅ Agent finished!")
break
elif current_status.value == "running":
logger.info(" Agent still running (hook denied stop), continuing...")
else:
logger.info(f" Unexpected status: {current_status}, stopping")
break

logger.info("✅ Task completed!")
logger.info(f"Final agent status: {conversation.state.execution_status}")

# Wait for events to stop coming (no events for 2 seconds)
logger.info("⏳ Waiting for events to stop...")
while time.time() - event_tracker["last_event_time"] < 2.0:
time.sleep(0.1)
logger.info("✅ Events have stopped")

logger.info("🚀 Running conversation again...")
conversation.send_message("Great! Now delete that file.")
conversation.run()
logger.info("✅ Second task completed!")
# Analyze hook execution events
logger.info("\n" + "=" * 50)
logger.info("📊 Hook Execution Events Analysis")
logger.info("=" * 50)

logger.info(f"Total HookExecutionEvents received: {len(hook_events)}")
for i, he in enumerate(hook_events, 1):
logger.info(f"\n Hook Event #{i}:")
logger.info(f" Type: {he.hook_event_type}")
logger.info(f" Command: {he.hook_command}")
logger.info(f" Success: {he.success}")
logger.info(f" Blocked: {he.blocked}")
logger.info(f" Exit Code: {he.exit_code}")
if he.additional_context:
# Truncate for readability
ctx = (
he.additional_context[:500] + "..."
if len(he.additional_context) > 500
else he.additional_context
)
logger.info(f" Additional Context: {ctx}")
if he.error:
logger.info(f" Error: {he.error}")

# Count stop hooks that were denied (pre-commit failed)
stop_events = [e for e in hook_events if e.hook_event_type == "Stop"]
denied_stops = [e for e in stop_events if e.blocked]

logger.info(f"\nStop hook events: {len(stop_events)}")
logger.info(f"Denied stops (pre-commit failures): {len(denied_stops)}")

if denied_stops:
logger.info(
"\n✅ SUCCESS: Stop hook denied at least once due to "
"pre-commit failure!"
)
logger.info(
" The agent should have received feedback and fixed the issue."
)
else:
logger.info(
"\n⚠️ No denied stops detected. Either pre-commit passed on first "
"try or the hook didn't work as expected."
)

# Demonstrate state.events functionality
logger.info("\n" + "=" * 50)
Expand All @@ -214,18 +332,18 @@ def event_callback(event):
total_events = len(conversation.state.events)
logger.info(f"📈 Total events in conversation: {total_events}")

# Get recent events (last 5) using state.events
logger.info("\n🔍 Getting last 5 events using state.events...")
# Get recent events (last 10) using state.events
logger.info("\n🔍 Getting last 10 events using state.events...")
all_events = conversation.state.events
recent_events = all_events[-5:] if len(all_events) >= 5 else all_events
recent_events = all_events[-10:] if len(all_events) >= 10 else all_events

for i, event in enumerate(recent_events, 1):
event_type = type(event).__name__
timestamp = getattr(event, "timestamp", "Unknown")
logger.info(f" {i}. {event_type} at {timestamp}")

# Let's see what the actual event types are
logger.info("\n🔍 Event types found:")
logger.info("\n🔍 Event types found in recent events:")
event_types = set()
for event in recent_events:
event_type = type(event).__name__
Expand Down
27 changes: 27 additions & 0 deletions examples/02_remote_agent_server/hook_scripts/pycompile_check.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/bin/bash
# Stop hook: Run Python syntax check on all .py files in the workspace
# Returns deny if any Python file has syntax errors, with the error output as feedback
#
# This hook validates that the agent hasn't broken any Python files.
# Environment variable CHECK_DIR can override the default working directory.

CHECK_DIR="${CHECK_DIR:-.}"

# Find all Python files and check for syntax errors
ERRORS=""
while IFS= read -r -d '' file; do
# Run python syntax check
result=$(python3 -m py_compile "$file" 2>&1)
if [ $? -ne 0 ]; then
ERRORS="${ERRORS}\n${result}"
fi
done < <(find "$CHECK_DIR" -name "*.py" -print0 2>/dev/null)

if [ -n "$ERRORS" ]; then
# Escape the output for JSON
ESCAPED_OUTPUT=$(echo -e "$ERRORS" | head -50 | python3 -c 'import json,sys; print(json.dumps(sys.stdin.read()))')
echo "{\"decision\": \"deny\", \"additionalContext\": $ESCAPED_OUTPUT}"
exit 2
fi

exit 0
3 changes: 2 additions & 1 deletion openhands-sdk/openhands/sdk/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
RemoteConversation,
)
from openhands.sdk.conversation.conversation_stats import ConversationStats
from openhands.sdk.event import Event, LLMConvertibleEvent
from openhands.sdk.event import Event, HookExecutionEvent, LLMConvertibleEvent
from openhands.sdk.event.llm_convertible import MessageEvent
from openhands.sdk.io import FileStore, LocalFileStore
from openhands.sdk.llm import (
Expand Down Expand Up @@ -99,6 +99,7 @@
"MCPToolDefinition",
"MCPToolObservation",
"MessageEvent",
"HookExecutionEvent",
"create_mcp_tools",
"get_logger",
"Conversation",
Expand Down
2 changes: 2 additions & 0 deletions openhands-sdk/openhands/sdk/agent/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,8 @@ def init_state(
dynamic_context=TextContent(text=dynamic_context)
if dynamic_context
else None,
# Include hook_config from state for observability
hook_config=state.hook_config,
)
on_event(event)

Expand Down
6 changes: 6 additions & 0 deletions openhands-sdk/openhands/sdk/conversation/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
if TYPE_CHECKING:
from openhands.sdk.agent.base import AgentBase
from openhands.sdk.conversation.state import ConversationExecutionStatus
from openhands.sdk.hooks import HookConfig


CallbackType = TypeVar(
Expand Down Expand Up @@ -95,6 +96,11 @@ def stats(self) -> ConversationStats:
"""The conversation statistics."""
...

@property
def hook_config(self) -> "HookConfig | None":
"""The hook configuration for this conversation."""
...


class BaseConversation(ABC):
"""Abstract base class for conversation implementations.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -395,6 +395,9 @@ def _ensure_plugins_loaded(self) -> None:

# Set up hook processor with the combined config
if final_hook_config is not None:
# Store final hook_config in state for observability
self._state.hook_config = final_hook_config

self._hook_processor, self._on_event = create_hook_callback(
hook_config=final_hook_config,
working_dir=str(self.workspace.working_dir),
Expand Down
Loading
Loading