Skip to content

Commit fdf88a0

Browse files
authored
Merge branch 'main' into openhands/use-default-preset-for-integration-tests
2 parents 0c1c46a + 13bcf02 commit fdf88a0

45 files changed

Lines changed: 3897 additions & 38 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,11 @@ You have permission to **APPROVE** or **COMMENT** on PRs. Do not use REQUEST_CHA
1515

1616
**Default to APPROVE**: If your review finds no issues at "important" level or higher, approve the PR. Minor suggestions or nitpicks alone are not sufficient reason to withhold approval.
1717

18+
**IMPORTANT: If you determine a PR is worth merging, you should approve it.** Don’t just say a PR is "worth merging" or "ready to merge" without actually submitting an approval. Your words and actions should be consistent.
19+
1820
### When to APPROVE
1921

20-
Approve PRs that are straightforward and low-risk:
22+
Examples of straightforward and low-risk PRs you should approve (non-exhaustive):
2123

2224
- **Configuration changes**: Adding models to config files, updating CI/workflow settings
2325
- **CI/Infrastructure changes**: Changing runner types, fixing workflow paths, updating job configurations
@@ -70,6 +72,7 @@ Do not leave comments for:
7072
- **Good behavior observed**: Don't comment just to praise code that follows best practices - this adds noise. Simply approve if the code is good.
7173
- **Suggestions for additional tests on simple changes**: For straightforward PRs (config changes, model additions, etc.), don't suggest adding test coverage unless tests are clearly missing for new logic
7274
- **Obvious or self-explanatory code**: Don't ask for comments on code that is already clear
75+
- **`.pr/` directory artifacts**: Files in the `.pr/` directory are temporary PR-specific documents (design notes, analysis, scripts) that are automatically cleaned up when the PR is approved. Do not comment on their presence or suggest removing them.
7376

7477
If a PR is approvable, just approve it. Don't add "one small suggestion" or "consider doing X" comments that delay merging without adding real value.
7578

.openhands/skills/debug-test-examples-workflow/SKILL.md renamed to .agents/skills/debug-test-examples-workflow/SKILL.md

File renamed without changes.
File renamed without changes.

.github/run-eval/resolve_model_config.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,13 @@
145145
"disable_vision": True,
146146
},
147147
},
148+
"glm-5": {
149+
"id": "glm-5",
150+
"display_name": "GLM-5",
151+
"llm_config": {
152+
"model": "litellm_proxy/openrouter/z-ai/glm-5",
153+
},
154+
},
148155
"qwen3-coder-next": {
149156
"id": "qwen3-coder-next",
150157
"display_name": "Qwen3 Coder Next",

.github/workflows/pr-review-by-openhands.yml

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@ on:
88
# 2. A draft PR is marked as ready for review, OR
99
# 3. A maintainer adds the 'review-this' label, OR
1010
# 4. A maintainer requests openhands-agent or all-hands-bot as a reviewer
11-
# Only users with write access can add labels or request reviews, ensuring security.
11+
# Adding labels and requesting new reviewers requires write access. GitHub may also allow PR authors
12+
# to re-request review from a previous reviewer.
1213
# The PR code is explicitly checked out for review, but secrets are only accessible
1314
# because the workflow runs in the base repository context
1415
pull_request_target:
@@ -22,14 +23,14 @@ permissions:
2223
jobs:
2324
pr-review:
2425
# Run when one of the following conditions is met:
25-
# 1. A new non-draft PR is opened by a trusted contributor, OR
26-
# 2. A draft PR is converted to ready for review by a trusted contributor, OR
26+
# 1. A new non-draft PR is opened by a non-first-time contributor, OR
27+
# 2. A draft PR is converted to ready for review by a non-first-time contributor, OR
2728
# 3. 'review-this' label is added, OR
2829
# 4. openhands-agent or all-hands-bot is requested as a reviewer
2930
# Note: FIRST_TIME_CONTRIBUTOR PRs require manual trigger via label/reviewer request
3031
if: |
31-
(github.event.action == 'opened' && github.event.pull_request.draft == false && github.event.pull_request.author_association != 'FIRST_TIME_CONTRIBUTOR') ||
32-
(github.event.action == 'ready_for_review' && github.event.pull_request.author_association != 'FIRST_TIME_CONTRIBUTOR') ||
32+
(github.event.action == 'opened' && github.event.pull_request.draft == false && github.event.pull_request.author_association != 'FIRST_TIME_CONTRIBUTOR' && github.event.pull_request.author_association != 'NONE') ||
33+
(github.event.action == 'ready_for_review' && github.event.pull_request.author_association != 'FIRST_TIME_CONTRIBUTOR' && github.event.pull_request.author_association != 'NONE') ||
3334
github.event.label.name == 'review-this' ||
3435
github.event.requested_reviewer.login == 'openhands-agent' ||
3536
github.event.requested_reviewer.login == 'all-hands-bot'

.github/workflows/server.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -708,7 +708,7 @@ jobs:
708708
echo 'EOF'
709709
} >> $GITHUB_OUTPUT
710710
711-
- name: Update PR description with comprehensive docker information
711+
- name: Update PR description with docker image details
712712
uses: nefrob/pr-description@v1.2.0
713713
with:
714714
content: ${{ steps.generate_description.outputs.pr_content }}

.github/workflows/todo-management.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,7 @@ jobs:
219219
echo "Available files:"
220220
ls -la
221221
222-
# Run the agent with comprehensive logging
222+
# Run the agent with detailed logging
223223
echo "Starting agent execution..."
224224
set +e # Don't exit on error, we want to capture it
225225
uv run python agent.py "$TODO_JSON" 2>&1 | tee agent_output.log

AGENTS.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,10 @@ When reviewing code, provide constructive feedback:
116116
- If it is a single-line string, you can break it into a multi-line string by doing "ABC" -> ("A"\n"B"\n"C")
117117
- If it is a long multi-line string (e.g., docstring), you should just add type ignore AFTER the ending """. You should NEVER ADD IT INSIDE the docstring.
118118

119+
# PyInstaller Data Files
120+
121+
When adding non-Python files (JS, templates, etc.) loaded at runtime, add them to `openhands-agent-server/openhands/agent_server/agent-server.spec` using `collect_data_files`.
122+
119123
</DEV_SETUP>
120124

121125
<PR_ARTIFACTS>
@@ -280,6 +284,21 @@ git push -u origin <feature-name>
280284
```
281285
</DOCUMENTATION_WORKFLOW>
282286

287+
<AGENT_TMP_DIRECTORY>
288+
# Agent Temporary Directory Convention
289+
290+
When tools need to store observation files (e.g., browser session recordings, task tracker data), use `.agent_tmp` as the directory name for consistency.
291+
292+
The browser session recording tool saves recordings to `.agent_tmp/observations/recording-{timestamp}/`.
293+
294+
This convention ensures tool-generated observation files are stored in a predictable location that can be easily:
295+
- Added to `.gitignore`
296+
- Cleaned up after agent sessions
297+
- Identified as agent-generated artifacts
298+
299+
Note: This is separate from `persistence_dir` which is used for conversation state persistence.
300+
</AGENT_TMP_DIRECTORY>
301+
283302
<REPO>
284303
<PROJECT_STRUCTURE>
285304
- `openhands-sdk/` core SDK; `openhands-tools/` built-in tools; `openhands-workspace/` workspace management; `openhands-agent-server/` server runtime; `examples/` runnable patterns; `tests/` split by domain (`tests/sdk`, `tests/tools`, `tests/agent_server`, etc.).
Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
"""Browser Session Recording Example
2+
3+
This example demonstrates how to use the browser session recording feature
4+
to capture and save a recording of the agent's browser interactions using rrweb.
5+
6+
The recording can be replayed later using rrweb-player to visualize the agent's
7+
browsing session.
8+
9+
The recording will be automatically saved to the persistence directory when
10+
browser_stop_recording is called. You can replay it with:
11+
- rrweb-player: https://github.com/rrweb-io/rrweb/tree/master/packages/rrweb-player
12+
- Online viewer: https://www.rrweb.io/demo/
13+
"""
14+
15+
import json
16+
import os
17+
18+
from pydantic import SecretStr
19+
20+
from openhands.sdk import (
21+
LLM,
22+
Agent,
23+
Conversation,
24+
Event,
25+
LLMConvertibleEvent,
26+
get_logger,
27+
)
28+
from openhands.sdk.tool import Tool
29+
from openhands.tools.browser_use import BrowserToolSet
30+
from openhands.tools.browser_use.definition import BROWSER_RECORDING_OUTPUT_DIR
31+
32+
33+
logger = get_logger(__name__)
34+
35+
# Configure LLM
36+
api_key = os.getenv("LLM_API_KEY")
37+
assert api_key is not None, "LLM_API_KEY environment variable is not set."
38+
model = os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929")
39+
base_url = os.getenv("LLM_BASE_URL")
40+
llm = LLM(
41+
usage_id="agent",
42+
model=model,
43+
base_url=base_url,
44+
api_key=SecretStr(api_key),
45+
)
46+
47+
# Tools - including browser tools with recording capability
48+
cwd = os.getcwd()
49+
tools = [
50+
Tool(name=BrowserToolSet.name),
51+
]
52+
53+
# Agent
54+
agent = Agent(llm=llm, tools=tools)
55+
56+
llm_messages = [] # collect raw LLM messages
57+
58+
59+
def conversation_callback(event: Event):
60+
if isinstance(event, LLMConvertibleEvent):
61+
llm_messages.append(event.to_llm_message())
62+
63+
64+
# Create conversation with persistence_dir set to save browser recordings
65+
conversation = Conversation(
66+
agent=agent,
67+
callbacks=[conversation_callback],
68+
workspace=cwd,
69+
persistence_dir="./.conversations",
70+
)
71+
72+
# The prompt instructs the agent to:
73+
# 1. Start recording the browser session
74+
# 2. Browse to a website and perform some actions
75+
# 3. Stop recording (auto-saves to file)
76+
PROMPT = """
77+
Please complete the following task to demonstrate browser session recording:
78+
79+
1. First, use `browser_start_recording` to begin recording the browser session.
80+
81+
2. Then navigate to https://docs.openhands.dev/ and:
82+
- Get the page content
83+
- Scroll down the page
84+
- Get the browser state to see interactive elements
85+
86+
3. Next, navigate to https://docs.openhands.dev/openhands/usage/cli/installation and:
87+
- Get the page content
88+
- Scroll down to see more content
89+
90+
4. Finally, use `browser_stop_recording` to stop the recording.
91+
Events are automatically saved.
92+
"""
93+
94+
print("=" * 80)
95+
print("Browser Session Recording Example")
96+
print("=" * 80)
97+
print("\nTask: Record an agent's browser session and save it for replay")
98+
print("\nStarting conversation with agent...\n")
99+
100+
conversation.send_message(PROMPT)
101+
conversation.run()
102+
103+
print("\n" + "=" * 80)
104+
print("Conversation finished!")
105+
print("=" * 80)
106+
107+
# Check if the recording files were created
108+
# Recordings are saved in BROWSER_RECORDING_OUTPUT_DIR/recording-{timestamp}/
109+
if os.path.exists(BROWSER_RECORDING_OUTPUT_DIR):
110+
# Find recording subdirectories (they start with "recording-")
111+
recording_dirs = sorted(
112+
[
113+
d
114+
for d in os.listdir(BROWSER_RECORDING_OUTPUT_DIR)
115+
if d.startswith("recording-")
116+
and os.path.isdir(os.path.join(BROWSER_RECORDING_OUTPUT_DIR, d))
117+
]
118+
)
119+
120+
if recording_dirs:
121+
# Process the most recent recording directory
122+
latest_recording = recording_dirs[-1]
123+
recording_path = os.path.join(BROWSER_RECORDING_OUTPUT_DIR, latest_recording)
124+
json_files = sorted(
125+
[f for f in os.listdir(recording_path) if f.endswith(".json")]
126+
)
127+
128+
print(f"\n✓ Recording saved to: {recording_path}")
129+
print(f"✓ Number of files: {len(json_files)}")
130+
131+
# Count total events across all files
132+
total_events = 0
133+
all_event_types: dict[int | str, int] = {}
134+
total_size = 0
135+
136+
for json_file in json_files:
137+
filepath = os.path.join(recording_path, json_file)
138+
file_size = os.path.getsize(filepath)
139+
total_size += file_size
140+
141+
with open(filepath) as f:
142+
events = json.load(f)
143+
144+
# Events are stored as a list in each file
145+
if isinstance(events, list):
146+
total_events += len(events)
147+
for event in events:
148+
event_type = event.get("type", "unknown")
149+
all_event_types[event_type] = all_event_types.get(event_type, 0) + 1
150+
151+
print(f" - {json_file}: {len(events)} events, {file_size} bytes")
152+
153+
print(f"✓ Total events: {total_events}")
154+
print(f"✓ Total size: {total_size} bytes")
155+
if all_event_types:
156+
print(f"✓ Event types: {all_event_types}")
157+
158+
print("\nTo replay this recording, you can use:")
159+
print(
160+
" - rrweb-player: "
161+
"https://github.com/rrweb-io/rrweb/tree/master/packages/rrweb-player"
162+
)
163+
else:
164+
print(f"\n✗ No recording directories found in: {BROWSER_RECORDING_OUTPUT_DIR}")
165+
print(" The agent may not have completed the recording task.")
166+
else:
167+
print(f"\n✗ Observations directory not found: {BROWSER_RECORDING_OUTPUT_DIR}")
168+
print(" The agent may not have completed the recording task.")
169+
170+
print("\n" + "=" * 100)
171+
print("Conversation finished.")
172+
print(f"Total LLM messages: {len(llm_messages)}")
173+
print("=" * 100)
174+
175+
# Report cost
176+
cost = conversation.conversation_stats.get_combined_metrics().accumulated_cost
177+
print(f"Conversation ID: {conversation.id}")
178+
print(f"EXAMPLE_COST: {cost}")

0 commit comments

Comments
 (0)