-
Notifications
You must be signed in to change notification settings - Fork 471
Add Browser Env Integration #732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Browser Env Integration #732
Conversation
Co-Authored-By: Claude Opus 4.5 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| if self.mode == "cua": | ||
| # Filter screenshots to manage context size | ||
| messages = self._mode_impl.filter_screenshots_in_messages(list(messages)) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Screenshot filtering ineffective due to wrong placement
High Severity
The filter_screenshots_in_messages call in env_response doesn't actually filter screenshots from the context sent to the model. In MultiTurnEnv.get_prompt_messages, the original unfiltered messages variable is used in the final concat_messages([messages, env_response]) call. The filtered version only affects what's passed to super().env_response() for tool call extraction, which only examines messages[-1] anyway. Screenshots will accumulate unbounded in the conversation, potentially causing context length issues despite the filtering code existing.
| browserbase_api_key=self.api_key, | ||
| browserbase_project_id=self.project_id, | ||
| model_api_key=api_key, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing synchronization causes resource leak in concurrent rollouts
Medium Severity
DOMMode._create_session lacks synchronization when creating the shared stagehand_client. If multiple rollouts call _create_session concurrently when stagehand_client is None, each creates a new AsyncStagehand instance. Only the last one is stored; the others are orphaned with unclosed connections. CUAMode correctly uses _thread_lock and _client_lock to protect its shared client creation, but DOMMode has no such protection.
Description
Adds
BrowserEnv- a unified browser automation integration for the verifiers library supporting two operational modes:DOM Mode (
mode="dom")navigate,observe,act,extract- Stagehand's AI-driven primitivesCUA Mode (
mode="cua")click,double_click,type_text,keypress,scroll,goto,back,forward,wait,screenshotBoth modes support local browser execution or Browserbase cloud infrastructure.
What's included:
verifiers/envs/integrations/browser_env/- Core integration (BrowserEnv, DOMMode, CUAMode)verifiers/envs/integrations/browser_env/cua-server/- TypeScript server for CUA modeenvironments/browser_dom_example/- Minimal DOM mode exampleenvironments/browser_cua_example/- Minimal CUA mode example[browser]extra:uv add 'verifiers[browser]'Benchmarks (GAIA, WebVoyager, Mind2Web) have been pushed to Prime Hub under the
browserbase/namespace.Type of Change
Testing
uv run pytestlocally.Checklist
Additional Notes
Future work:
browserbase/org~
Note
Adds a unified browser automation integration with two modes and supporting assets.
BrowserEnvinverifiers/envs/integrations/browser_envwithmode="dom"(Stagehand tools) andmode="cua"(vision primitives + screenshots); default system prompts; env var validation; custom tool call handling for multipart CUA responses; screenshot filteringBrowserEnvviaverifiers/__init__.pyand integration package__init__.py(lazy imports)environments/browser_dom_exampleandenvironments/browser_cua_examplewith minimal datasets, judge rubric, README, andpyproject.tomlbrowser_env/cua-server/(actions API, session management, README, scripts, env templates)BrowserEnvtodocs/environments.mdandintegrations/README.md, including install extras and mode descriptions[project.optional-dependencies].browserextra (stagehand,aiohttp,python-dotenv)tests/test_browser_env.pycovering env var checks, prompts, CUA formatting/filtering, DOM LLM config, example datasets; updatetests/test_envs.pyto skip new browser examples andmcp_envWritten by Cursor Bugbot for commit 906a836. This will update automatically on new commits. Configure here.