Add Browser Env Integration #732

filip-michalsky · 2026-01-15T11:46:19Z

Description

Adds BrowserEnv - a unified browser automation integration for the verifiers library supporting two operational modes:

DOM Mode (mode="dom")

Uses the Stagehand Python SDK for natural language browser control
Tools: navigate, observe, act, extract - Stagehand's AI-driven primitives
Ideal for tasks that benefit from semantic understanding of page elements

CUA Mode (mode="cua")

Vision-based primitives for Computer Use Agent workflows
Tools: click, double_click, type_text, keypress, scroll, goto, back, forward, wait, screenshot
Requires companion TypeScript server (included) for CDP connection via Stagehand internals
Automatic screenshot management with context trimming for VLM input

Both modes support local browser execution or Browserbase cloud infrastructure.

What's included:

verifiers/envs/integrations/browser_env/ - Core integration (BrowserEnv, DOMMode, CUAMode)
verifiers/envs/integrations/browser_env/cua-server/ - TypeScript server for CUA mode
environments/browser_dom_example/ - Minimal DOM mode example
environments/browser_cua_example/ - Minimal CUA mode example
New [browser] extra: uv add 'verifiers[browser]'

Benchmarks (GAIA, WebVoyager, Mind2Web) have been pushed to Prime Hub under the browserbase/ namespace.

Type of Change

New feature (non-breaking change which adds functionality)

Testing

# DOM mode
prime eval run browserbase/browser-dom-example -m openai/gpt-4.1-mini

# CUA mode (start server first: cd verifiers/envs/integrations/browser_env/cua-server && ./start.sh)
prime eval run browserbase/browser-cua-example -m qwen/qwen3-vl-30b-a3b-instruct

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Additional Notes

Future work:

Compile CUA TypeScript server to binary to remove Node.js dependency
Additional benchmark environments available on Prime Hub under browserbase/ org
~

Note

Adds a unified browser automation integration with two modes and supporting assets.

New BrowserEnv in verifiers/envs/integrations/browser_env with mode="dom" (Stagehand tools) and mode="cua" (vision primitives + screenshots); default system prompts; env var validation; custom tool call handling for multipart CUA responses; screenshot filtering
Exports BrowserEnv via verifiers/__init__.py and integration package __init__.py (lazy imports)
Examples: environments/browser_dom_example and environments/browser_cua_example with minimal datasets, judge rubric, README, and pyproject.toml
CUA server: TypeScript Fastify service under browser_env/cua-server/ (actions API, session management, README, scripts, env templates)
Docs: add BrowserEnv to docs/environments.md and integrations/README.md, including install extras and mode descriptions
Deps: new [project.optional-dependencies].browser extra (stagehand, aiohttp, python-dotenv)
Tests: tests/test_browser_env.py covering env var checks, prompts, CUA formatting/filtering, DOM LLM config, example datasets; update tests/test_envs.py to skip new browser examples and mcp_env

^{Written by Cursor Bugbot for commit 906a836. This will update automatically on new commits. Configure here.}

CLAassistant · 2026-01-15T11:46:34Z

All committers have signed the CLA.