feat: browser-use tool migration [0.9.0] by shanyu-strix · Pull Request #352 · usestrix/strix

shanyu-strix · 2026-03-11T18:25:35Z

feat: browser-use tool migration

TL;DR

Ripped out the old Playwright browser tool and replaced it with browser-use. Agents can now browse the web with natural language ("go to example.com and find the login form") or precise commands (click element #5, set a cookie, take a screenshot). Works both inside the sandbox container and against your local Chrome.

What's new

Two ways to use the browser:

NEW 🌟 action="run" + a natural language task → browser-use Agent handles it end-to-end
action="click", "screenshot", "cookies", etc. → direct CDP commands, no LLM needed

Two modes:

Sandboxed (default) — Chromium runs inside the Docker container, traffic goes through Caido proxy
Local (use_local=True) — attaches to your system Chrome, keeps your login sessions and cookies [⚠️EXPERIMENTAL]

Self-healing sessions:

if Chromium crashes (it will), a watchdog restarts it and the Python layer reconnects automatically. A crash mid-task means ~15-30s of recovery, not a dead session.

TODO

Add ability for the agent to select extra outputs from the agentic run
Add authentication to cdp
Harden authentication for tool server
Add timeouts

Notable decisions

socat for CDP — Chromium only binds its debug port to 127.0.0.1, even if you tell it not to. Docker port mapping needs 0.0.0.0. socat forwards between the two. Standard workaround.

WebSocket URL rewriting — Chromium reports its WebSocket URL with the container-internal address. We rewrite it to the Docker-mapped host:port so connections from outside the container actually work.

LiteLLM adapter — browser-use expects a specific chat model interface. Rather than hardcoding a provider, ChatLiteLLM routes through litellm so any model string (anthropic/claude-sonnet-4-20250514, openrouter/google/gemini-2.0-flash-001, etc.) just works.

greptile-apps · 2026-03-11T18:30:44Z

Greptile Summary

This PR replaces the Playwright-based browser tool with browser-use, adding a natural-language agent mode (action="run") alongside all the existing direct CDP commands. It introduces a clean ChatLiteLLM adapter so any litellm-compatible model string works out of the box, wires up an authenticated CDP proxy (proxy.py) instead of the previous raw socat forward, adds a self-healing Chromium watchdog in the entrypoint, and plumbs browser_cdp_port all the way from DockerRuntime through SandboxInfo to the tool. The overall architecture is solid and the credential-hygiene improvement (token read from env instead of CLI arg) is a nice touch.

Key findings:

Resource leak in _launch_local_browser race: The race-condition fix applied to _launch_browser (closing the discarded browser via a background task) was not applied to _launch_local_browser. When two coroutines race, the losing browser from Browser.from_system_chrome() is silently dropped without being closed.
Agent cleanup double-close: The finally block in run_agent iterates over ("close", "stop") with no break/return after a successful call, so both methods are always invoked on the agent. The coro := fn() walrus assignment also calls the function as a side effect before the coroutine check. The equivalent helper _close_browser avoids this with an explicit return after success.
Non-constant-time token comparison in proxy.py: Bearer token validation uses Python ==, which is not timing-safe. Using hmac.compare_digest is the standard fix.

Confidence Score: 3/5

Safe to merge after fixing the local-browser resource leak; the other two issues are low-severity cleanup concerns.
The PR is a substantial, well-structured migration. The resource leak in _launch_local_browser is a real bug (it's the same class of issue already fixed for _launch_browser), but it only triggers under concurrent-launch conditions for the experimental local-Chrome mode. The double-close issue is low-risk because all errors are caught. The timing attack is minor given the Docker-internal context. No data-loss, auth-bypass, or crash-path issues were found in the primary sandboxed flow.
strix/tools/browser/browser_manager.py — _launch_local_browser needs the same race-condition fix already applied to _launch_browser.

Important Files Changed

Filename	Overview
strix/tools/browser/browser_manager.py	Core session lifecycle management for browser-use; the race-condition fix applied to `_launch_browser` was not applied symmetrically to `_launch_local_browser`, leaving a resource leak when two coroutines race to create a local session.
strix/tools/browser/browser_actions.py	Implements all browser actions (agent run + direct CDP commands); agent cleanup loop has no early-exit after first successful `close()`/`stop()` call, causing both methods to always be invoked.
containers/proxy.py	New authenticated CDP proxy (replaces socat); correctly strips the auth token before forwarding and handles both HTTP and WebSocket upgrades; token comparison uses `==` rather than a constant-time digest.
containers/docker-entrypoint.sh	Adds full Chromium lifecycle management (launch, CDP readiness wait, auth-proxy startup, 10-second watchdog loop with configurable restart cap); `--token` flag removed from tool-server invocation in favour of env-var injection, improving credential hygiene.
containers/healthcheck.sh	New healthcheck that probes tool server, Caido, and Chromium CDP on the internal port (no auth needed); correctly avoids the unauthenticated-proxy problem by using the internal port directly.
strix/tools/browser/litellm/chat.py	Clean LiteLLM adapter implementing `BaseChatModel`; covers structured output via `response_format`, maps all major litellm error types to browser-use exceptions, and handles extended thinking via `reasoning_content`.
strix/tools/browser/litellm/serializer.py	Serializes browser-use message types to litellm-compatible OpenAI format; handles `UserMessage`, `SystemMessage`, and `AssistantMessage` (including tool_calls); unknown message types are silently skipped with no warning (previously flagged).
strix/runtime/tool_server.py	Added `/cdp/version` authenticated endpoint and `_check_cdp_health()` helper; removed `--token` CLI arg in favour of env-var; `_run_tool` still uses `asyncio.to_thread` which would mishandle async tools if ever called via the HTTP endpoint, but `browser_actions` (async) has `sandbox_execution=False` so it is never forwarded to the tool server in normal flow.
strix/runtime/docker_runtime.py	Added `_browser_cdp_port` allocation and recovery; `browser_cdp_port` is now properly included in `SandboxInfo` and propagated through to agents; no issues found.
containers/Dockerfile	Adds `socat` to apt package list (kept for compatibility despite socat being replaced by the Python auth proxy), copies `healthcheck.sh` and `proxy.py`, and wires up `HEALTHCHECK`; straightforward and correct.
strix/interface/tool_components/browser_renderer.py	New Textual renderer for `browser_actions`; covers all actions with distinct visual styles and concise summaries; no logic issues found.

Comments Outside Diff (2)

strix/tools/browser/browser_actions.py, line 243-249 (link)

Agent cleanup always calls both close() and stop()

The for name in ("close", "stop") loop has no break or return after a successful cleanup, so both methods are always attempted. More importantly, coro := fn() calls fn() as a side effect in the condition expression — if close() is synchronous, it executes immediately but asyncio.iscoroutine returns False, the loop falls through, and stop() is called as well.

Compare with _close_browser in browser_manager.py (line 198–212), which has an explicit return after a successful close to stop trying further methods.

The fix is to break (or return) after successfully closing:
```
for name in ("close", "stop"):
    if fn := getattr(agent, name, None):
        try:
            if asyncio.iscoroutine(coro := fn()):
                await asyncio.wait_for(coro, timeout=5)
        except Exception:  # noqa: BLE001,S110
            pass
        else:
            break  # stop after the first successful cleanup call
```
containers/proxy.py, line 16-19 (link)

Non-constant-time token comparison

Using Python's == operator for bearer-token comparison is susceptible to timing attacks — an attacker can distinguish valid-prefix guesses from fully-wrong guesses by measuring response latency. The standard fix is hmac.compare_digest, which is constant-time:

While the proxy listens only inside the Docker network, defence-in-depth is worthwhile for an auth gate.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: strix/tools/browser/browser_manager.py
Line: 329-330

Comment:
**Local browser leaked on concurrent-launch race**

`_launch_local_browser` has the same race condition that was fixed for `_launch_browser` (lines 307-310), but the fix was not applied here.

Between the first guard check (line 319) and this second guard check (line 329), another coroutine may have already created and stored a session for the same `agent_id`. When that happens, the newly-created `browser = Browser.from_system_chrome(...)` is silently dropped without being closed, leaking its connection to the system Chrome instance.

`_launch_browser` closes the discarded browser via a background task:
```python
# _launch_browser (lines 307–310) — existing fix
if session := _manager.get(agent_id):
    task = asyncio.ensure_future(_close_browser(browser, label="race-discarded"))
    _manager.background_tasks.add(task)
    task.add_done_callback(_manager.background_tasks.discard)
    return session
```

`_launch_local_browser` should do the same:
```python
if session := _manager.get(agent_id):
    task = asyncio.ensure_future(_close_browser(browser, label="race-discarded-local"))
    _manager.background_tasks.add(task)
    task.add_done_callback(_manager.background_tasks.discard)
    return session
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: strix/tools/browser/browser_actions.py
Line: 243-249

Comment:
**Agent cleanup always calls both `close()` and `stop()`**

The `for name in ("close", "stop")` loop has no `break` or `return` after a successful cleanup, so both methods are always attempted. More importantly, `coro := fn()` calls `fn()` as a side effect in the condition expression — if `close()` is synchronous, it executes immediately but `asyncio.iscoroutine` returns `False`, the loop falls through, and `stop()` is called as well.

Compare with `_close_browser` in `browser_manager.py` (line 198–212), which has an explicit `return` after a successful close to stop trying further methods.

The fix is to `break` (or `return`) after successfully closing:
```python
for name in ("close", "stop"):
    if fn := getattr(agent, name, None):
        try:
            if asyncio.iscoroutine(coro := fn()):
                await asyncio.wait_for(coro, timeout=5)
        except Exception:  # noqa: BLE001,S110
            pass
        else:
            break  # stop after the first successful cleanup call
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: containers/proxy.py
Line: 16-19

Comment:
**Non-constant-time token comparison**

Using Python's `==` operator for bearer-token comparison is susceptible to timing attacks — an attacker can distinguish valid-prefix guesses from fully-wrong guesses by measuring response latency. The standard fix is `hmac.compare_digest`, which is constant-time:

```suggestion
import hmac

def _is_authorized(req: web.Request) -> bool:
    auth_header: str = req.headers.get("Authorization", "")
    token_param: str | None = req.query.get("token")
    expected = TOKEN.encode()
    header_ok = auth_header.startswith("Bearer ") and hmac.compare_digest(
        auth_header[len("Bearer "):].encode(), expected
    )
    param_ok = token_param is not None and hmac.compare_digest(token_param.encode(), expected)
    return header_ok or param_ok
```

While the proxy listens only inside the Docker network, defence-in-depth is worthwhile for an auth gate.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: e827141}

strix/tools/browser/browser_manager.py

strix/tools/browser/llm.py

shanyu-strix · 2026-03-11T23:35:36Z

@greptileai please review

containers/healthcheck.sh

strix/tools/browser/browser_commands.py

Enhances browser command robustness by gracefully handling cases where the DOM state is not yet available, returning an informative error instead of asserting. Updates the container healthcheck to directly probe Chromium's internal CDP port, simplifying the check by removing previous indirection.

shanyu-strix · 2026-03-11T23:42:39Z

@greptileai fixed, please check

shanyu-strix · 2026-03-12T17:37:51Z

@greptileai review

shanyu-strix · 2026-03-13T00:31:57Z

@greptileai check please

shanyu-strix · 2026-03-13T00:45:07Z

@greptileai please check

0xallam · 2026-03-13T01:10:34Z

pyproject.toml

+"strix/tools/browser/**/*.py" = [
+  "ARG002",  # Unused method argument (interface methods may not use all args)
+  "PLR0911", # Too many return statements (dispatchers and complex browser logic)
+  "PLR0912", # Too many branches (dispatch functions)
+  "PLR0915", # Too many statements (complex browser task handling)
+]
+"strix/interface/tool_components/browser_renderer.py" = [
+  "PLR0911", # Too many return statements (action dispatcher)
+  "PLR0912", # Too many branches
+  "PLR0915", # Too many statements
+]
+"strix/telemetry/tracer.py" = [
+  "PLR0912", # Too many branches (save_run_data is legitimately complex)
+  "PLR0915", # Too many statements
+]


remove those

pyproject.toml

0xallam · 2026-03-13T01:11:12Z

pyproject.toml

+    "browser_use",
+    "browser_use.*",
+    "cdp_use",
+    "cdp_use.*",
+    "aiohttp",
+    "aiohttp.*",


why duplicated? keep ones with * only

pyproject.toml

0xallam · 2026-03-13T01:12:45Z

strix/tools/__init__.py

    from .python import *  # noqa: F403
    from .terminal import *  # noqa: F403

+    if not DISABLE_BROWSER:


revert this edit

0xallam · 2026-03-13T01:14:11Z

strix/tools/executor.py

+    # Propagate agent_id so tools can scope per-agent resources (e.g. browser instances).
+    # NOTE: This is needed for browser_use tools to work correctly.
+    agent_id = getattr(agent_state, "agent_id", None) if agent_state else None
+    if agent_id:
+        set_current_agent_id(agent_id)
+


isn't this already done? I think we already provide "state" obj to all tools, which has the agent id

Let me check

0xallam · 2026-03-13T01:15:24Z

strix/tools/browser/browser_manager.py

+def llm_supports_vision() -> bool:
+    try:
+        import litellm
+
+        from strix.config.config import resolve_llm_config
+
+        model, _, _ = resolve_llm_config()
+        return bool(model and litellm.supports_vision(model))
+    except Exception:  # noqa: BLE001
+        return False


Not a good idea to make a function for this...

0xallam · 2026-03-13T01:16:48Z

strix/tools/browser/litellm/chat.py

+    def _parse_usage(response: Any) -> ChatInvokeUsage | None:
+        """Extract token usage from a litellm response."""
+        usage = getattr(response, "usage", None)
+        if usage is None:
+            return None
+
+        prompt_tokens = getattr(usage, "prompt_tokens", 0) or 0
+        completion_tokens = getattr(usage, "completion_tokens", 0) or 0
+
+        prompt_cached = getattr(usage, "cache_read_input_tokens", None)
+        cache_creation = getattr(usage, "cache_creation_input_tokens", None)
+
+        if prompt_cached is None:
+            details = getattr(usage, "prompt_tokens_details", None)
+            if details:
+                prompt_cached = getattr(details, "cached_tokens", None)
+
+        return ChatInvokeUsage(
+            prompt_tokens=prompt_tokens,
+            prompt_cached_tokens=int(prompt_cached) if prompt_cached is not None else None,
+            prompt_cache_creation_tokens=int(cache_creation)
+            if cache_creation is not None
+            else None,
+            prompt_image_tokens=None,
+            completion_tokens=completion_tokens,
+            total_tokens=prompt_tokens + completion_tokens,
+        )


Do we need to get usage data in case of having browser use internally?
I think we can just get rid of it

Don't we want to be tracking the token spend from the browser agent?

strix/telemetry/utils.py

0xallam · 2026-03-13T01:19:43Z

containers/browser-server.py

+def _strip_token(url: str) -> str:
+    return re.sub(r"[?&]token=[^&]*", "", url).replace("?&", "?").rstrip("?")


why are we parsing it manually? why not just do it like the tool server, and use uvicorn/fastapi for the whole file

0xallam · 2026-03-13T03:06:51Z

strix/runtime/docker_runtime.py

                        f"{CONTAINER_CAIDO_PORT}/tcp": self._caido_port,
                    },
                    cap_add=["NET_ADMIN", "NET_RAW"],
+                    shm_size="256m",


I think this should be removed

… into feat/browseruse

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

strix/tools/browser/browser_manager.py Outdated Show resolved Hide resolved

strix/tools/browser/llm.py Outdated Show resolved Hide resolved

shanyu-strix marked this pull request as draft March 11, 2026 19:09

shanyu-strix added 5 commits March 11, 2026 16:33

feat: browser-use tool initial migration

305cc9e

simplify entrypoint

d81731b

fix greptile

4f4d4db

harden: authentication for cdp

1bc87f5

debug: disable

4ab0c8c

shanyu-strix force-pushed the feat/browseruse branch from 5c1a320 to 4ab0c8c Compare March 11, 2026 23:33

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

containers/healthcheck.sh Outdated Show resolved Hide resolved

strix/tools/browser/browser_commands.py Outdated Show resolved Hide resolved

shanyu-strix added 2 commits March 11, 2026 17:23

fix: improve support for non-vision models

419dbad

remove built-in llm, use langchain

f883896

shanyu-strix marked this pull request as ready for review March 12, 2026 17:49

refactor

4dc9f0e

Update renderer

e827141

small changes (QOL)

76a7f29

0xallam requested changes Mar 13, 2026

View reviewed changes

shanyu-strix added 2 commits March 12, 2026 18:38

qol

89a8751

streamline into tool-server

ebdc1e0

0xallam reviewed Mar 13, 2026

View reviewed changes

shanyu-strix added 4 commits March 13, 2026 14:38

metadata + cleanup some actions

d1bf575

remove slop comments

45b2942

qol on chat browser

9a80b56

Update docker_runtime.py

588f45c

shanyu-strix added 8 commits March 13, 2026 15:17

prevent browseruse from killing cdp connection

15002b9

slim down browser-manager

a356684

important notes

4bfa89c

simplify actions, remove ones that cannot be used

9f26951

remove upload files

a641847

remove docker entrypoint slop

c61e24f

slim down cdp functionality

3fb0d9e

slim down cdp functionality

c52b199

shanyu-strix force-pushed the feat/browseruse branch from 3fb0d9e to c52b199 Compare March 14, 2026 02:05

shanyu-strix added 15 commits March 13, 2026 19:05

Merge branch 'feat/browseruse' of https://github.com/shanyu-strix/strix…

2965e0e

… into feat/browseruse

suppress cdp + add logging

9aded55

cleanup renderer

84a4ebc

remove file name parameter from schema definition

cedc531

Update schema with correct(er) names

ed441ad

base integration tests

f029045

cleanup logging in browser_actions (to test)

a3485c0

add run test

e084827

comment cleanup

9e2d7ed

fix: properly isolate context

c73bc7a

cleanup + concurrency test

6b93d50

Lean up implementation

0c02803

more extensive tests for edge cases

fc3272a

Update test_browser_isolation.py

da00c1f

Improve concurrency test

cc6274d

		def _strip_token(url: str) -> str:
		return re.sub(r"[?&]token=[^&]*", "", url).replace("?&", "?").rstrip("?")

Conversation

shanyu-strix commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat: browser-use tool migration

TL;DR

What's new

Notable decisions

Uh oh!

greptile-apps bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Comments Outside Diff (2)

Uh oh!

Uh oh!

Uh oh!

shanyu-strix commented Mar 11, 2026

Uh oh!

Uh oh!

Uh oh!

shanyu-strix commented Mar 11, 2026

Uh oh!

shanyu-strix commented Mar 12, 2026

Uh oh!

shanyu-strix commented Mar 13, 2026

Uh oh!

shanyu-strix commented Mar 13, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shanyu-strix commented Mar 11, 2026 •

edited

Loading

greptile-apps bot commented Mar 11, 2026 •

edited

Loading