perf: add persistent playwright executor daemon #113

rgarcia · 2026-01-07T22:54:05Z

Summary

Replace per-request tsx subprocess spawning with a persistent Node.js daemon that maintains a warm CDP connection to the browser.

Performance improvement: ~94% faster for simple operations

Before: ~750ms per request (tsx startup + module load + CDP connect)
After: ~45ms per request (reuses daemon + warm CDP connection)

Changes

Add playwright-daemon.ts: persistent daemon that listens on Unix socket, maintains warm CDP connection, uses esbuild for fast TypeScript transformation
Update playwright.go: communicate with daemon via Unix socket instead of spawning tsx subprocess; lazy-start daemon on first request
Update Dockerfiles: install esbuild, pre-compile daemon to JavaScript during build

Architecture

HTTP Request → Go API → Unix Socket → Playwright Daemon → CDP → Chromium
                              ↑
                    (persistent, warm connection)

Test plan

Build headless Docker image
Run performance benchmarks comparing before/after
Test TypeScript syntax support (type annotations, interfaces, async/await)
Verify existing Go tests pass
Manual testing in production environment

Note

Introduces a persistent Playwright execution daemon and switches the API to use it over a Unix socket, removing per-request tsx subprocesses.

New daemon: Adds server/runtime/playwright-daemon.ts (compiled to /usr/local/lib/playwright-daemon.js) that listens on /tmp/playwright-daemon.sock, keeps a warm CDP connection, transforms TS via esbuild, and reconnects on browser restarts.
API changes: api/playwright.go now lazy-starts the daemon, communicates via Unix socket with request/response JSON, and returns structured errors; ApiService gains fields for daemon process control.
Dockerfiles: Install esbuild, remove tsx from globals, and compile+embed the daemon in both headful/headless images; update neko base image reference.
Tests: Move Playwright execute test into e2e_playwright_test.go and add a recovery test that restarts Chromium and verifies daemon reconnection.

^{Written by Cursor Bugbot for commit b4be529. This will update automatically on new commits. Configure here.}

ulziibay-kernel · 2026-01-07T23:01:24Z

server/runtime/playwright-daemon.ts

@@ -0,0 +1,266 @@
+/**
+ * Persistent Playwright Executor Daemon


if i want to add tracing functionality a first class citizen, is this a place to add that given the architecture?

potentially! Currently there are two main ways users can execute playwright code against a Kernel browser:
(1) spin up a kernel browser, and then use the CDP url from a playwright script. Here we have no control over playwright tracing config since it's user code
(2) use the playwright execute API (https://www.kernel.sh/docs/browsers/playwright-execution) which is what this PR modifies. This is an API that takes typescript an executes it against the browser

i'd say most people use (1) right now so if we added a parameter to enable tracing in this endpoint it might not be useful in practice

Replace per-request tsx subprocess with a persistent Node.js daemon that maintains a warm CDP connection and uses esbuild for fast TypeScript transformation. Performance improvement: ~94% faster for simple operations - Before: ~750ms per request (tsx startup + module load + CDP connect) - After: ~45ms per request (reuses daemon + warm connection) The daemon is lazy-started on first request and communicates via Unix socket using newline-delimited JSON.

Test that the playwright daemon recovers after chromium is restarted via supervisorctl. Moves playwright tests to their own file.

cursor · 2026-01-07T23:25:02Z

server/runtime/playwright-daemon.ts

+
+const SOCKET_PATH = process.env.PLAYWRIGHT_DAEMON_SOCKET || '/tmp/playwright-daemon.sock';
+const CDP_ENDPOINT = process.env.CDP_ENDPOINT || 'ws://127.0.0.1:9222';
+const USE_PATCHRIGHT = process.env.PLAYWRIGHT_ENGINE !== 'playwright-core';


Default Playwright engine changed from playwright-core to patchright

High Severity

The PLAYWRIGHT_ENGINE environment variable logic was inverted. The old executor checked === 'patchright' to use patchright, defaulting to playwright-core when unset. The new daemon checks !== 'playwright-core', which means when PLAYWRIGHT_ENGINE is undefined (the common case), patchright is now used instead of playwright-core. This silently changes the default browser automation library for all existing deployments.

intentional

cursor · 2026-01-07T23:25:02Z

server/runtime/playwright-daemon.ts

+    const result = await Promise.race([
+      userFunction(page, context, browserInstance),
+      timeoutPromise,
+    ]);


Timeout timer causes unhandled rejection and memory leak

Medium Severity

The timeoutPromise uses setTimeout without storing the timer reference. When the user function completes before the timeout, Promise.race returns but the timer continues running. When it fires, it rejects a promise that nothing is listening to, causing an unhandled rejection. In Node.js 22, this generates warnings in logs and can crash the daemon depending on configuration. The timer reference needs to be stored and cleared via clearTimeout after the race completes.

cursor · 2026-01-07T23:25:02Z

server/runtime/playwright-daemon.ts

+    const result = await Promise.race([
+      userFunction(page, context, browserInstance),
+      timeoutPromise,
+    ]);


Timed-out user code continues running and corrupts page state

High Severity

When a timeout occurs, Promise.race returns immediately but the user function continues executing with access to the shared page object. Since the daemon reuses the same page for subsequent requests (lines 148-149 prefer existing pages), a timed-out request's code can still be running when the next request starts, causing race conditions and unpredictable behavior. The old subprocess-based executor was killed by the OS on timeout via exec.CommandContext, providing clean isolation. The daemon lacks any mechanism to abort user code when timeout fires.

Additional Locations (1)

server/runtime/playwright-daemon.ts#L145-L149

Acknowledged. This is an inherent tradeoff of the persistent daemon architecture. A few mitigating factors:

The Go API serializes all requests with a mutex, so concurrent execution isn't possible

Timed-out code is typically blocked on I/O (awaiting network), not CPU-bound, so it won't starve subsequent requests

The practical risk of page state corruption is low - most timeouts occur during navigation/network issues where the page is already in a bad state

True isolation would require per-request subprocesses (the old model) or worker threads with their own CDP connections, both of which sacrifice the ~94% latency improvement this PR provides.

We accept this tradeoff for the performance gains. If it becomes a problem in practice, we can revisit with a "fresh page after timeout" mitigation.

Sayan- · 2026-01-07T23:35:56Z

server/runtime/playwright-daemon.ts

@@ -0,0 +1,266 @@
+/**


should we just manage the daemon via supervisord instead? that way we can also automatically resume from failures, etc. wdyt?

since CDP connections present some bot detection risk i figured we should only run the daemon if the user is using this playwright endpoint, so that's why it's not running by default

sg! exciting stuff!

Sayan- · 2026-01-07T23:36:01Z

server/cmd/api/api/playwright.go

-
-		// If we can't parse the output, return a generic error
-		success := false
 		errorMsg := fmt.Sprintf("execution failed: %v", err)


might be worth distinguishing daemon vs code execution errors to make debugging easier

Sayan-

sweet! qq on daemon management, lgtm otherwise

ulziibay-kernel reviewed Jan 7, 2026

View reviewed changes

rgarcia force-pushed the playwright-daemon-perf branch from 7ce91a2 to eba83f5 Compare January 7, 2026 23:02

rgarcia added 4 commits January 7, 2026 15:09

test: add playwright daemon recovery test

914906c

Test that the playwright daemon recovers after chromium is restarted via supervisorctl. Moves playwright tests to their own file.

chore: remove tsx dependency (no longer needed)

294257f

feat: use patchright by default instead of playwright-core

02cb726

fix: update ghcr.io base image org from onkernel to kernel

b4be529

cursor bot reviewed Jan 7, 2026

View reviewed changes

Sayan- reviewed Jan 7, 2026

View reviewed changes

rgarcia merged commit e30df23 into main Jan 8, 2026
5 checks passed

rgarcia deleted the playwright-daemon-perf branch January 8, 2026 00:02

perf: add persistent playwright executor daemon #113

perf: add persistent playwright executor daemon #113

Uh oh!

Conversation

rgarcia commented Jan 7, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Architecture

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Jan 7, 2026

Choose a reason for hiding this comment

Default Playwright engine changed from playwright-core to patchright

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Jan 7, 2026

Choose a reason for hiding this comment

Timeout timer causes unhandled rejection and memory leak

Uh oh!

cursor bot Jan 7, 2026

Choose a reason for hiding this comment

Timed-out user code continues running and corrupts page state

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Sayan- left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rgarcia commented Jan 7, 2026 •

edited by cursor bot

Loading