Make CDP server more authoritative with respect to IDs #451

karlseguin · 2025-02-28T10:58:23Z

The TL;DR is that this commit enforces the use of correct IDs, introduces a BrowserContext, and adds some CDP tests.

These are the ids we need to be aware of when talking about CDP:

id
browserContextId
targetId
sessionId
loaderId
frameId

The id is the only one that should originate from the driver. It's attached to most messages and it's how we maintain a request -> response flow: when the server responds to a specific message, it echo's back the id from the requested message. (As opposed to out-of-band events sent from the server which won't have an id). When I say "id" from this point forward, I mean every id except for this req->res id.

Every other id is created by the browser.

Prior to this commit, we didn't really check incoming ids from the driver. If the driver said "attachToTarget" and included a targetId, we just assumed that this was the current targetId. This was aided by the fact that we only used hard-coded IDS. If we only "create" a frameId of "FRAME-1", then it's tempting to think the driver will only ever send a frameId of "FRAME-1".

The issue with this approach is that if the browser and driver fall out of sync and there's only ever 1 browserContextId, 1 sessionId and 1 frameId, it's not impossible to imagine cases where we behave on the thing.

Imagine this flow:

Driver asks for a new BrowserContext
Browser says OK, your browserContextId is 1
Driver, for whatever reason, says close browserContextId 2
Browser says, OK, but it doesn't check the id and just closes the only BrowserContext it knows about (which is 1)

By both re-using the same hard-coded ids, and not verifying that the ids sent from the client correspond to the correct ids, any issues are going to be hard to debug.

Currently LOADER_ID and FRAEM_ID are still hard-coded. Baby steps.

The TL;DR is that this commit enforces the use of correct IDs, introduces a BrowserContext, and adds some CDP tests. These are the ids we need to be aware of when talking about CDP: - id - browserContextId - targetId - sessionId - loaderId - frameId The `id` is the only one that _should_ originate from the driver. It's attached to most messages and it's how we maintain a request -> response flow: when the server responds to a specific message, it echo's back the id from the requested message. (As opposed to out-of-band events sent from the server which won't have an `id`). When I say "id" from this point forward, I mean every id except for this req->res id. Every other id is created by the browser. Prior to this commit, we didn't really check incoming ids from the driver. If the driver said "attachToTarget" and included a targetId, we just assumed that this was the current targetId. This was aided by the fact that we only used hard-coded IDS. If _we_ only "create" a frameId of "FRAME-1", then it's tempting to think the driver will only ever send a frameId of "FRAME-1". The issue with this approach is that _if_ the browser and driver fall out of sync and there's only ever 1 browserContextId, 1 sessionId and 1 frameId, it's not impossible to imagine cases where we behave on the thing. Imagine this flow: - Driver asks for a new BrowserContext - Browser says OK, your browserContextId is 1 - Driver, for whatever reason, says close browserContextId 2 - Browser says, OK, but it doesn't check the id and just closes the only BrowserContext it knows about (which is 1) By both re-using the same hard-coded ids, and not verifying that the ids sent from the client correspond to the correct ids, any issues are going to be hard to debug. Currently LOADER_ID and FRAEM_ID are still hard-coded. Baby steps.

Rely on inspector to send the result, otherwise we'll send 2 responses to the same message (one ourselves and one from the inspector), which Playwright does not like.

A really important visual change in the readme :)

…utting down Previously, we could have multiple in-flight messages from the server to a single client. This isn't safe and can lead to message interleaving. While write / send are atomic, they are only atomic for the N bytes which they write, which may not be the entire buffer. Consider this writeAll function: ``` pub fn writeAll(socket: socket_t, bytes: []const u8) !void { var index: usize = 0; while (index < bytes.len) { index += try posix.write(socket, bytes[index..]); } } ``` If we're trying to send "abc123", this could take anywhere from 1 to 6 calls to posix.write (it would take 6 calls, for example, if every call to posix.write only wrote a single byte). Now if you're trying to write other data to this same socket at the same time, messages _will_ get interleaved. In order for this to work, the client now has a send_queue (doubly linked list). When one message is sent, it sends the next. In addition to the above change, the Client is now self-contained with respect to its lifetime. This is necessary so that completions which come in AFTER our concept of its lifetime ends, can still be processed. I think all types that receive completions need to follow this model. This relies on the fact that kqueue (which I know for a fact) and io_uring (which people seem to imply) handle socket shutdown properly. It's still a bit messy because of timeout and not wanting to wait until timeout to accept new connections, but needing to wait until timeout to cleanup the client. The self-contained nature of Client makes it difficult to test as a generic. I removed Client(T). Tests now use real sockets. Some tests had to be removed because they're too difficult to test over a real connection :(

krichprollsch · 2025-03-10T13:37:17Z

closed in favor of #459 to clean git history.

karlseguin and others added 14 commits February 28, 2025 18:40

send attach events before result

e3858b3

allow Target.getTargetInfo to be called without parameters

7ed6925

Don't send CDP result when message is forward to inspector.

ca51b41

Rely on inspector to send the result, otherwise we'll send 2 responses to the same message (one ourselves and one from the inspector), which Playwright does not like.

upgrade vendor/zig-js-runtime

c545610

upgrade vendor/zig-async-io

47961c1

Add Set-Cookie parsing

48ecfb6

add test for Storage shed, use map.getOrPut

0a02ae5

add cookie jar

b091c01

Cookie with SameSite=None is only valid when Secure

467442c

fix typo, improve comment, add 1 test case

353964b

Update README.md

5dba5e8

A really important visual change in the readme :)

Merge branch 'main' into browser_context

a5cf14e

krichprollsch approved these changes Mar 10, 2025

View reviewed changes

krichprollsch mentioned this pull request Mar 10, 2025

Make CDP server more authoritative with respect to IDs #459

Merged

krichprollsch closed this Mar 10, 2025

github-actions bot locked and limited conversation to collaborators Mar 10, 2025

karlseguin deleted the browser_context branch March 11, 2025 02:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make CDP server more authoritative with respect to IDs #451

Make CDP server more authoritative with respect to IDs #451

Uh oh!

karlseguin commented Feb 28, 2025

Uh oh!

krichprollsch commented Mar 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Make CDP server more authoritative with respect to IDs #451

Make CDP server more authoritative with respect to IDs #451

Uh oh!

Conversation

karlseguin commented Feb 28, 2025

Uh oh!

krichprollsch commented Mar 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants