Skip to content

Conversation

@karlseguin
Copy link
Collaborator

@karlseguin karlseguin commented Aug 25, 2025

Previously, the IO loop was doing three things:
1 - Managing timeouts (either from scripts or for our own needs)
2 - Handling browser IO events (page/script/xhr)
3 - Handling CDP events (accept, read, write, timeout)

With the libcurl merge, 1 was moved to an in-process scheduler and 2 was moved to libcurl's own event loop. That means the entire loop code, including the dependency on tigerbeetle-io existed for handling a single TCP client. Not only is that a lot of code, there was also friction between the two loops (the libcurl one and our IO loop), which would result in latency - while one loop is waiting for the events, any events on the other loop go un-processed.

This PR removes our IO loop. To accomplish this:

1 - The main accept loop is blocking. This is simpler and works perfectly well, given we only allow 1 active connection.
2 - The client socket is passed to libcurl - yes, libcurl's loop can take arbitrary FDs and poll them along with its own.

In addition to having one less dependency, the CDP code is quite a bit simpler, especially around shutdowns and writes. This also removes some of the latency caused by the friction between page process and CDP processing. Specifically, when CDP now blocks for input, http page events (script loading, xhr, ...) will still be processed.

There's still friction. For one, the reverse isn't true: when the page is waiting for events, CDP events aren't going to be processed. But the page.wait already have some sensitivity to this (e.g. the page.request_intercepted flag). Also, when CDP waits, while we will process network events, page timeouts are still not processed. Because of both these remaining issues, we still need to jump between the two loops - but being able to block on CDP (even for a short time) WITHOUT stopping the page's network I/O, should reduce some latency.

Previously, the IO loop was doing three things:
1 - Managing timeouts (either from scripts or for our own needs)
2 - Handling browser IO events (page/script/xhr)
3 - Handling CDP events (accept, read, write, timeout)

With the libcurl merge, 1 was moved to an in-process scheduler and 2 was moved
to libcurl's own event loop. That means the entire loop code, including
the dependency on tigerbeetle-io existed for handling a single TCP client.
Not only is that a lot of code, there was also friction between the two loops
(the libcurl one and our IO loop), which would result in latency - while one
loop is waiting for the events, any events on the other loop go un-processed.

This PR removes our IO loop. To accomplish this:

1 - The main accept loop is blocking. This is simpler and works perfectly well,
given we only allow 1 active connection.
2 - The client socket is passed to libcurl - yes, libcurl's loop can take
arbitrary FDs and poll them along with its own.

In addition to having one less dependency, the CDP code is quite a bit simpler,
especially around shutdowns and writes. This also removes _some_ of the latency
caused by the friction between page process and CDP processing. Specifically,
when CDP now blocks for input, http page events (script loading, xhr, ...) will
still be processed.

There's still friction. For one, the reverse isn't true: when the page is
waiting for events, CDP events aren't going to be processed. But the page.wait
already have some sensitivity to this (e.g. the page.request_intercepted flag).
Also, when CDP waits, while we will process network events, page timeouts are
still not processed. Because of both these remaining issues, we still need to
jump between the two loops - but being able to block on CDP (even for a short
time) WITHOUT stopping the page's network I/O, should reduce some latency.
Copy link
Contributor

@sjorsdonkers sjorsdonkers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see how low impact this is on the code.
A great bonus payoff from the libcurl switch.

Copy link
Member

@krichprollsch krichprollsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@krichprollsch krichprollsch merged commit 041e014 into main Aug 26, 2025
16 of 18 checks passed
@krichprollsch krichprollsch deleted the remove_loop branch August 26, 2025 16:17
@github-actions github-actions bot locked and limited conversation to collaborators Aug 26, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants