Skip to content

Conversation

@karlseguin
Copy link
Collaborator

The mix of sync and async HTTP requests requires care to avoid deadlocks. Previously, it was possible for async requests to use up all available HTTP state objects duration a navigation flow (either directly, or via an internal redirect (e.g. click, submit, ...)). This would block the navigation, which, because everything is single thread, would block the I/O loop, resulting in a deadlock.

The correct solution seems to be to remove all synchronous I/O. And I tried to do that, but I ran into a wall with module-loading, which is initiated from V8. V8 says "give me the source for this module", and I don't see a great way to tell it: wait a bit.

So I went back to trying to make this work with the hybrid model, despite last weeks failures to get it to work. I changed two things:

1 - The http client will only directly initiate an async request if there's
at least 2 free state objects available (1 for the request, and leaving 1
free for any synchronous requests)

2 - Delayed navigation retries until there's at least 1 free http state object
available.

Commits from last week did help with this. First, we're now guaranteed to have a single sync-request at a time (previously, we could have had 2). Secondly, the async connection is now async end-to-end (previously, it could have blocked on an empty state pool).

We could probably make this a bit more obviously by reserving 1 state object for synchronous requests. But, since the long term solution is probably having no synchronous requests, I'm happy with anything that lets me move past this issue.

@karlseguin karlseguin force-pushed the fix_internal_navigation_deadlocks branch from 91b1177 to ad92ae2 Compare June 11, 2025 10:09
The mix of sync and async HTTP requests requires care to avoid deadlocks.
Previously, it was possible for async requests to use up all available HTTP
state objects duration a navigation flow (either directly, or via an internal
redirect (e.g. click, submit, ...)). This would block the navigation, which,
because everything is single thread, would block the I/O loop, resulting in a
deadlock.

The correct solution seems to be to remove all synchronous I/O. And I tried to
do that, but I ran into a wall with module-loading, which is initiated from V8.
V8 says "give me the source for this module", and I don't see a great way to
tell it: wait a bit.

So I went back to trying to make this work with the hybrid model, despite last
weeks failures to get it to work. I changed two things:

1 - The http client will only directly initiate an async request if there's
    at least 2 free state objects available (1 for the request, and leaving 1
    free for any synchronous requests)

2 - Delayed navigation retries until there's at least 1 free http state object
    available.

Commits from last week did help with this. First, we're now guaranteed to have
a single sync-request at a time (previously, we could have had 2). Secondly,
the async connection is now async end-to-end (previously, it could have blocked
on an empty state pool).

We could probably make this a bit more obviously by reserving 1 state object
for synchronous requests. But, since the long term solution is probably having
no synchronous requests, I'm happy with anything that lets me move past this
issue.
@karlseguin karlseguin force-pushed the fix_internal_navigation_deadlocks branch from ad92ae2 to 97c769e Compare June 12, 2025 04:35
karlseguin and others added 2 commits June 12, 2025 23:01
In #767 I tried to call loop.run
from within a loop.run (spoiler, it didn't work), in order to make sure aborted
connections were properly cleaned up before starting a new navigation. That
resulted in having loop.run no longer wait for timeouts for fear of having to
wait on a long timeout. The ended up breaking page.wait (used in the fetch
command).

This commit brings back the original behavior where loop.run() waits for all
completions. Which is now safe to do since the nested loop.run() call has
been removed.
@karlseguin karlseguin merged commit e3afa29 into main Jun 13, 2025
11 checks passed
@karlseguin karlseguin deleted the fix_internal_navigation_deadlocks branch June 13, 2025 01:38
@github-actions github-actions bot locked and limited conversation to collaborators Jun 13, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants