Skip to content

Add emscripten_poll_with_callback and unify fd readiness on one poll wait-queue#27181

Open
guybedford wants to merge 2 commits into
emscripten-core:mainfrom
guybedford:async-readiness
Open

Add emscripten_poll_with_callback and unify fd readiness on one poll wait-queue#27181
guybedford wants to merge 2 commits into
emscripten-core:mainfrom
guybedford:async-readiness

Conversation

@guybedford

Copy link
Copy Markdown
Collaborator

Adds emscripten_poll_with_callback, a non-blocking single-fd poll:

typedef void (*em_poll_callback)(int fd, int revents);
int emscripten_poll_with_callback(int fd, int events, int timeout, em_poll_callback callback);

cb(fd, revents) fires when the fd is ready or the timeout elapses; revents is by value. It does not suspend the caller, so it works without ASYNCIFY/JSPI. Returns -EBADF for a bad fd and -EPERM if the descriptor type can't deliver readiness callbacks (checked before arming, even when ready, like epoll_ctl); closing an fd wakes its waiters with POLLNVAL. events is a bitmask — register several conditions, fire once on whichever is ready first, re-arm to continue.

It is meant as an integration point for async runtimes and event loops that need to await I/O readiness without a blocking call or a stack switch — e.g. waiting for a socket to become readable/writable and dispatching when ready. In ASYNCIFY/JSPI builds it complements blocking poll()/select(); in plain synchronous builds it is the way to wait on an fd without spinning a poll loop. Unlike the global socket event callbacks, registration is per-fd and opt-in: you are only woken for the (fd, events) you armed, once.

To support it, fd readiness now uses a single wait-queue on the file node, replacing three separate mechanisms (the socket event callbacks, the pipe readable handlers, and the blocking-poll notifier):

  • stream_ops.poll(stream) is now pure readiness derivation — it no longer registers notifications.
  • producers wake waiters via $notifyPollCallback(node, flags): SOCKFS.emit bridges socket events, PIPEFS writes wake the read end.
  • consumers register via $addPollCallback(node, cb): the async __syscall_poll registers one waiter per fd and re-derives the set on any wake, and emscripten_poll_with_callback registers a single-fd waiter.

A consequence of routing sockets through the same seam: blocking poll()/select() on a socket is now woken by incoming data. Previously sock_ops.poll() ignored the notifier, so a blocking poll() on a socket could only time out.

Tests:

  • test_poll_callback — callback readiness on a socket, the -EPERM/-EBADF capability gate, POLLNVAL on close.
  • test_poll_socket_blocking — a blocking poll() woken by a send that arrives only after it has blocked (sender thread under pthreads, timer under JSPI). It hangs on the pre-unification machinery and passes after, so it actually exercises the wake path.
  • Core poll/ppoll/select/pipe blocking suites, including PROXY_TO_PTHREAD.

Size/perf: wasm unchanged; +~70 B JS on socket builds, none otherwise. No hot-path change beyond a short-circuiting notify on socket events / pipe writes.

…_callback

Adds emscripten_poll_with_callback(fd, events, timeout, cb): a non-blocking
single-fd poll that invokes cb(fd, revents) when the fd is ready or the timeout
elapses. revents is passed by value. It does not suspend the caller, so it works
without ASYNCIFY/JSPI. Returns -EBADF for a bad fd and -EPERM if the descriptor
type can't deliver readiness callbacks (checked before arming, even when ready);
closing an fd wakes its waiters with POLLNVAL.

It is meant as an integration point for async runtimes and event loops that need
to await I/O readiness without a blocking call or a stack switch: e.g. waiting
for a socket to become readable/writable, or for an async-completion fd, and
dispatching when ready. In ASYNCIFY/JSPI builds it complements blocking
poll()/select(); in plain synchronous builds it is the only way to wait on an fd
without spinning a poll loop.

To support it, fd readiness now uses a single wait-queue on the file node,
replacing three separate mechanisms (the socket event callbacks, the pipe
readable handlers, and the blocking-poll notifier):

- stream_ops.poll(stream) is now pure derivation; it no longer registers.
- producers wake waiters via $notifyPollCallback(node, flags): SOCKFS.emit
  bridges socket events, PIPEFS writes wake the read end.
- consumers register via $addPollCallback(node, cb): the async __syscall_poll
  registers one waiter per fd (re-deriving the set on wake), and
  emscripten_poll_with_callback registers a single-fd waiter.

Sockets now feed the same seam, so blocking poll()/select() on a socket is woken
by incoming data; previously sock_ops.poll() ignored the notifier.

Tests: test_poll_callback (callback readiness, -EPERM/-EBADF gate, POLLNVAL
close), test_poll_socket_blocking (blocking poll() woken by a delayed send;
hangs before this change, passes after), and the core poll/ppoll/select/pipe
blocking suites, including PROXY_TO_PTHREAD variants.

@sbc100 sbc100 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I general I like the idea of exposing something like emscripten_poll_with_callback to userspace.

However there are several way to expose it:

  1. As callback-based API.
  2. As a promise-based API and returns promise_t to C/C++
  3. As a promise-based API that uses JSPI/asyncify.

Maybe more?

I think all the above use cases are valid, and I think it would be nice to expose them all, and ideally we would have a nice way to write just one of these and derive the rest of them.

We have been trying to come up with unified scheme to how to go about this for a while but so far its been kind of ad-hoc. This might be a good opportunity to define a use a policy creating now async-any-way-you-like function.

I think the ideal solution is that the JS library author writes a single async JS function in that most idiomatic way (i.e. async foo()) and then the native C/C++ developer should be able to automatically call that function in any of the above ways.

Having said all of that, it seems like this PR is really in two parts:

  1. And internal refactoring of how poll works.
  2. The exposing of a new poll-on-off function to userspace.

Would it be worth landing (1) while we figure out the best shape for (2) asyncronously?

Comment thread src/lib/libsockfs.js
'close': {{{ cDefs.POLLIN }}} | {{{ cDefs.POLLHUP }}},
'error': {{{ cDefs.POLLERR }}},
}[event];
if (flags) notifyPollCallback(FS.getStream(fd)?.node, flags);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can flags ever be undefined here? i.e. are there events that are not lists in the flags mapping above?

Maybe assert(flags, "unhandled event .. ") instead?

Comment thread src/lib/libsockfs.js
var sock = stream.node.sock;
// Wake any pending poll-callback waiters: the fd is going away (POLLNVAL),
// so they complete and release their keepalive rather than hang.
notifyPollCallback(stream.node, {{{ cDefs.POLLNVAL }}});

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different the notifyPollCallback for close in the emit method above?

Comment thread src/lib/libsyscall.js
} else {
flags = {{{ cDefs.POLLIN | cDefs.POLLOUT }}};
}
flags = stream.stream_ops.poll ? stream.stream_ops.poll(stream) : ({{{ cDefs.POLLIN | cDefs.POLLOUT }}});

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe expand this ternary for readability (closure compiler et al can always restore I think?)

Comment thread src/lib/libsyscall.js
// Copy first: a woken waiter removes itself as it completes.
node?.pollCallbacks?.slice().forEach((cb) => cb(flags));
},
$addPollCallback: (node, cb) => {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark these new helpers as __internal ?

Comment thread src/lib/libsyscall.js
// are entirely here.
$notifyPollCallback: (node, flags) => {
// Copy first: a woken waiter removes itself as it completes.
node?.pollCallbacks?.slice().forEach((cb) => cb(flags));

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use for ... of over forEach ? IIRC our convention is to prefer for .. of loops

Comment thread src/lib/libpipefs.js
#if PTHREADS || ASYNCIFY
pipe.notifyReadableHandlers();
#endif
notifyPollCallback(pipe.readNode, {{{ cDefs.POLLRDNORM }}} | {{{ cDefs.POLLIN }}});

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can come up with a better name? Something like nodeStateChanged ? Or notifyNodeListeners?

Also, I wonder if this should be a method on the FS global? I guess its only needed for PIPFS and SOCKFS so maybe not a great idea?

Comment thread src/lib/libsyscall.js
// readiness through $notifyPollCallback can be waited on. `pollAsync` is a
// flag, or a predicate when an instance (e.g. a listening socket) can't.
var pollAsync = stream.stream_ops.pollAsync;
if (typeof pollAsync == 'function') pollAsync = pollAsync(stream);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these two lines be written as just var pollAsync = stream.stream_ops.pollAsync?.(stream)

@sbc100 sbc100 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clear, I'm excited about the general direction here.

Regarding the specific shape the callback-based C APIs, I'm not sure about the timeout part. In general, I think the callback-based APIs we have today to not have timeouts but rather some kind of cancellation mechanism. It should be easy enough to then build your own timeout in userspace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants