You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
dont drop user fd until node reads a synthetic EOF (#93)
turns out #51 reduced the race rate
but did not eliminate it completely. we still see the data race bug very
rarely and it is exacerbated on high-numbers of repeats in the tests
and/or high nodejs event loop utilization
my old theory was:
- the user fd is O_NONBLOCK clear so any writes there should block until
its made its way into the 4kb intermediary buffer
- as a result, when the .wait returns, we know that the program output
has at least made it fully into the the user fd input buffer
- because we poll until `controller_inq == 0 && controller_outq == 0 &&
user_inq == 0 && user_outq == 0`, in theory we should never call the js
side exit code until theres truly nothing left in the pipe (i.e. the
readstream on the nodejs side has it in its own buffer)
HOWEVER, with some logging i saw a few cases where in fact
`controller_inq == 0 && controller_outq == 0 && user_inq == 0 &&
user_outq == 0`, the program had exited, yet the nodejs side still
missed some data
what i think is _actually_ happening is that
1. one of libuv's io threads reads the data from the controller side and
queues a data event on the stream
2. polling exits as now theres no more data in the queues we immediately
`drop(user_fd);` which [sets TTY_OTHER_CLOSED
synchronously](https://github.com/torvalds/linux/blob/4ff71af020ae59ae2d83b174646fc2ad9fcd4dc4/drivers/tty/pty.c#L66)
and we get an error event queued on the stream
3. if the nodejs event loop happens to read error before data, we emit
'end' and mark the data as read even though technically nodejs hasnt
processed the data event yet so we drop data :(
how we fix it:
1. axe poll_pty_fds_until_read
2. make Pty struct own user_fd and expose a method for the js side to
drop this fd when its done
3. when child.wait finishes, write a synthetic 'EOF' that is actually a
custom OSC terminal control sequence (`\x1B]7878\x1B\\`, 7878 is RUST on
the phonepad :)) to the user fd (a cursory search shows no results, it
seems _very_ unlikely for this sequence to appear randomly)
4. on the nodejs wrapper side, create a transform stream that parses out
the synthetic EOF and emits it as a custom event when it happens
5. when the nodejs side hits this EOF, we know we are actually at the
end of the data stream and can safely drop user_fd
node-pty had the [same
problem](microsoft/node-pty#72) and did the
:grug: brain thing and [added a wait
250ms](microsoft/vscode@9464b54
) so im calling it slightly more ok
0 commit comments