You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Historically, when making changes to the code we would forget to send
something to the Pager Worker channel before closing it. This would
cause the `recv().unwrap()` in QueryPager to panic, because the channel
would be closed without having sent anything.
To harden against such logic bugs, we added an abstraction of
PageSendAttemptedProof, which enforces that at least one item is sent
to the channel before it is closed, or else `worker.work()` can't
return. With this abstraction in place, we long believed the `unwrap()`
to be safe.
However, there is two more cases when `recv()` can return None:
1) when the runtime is being shut down;
2) when the worker task which is owner of the channel's sending part
panics.
In both cases, the worker task terminates, and the channel is closed
without sending anything.
This commit handles both cases by executing special recovery logic
when `recv()` returns None. This allows for graceful handling of runtime
shutdown scenarios, without panicking, as well as correct panic
propagation.
We are sure that we won't introduce silent errors this way, because
if we get None, the only possible explanation is that the runtime
is indeed being shut down or that the worker task panicked. The logic
bugs on the side of the Pager Worker are already prevented by the
PageSendAttemptedProof abstraction.
If panic is detected, it is propagated using `std::panic::resume_unwind`.
If runtime shutdown is detected, we await a never-ending future
to avoid returning from this function while the runtime is being
shut down. To help debugging, we also emit a tracing info-level message
in this case.
Fixes: #1435
0 commit comments