Fix null sqe crash in ReadAsync when io_uring submission queue is full#14521
Open
archang19 wants to merge 1 commit intofacebook:mainfrom
Open
Fix null sqe crash in ReadAsync when io_uring submission queue is full#14521archang19 wants to merge 1 commit intofacebook:mainfrom
archang19 wants to merge 1 commit intofacebook:mainfrom
Conversation
Summary: ReadAsync calls io_uring_get_sqe() without checking for nullptr. When the io_uring submission queue is full (outstanding completions not yet reaped), io_uring_get_sqe returns NULL and the subsequent io_uring_prep_readv dereferences it, causing a segfault. MultiRead already handles this correctly by using io_uring_sq_space_left() to cap submissions. ReadAsync submits exactly one SQE per call so a simple null check with error return is sufficient. On null sqe, clean up the already-allocated Posix_IOHandle and return IOStatus::Busy so the caller can retry after reaping completions. The io_uring queue depth is kIoUringDepth (256), and each thread gets its own io_uring instance via thread-local storage. In practice the SQ rarely fills because ReadAsync calls io_uring_submit() after each io_uring_get_sqe(), immediately flushing the SQE to the kernel. The null SQE would only occur under unusual kernel backpressure where the kernel cannot consume from the SQ ring fast enough. IOStatus::Busy was chosen (over IOError) because this is a transient condition. The caller has two options: 1. Call Poll() to reap outstanding completions from the CQ, then retry ReadAsync. This mirrors how MultiRead handles queue pressure internally by capping submissions and reaping between batches. 2. Fall back to synchronous Read(). Existing callers (FilePrefetchBuffer, IODispatcher) already have synchronous fallback paths for non-OK ReadAsync status, so IOStatus::Busy naturally triggers that fallback without additional code changes. Given the rarity of this condition, the synchronous fallback is pragmatic and avoids adding retry complexity. Also adds a TEST_SYNC_POINT_CALLBACK on io_uring_get_sqe to enable test injection, and a new ReadAsyncQueueFull unit test that uses SyncPoint to force a null SQE and verifies the Busy return, handle cleanup, and no crash. Differential Revision: D98533853
|
@archang19 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D98533853. |
✅ clang-tidy: No findings on changed linesCompleted in 130.1s. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
ReadAsync calls io_uring_get_sqe() without checking for nullptr. When the
io_uring submission queue is full (outstanding completions not yet reaped),
io_uring_get_sqe returns NULL and the subsequent io_uring_prep_readv
dereferences it, causing a segfault.
MultiRead already handles this correctly by using io_uring_sq_space_left()
to cap submissions. ReadAsync submits exactly one SQE per call so a simple
null check with error return is sufficient.
On null sqe, clean up the already-allocated Posix_IOHandle and return
IOStatus::Busy so the caller can retry after reaping completions.
The io_uring queue depth is kIoUringDepth (256), and each thread gets its
own io_uring instance via thread-local storage. In practice the SQ rarely
fills because ReadAsync calls io_uring_submit() after each io_uring_get_sqe(),
immediately flushing the SQE to the kernel. The null SQE would only occur
under unusual kernel backpressure where the kernel cannot consume from the
SQ ring fast enough.
IOStatus::Busy was chosen (over IOError) because this is a transient
condition. The caller has two options:
ReadAsync. This mirrors how MultiRead handles queue pressure internally
by capping submissions and reaping between batches.
IODispatcher) already have synchronous fallback paths for non-OK
ReadAsync status, so IOStatus::Busy naturally triggers that fallback
without additional code changes. Given the rarity of this condition,
the synchronous fallback is pragmatic and avoids adding retry complexity.
Also adds a TEST_SYNC_POINT_CALLBACK on io_uring_get_sqe to enable test
injection, and a new ReadAsyncQueueFull unit test that uses SyncPoint to
force a null SQE and verifies the Busy return, handle cleanup, and no crash.
Differential Revision: D98533853