Skip to content

Commit 28aabff

Browse files
committed
io_uring/sqpoll: close race on waiting for sqring entries
When an application uses SQPOLL, it must wait for the SQPOLL thread to consume SQE entries, if it fails to get an sqe when calling io_uring_get_sqe(). It can do so by calling io_uring_enter(2) with the flag value of IORING_ENTER_SQ_WAIT. In liburing, this is generally done with io_uring_sqring_wait(). There's a natural expectation that once this call returns, a new SQE entry can be retrieved, filled out, and submitted. However, the kernel uses the cached sq head to determine if the SQRING is full or not. If the SQPOLL thread is currently in the process of submitting SQE entries, it may have updated the cached sq head, but not yet committed it to the SQ ring. Hence the kernel may find that there are SQE entries ready to be consumed, and return successfully to the application. If the SQPOLL thread hasn't yet committed the SQ ring entries by the time the application returns to userspace and attempts to get a new SQE, it will fail getting a new SQE. Fix this by having io_sqring_full() always use the user visible SQ ring head entry, rather than the internally cached one. Cc: [email protected] # 5.10+ Link: axboe/liburing#1267 Reported-by: Benedek Thaler <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
1 parent f7c9134 commit 28aabff

File tree

1 file changed

+8
-1
lines changed

1 file changed

+8
-1
lines changed

io_uring/io_uring.h

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -284,7 +284,14 @@ static inline bool io_sqring_full(struct io_ring_ctx *ctx)
284284
{
285285
struct io_rings *r = ctx->rings;
286286

287-
return READ_ONCE(r->sq.tail) - ctx->cached_sq_head == ctx->sq_entries;
287+
/*
288+
* SQPOLL must use the actual sqring head, as using the cached_sq_head
289+
* is race prone if the SQPOLL thread has grabbed entries but not yet
290+
* committed them to the ring. For !SQPOLL, this doesn't matter, but
291+
* since this helper is just used for SQPOLL sqring waits (or POLLOUT),
292+
* just read the actual sqring head unconditionally.
293+
*/
294+
return READ_ONCE(r->sq.tail) - READ_ONCE(r->sq.head) == ctx->sq_entries;
288295
}
289296

290297
static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)

0 commit comments

Comments
 (0)