Skip to content

Commit a0727c7

Browse files
isilenceaxboe
authored andcommitted
io_uring: improve cqe !tracing hot path
While looking at io_fill_cqe_req()'s asm I stumbled on our trace points turning into the chunk below: trace_io_uring_complete(req->ctx, req, req->cqe.user_data, req->cqe.res, req->cqe.flags, req->extra1, req->extra2); io_uring/io_uring.c:898: trace_io_uring_complete(req->ctx, req, req->cqe.user_data, movq 232(%rbx), %rdi # req_44(D)->big_cqe.extra2, _5 movq 224(%rbx), %rdx # req_44(D)->big_cqe.extra1, _6 movl 84(%rbx), %r9d # req_44(D)->cqe.D.81184.flags, _7 movl 80(%rbx), %r8d # req_44(D)->cqe.res, _8 movq 72(%rbx), %rcx # req_44(D)->cqe.user_data, _9 movq 88(%rbx), %rsi # req_44(D)->ctx, _10 ./arch/x86/include/asm/jump_label.h:27: asm_volatile_goto("1:" 1:jmp .L1772 # objtool NOPs this # ... It does a jump_label for actual tracing, but those 6 moves will stay there in the hottest io_uring path. As an optimisation, add a trace_io_uring_complete_enabled() check, which is also uses jump_labels, it tricks the compiler into behaving. It removes the junk without changing anything else int the hot path. Note: apparently, it's not only me noticing it, and people are also working it around. We should remove the check when it's solved generically or rework tracing. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/555d8312644b3776f4be7e23f9b92943875c4bc7.1692916914.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
1 parent 04d9244 commit a0727c7

File tree

1 file changed

+5
-4
lines changed

1 file changed

+5
-4
lines changed

io_uring/io_uring.h

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -145,10 +145,11 @@ static inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, struct io_kiocb *req
145145
if (unlikely(!cqe))
146146
return false;
147147

148-
trace_io_uring_complete(req->ctx, req, req->cqe.user_data,
149-
req->cqe.res, req->cqe.flags,
150-
(req->flags & REQ_F_CQE32_INIT) ? req->extra1 : 0,
151-
(req->flags & REQ_F_CQE32_INIT) ? req->extra2 : 0);
148+
if (trace_io_uring_complete_enabled())
149+
trace_io_uring_complete(req->ctx, req, req->cqe.user_data,
150+
req->cqe.res, req->cqe.flags,
151+
(req->flags & REQ_F_CQE32_INIT) ? req->extra1 : 0,
152+
(req->flags & REQ_F_CQE32_INIT) ? req->extra2 : 0);
152153

153154
memcpy(cqe, &req->cqe, sizeof(*cqe));
154155

0 commit comments

Comments
 (0)