Skip to content

Log errno and thread ID on CreateIOUring failure (#14520)#14520

Closed
archang19 wants to merge 1 commit intofacebook:mainfrom
archang19:export-D98526792
Closed

Log errno and thread ID on CreateIOUring failure (#14520)#14520
archang19 wants to merge 1 commit intofacebook:mainfrom
archang19:export-D98526792

Conversation

@archang19
Copy link
Copy Markdown
Contributor

@archang19 archang19 commented Mar 27, 2026

Summary:

CreateIOUring() silently returns nullptr on failure, discarding the errno
from io_uring_queue_init. This makes it impossible to diagnose why
io_uring initialization fails on specific threads (e.g. ENOMEM from
memlock limits, EINVAL from unsupported flags, EMFILE from fd exhaustion).

Add a fprintf(stderr, ...) that logs strerror, errno, and pthread thread
ID when io_uring_queue_init fails, so failures are diagnosable from logs
without needing to reproduce.

Reviewed By: xingbowang

Differential Revision: D98526792

@meta-cla meta-cla bot added the CLA Signed label Mar 27, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Mar 27, 2026

@archang19 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D98526792.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

✅ clang-tidy: No findings on changed lines

Completed in 66.1s.

@meta-codesync meta-codesync bot changed the title Log errno and thread ID on CreateIOUring failure Log errno and thread ID on CreateIOUring failure (#14520) Mar 27, 2026
archang19 added a commit to archang19/rocksdb that referenced this pull request Mar 27, 2026
Summary:

CreateIOUring() silently returns nullptr on failure, discarding the errno
from io_uring_queue_init. This makes it impossible to diagnose why
io_uring initialization fails on specific threads (e.g. ENOMEM from
memlock limits, EINVAL from unsupported flags, EMFILE from fd exhaustion).

Add a fprintf(stderr, ...) that logs strerror, errno, and pthread thread
ID when io_uring_queue_init fails, so failures are diagnosable from logs
without needing to reproduce.

Differential Revision: D98526792
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit b2ccf9e


Code Review: PR #14520 — Log errno and thread ID on CreateIOUring failure

PR: 2 files changed, 8 insertions, 1 deletion
Author: archang19 | Reviewed By: xingbowang


Summary

This PR improves diagnostic logging when io_uring_queue_init fails in CreateIOUring(). It adds errno details, human-readable error string, and thread ID to the failure message, and removes a less-informative fprintf from the ReadAsync() caller. The change is correct, focused, and follows existing codebase patterns.


Findings

F1: Startup probe will print error to stderr on systems where io_uring fails at runtime

Severity: Medium
File: env/io_posix.h:349 (new fprintf) / env/fs_posix.cc:1345 (caller)

The PosixFileSystem constructor (fs_posix.cc:1338-1352) calls CreateIOUring() as a one-time probe to test io_uring support. With this PR, systems where io_uring is compiled in (ROCKSDB_IOURING_PRESENT) but fails at runtime will now print an error to stderr during normal startup:

CreateIOUring failed: Function not implemented (errno=38), thread=140123456789

This can happen in Docker containers with restricted syscalls, systems with low RLIMIT_MEMLOCK, or older kernels with liburing installed. The old code had no stderr output for this probe path.

Impact: Users may interpret this as an error when it's actually expected behavior (graceful fallback to pread). One-time per process, so log volume is minimal.

Suggestion: Consider wording that indicates fallback is normal, e.g., "CreateIOUring failed (falling back to pread): ...".


F2: pthread_t cast assumes glibc type definition

Severity: Low
File: env/io_posix.h:351

static_cast<unsigned long>(pthread_self()) is safe on Linux glibc where pthread_t is unsigned long. On musl libc it's a struct pointer. Since ROCKSDB_IOURING_PRESENT is Linux-only and targets glibc, this is safe in practice. No action needed now.


F3: Correctness, error handling, and memory safety — all verified clean

  • -ret correctly converts negative errno return ✓
  • errnoStr(-ret).c_str() temporary lifetime is safe per C++ standard ✓
  • fprintf(stderr) is thread-safe per POSIX ✓
  • Memory: new_io_uring properly deleted before returning nullptr ✓
  • errnoStr() heap allocation acceptable on cold path ✓

F4: Include adds transitive dependency

Severity: Low#include "util/string_util.h" inside ROCKSDB_IOURING_PRESENT guard is correct. string_util.h is lightweight and widely included.

F5: No new tests needed — diagnostic-only change on cold path with no behavioral changes.


Verdict

The PR is a net improvement. The only substantive concern is F1 — the startup probe will now emit a stderr message on systems where io_uring fails at runtime, which was previously silent. Consider whether the message should clarify that fallback behavior is normal.


ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

Summary:

CreateIOUring() silently returns nullptr on failure, discarding the errno
from io_uring_queue_init. This makes it impossible to diagnose why
io_uring initialization fails on specific threads (e.g. ENOMEM from
memlock limits, EINVAL from unsupported flags, EMFILE from fd exhaustion).

Add a fprintf(stderr, ...) that logs strerror, errno, and pthread thread
ID when io_uring_queue_init fails, so failures are diagnosable from logs
without needing to reproduce.

Reviewed By: xingbowang

Differential Revision: D98526792
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Mar 28, 2026

This pull request has been merged in 2af3820.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants