-
Notifications
You must be signed in to change notification settings - Fork 494
UCT/GDA: Move peermem check outside the loop. #11001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCT/GDA: Move peermem check outside the loop. #11001
Conversation
WalkthroughAdded a one-time global cached Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant uct_gdaki_query_tl_devices as GDAKI
opt First-call: compute static peermem_loaded
Caller->>GDAKI: invoke
GDAKI-->>GDAKI: derive peermem_loaded from md->super.reg_mem_types & CUDA type
GDAKI-->>Caller: (diagnostic if unsupported)
end
Caller->>GDAKI: invoke
alt peermem_loaded == 0
GDAKI--xCaller: return UCS_ERR_NO_DEVICE
else peermem_loaded == 1
GDAKI->>GDAKI: iterate GPUs / MDs (per-GPU processing)
GDAKI-->>Caller: return status (success or other)
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
Comment |
4c70512 to
9a3058d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/uct/ib/mlx5/gdaki/gdaki.c (1)
700-700: Optional: Remove trailing whitespace.Minor formatting nit: Line 700 appears to have trailing whitespace after the
out:label.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/uct/ib/mlx5/gdaki/gdaki.c(2 hunks)
🔇 Additional comments (2)
src/uct/ib/mlx5/gdaki/gdaki.c (2)
618-630: LGTM! Peermem check correctly moved outside the loop.The lazy initialization pattern using a static variable is correct and achieves the stated goal of checking peermem support once rather than repeatedly inside the GPU loop. The diagnostic message will now only be printed once on first invocation when peermem is not loaded.
Note: The static variable lacks explicit synchronization, which could result in multiple concurrent threads initializing
peermem_loadedsimultaneously. However, this race is benign (multiple assignments of the same value, potentially duplicate diagnostics), and the pattern is consistent with the existinguar_supportedvariable.
632-635: Good optimization: early return prevents unnecessary work.The early return when peermem is not loaded is correct and efficient. Since peermem is required for any GPU to be usable with this transport, returning immediately avoids the overhead of GPU enumeration and device allocation when support is unavailable.
UCT/GDA: Move outside the loop.
What?
Move nvidia peermem driver check outside the gpu loop.
Why?
Improve code.
Summary by CodeRabbit
Bug Fixes
Performance