[libc] Simplifiy slab waiting in GPU memory allocator #152872

jhuber6 · 2025-08-09T17:20:10Z

Summary:
This moves the waiting to be done inside of the try_lock routine
instead. This makes the logic much simpler since it's just a single loop
on a load. We should have the same effect here, and since we don't care
about this being a generic interface it shouldn't matter that it waits
abit. Still wait free since it's guaranteed to make progress
eventually.

llvmbot · 2025-08-09T17:20:42Z

@llvm/pr-subscribers-libc

Author: Joseph Huber (jhuber6)

Changes

Summary:
This moves the waiting to be done inside of the try_lock routine
instead. This makes the logic much simpler since it's just a single loop
on a load. We should have the same effect here, and since we don't care
about this being a generic interface it shouldn't matter that it waits
abit. Still wait free since it's guaranteed to make progress
eventually.

Full diff: https://github.com/llvm/llvm-project/pull/152872.diff

1 Files Affected:

(modified) libc/src/__support/GPU/allocator.cpp (+12-12)

diff --git a/libc/src/__support/GPU/allocator.cpp b/libc/src/__support/GPU/allocator.cpp
index 534a309fec7b4..f8f578ad151cd 100644
--- a/libc/src/__support/GPU/allocator.cpp
+++ b/libc/src/__support/GPU/allocator.cpp
@@ -166,7 +166,11 @@ static inline uint32_t get_leader_id(uint64_t ballot, uint32_t id) {
 
 // We use a sentinal value to indicate a failed or in-progress allocation.
 template <typename T> bool is_sentinel(const T &x) {
-  return x == cpp::numeric_limits<T>::max();
+  if constexpr (cpp::is_pointer_v<T>)
+    return reinterpret_cast<uintptr_t>(x) ==
+           cpp::numeric_limits<uintptr_t>::max();
+  else
+    return x == cpp::numeric_limits<T>::max();
 }
 
 } // namespace impl
@@ -446,7 +450,13 @@ struct GuardPtr {
       return new (raw) Slab(cpp::forward<Args>(args)...);
     }
 
-    if (!expected || impl::is_sentinel(reinterpret_cast<uintptr_t>(expected)))
+    // If there is a slab allocation in progress we retry a few times.
+    for (uint32_t t = 0; impl::is_sentinel(expected) && t < MAX_TRIES; t++) {
+      sleep_briefly();
+      expected = ptr.load(cpp::MemoryOrder::RELAXED);
+    }
+
+    if (!expected || impl::is_sentinel(expected))
       return nullptr;
 
     if (!ref.acquire(n, count))
@@ -557,16 +567,6 @@ static Slab *find_slab(uint32_t chunk_size, uint64_t &uniform,
       Slab *slab = slots[index].try_lock(lane_mask, uniform & lane_mask,
                                          reserved, chunk_size, index);
 
-      // If there is a slab allocation in progress we retry a few times.
-      for (uint32_t retries = 0;
-           !slab && !impl::is_sentinel(reserved) && retries < MAX_TRIES;
-           retries++) {
-        uint64_t lane_mask = gpu::get_lane_mask();
-        slab = slots[index].try_lock(lane_mask, uniform & lane_mask, reserved,
-                                     chunk_size, index);
-        sleep_briefly();
-      }
-
       // If we find a slab with a matching chunk size then we store the result.
       // Otherwise, we need to free the claimed lock and continue. In the case
       // of out-of-memory we receive a sentinel value and return a failure.

Summary: This moves the waiting to be done inside of the `try_lock` routine instead. This makes the logic much simpler since it's just a single loop on a load. We should have the same effect here, and since we don't care about this being a generic interface it shouldn't matter that it waits abit. Still wait free since it's guaranteed to make progress *eventually*.

jhuber6 requested review from JonChesterfield, lntue, michaelrj-google and shiltian August 9, 2025 17:20

llvmbot added the libc label Aug 9, 2025

jhuber6 force-pushed the Waiting branch from 64760ab to 7ef58d3 Compare August 9, 2025 17:31

JonChesterfield approved these changes Aug 11, 2025

View reviewed changes

jhuber6 merged commit 0058952 into llvm:main Aug 11, 2025
19 checks passed

jhuber6 deleted the Waiting branch August 11, 2025 18:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[libc] Simplifiy slab waiting in GPU memory allocator #152872

[libc] Simplifiy slab waiting in GPU memory allocator #152872

Uh oh!

jhuber6 commented Aug 9, 2025

Uh oh!

llvmbot commented Aug 9, 2025

Uh oh!

Uh oh!

Uh oh!

[libc] Simplifiy slab waiting in GPU memory allocator #152872

[libc] Simplifiy slab waiting in GPU memory allocator #152872

Uh oh!

Conversation

jhuber6 commented Aug 9, 2025

Uh oh!

llvmbot commented Aug 9, 2025

Uh oh!

Uh oh!

Uh oh!