Memory reserve or wait #688

madsbk · 2025-11-25T14:08:57Z

Introduce a new class, MemoryReserveOrWait, that provides asynchronous waiting for memory reservation requests.

    /**
     * @brief Attempts to reserve memory or waits until the reservation can be satisfied.
     *
     * This coroutine submits a memory reservation request and then suspends until
     * either sufficient memory becomes available or no progress is made within the
     * configured timeout.
     *
     * If the timeout expires before the request can be fulfilled, an empty
     * `MemoryReservation` is returned.
     *
     * @param size Number of bytes to reserve.
     * @param future_release_potential Estimated number of bytes the requester may release
     * in the future, used as a heuristic when selecting which eligible request to satisfy
     * first.
     * @return A `MemoryReservation` representing the allocated memory, or an empty
     * reservation if the timeout expires.
     *
     * @throw std::runtime_error If shutdown occurs before the request can be processed.
     */
    coro::task<MemoryReservation> reserve_or_wait(
        std::size_t size, std::size_t future_release_potential
    );

We need bcca40734054d8b0a4da134fa15595b3025d69e0 for rapidsai#688

#688 needs jbaldwin/libcoro#423 Authors: - Mads R. B. Kristensen (https://github.com/madsbk) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - Lawrence Mitchell (https://github.com/wence-) URL: #747

wence-

First pass

cpp/include/rapidsmpf/streaming/core/memory_reserve_or_wait.hpp

cpp/src/streaming/core/memory_reserve_or_wait.cpp

wence- · 2025-12-17T12:28:35Z

cpp/src/streaming/core/memory_reserve_or_wait.cpp

+            // Extract the selected request and push the reservation into its queue.
+            ResReq request = reservation_requests_.extract(it).value();
+            lock.unlock();
+            push_into_queue(request, std::move(res));


issue: Maybe? push_into_queue takes the request by reference. But that reference could be dead by the time the queue is accessed inside the task?

Note, request.queue is also just a reference. It is reserve_or_wait() that creates the queue and keeps it alive until it has been fulfilled or cancelled.

Let's change it to,
auto push_into_queue = [this](coro::queue& res_queue, MemoryReservation res)..., then the intention is more clearer.

cpp/src/streaming/core/memory_reserve_or_wait.cpp

Co-authored-by: Lawrence Mitchell <[email protected]>

…into memory_reserve_or_wait

madsbk · 2025-12-17T14:41:01Z

Thanks @wence-, I have fixed the ordering so we pick the smallest request on timeout and if tie, the oldest request.

cpp/include/rapidsmpf/streaming/core/memory_reserve_or_wait.hpp

cpp/src/streaming/core/memory_reserve_or_wait.cpp

cpp/include/rapidsmpf/streaming/core/memory_reserve_or_wait.hpp

nirandaperera · 2025-12-17T17:36:14Z

cpp/src/streaming/core/memory_reserve_or_wait.cpp

+
+    // Use libcoro's queue to track completion of this reservation request.
+    // The queue will have at most one item: the fulfilled memory reservation.
+    coro::queue<MemoryReservation> request_queue{};


coro::queue seems like a bigger hammer than what we really need here. How about coro::event?

Yeah, but we would need both a event, a MemoryReservation, and handle shutdown. Isn't it better to keep it simple?

@madsbk My thinking was we could use a coro::event with an optional<MemoryReservation>.
We could set the coro::event when a reservation is ready. During shutdown, we can set the event with a nullopt.
I feel like it would be not-very-complicated (I guess) 😇 . Maybe we could do this later as an improvement?

My gut feeling is that this is an unnecessary complication, but we can consider it as a follow-up PR.

cpp/tests/streaming/test_memory_reserve_or_wait.cpp

cpp/include/rapidsmpf/streaming/core/memory_reserve_or_wait.hpp

nirandaperera · 2025-12-17T18:17:38Z

cpp/src/streaming/core/memory_reserve_or_wait.cpp

+            // Extract the selected request and push the reservation into its queue.
+            ResReq request = reservation_requests_.extract(it).value();
+            lock.unlock();
+            push_into_queue(request, std::move(res));


Let's change it to,
auto push_into_queue = [this](coro::queue& res_queue, MemoryReservation res)..., then the intention is more clearer.

nirandaperera · 2025-12-17T18:34:58Z

cpp/src/streaming/core/memory_reserve_or_wait.cpp

+            if (Clock::now() - last_reservation_success > timeout_) {
+                // This is the only way out of the while-loop that doesn't shutdown
+                // the periodic memory check.
+                break;
+            }


I'm missing something here. Shouldnt timeout_ be evaluated for each request? periodic_memory_check is a long-running task per MemoryReserveOrWait instance. So, when the timeout expires, we pick the smallest request with the future_release_potential, and try to reserve that. BUT this could actually be the last request that was added. And if the reservation fails, we inform it that it has failed. But in reality, it could have been attempted again later.

What I'm trying to say is that, I feel like the Request should have a timestamp that tells us when it was created, and based on time-now, any expired requests should also be released.

The timeout is not specific to a single request. I have updated the docs:

/** * @brief Attempts to reserve memory or waits until progress can be made. * * This coroutine submits a memory reservation request and then suspends until * either sufficient memory becomes available or no reservation request (including * other pending requests) makes progress within the configured timeout. * * The timeout does not apply specifically to this request. Instead, it is used as * a global progress guarantee: if no pending reservation request can be satisfied * within timeout, `MemoryReserveOrWait` forces progress by selecting the smallest * pending request and attempting to reserve memory for it. The forced reservation * attempt may result in an empty `MemoryReservation` if the selected request still * cannot be satisfied. * * When multiple reservation requests are eligible, `MemoryReserveOrWait` uses * @p future_release_potential as a heuristic to prefer requests that are expected * to free memory sooner. Operations that do not free memory, for example reading * data from disk into memory, should use a value of zero. Operations that are * expected to reduce memory usage, for example a reduction such as a sum, should * use a value corresponding to the amount of input data that will be released * once the operation completes. * * @param size Number of bytes to reserve. * @param future_release_potential Estimated number of bytes the requester may * release in the future. * @return A `MemoryReservation` representing the allocated memory, or an empty * reservation if progress could not be made. * * @throws std::runtime_error If shutdown occurs before the request can be processed. */ coro::task<MemoryReservation> reserve_or_wait( std::size_t size, std::size_t future_release_potential );

Co-authored-by: Niranda Perera <[email protected]>

…into memory_reserve_or_wait

nirandaperera · 2025-12-18T16:47:16Z

cpp/src/streaming/core/memory_reserve_or_wait.cpp

+                // the periodic memory check.
+                break;
+            }
+            auto const max_size = memory_available();


we should yield/continue if max_size is smaller than the smallest request size, isnt it?

That happens on line 204 where we found the eligible requests.

nirandaperera · 2025-12-18T16:53:39Z

cpp/src/streaming/core/memory_reserve_or_wait.cpp

+
+    while (true) {
+        auto last_reservation_success = Clock::now();
+        while (true) {


I have a bit of a concern with this double-while-loop. Consider the scenario where, memory is fully/almost fully reserved, and none of the request can be served.
Then, IINM, this coroutine will wake up from the yield and very quickly yeild back in the next iteration, and continue to do so, until the timeout expires. Then on the outer loop, keep on draining the request queue. I feel like this is more-or-less a busy loop.

Why didnt we consider a callback approach? Where we register a callback in BufferResource that will be called for every release. Then based on the callback call, we can wake up coroutines that are waiting in the request queue?
I feel like we might not need this periodic mem check task that way? Then we would not have to worry about the busy-loops IMO.
Basically "do nothing until someone releases memory".

The main limitation of a pure callback-based approach is that BufferResource does not have full visibility into memory usage. It only sees allocations made via BufferResource::allocate(), not allocations performed by libcudf, which can represent a large share of the memory pressure and are often why a reservation cannot be satisfied.

I agree this is a valid concern. When the system is effectively out of memory, we can end up in a wake up, immediately yield again pattern. That said, it may not be as bad as it sounds: yield gives all other coroutines a chance to run, and checking available memory is very fast. The periodic memory check is therefore a pragmatic compromise. It keeps the logic simple, avoids coupling to a specific allocator, and still works when memory is released outside BufferResource’s control.

Longer term, a unified reservation system could enable a callback-driven approach and remove the need for periodic polling. Given the current constraints, however, the periodic check is the best trade-off.

wence- · 2026-01-08T16:29:25Z

cpp/include/rapidsmpf/streaming/core/context.hpp

-     *
-     * @return The Options instance.
-     */
-    [[nodiscard]] config::Options const& options() const noexcept;


Now options() always returns a copy, rather than a const reference. Is there a particular reason for this?

Simplification, config::Options is always backed by a shared pointer internally so the overhead is minimal

wence- · 2026-01-08T16:30:19Z

cpp/include/rapidsmpf/streaming/core/memory_reserve_or_wait.hpp

+     * If no reservation request can be satisfied within @p timeout, the coroutine
+     * forces progress by selecting the smallest pending request and attempting to
+     * reserve memory for it. This attempt may result in an empty reservation if the
+     * request still cannot be satisfied.


OK, so the idea will be that you still need to check your reservation gave you enough space.

yes, and you will have to decide if you want to overbook or maybe you have a low-memory mode

wence- · 2026-01-08T16:44:16Z

cpp/src/streaming/core/memory_reserve_or_wait.cpp

+    if (!reservation_requests.empty()) {
+        std::vector<Node> nodes;
+        for (Request const& request : reservation_requests) {
+            nodes.push_back(request.queue.shutdown_drain(ctx_->executor()));


Do you mean shutdown_drain, which waits for the consumer to pick it up, or shutdown which does not.

Good catch, we want shutdown

wence- · 2026-01-08T16:51:40Z

cpp/src/streaming/core/memory_reserve_or_wait.cpp

+                // the periodic memory check.
+                break;
+            }
+            auto const max_size = memory_available();


That happens on line 204 where we found the eligible requests.

wence- · 2026-01-08T17:58:30Z

I think this makes sense. With a query about shutdown.

…rve_or_wait

copy-pr-bot · 2026-01-08T18:32:45Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

madsbk · 2026-01-08T18:42:52Z

/ok to test f1ad5db

madsbk · 2026-01-08T19:22:20Z

/merge

...

madsbk · 2026-01-08T19:22:57Z

/merge

madsbk self-assigned this Nov 25, 2025

madsbk added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Nov 25, 2025

madsbk closed this Nov 25, 2025

madsbk reopened this Nov 25, 2025

madsbk mentioned this pull request Dec 2, 2025

bench-shuffle: discard output #689

Merged

madsbk added a commit to madsbk/rapidsmpf that referenced this pull request Dec 15, 2025

libcoro: use the newest commit.

3fb692b

We need bcca40734054d8b0a4da134fa15595b3025d69e0 for rapidsai#688

madsbk mentioned this pull request Dec 15, 2025

libcoro: use the newest commit #747

Merged

madsbk force-pushed the memory_reserve_or_wait branch 4 times, most recently from c447471 to 5e0831a Compare December 17, 2025 08:22

madsbk added 6 commits December 17, 2025 12:32

_KiB

013cd2b

SetUpWithThreads(): accept memory_available

894ae33

Rename Context::get_options() => Context::options()

9928c60

MemoryReserveOrWait

8d75bbe

cleanup includes

bacfbcc

tests

65124fe

madsbk force-pushed the memory_reserve_or_wait branch from fe430de to 65124fe Compare December 17, 2025 11:34

madsbk marked this pull request as ready for review December 17, 2025 11:35

madsbk requested review from a team as code owners December 17, 2025 11:35

rapidsai deleted a comment from copy-pr-bot bot Dec 17, 2025

reserve_or_wait_or_overbook

395be4f

wence- reviewed Dec 17, 2025

View reviewed changes

madsbk and others added 4 commits December 17, 2025 14:00

Apply suggestions from code review

253f54a

Co-authored-by: Lawrence Mitchell <[email protected]>

ResReq const& request

5e12d14

Request

2796000

fix cast

0fb32c9

madsbk added 4 commits December 17, 2025 14:50

Request: clean up ordering

98d7ead

Request: clean up ordering

873b7b8

Merge branch 'memory_reserve_or_wait' of github.com:madsbk/rapidsmpf …

c62894c

…into memory_reserve_or_wait

doc

f980092

madsbk requested a review from wence- December 17, 2025 14:08

madsbk mentioned this pull request Dec 17, 2025

Add opaque_reservation utility rapidsai/cudf#20885

Merged

3 tasks

reserve_or_wait_or_overbook: use mem_type_

d74a7c1

nirandaperera requested changes Dec 17, 2025

View reviewed changes

nirandaperera previously requested changes Dec 17, 2025

View reviewed changes

madsbk and others added 6 commits December 18, 2025 08:23

parameterized on multiple threads

bc32057

cleanup

09be78d

Apply suggestion from @nirandaperera

7c457d2

Co-authored-by: Niranda Perera <[email protected]>

Merge branch 'memory_reserve_or_wait' of github.com:madsbk/rapidsmpf …

3ea48ac

…into memory_reserve_or_wait

doc

0c5c6c6

CheckPriority: handle multiple threads

3326ced

madsbk requested a review from nirandaperera December 18, 2025 09:36

cleanup

ae99954

nirandaperera reviewed Dec 18, 2025

View reviewed changes

madsbk requested a review from nirandaperera January 2, 2026 19:19

wence- approved these changes Jan 8, 2026

View reviewed changes

madsbk added 2 commits January 8, 2026 10:24

Merge branch 'main' of github.com:rapidsai/rapidsmpf into memory_rese…

cb27f76

…rve_or_wait

shutdow use shutdown

f1ad5db

rapids-bot bot merged commit 90fe8fe into rapidsai:main Jan 8, 2026
98 checks passed

Memory reserve or wait #688

Memory reserve or wait #688

Uh oh!

Conversation

madsbk commented Nov 25, 2025

Uh oh!

wence- left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

madsbk commented Dec 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

madsbk Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wence- commented Jan 8, 2026

Uh oh!

copy-pr-bot bot commented Jan 8, 2026

Uh oh!

madsbk commented Jan 8, 2026

Uh oh!

madsbk commented Jan 8, 2026

Uh oh!

Uh oh!

madsbk commented Jan 8, 2026

Uh oh!

madsbk Dec 18, 2025 •

edited

Loading