Remove device_memory_resource inheritance from all resources and adaptors by bdice · Pull Request #2301 · rapidsai/rmm

bdice · 2026-03-14T13:38:48Z

Summary

Remove device_memory_resource inheritance from all memory resources (stateless, stateful, and adaptors)
Remove do_allocate / do_deallocate / do_is_equal virtual overrides from all resources
Rewrite benchmark factory functions from shared_ptr<device_memory_resource> to any_device_resource
Convert simulated_memory_resource from DMR inheritance to CCCL concepts
Change copy/move from = delete to = default on cuda_async_memory_resource, cuda_async_managed_memory_resource, sam_headroom_memory_resource, and simulated_memory_resource (required for CCCL resource_ref copyability via shared_resource base)
Remove NullUpstream tests and DEVICE_MEMORY_RESOURCE_VIEW_TEST (no longer needed without DMR)

Closes #2295
Part of #2011

copy-pr-bot · 2026-03-14T13:38:52Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@vyasr

…async_resource_ref (#2300) ## Summary Replaces `shared_ptr[device_memory_resource]` with per-subclass `unique_ptr[ConcreteType]` (owning) and `optional[device_async_resource_ref]` (non-owning reference) across all Python/Cython bindings. This is a part of #2011. There are **significant** opportunities to make this Cython code better over time but I have to get something that removes `device_memory_resource` from the Python/Cython side before I can finish migration on the C++ side (#2296). I welcome critique of this design, and ideas for how it can be improved, particularly from @vyasr @wence-. I would like to address any suggested improvements in follow-up PRs, because this changeset is necessary to unblock #2301. The changes in `cdef class DeviceMemoryResource` are perhaps the most significant changes here from a design perspective. The solution I'm going with for now is to keep the `DeviceMemoryResource` class around, as a base class for the Cython MRs, and let it handle allocate/deallocate. It owns a `optional[device_async_resource_ref]` which is used for allocation/deallocation. It's `optional` so that the class can be default-constructed (Cython requires nullary constructors), but it should never be `nullopt` except during initialization. Then, each MR class owns a `c_obj` like `unique_ptr[cuda_memory_resource]`. This is `unique_ptr` so it can be default-constructed for Cython's requirements. I chose `unique_ptr` over `optional` here to emphasize that this member is the thing that actually owns the resource. As with the `c_ref`, this should never be `nullptr` except during initialization. When an MR class is created, it initializes its `c_obj` and then constructs a `c_ref` (a member inherited from the `DeviceMemoryResource` base class). "Special" methods for an MR like getting the statistics counts go through `deref(self.c_obj)`, and "common" methods like allocate/deallocate go through `self.c_ref.value()`. ### Changes - **`.pxd` declarations**: Remove `device_memory_resource` class. Declare `device_async_resource_ref` and a `make_device_async_resource_ref()` inline C++ template that returns `optional` to work around Cython generating default-constructed temporaries for non-default-constructible types. All adaptor constructors take `device_async_resource_ref` instead of `device_memory_resource*`. - **`.pxd` class definitions**: `DeviceMemoryResource` base holds `optional[device_async_resource_ref] c_ref`; each concrete subclass holds `unique_ptr[ConcreteType] c_obj`. - **`.pyx` implementations**: All `__cinit__` methods construct via `unique_ptr` then set `c_ref` via `make_device_async_resource_ref`. Typed accessors (`pool_size`, `flush`, etc.) use `deref(self.c_obj)`. Per-device functions use `set_per_device_resource_ref`. - **`device_buffer.pyx`**: Passes `self.mr.c_ref.value()` instead of `self.mr.get_mr()`. Closes #2294

…tors Remove the device_memory_resource virtual base class inheritance from all production memory resources, adaptors, and stream_ordered_memory_resource. Resources now derive publicly from cuda::mr::shared_resource<Impl> (for stateful/adaptor types) or stand alone with direct CCCL concept methods (for stateless types). The legacy do_allocate/do_deallocate/do_is_equal virtual overrides and pointer-based per-device-resource APIs are removed. stream_ordered_memory_resource provides allocate/deallocate/allocate_sync/ deallocate_sync directly instead of through the DMR virtual dispatch. All 103 C++ tests and 1165 Python tests pass.

bdice · 2026-03-18T00:04:30Z

cpp/include/rmm/detail/cccl_adaptors.hpp

                                                        ::rmm::detail::cccl_resource_ref> and
                             not is_specialization_of_v<std::remove_cv_t<OtherResourceType>,
                                                        ::rmm::detail::cccl_async_resource_ref> and
+                             not shared_resource_cast<OtherResourceType>::value and


This isn't something we want to keep. I don't know how to work around the constraint satisfaction issues in NVIDIA/cccl#8037.

Minimal repro: https://godbolt.org/z/ddfn1sE8Y

Reviewers: I discussed this issue with @ericniebler. We are going to try NVIDIA/cccl#8121 as a workaround. For now, I want to get the current series of open PRs merged (#2301, #2324, #2325), and then try to reduce the workarounds that have been introduced.

bdice · 2026-03-18T00:19:45Z

cpp/tests/mr/aligned_mr_tests.cpp

      void* ptr{nullptr};
      if (i != 0) { std::this_thread::sleep_for(std::chrono::milliseconds{100}); }
-      EXPECT_NO_THROW(ptr = mr.allocate(stream, allocation_size));
+      EXPECT_NO_THROW(ptr = mr.allocate(stream, allocation_size, rmm::CUDA_ALLOCATION_ALIGNMENT));


An explicit alignment argument is required because of the issues described in NVIDIA/cccl#8063. There is no 2-argument allocate(stream, size) overload for shared_resource<T> like there is for other CCCL MR classes. There is only allocation(stream, size, alignment) for shared_resource<T>, and it has no default alignment value.

cpp/benchmarks/random_allocations/random_allocations.cpp

cpp/include/rmm/mr/detail/stream_ordered_memory_resource.hpp

Rewrite benchmark factory functions from shared_ptr<device_memory_resource> to any_device_resource, convert simulated_memory_resource from DMR inheritance to CCCL concepts, and change copy/move from = delete to = default on cuda_async_memory_resource, cuda_async_managed_memory_resource, sam_headroom_memory_resource, and simulated_memory_resource to satisfy CCCL resource_ref copyability requirements.

GCC 13 incorrectly resolves the injected class name to the fully-specialized type inside a class template body, even in template-template parameter position. This causes a hard error (not SFINAE) in is_specialization_of_v when checking against cccl_resource_ref and cccl_async_resource_ref in their own generic-resource constructors. Replace is_specialization_of_v<T, cccl_*_resource_ref> with dedicated is_cccl_*_resource_ref_v<T> traits defined outside the class bodies, where the forward declarations are unambiguously templates. This is correct on both GCC 13 and 14.

Inline allocation logic directly into the CCCL-compatible allocate/deallocate methods and delete the legacy do_* methods, per task 4 of the DMR inheritance removal plan.

cpp/tests/mr/failure_callback_mr_tests.cpp

cpp/tests/mr/host_mr_ref_tests.cpp

mhaseeb123 · 2026-03-23T18:42:22Z

cpp/benchmarks/random_allocations/random_allocations.cpp


 /// MR factory functions
-inline auto make_cuda() { return std::make_shared<rmm::mr::cuda_memory_resource>(); }
+using any_device_resource = cuda::mr::any_resource<cuda::mr::device_accessible>;


Is there a way to define this in a common header. We seem to have it duplicated in a few places

I'm trying to keep usage of the any_device_resource alias to a minimum, only for internal usage. I debated removing it entirely and requiring the explicit templated type everywhere. My reasoning for this is that we may want to allow other resource accessibility properties in the future (templating classes on Properties...) and I don't want to force the design into "device only" in too many places.

mhaseeb123 · 2026-03-23T18:49:55Z

cpp/include/rmm/detail/cccl_adaptors.hpp

+  std::void_t<
+    decltype(std::declval<std::remove_cv_t<T>&>().get()),
+    std::enable_if_t<std::is_base_of_v<cuda::mr::shared_resource<std::remove_reference_t<
+                                         decltype(std::declval<T&>().get())>>,
+                                       std::remove_cv_t<T>> and
+                     not is_specialization_of_v<std::remove_cv_t<T>, cuda::mr::shared_resource>>>> {


Is it possible to use requires here instead?

No, we have to support C++17 in RMM for now. I plan to migrate to C++20 sometime after finishing the CCCL MR migration. Also this code will be completely deleted in a follow-up PR so it's fine to keep as-is for now.

mhaseeb123

Few non-blocking comments

cpp/include/rmm/detail/cccl_adaptors.hpp

cpp/include/rmm/mr/detail/stream_ordered_memory_resource.hpp

vuule · 2026-03-23T19:58:04Z

cpp/benchmarks/multi_stream_allocations/multi_stream_allocations_bench.cu

 }

-using MRFactoryFunc = std::function<std::shared_ptr<rmm::mr::device_memory_resource>()>;
+using any_device_resource = cuda::mr::any_resource<cuda::mr::device_accessible>;


does this include resources that allocate memory that's also host accessible? i.e. does this template param mean "only device accessible"?

No, a cuda::mr::any_resource<cuda::mr::device_accessible> can do a narrowing conversion of a host-device accessible resource.

In other words, managed and pinned memory resources are safe to store in a cuda::mr::any_resource<cuda::mr::device_accessible>, but you lose the ability to statically know that they are host-accessible once the property is downcasted away.

vuule

Looks good, a very straight-forward PR to review (with context) :)

cpp/benchmarks/random_allocations/random_allocations.cpp

cpp/benchmarks/utilities/simulated_memory_resource.hpp

ttnghia · 2026-03-23T21:16:40Z

cpp/include/rmm/detail/cccl_adaptors.hpp

+  template <
+    typename OtherResourceType,
+    std::enable_if_t<
+      not is_specialization_of_v<std::remove_cv_t<OtherResourceType>,
+                                 cuda::mr::synchronous_resource_ref> and
+      not is_specialization_of_v<std::remove_cv_t<OtherResourceType>, cuda::mr::resource_ref> and
+      not is_cccl_resource_ref_v<std::remove_cv_t<OtherResourceType>> and
+      not is_cccl_async_resource_ref_v<std::remove_cv_t<OtherResourceType>> and
+      not shared_resource_cast<OtherResourceType>::value and
+      not std::is_base_of_v<rmm::mr::device_memory_resource,
+                            std::remove_cv_t<OtherResourceType>> and
+      cuda::mr::synchronous_resource<OtherResourceType>>* = nullptr>


This is too verbose and error prone. Can we simplify this a bit? Such as introducing is_plain_synchronous_resource_v<T> and is_plain_async_resource_v<T> helper traits, then migrate to requires clauses.

This is all deleted code, two PRs from now. It's really gross but we won't have it in the final state. See #2325.

cpp/include/rmm/mr/detail/stream_ordered_memory_resource.hpp

cpp/tests/mr/failure_callback_mr_tests.cpp

…urce

…/deallocate

github-project-automation bot added this to RMM Project Board Mar 14, 2026

bdice mentioned this pull request Mar 16, 2026

[FEA] Support memory resources from CCCL 3.2 #2011

Open

39 tasks

bdice force-pushed the remove-dmr-inheritance branch from 032b3a6 to 856395e Compare March 17, 2026 08:16

bdice mentioned this pull request Mar 17, 2026

Migrate Python/Cython bindings from device_memory_resource to device_async_resource_ref #2300

Merged

bdice force-pushed the remove-dmr-inheritance branch from 856395e to a06396a Compare March 17, 2026 23:46

bdice moved this to In Progress in RMM Project Board Mar 18, 2026

bdice self-assigned this Mar 18, 2026

bdice added breaking Breaking change improvement Improvement / enhancement to an existing function labels Mar 18, 2026

bdice marked this pull request as ready for review March 18, 2026 00:01

bdice requested review from a team as code owners March 18, 2026 00:01

bdice requested review from ttnghia and vyasr and removed request for a team March 18, 2026 00:01

bdice commented Mar 18, 2026

View reviewed changes

Remove NullUpstream tests

db2eb7c

bdice commented Mar 18, 2026

View reviewed changes

Remove DEVICE_MEMORY_RESOURCE_VIEW_TEST

afd4cc6

ttnghia reviewed Mar 19, 2026

View reviewed changes

cpp/benchmarks/random_allocations/random_allocations.cpp Outdated Show resolved Hide resolved

bdice mentioned this pull request Mar 19, 2026

Ambiguous get_property overload with CCCL's dynamic_accessibility_property #2322

Closed

ttnghia reviewed Mar 19, 2026

View reviewed changes

cpp/include/rmm/mr/detail/stream_ordered_memory_resource.hpp Outdated Show resolved Hide resolved

bdice requested a review from a team as a code owner March 19, 2026 18:11

bdice force-pushed the remove-dmr-inheritance branch from 563d7d3 to 8385650 Compare March 20, 2026 20:52

bdice mentioned this pull request Mar 20, 2026

Remove bridge infrastructure and device_memory_resource #2324

Open

bdice added 2 commits March 20, 2026 16:08

Use == 0 for unsigned types

e4991dc

Use implicit conversion to any_resource in benchmark factory functions

78f58d0

bdice mentioned this pull request Mar 20, 2026

[FEA] Remove device_memory_resource inheritance from all C++ memory resources #2295

Closed

22 tasks

bdice added 2 commits March 21, 2026 05:48

Remove do_allocate/do_deallocate from stream_ordered_memory_resource

5d0ccdc

Inline allocation logic directly into the CCCL-compatible allocate/deallocate methods and delete the legacy do_* methods, per task 4 of the DMR inheritance removal plan.

bdice commented Mar 23, 2026

View reviewed changes

cpp/tests/mr/failure_callback_mr_tests.cpp Show resolved Hide resolved

bdice commented Mar 23, 2026

View reviewed changes

cpp/tests/mr/host_mr_ref_tests.cpp Outdated Show resolved Hide resolved

mhaseeb123 reviewed Mar 23, 2026

View reviewed changes

mhaseeb123 approved these changes Mar 23, 2026

View reviewed changes

cpp/include/rmm/detail/cccl_adaptors.hpp Show resolved Hide resolved

cpp/include/rmm/detail/cccl_adaptors.hpp Show resolved Hide resolved

cpp/include/rmm/mr/detail/stream_ordered_memory_resource.hpp Show resolved Hide resolved

vuule reviewed Mar 23, 2026

View reviewed changes

vuule approved these changes Mar 23, 2026

View reviewed changes

ttnghia reviewed Mar 23, 2026

View reviewed changes

bdice added 6 commits March 24, 2026 21:37

Remove UnsupportedAlignmentTest tombstone comment

b9788f7

Replace reinterpret_cast<cudaStream_t>(0) with cudaStream_t{nullptr}

9d1bcde

Use RMM_CONSTEXPR_FRIEND for get_property in simulated_memory_resource

867a418

Add static_asserts for all resource concepts to simulated_memory_reso…

86c954c

…urce

Add doxygen to allocate_sync/deallocate_sync for parity with allocate…

14ff384

…/deallocate

Fix missing else in declare_benchmark causing spurious error message

1c8aa4a

bdice merged commit d0f847f into rapidsai:staging Mar 24, 2026
84 of 85 checks passed

github-project-automation bot moved this from In Progress to Done in RMM Project Board Mar 24, 2026

Conversation

bdice commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

copy-pr-bot bot commented Mar 14, 2026

Uh oh!

bdice Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdice Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

bdice Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mhaseeb123 Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdice Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhaseeb123 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

bdice Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

mhaseeb123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vuule Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdice Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

vuule left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ttnghia Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

bdice Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bdice commented Mar 14, 2026 •

edited

Loading

bdice Mar 18, 2026 •

edited

Loading

bdice Mar 18, 2026 •

edited

Loading

mhaseeb123 Mar 23, 2026 •

edited

Loading

bdice Mar 24, 2026 •

edited

Loading

vuule Mar 23, 2026 •

edited

Loading

bdice Mar 24, 2026 •

edited

Loading