Skip to content

Delete cccl_adaptors.hpp and use raw CCCL resource_ref types#2325

Draft
bdice wants to merge 4 commits intorapidsai:stagingfrom
bdice:delete-cccl-adaptors
Draft

Delete cccl_adaptors.hpp and use raw CCCL resource_ref types#2325
bdice wants to merge 4 commits intorapidsai:stagingfrom
bdice:delete-cccl-adaptors

Conversation

@bdice
Copy link
Collaborator

@bdice bdice commented Mar 20, 2026

Summary

  • Add SFINAE to 1-arg constructors on all memory resources to work around CCCL #8037 recursive constraint cycle (49 files)
  • Change polymorphic_allocator, thrust_allocator, and device_check_resource_adaptor to store any_resource members instead of non-owning resource_ref
  • Delete cccl_adaptors.hpp and replace RMM's resource_ref type aliases with direct CCCL types
  • Suppress nvcc host/device diagnostic in multi_stream_allocations_bench.cu

Closes #2323
Part of #2011

@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 20, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

rapids-bot bot pushed a commit that referenced this pull request Mar 23, 2026
…#2328)

Replace the hand-rolled friend `get_property` templates in `cccl_resource_ref` and `cccl_async_resource_ref` with inheritance from `cuda::forward_property`. This delegates property forwarding to CCCL's own machinery, which correctly handles `dynamic_accessibility_property` ([NVIDIA/cccl#7727](NVIDIA/cccl#7727)) and any future properties without ambiguity.

Each wrapper now exposes `upstream_resource()` returning the inner `ResourceType`, as required by `forward_property` for stateful properties.

Tests add minimal `forward_property` adaptors using RMM resource refs as upstream, exercising the exact scenario that causes the ambiguity.

Note: this is a temporary solution for the `main` branch -- resolving #2323 / #2325 will remove this code on the `staging` branch while I continue working on CCCL MR migrations.

Closes #2322.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Rong Ou (https://github.com/rongou)

URL: #2328
@bdice bdice moved this to In Progress in RMM Project Board Mar 23, 2026
@bdice bdice added breaking Breaking change improvement Improvement / enhancement to an existing function labels Mar 23, 2026
@bdice bdice self-assigned this Mar 23, 2026
@bdice bdice force-pushed the delete-cccl-adaptors branch from 8a3dfa6 to 0f59bf7 Compare March 24, 2026 23:42
bdice added 4 commits March 25, 2026 16:50
…onstraint cycle

Replace device_async_resource_ref constructor parameters with
cuda::mr::any_resource<device_accessible> across all adaptor impl
classes. Add template constructors (constrained with
!is_same_v<decay_t<T>, AdaptorType>) to public adaptor headers for
single-arg-capable constructors, breaking the recursive
is_constructible cycle that CCCL #8037 causes. Multi-arg constructors
that cannot be confused with copy/move use plain any_resource params
with out-of-line definitions.

Update Python/Cython bindings with any_device_resource type alias and
_to_any_resource() wrapper at all call sites to work around Cython's
inability to call C++ template constructors directly.
…urce_adaptor to store any_resource members

Replace device_async_resource_ref members with
cuda::mr::any_resource<device_accessible> in polymorphic_allocator,
thrust_allocator, and device_check_resource_adaptor. This eliminates
the CCCL #8037 recursive constraint cycle for these classes.

polymorphic_allocator uses a template constructor with SFINAE
(is_polymorphic_allocator_v) because it is a class template with a
1-arg constructor, matching the pattern used by the adaptor classes.
Replace RMM's wrapper types (cccl_resource_ref, cccl_async_resource_ref)
with direct aliases to CCCL's resource_ref and synchronous_resource_ref.
This eliminates the 469-line adaptor layer that was originally needed to
work around shared_resource type-erasure issues.

The wrapper was no longer needed once the CCCL #8037 recursive constraint
cycle was broken via template SFINAE constructors (previous commit).

Additional changes required for compilation without the wrapper:
- per_device_resource: static_cast<any_device_resource>(ref) replaced
  with any_device_resource{ref} (wrapper had operator any_resource)
- cuda_async_memory_resource, cuda_async_managed_memory_resource,
  sam_headroom_memory_resource: copy/move changed from = delete to
  = default (CCCL resource_ref requires copyability; shared_resource
  base already provides correct reference-counted semantics)
- device_check_resource_adaptor (test): template SFINAE constructor
  to break the same CCCL #8037 cycle
…hmark

The any_resource internals use __host__ __device__ functions that call
shared_resource copy/move constructors which are __host__ only. This is
safe because the benchmark factory functions are only called from host code.
@bdice bdice force-pushed the delete-cccl-adaptors branch from 0f59bf7 to 423eeac Compare March 25, 2026 16:50
@bdice
Copy link
Collaborator Author

bdice commented Mar 25, 2026

I'm going to break this up into smaller PRs. I'm not 100% convinced all the changes are still necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking change improvement Improvement / enhancement to an existing function

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant