Skip to content

Backport memory resources PRs#2983

Merged
rapids-bot[bot] merged 2 commits intorapidsai:release/26.04from
achirkin:backport-memory-resources-prs
Mar 18, 2026
Merged

Backport memory resources PRs#2983
rapids-bot[bot] merged 2 commits intorapidsai:release/26.04from
achirkin:backport-memory-resources-prs

Conversation

@achirkin
Copy link
Contributor

Backport PRs that were mistakenly merged into main:

Use `cuda::mr::any_synchronous_resource` for host, pinned, and managed resource types and give the user explicit control for host, pinned, and managed resources.

#### New
  - `raft::resource::managed_memory_resource` and `raft::resource::pinned_memory_resource` are passed to managed and pinned mdarrays during construction via corresponding container policies. This allows the user to replace/modify these resources, for example, to add logging or memory pooling.
  - `raft::mr::get_default_host_resource` and `raft::mr::set_default_host_resource` can be used by the user to alter the default host resource the same way. It is not stored in `raft::resources` handle like the other two for two reasons:
    1. To mirror rmm default device resource getter/setter
    2. To avoid breaking the `raft::make_host_mdarray` overloads that do not take `raft::resources` as an argument (many instances across raft and cuvs).

#### Changed

 - Use `raft::mr::host_resource_ref` and `raft::mr::host_device_resource_ref` for the non-owning semantics (defined as `cuda::mr::synchronous_resource_ref` with appropriate access attributes)
 - Use `raft::host_resource` and `raft::host_device_resource` for owning semantics (defined as `cuda::mr::any_synchronous_resource` with appropriate access attributes)

With these changes, raft fully switches to `cuda::mr` types for host and host-device resources, while still using `rmm` types for device async resources. Changing the latter would break a lot of cuVS and is not needed - `rmm` will eventually fully converge to `cuda::mr` anyway.

#### Breaking changes
  - Rename container policies
  - Reuse of a single `host_container` for the three types of resources.
  - Switch to using `cuda::mr::any_synchronous_resource` from `std::pmr::memory_resource`

The effect of this changes should be limited, because the policies are hidden behind the mdarray templates and synonyms and the  `std::pmr::memory_resource` was introduced recently and haven't been used much.

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Tamas Bela Feher (https://github.com/tfeher)

URL: rapidsai#2968
Detailed tracking of (almost) all allocations on device and host.

```C++
  // optionally pass an existing resource handle
  raft::resources res;

  // The tracking handle is a child of resource handle; it wraps all memory resources with statistics adaptors
  raft::memory_tracking_resources tracked(res, "allocations.csv", std::chrono::milliseconds(1));

  // All allocations are logged to a .csv as long as `tracked` is alive
  cuvs::neighbors::cagra::build(tracked, ...);
```
This produces a CSV file with sampled allocations with a timeline and NVTX correlation
```csv
timestamp_us,nvtx_depth,nvtx_range,host_current,host_total,pinned_current,pinned_total,managed_current,managed_total,device_current,device_total,workspace_current,workspace_total,large_workspace_current,large_workspace_total
198809,1,"hnsw::build<ACE>",20008,20008,0,0,0,0,148304,148304,0,0,0,0
199961,1,"hnsw::build<ACE>",20008,20008,0,0,0,0,15588304,15588304,0,0,0,0
201350,1,"hnsw::build<ACE>",0,20008,0,0,0,0,0,40385488,0,0,0,0
222216,3,"cagra::build_knn_graph<IVF-PQ>(5000000, 1536, 72)",1440000000,1440020008,0,0,0,0,0,40385488,0,0,0,0
273892,4,"ivf_pq::build(5000000, 1536)",1440020008,1440040016,0,0,0,0,40385488,80770976,0,0,0,0
304183,4,"ivf_pq::build(5000000, 1536)",1440020008,1440040016,0,0,0,0,40385488,80770976,0,0,4388567040,4388567040
309064,4,"ivf_pq::build(5000000, 1536)",1440020008,1440040016,0,0,0,0,53860384,94245872,0,0,4388567040,4388567040
334655,4,"ivf_pq::build(5000000, 1536)",1440020008,1440040016,0,0,0,0,67339295,107724783,0,0,4388567040,4388567040
385037,4,"ivf_pq::build(5000000, 1536)",1440020008,1440040016,0,0,0,0,74076743,114462231,0,0,4388567040,4388567040
386129,4,"ivf_pq::build(5000000, 1536)",1440020008,1440040016,0,0,0,0,80814199,121199687,0,0,4388567040,4388567040
402750,4,"ivf_pq::build(5000000, 1536)",1440020008,1440040016,0,0,0,0,46099768,126913967,0,0,4388567040,4388567040
...
```
This can later be visualized (the visualization script is not included in the PR):
<img width="2100" height="1350" alt="allocations" src="https://github.com/user-attachments/assets/3f0ab942-b49b-4e09-a0ea-9181725ae05e" />

#### Implementation overview

##### NVTX

Added thread-local tracking of NVTX range stack; the calling thread shares a handle to the sampling thread to correlate the NVTX range state with allocations.

##### Memory resource adaptors

- statistics adaptor: atomically counts allocations/deallocations for any `cuda::mr`-compatible resource
- notifying adaptor: sets a shared "notifier" state on each event

##### Resource monitor

A resource monitor registers a collection of resource statistics objects, a single NVTX range handle, and a single notifier state. It spawns a new thread to sample the resource statistics at a given rate (but only when the notifier is triggered). This thread writes to a CSV output stream.

##### Memory tracking resources

`raft::memory_tracking_resources` is a child of `raft::resources`, thus can be used as a drop-in replacement. It replaces all known memory resource for the duration of its lifetime and manages the output file or stream if necessary.


Depends on (and includes all changes of) rapidsai#2968

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: rapidsai#2973
@achirkin achirkin self-assigned this Mar 18, 2026
@achirkin achirkin requested review from a team as code owners March 18, 2026 12:11
@achirkin achirkin added the enhancement New feature or request label Mar 18, 2026
@achirkin achirkin added feature request New feature or request and removed enhancement New feature or request 5 - Ready to Merge labels Mar 18, 2026
@achirkin
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 7b03b3f into rapidsai:release/26.04 Mar 18, 2026
157 of 159 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Non-breaking change

Development

Successfully merging this pull request may close these issues.

2 participants