Skip to content

UCT/CUDA/CUDA_IPC: Unmap memhandle mapped at rkey unpack & rkey ptr.#11288

Open
rakhmets wants to merge 1 commit intoopenucx:masterfrom
rakhmets:topic/uct-cuda-ipc-destroy-rkey
Open

UCT/CUDA/CUDA_IPC: Unmap memhandle mapped at rkey unpack & rkey ptr.#11288
rakhmets wants to merge 1 commit intoopenucx:masterfrom
rakhmets:topic/uct-cuda-ipc-destroy-rkey

Conversation

@rakhmets
Copy link
Copy Markdown
Contributor

@rakhmets rakhmets commented Mar 23, 2026

What?

Unregistered the CUDA IPC memory handle that was registered when unpacking the remote memory key and/or getting local pointer to remote memory.

Why?

CUDA IPC component does peer reachability check when unpacking the remote memory key. At the first reachability check it opens an inter process memory handle exported from another process. The component also may open an inter process memory handle exported from another process at getting local pointer to remote memory. If this handle is not closed, then the memory on the remote side cannot be released.

@rakhmets rakhmets force-pushed the topic/uct-cuda-ipc-destroy-rkey branch 2 times, most recently from 93e5320 to 373daae Compare March 23, 2026 15:32
extended_rkey = &unpacked_rkey->super;
rkey = &extended_rkey->super;
rkey_release_args = (uct_cuda_ipc_rkey_release_args_h)args;
status = uct_cuda_ipc_unmap_memhandle(rkey->pid, extended_rkey->pid_ns,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it means rkey will not be cached? seems it can hurt RNDV protocol performance

Copy link
Copy Markdown
Contributor Author

@rakhmets rakhmets Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every uct_cuda_ipc_map_memhandle should be paired with uct_cuda_ipc_unmap_memhandle.
The one in uct_cuda_ipc_rkey_unpack does not have a pair. This PR fixes this.

I have just updated PR. So that UCX_CUDA_IPC_CACHE is taken into account in this path.

@rakhmets rakhmets requested a review from yosefe March 24, 2026 15:40
@rakhmets rakhmets marked this pull request as draft March 24, 2026 17:30
@rakhmets
Copy link
Copy Markdown
Contributor Author

rakhmets commented Mar 24, 2026

Converted to draft to prevent merging before #11292 to avoid merge conflicts in #11292.

@rakhmets rakhmets force-pushed the topic/uct-cuda-ipc-destroy-rkey branch 7 times, most recently from 90d3956 to 3dffc60 Compare March 31, 2026 11:29
@rakhmets rakhmets marked this pull request as ready for review March 31, 2026 11:35
@rakhmets rakhmets changed the title UCT/CUDA/CUDA_IPC: Unmap memhandle mapped at rkey unpack. UCT/CUDA/CUDA_IPC: Unmap memhandle mapped at rkey unpack & rkey ptr. Mar 31, 2026
@rakhmets
Copy link
Copy Markdown
Contributor Author

Converted to draft to prevent merging before #11292 to avoid merge conflicts in #11292.

Rebased and updated to cover ucp_rkey_ptr flow.

@rakhmets rakhmets force-pushed the topic/uct-cuda-ipc-destroy-rkey branch from 3dffc60 to dcb2923 Compare March 31, 2026 14:17
@rakhmets rakhmets force-pushed the topic/uct-cuda-ipc-destroy-rkey branch from dcb2923 to 01eca31 Compare March 31, 2026 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants