Skip to content

Export mscclpp GpuBuffer to dlpack format#492

Merged
Binyang2014 merged 12 commits intomainfrom
binyli/export_dlpack
Apr 3, 2025
Merged

Export mscclpp GpuBuffer to dlpack format#492
Binyang2014 merged 12 commits intomainfrom
binyli/export_dlpack

Conversation

@Binyang2014
Copy link
Contributor

@Binyang2014 Binyang2014 commented Apr 1, 2025

For mscclpp, to use nvls we require the buffer is allocated by mscclpp::GpuBuffer. Due to cupy doesn't support bfloat16 yet, we export the raw buffer to dlpack format.
User can use this feature to create buffer with type supported by pytorch

buffer = RawGpuBuffer(1024 * 2) # 2 for bfloat16
dl_pack = buffer.to_dlpack(str(torch.bfloat16))
tensor = torch.utils.dlpack.from_dlpack(dl_pack)

@Binyang2014 Binyang2014 marked this pull request as ready for review April 2, 2025 16:30
Copy link
Contributor

@chhwang chhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but we would need to support multi-dimensional allocation directly. If users convert 1D to multi-dim, it may incur a local copy which is not intended here.

@Binyang2014 Binyang2014 marked this pull request as draft April 2, 2025 18:25
@Binyang2014 Binyang2014 marked this pull request as ready for review April 3, 2025 16:15
Copy link
Contributor

@chhwang chhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Binyang2014 Binyang2014 merged commit adc9ee5 into main Apr 3, 2025
14 checks passed
@Binyang2014 Binyang2014 deleted the binyli/export_dlpack branch April 3, 2025 19:59
@Binyang2014 Binyang2014 mentioned this pull request Apr 3, 2025
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants