Skip to content

[neural.slang] Enable __getStructuredBufferPtr for CUDA #10167

@kaizhangNV

Description

@kaizhangNV

Problem Description

The motivation of this feature request is that we want a flexible way to call

__atomic_reduce_add(__ref T dst, T value)

This intrinsic requires that the element type of the buffer must be same type as the value type. So in some cases, when our buffer type is just T, but the value type could be vector<T, 2>, we basically have to good option to call this intrinsic. This happens at neural.slang where we have the optimized code path that call the half2 version of the __atomic_reduce_add, while we can't change the buffer to StructuredBuffer<half2>. So in this case, we need a way to cast the pointer of the buffer.

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions