You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This patch adds a new variant of TMA Bulk Copy
intrinsics introduced in sm100+. This variant
has an additional byte_mask to select the bytes
for the copy operation.
* Selection is all done through tablegen now.
So, this patch removes the corresponding
SelectCpAsyncBulkS2G() function.
* This patch also removes the NoCapture attribute from
the base intrinsics which do not have bytemask.
* lit tests are verified with a cuda-12.8 ptxas
executable.
Signed-off-by: Durgadoss R <[email protected]>
0 commit comments