Skip to content

Commit 0c7edf9

Browse files
authored
[NVIDIA] Use correct commit type for TMA (#5738)
Follow-up to #5733 which somehow the passed CI and auto-merged, even with this bug. TMA should be using `bulk.commit.group` rather than `commit.group`.
1 parent 3734709 commit 0c7edf9

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

test/Conversion/tma_to_llvm.mlir

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,7 @@ tt.func @tma_scatter(%arg0: !tt.ptr<i8>, %arg1: tensor<32xi32, #ttg.slice<{dim =
168168
// CHECK-SAME: (i1 [[PRED]], ptr addrspace(1) %0, i32 %2, i32 {{%[0-9]+}}, i32 {{%[0-9]+}}, i32 {{%[0-9]+}}, i32 {{%[0-9]+}}, ptr addrspace(3) [[PTR]])
169169
ttng.async_tma_scatter %arg0[%arg1, %arg2] %arg3 : !tt.ptr<i8>, tensor<32xi32, #ttg.slice<{dim = 0, parent = #blocked}>>, i32, !ttg.memdesc<32x128xbf16, #shared1, #smem, mutable>
170170

171-
// CHECK: call void @llvm.nvvm.cp.async.commit.group()
171+
// CHECK: nvvm.cp.async.bulk.commit.group()
172172

173173
// CHECK-NEXT: ret void
174174
tt.return

third_party/nvidia/lib/TritonNVIDIAGPUToLLVM/LoadStoreOpToLLVM.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1605,7 +1605,7 @@ LogicalResult AsyncTMAScatterOpConversion::matchAndRewrite(
16051605

16061606
// TODO: Separate the syncronizations operations into separate TTGIR ops to
16071607
// be able to schedule them at the high level.
1608-
rewriter.create<NVVM::CpAsyncCommitGroupOp>(loc);
1608+
rewriter.create<NVVM::CpAsyncBulkCommitGroupOp>(loc);
16091609

16101610
rewriter.eraseOp(op);
16111611
return success();

0 commit comments

Comments
 (0)