Skip to content

Commit 477c0b6

Browse files
authored
[mlir][affine][gpu] Replace DivSIOp to CeilDivSIOp when lowering to GPU launch (#73328)
When converting affine.for to GPU launch operator, we have to calculate the block dimension and thread dimension for the launch operator. The formula of the dimension size is (upper_bound - lower_bound) / step_size When the difference is indivisible by step_size, we use rounding-to-zero as the division result. However, the block dimension and thread dimension is right-open range, i.e., [0, block_dim) and [0, thread_dim). So, we will get the wrong result if we use DivSIOp. In this patch, we replace it with CeilDivSIOp to get the correct block and thread dimension values.
1 parent 27c0bc9 commit 477c0b6

File tree

2 files changed

+4
-3
lines changed

2 files changed

+4
-3
lines changed

mlir/lib/Conversion/SCFToGPU/SCFToGPU.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,8 @@ AffineLoopToGpuConverter::collectBounds(AffineForOp forOp, unsigned numLoops) {
195195
upperBound, lowerBound);
196196
Value step = getOrCreateStep(currentLoop, builder);
197197
if (getConstantIntValue(step) != static_cast<int64_t>(1))
198-
range = builder.create<arith::DivSIOp>(currentLoop.getLoc(), range, step);
198+
range =
199+
builder.create<arith::CeilDivSIOp>(currentLoop.getLoc(), range, step);
199200
dims.push_back(range);
200201

201202
lbs.push_back(lowerBound);

mlir/test/Conversion/SCFToGPU/step_positive.mlir

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
// CHECK-LABEL: @step_var
44
func.func @step_var(%A : memref<?x?xf32>, %B : memref<?x?xf32>) {
55
// Check that we divide by step.
6-
// CHECK: %[[range_i:.*]] = arith.divsi {{.*}}, %{{.*}}
7-
// CHECK: %[[range_j:.*]] = arith.divsi {{.*}}, %{{.*}}
6+
// CHECK: %[[range_i:.*]] = arith.ceildivsi {{.*}}, %{{.*}}
7+
// CHECK: %[[range_j:.*]] = arith.ceildivsi {{.*}}, %{{.*}}
88

99
// CHECK: gpu.launch
1010
// CHECK-SAME: blocks(%{{[^)]*}}, %{{[^)]*}}, %{{[^)]*}}) in (%{{[^)]*}} = %[[range_i]], %{{[^)]*}} = %{{[^)]*}}, %{{[^)]*}} = %{{[^)]*}})

0 commit comments

Comments
 (0)