-
Couldn't load subscription status.
- Fork 15k
Description
I noticed that the unrolling tools of SCF / constantTripCount() produce invalid results in case of unsigned scf.for operations with minimal bitwidth to represent the bounds. So far, I have identified three different problems.
All examples are run with mlir-opt -test-loop-unrolling='unroll-factor=2'
constantTripCount()returns 0 instead of 1
Initial loop:
func.func @unroll_twos_complement_min_bit_width_bound() -> (i32, i32) {
%0 = arith.constant 7 : i32
%lb = arith.constant 0 : i2
%ub = arith.constant 2 : i2
%step = arith.constant 1 : i2
%result:2 = scf.for unsigned %i0 = %lb to %ub step %step iter_args(%arg0 = %0, %arg1 = %0) -> (i32, i32) : i2{
%add = arith.addi %arg0, %arg1 : i32
%mul = arith.muli %arg0, %arg1 : i32
scf.yield %add, %mul : i32, i32
}
return %result#0, %result#1 : i32, i32
}Intermediate result while unrolling:
%0:2 = scf.for unsigned %arg0 = %c0_i2 to %c-2_i2 step %c-2_i2_0 iter_args(%arg1 = %c7_i32, %arg2 = %c7_i32) -> (i32, i32) : i2 {...}-> constantTripCount() returns 0:
llvm-project/mlir/lib/Dialect/Utils/StaticValueUtils.cpp
Lines 318 to 319 in c336625
| if (stepCst.isNegative()) | |
| return APInt(bitwidth, 0); |
Expected result:
func.func @unroll_twos_complement_min_bit_width_bound() -> (i32, i32) {
%c7_i32 = arith.constant 7 : i32
%c0_i2 = arith.constant 0 : i2
%c-2_i2 = arith.constant -2 : i2
%c1_i2 = arith.constant 1 : i2
%c-2_i2_0 = arith.constant -2 : i2
%0 = arith.addi %c7_i32, %c7_i32 : i32
%1 = arith.muli %c7_i32, %c7_i32 : i32
%2 = arith.addi %0, %1 : i32
%3 = arith.muli %0, %1 : i32
return %2, %3 : i32, i32
}Actual result:
func.func @unroll_twos_complement_min_bit_width_bound() -> (i32, i32) {
%c7_i32 = arith.constant 7 : i32
%c0_i2 = arith.constant 0 : i2
%c-2_i2 = arith.constant -2 : i2
%c1_i2 = arith.constant 1 : i2
%c-2_i2_0 = arith.constant -2 : i2
return %c7_i32, %c7_i32 : i32, i32
}- Trip count interpreted as a negative number. Potential fix:
tripCount->ZExtValue()instead oftripCount->SExtValue()
Initial loop:
func.func @unroll_pow2_trip_count() -> (i32, i32) {
%0 = arith.constant 7 : i32
%lb = arith.constant 0 : i3
%ub = arith.constant 4 : i3
%step = arith.constant 1 : i3
%result:2 = scf.for unsigned %i0 = %lb to %ub step %step iter_args(%arg0 = %0, %arg1 = %0) -> (i32, i32) : i3{
%add = arith.addi %arg0, %arg1 : i32
%mul = arith.muli %arg0, %arg1 : i32
scf.yield %add, %mul : i32, i32
}
return %result#0, %result#1 : i32, i32
}Intermediate result while unrolling:
%0:2 = scf.for unsigned %arg0 = %c0_i3 to %c-4_i3 step %c2_i3 iter_args(%arg1 = %c7_i32, %arg2 = %c7_i32) -> (i32, i32) : i3 {...}-> tripCount->getSExtValue() = -2:
llvm-project/mlir/lib/Dialect/SCF/IR/SCF.cpp
Lines 438 to 439 in 33503d0
| if (!tripCount.has_value() || tripCount->getSExtValue() > 1) | |
| return failure(); |
Expected result:
func.func @unroll_pow2_trip_count() -> (i32, i32) {
%c7_i32 = arith.constant 7 : i32
%c0_i3 = arith.constant 0 : i3
%c-4_i3 = arith.constant -4 : i3
%c1_i3 = arith.constant 1 : i3
%c2_i3 = arith.constant 2 : i3
%0:2 = scf.for unsigned %arg0 = %c0_i3 to %c-4_i3 step %c2_i3 iter_args(%arg1 = %c7_i32, %arg2 = %c7_i32) -> (i32, i32) : i3 {
%1 = arith.addi %arg1, %arg2 : i32
%2 = arith.muli %arg1, %arg2 : i32
%3 = arith.addi %1, %2 : i32
%4 = arith.muli %1, %2 : i32
scf.yield %3, %4 : i32, i32
}
return %0#0, %0#1 : i32, i32
}Actual result:
func.func @unroll_pow2_trip_count() -> (i32, i32) {
%c7_i32 = arith.constant 7 : i32
%c0_i3 = arith.constant 0 : i3
%c-4_i3 = arith.constant -4 : i3
%c1_i3 = arith.constant 1 : i3
%c2_i3 = arith.constant 2 : i3
%0 = arith.addi %c7_i32, %c7_i32 : i32
%1 = arith.muli %c7_i32, %c7_i32 : i32
%2 = arith.addi %0, %1 : i32
%3 = arith.muli %0, %1 : i32
return %2, %3 : i32, i32
}- Invalid transformation of loop with a tripcount of 1
Initial loop:
func.func @unroll_trip_count_1_min_bit_width() -> (i32, i32) {
%0 = arith.constant 7 : i32
%lb = arith.constant 0 : i1
%ub = arith.constant 1 : i1
%step = arith.constant 1 : i1
%result:2 = scf.for unsigned %i0 = %lb to %ub step %step iter_args(%arg0 = %0, %arg1 = %0) -> (i32, i32) : i1{
%add = arith.addi %arg0, %arg1 : i32
%mul = arith.muli %arg0, %arg1 : i32
scf.yield %add, %mul : i32, i32
}
return %result#0, %result#1 : i32, i32
}Intermediate result while unrolling:
%0:2 = scf.for unsigned %arg0 = %false to %true step %false_1 iter_args(%arg1 = %c7_i32, %arg2 = %c7_i32) -> (i32, i32) : i1 {...}I this case, the intermediate result seems to be wrong, I would still expect a step size of 1.
-> constantTripCount() returns 0:
llvm-project/mlir/lib/Dialect/Utils/StaticValueUtils.cpp
Lines 316 to 317 in c336625
| if (stepCst.isZero()) | |
| return stepCst; |
Expected result:
func.func @unroll_trip_count_1_min_bit_width() -> (i32, i32) {
%c7_i32 = arith.constant 7 : i32
%false = arith.constant false
%true = arith.constant true
%true_0 = arith.constant true
%add = arith.addi %c7_i32, %c7_i32 : i32
%mul = arith.muli %c7_i32, %c7_i32 : i32
return %add, %mul : i32, i32
}Actual result:
func.func @unroll_trip_count_1_min_bit_width() -> (i32, i32) {
%c7_i32 = arith.constant 7 : i32
%false = arith.constant false
%true = arith.constant true
%true_0 = arith.constant true
%false_1 = arith.constant false
return %c7_i32, %c7_i32 : i32, i32
}There are probably more bugs of a similar nature.
For the ones I have found, please find the LIT tests with my expected outputs attached.