Skip to content

Commit 76f0064

Browse files
authored
[AMD] Fix fuse-nested-loops lit test (#5678)
The order of evaluating of function arguments unspecified according the standard. It caused a lit test failure on my local and all AMD machines I tried. Create `cmpi sge` and `cmpi slt` operation separatly to determine the order of them. Fixed `fuse-nested-loops:multiple_loops` lit test. Signed-off-by: Ilya Veselov <[email protected]>
1 parent cea35da commit 76f0064

File tree

1 file changed

+5
-4
lines changed

1 file changed

+5
-4
lines changed

lib/Dialect/TritonGPU/Transforms/FuseNestedLoops.cpp

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -671,10 +671,11 @@ static void fuseOneLevel(LoopNestNode *parent, mlir::DominanceInfo &domInfo) {
671671
b.setInsertionPointAfter(prologueIf);
672672
Value innerEndT = b.create<arith::AddIOp>(
673673
loc, innerStartT, castIntIfNecessary(b, loc, lenInners[k], intTy));
674-
Value bodyCond = b.create<arith::AndIOp>(
675-
loc,
676-
b.create<arith::CmpIOp>(loc, arith::CmpIPredicate::sge, T, innerStartT),
677-
b.create<arith::CmpIOp>(loc, arith::CmpIPredicate::slt, T, innerEndT));
674+
Value ge =
675+
b.create<arith::CmpIOp>(loc, arith::CmpIPredicate::sge, T, innerStartT);
676+
Value lt =
677+
b.create<arith::CmpIOp>(loc, arith::CmpIPredicate::slt, T, innerEndT);
678+
Value bodyCond = b.create<arith::AndIOp>(loc, ge, lt);
678679

679680
// The outputs will be the outputs of the inner loop body and the next jk.
680681
SmallVector<Type> bodyOutTypes{jk.getType()};

0 commit comments

Comments
 (0)