Skip to content

Commit 44a64f0

Browse files
Mogballloislo
authored andcommitted
[TritonGPU] LICM outer loop before flattening (triton-lang#6010)
Ops in prologue/epilogue can't get hoisted by LICM after the loop is flattened, so LICM the outer loop before. We still don't want to LICM the inner loop because it can significantly increase liveranges.
1 parent 4272074 commit 44a64f0

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

lib/Dialect/TritonGPU/Transforms/FuseNestedLoops.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
#include "mlir/Dialect/UB/IR/UBOps.h"
33
#include "mlir/IR/Dominance.h"
44
#include "mlir/IR/ImplicitLocOpBuilder.h"
5+
#include "mlir/Transforms/LoopInvariantCodeMotionUtils.h"
56
#include "mlir/Transforms/RegionUtils.h"
67
#include "triton/Dialect/Triton/IR/Dialect.h"
78
#include "triton/Dialect/TritonGPU/Transforms/Passes.h"
@@ -1053,6 +1054,7 @@ static LogicalResult preprocessLoopNest(const LoopNest &nest,
10531054
scf::ForOp &outerLoop = nest.root->loop;
10541055
scf::ForOp &innerLoop = nest.root->children.front()->loop;
10551056

1057+
moveLoopInvariantCode(outerLoop);
10561058
optimizeEpilogueDependencies(outerLoop, innerLoop, domInfo);
10571059
return speculateInnerLoopLength(outerLoop, innerLoop, domInfo);
10581060
}

0 commit comments

Comments
 (0)