Skip to content

Conversation

@NexMing
Copy link
Contributor

@NexMing NexMing commented Oct 27, 2025

When a scf.if directly precedes an scf.condition in the before region of an scf.while and both share the same condition, move the if into the after region of the loop. This helps simplify the control flow to enable uplifting scf.while to scf.for.

@llvmbot
Copy link
Member

llvmbot commented Oct 27, 2025

@llvm/pr-subscribers-mlir

Author: Ming Yan (NexMing)

Changes

When an scf.if directly precedes an scf.condition in the before region of an scf.while and both share the same condition, move the if into the after region of the loop. This preserves semantics while simplifying the condition region and canonicalizing control flow for further optimizations.


Full diff: https://github.com/llvm/llvm-project/pull/165216.diff

2 Files Affected:

  • (modified) mlir/lib/Dialect/SCF/IR/SCF.cpp (+133-1)
  • (modified) mlir/test/Dialect/SCF/canonicalize.mlir (+37)
diff --git a/mlir/lib/Dialect/SCF/IR/SCF.cpp b/mlir/lib/Dialect/SCF/IR/SCF.cpp
index 9bd13f3236cfc..ddf8134d98776 100644
--- a/mlir/lib/Dialect/SCF/IR/SCF.cpp
+++ b/mlir/lib/Dialect/SCF/IR/SCF.cpp
@@ -3546,6 +3546,137 @@ LogicalResult scf::WhileOp::verify() {
 }
 
 namespace {
+/// Move an scf.if op that is directly before the scf.condition op in the while
+/// before region, and whose condition matches the condition of the
+/// scf.condition op, down into the while after region.
+///
+/// scf.while (..) : (...) -> ... {
+///  %additional_used_values = ...
+///  %cond = ...
+///  ...
+///  %res = scf.if %cond -> (...) {
+///    use(%additional_used_values)
+///    ... // then block
+///    scf.yield %then_value
+///  } else {
+///    scf.yield %else_value
+///  }
+///  scf.condition(%cond) %res, ...
+/// } do {
+/// ^bb0(%res_arg, ...):
+///    use(%res_arg)
+///    ...
+///
+/// becomes
+/// scf.while (..) : (...) -> ... {
+///  %additional_used_values = ...
+///  %cond = ...
+///  ...
+///  scf.condition(%cond) %else_value, ..., %additional_used_values
+/// } do {
+/// ^bb0(%res_arg ..., %additional_args): :
+///    use(%additional_args)
+///    ... // if then block
+///    use(%then_value)
+///    ...
+struct WhileMoveIfDown : public OpRewritePattern<WhileOp> {
+  using OpRewritePattern<WhileOp>::OpRewritePattern;
+
+  LogicalResult matchAndRewrite(WhileOp op,
+                                PatternRewriter &rewriter) const override {
+    auto conditionOp =
+        cast<scf::ConditionOp>(op.getBeforeBody()->getTerminator());
+    auto ifOp = dyn_cast_or_null<scf::IfOp>(conditionOp->getPrevNode());
+
+    // Check that the ifOp is directly before the conditionOp and that it
+    // matches the condition of the conditionOp. Also ensure that the ifOp has
+    // no else block with content, as that would complicate the transformation.
+    // TODO: support else blocks with content.
+    if (!ifOp || ifOp.getCondition() != conditionOp.getCondition() ||
+        (ifOp.elseBlock() && !ifOp.elseBlock()->without_terminator().empty()))
+      return failure();
+
+    assert(ifOp->use_empty() || (llvm::range_size(ifOp->getUsers()) == 1 &&
+                                 *ifOp->user_begin() == conditionOp) &&
+                                    "ifOp has unexpected uses");
+
+    Location loc = op.getLoc();
+
+    // Replace uses of ifOp results in the conditionOp with the yielded values
+    // from the ifOp branches.
+    for (auto [idx, arg] : llvm::enumerate(conditionOp.getArgs())) {
+      auto it = llvm::find(ifOp->getResults(), arg);
+      if (it != ifOp->getResults().end()) {
+        size_t ifOpIdx = it.getIndex();
+        Value thenValue = ifOp.thenYield()->getOperand(ifOpIdx);
+        Value elseValue = ifOp.elseYield()->getOperand(ifOpIdx);
+
+        rewriter.replaceAllUsesWith(it.getBase(), elseValue);
+        rewriter.replaceAllUsesWith(op.getAfterArguments()[idx], thenValue);
+      }
+    }
+
+    SmallVector<Value> additionalUsedValues;
+    auto isValueUsedInsideIf = [&](Value val) {
+      return llvm::any_of(val.getUsers(), [&](Operation *user) {
+        return ifOp.getThenRegion().isAncestor(user->getParentRegion());
+      });
+    };
+
+    // Collect additional used values from before region.
+    for (Operation *it = ifOp->getPrevNode(); it != nullptr;
+         it = it->getPrevNode())
+      llvm::copy_if(it->getResults(), std::back_inserter(additionalUsedValues),
+                    isValueUsedInsideIf);
+
+    llvm::copy_if(op.getBeforeArguments(),
+                  std::back_inserter(additionalUsedValues),
+                  isValueUsedInsideIf);
+
+    // Create new whileOp with additional used values as results.
+    auto additionalValueTypes = llvm::map_to_vector(
+        additionalUsedValues, [](Value val) { return val.getType(); });
+    size_t additionalValueSize = additionalUsedValues.size();
+    SmallVector<Type> newResultTypes(op.getResultTypes());
+    newResultTypes.append(additionalValueTypes);
+
+    auto newWhileOp =
+        scf::WhileOp::create(rewriter, loc, newResultTypes, op.getInits());
+
+    newWhileOp.getBefore().takeBody(op.getBefore());
+    newWhileOp.getAfter().takeBody(op.getAfter());
+    newWhileOp.getAfter().addArguments(
+        additionalValueTypes, SmallVector<Location>(additionalValueSize, loc));
+
+    SmallVector<Value> conditionArgs = conditionOp.getArgs();
+    llvm::append_range(conditionArgs, additionalUsedValues);
+
+    // Update conditionOp inside new whileOp before region.
+    rewriter.setInsertionPoint(conditionOp);
+    rewriter.replaceOpWithNewOp<scf::ConditionOp>(
+        conditionOp, conditionOp.getCondition(), conditionArgs);
+
+    // Replace uses of additional used values inside the ifOp then region with
+    // the whileOp after region arguments.
+    rewriter.replaceUsesWithIf(
+        additionalUsedValues,
+        newWhileOp.getAfterArguments().take_back(additionalValueSize),
+        [&](OpOperand &use) {
+          return ifOp.getThenRegion().isAncestor(
+              use.getOwner()->getParentRegion());
+        });
+
+    // Inline ifOp then region into new whileOp after region.
+    rewriter.eraseOp(ifOp.thenYield());
+    rewriter.inlineBlockBefore(ifOp.thenBlock(), newWhileOp.getAfterBody(),
+                               newWhileOp.getAfterBody()->begin());
+    rewriter.eraseOp(ifOp);
+    rewriter.replaceOp(op,
+                       newWhileOp->getResults().drop_back(additionalValueSize));
+    return success();
+  }
+};
+
 /// Replace uses of the condition within the do block with true, since otherwise
 /// the block would not be evaluated.
 ///
@@ -4258,7 +4389,8 @@ void WhileOp::getCanonicalizationPatterns(RewritePatternSet &results,
   results.add<RemoveLoopInvariantArgsFromBeforeBlock,
               RemoveLoopInvariantValueYielded, WhileConditionTruth,
               WhileCmpCond, WhileUnusedResult, WhileRemoveDuplicatedResults,
-              WhileRemoveUnusedArgs, WhileOpAlignBeforeArgs>(context);
+              WhileRemoveUnusedArgs, WhileOpAlignBeforeArgs, WhileMoveIfDown>(
+      context);
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/mlir/test/Dialect/SCF/canonicalize.mlir b/mlir/test/Dialect/SCF/canonicalize.mlir
index 2bec63672e783..cfae3b34305de 100644
--- a/mlir/test/Dialect/SCF/canonicalize.mlir
+++ b/mlir/test/Dialect/SCF/canonicalize.mlir
@@ -974,6 +974,43 @@ func.func @replace_if_with_cond3(%arg0 : i1, %arg2: i64) -> (i32, i64) {
 
 // -----
 
+// CHECK-LABEL: @while_move_if_down
+func.func @while_move_if_down() -> i32 {
+  %0 = scf.while () : () -> (i32) {
+    %additional_used_value = "test.get_some_value1" () : () -> (i32)
+    %else_value = "test.get_some_value2" () : () -> (i32)
+    %condition = "test.condition"() : () -> i1
+    %res = scf.if %condition -> (i32) {
+      "test.use1" (%additional_used_value) : (i32) -> ()
+      %then_value = "test.get_some_value3" () : () -> (i32)
+      scf.yield %then_value : i32
+    } else {
+      scf.yield %else_value : i32
+    }
+    scf.condition(%condition) %res : i32
+  } do {
+  ^bb0(%res_arg: i32):
+    "test.use2" (%res_arg) : (i32) -> ()
+    scf.yield
+  }
+  return %0 : i32
+}
+// CHECK-NEXT:      %[[WHILE_0:.*]]:2 = scf.while : () -> (i32, i32) {
+// CHECK-NEXT:        %[[VAL_0:.*]] = "test.get_some_value1"() : () -> i32
+// CHECK-NEXT:        %[[VAL_1:.*]] = "test.get_some_value2"() : () -> i32
+// CHECK-NEXT:        %[[VAL_2:.*]] = "test.condition"() : () -> i1
+// CHECK-NEXT:        scf.condition(%[[VAL_2]]) %[[VAL_1]], %[[VAL_0]] : i32, i32
+// CHECK-NEXT:      } do {
+// CHECK-NEXT:      ^bb0(%[[VAL_3:.*]]: i32, %[[VAL_4:.*]]: i32):
+// CHECK-NEXT:        "test.use1"(%[[VAL_4]]) : (i32) -> ()
+// CHECK-NEXT:        %[[VAL_5:.*]] = "test.get_some_value3"() : () -> i32
+// CHECK-NEXT:        "test.use2"(%[[VAL_5]]) : (i32) -> ()
+// CHECK-NEXT:        scf.yield
+// CHECK-NEXT:      }
+// CHECK-NEXT:      return %[[VAL_6:.*]]#0 : i32
+
+// -----
+
 // CHECK-LABEL: @while_cond_true
 func.func @while_cond_true() -> i1 {
   %0 = scf.while () : () -> i1 {

@llvmbot
Copy link
Member

llvmbot commented Oct 27, 2025

@llvm/pr-subscribers-mlir-scf

Author: Ming Yan (NexMing)

Changes

When an scf.if directly precedes an scf.condition in the before region of an scf.while and both share the same condition, move the if into the after region of the loop. This preserves semantics while simplifying the condition region and canonicalizing control flow for further optimizations.


Full diff: https://github.com/llvm/llvm-project/pull/165216.diff

2 Files Affected:

  • (modified) mlir/lib/Dialect/SCF/IR/SCF.cpp (+133-1)
  • (modified) mlir/test/Dialect/SCF/canonicalize.mlir (+37)
diff --git a/mlir/lib/Dialect/SCF/IR/SCF.cpp b/mlir/lib/Dialect/SCF/IR/SCF.cpp
index 9bd13f3236cfc..ddf8134d98776 100644
--- a/mlir/lib/Dialect/SCF/IR/SCF.cpp
+++ b/mlir/lib/Dialect/SCF/IR/SCF.cpp
@@ -3546,6 +3546,137 @@ LogicalResult scf::WhileOp::verify() {
 }
 
 namespace {
+/// Move an scf.if op that is directly before the scf.condition op in the while
+/// before region, and whose condition matches the condition of the
+/// scf.condition op, down into the while after region.
+///
+/// scf.while (..) : (...) -> ... {
+///  %additional_used_values = ...
+///  %cond = ...
+///  ...
+///  %res = scf.if %cond -> (...) {
+///    use(%additional_used_values)
+///    ... // then block
+///    scf.yield %then_value
+///  } else {
+///    scf.yield %else_value
+///  }
+///  scf.condition(%cond) %res, ...
+/// } do {
+/// ^bb0(%res_arg, ...):
+///    use(%res_arg)
+///    ...
+///
+/// becomes
+/// scf.while (..) : (...) -> ... {
+///  %additional_used_values = ...
+///  %cond = ...
+///  ...
+///  scf.condition(%cond) %else_value, ..., %additional_used_values
+/// } do {
+/// ^bb0(%res_arg ..., %additional_args): :
+///    use(%additional_args)
+///    ... // if then block
+///    use(%then_value)
+///    ...
+struct WhileMoveIfDown : public OpRewritePattern<WhileOp> {
+  using OpRewritePattern<WhileOp>::OpRewritePattern;
+
+  LogicalResult matchAndRewrite(WhileOp op,
+                                PatternRewriter &rewriter) const override {
+    auto conditionOp =
+        cast<scf::ConditionOp>(op.getBeforeBody()->getTerminator());
+    auto ifOp = dyn_cast_or_null<scf::IfOp>(conditionOp->getPrevNode());
+
+    // Check that the ifOp is directly before the conditionOp and that it
+    // matches the condition of the conditionOp. Also ensure that the ifOp has
+    // no else block with content, as that would complicate the transformation.
+    // TODO: support else blocks with content.
+    if (!ifOp || ifOp.getCondition() != conditionOp.getCondition() ||
+        (ifOp.elseBlock() && !ifOp.elseBlock()->without_terminator().empty()))
+      return failure();
+
+    assert(ifOp->use_empty() || (llvm::range_size(ifOp->getUsers()) == 1 &&
+                                 *ifOp->user_begin() == conditionOp) &&
+                                    "ifOp has unexpected uses");
+
+    Location loc = op.getLoc();
+
+    // Replace uses of ifOp results in the conditionOp with the yielded values
+    // from the ifOp branches.
+    for (auto [idx, arg] : llvm::enumerate(conditionOp.getArgs())) {
+      auto it = llvm::find(ifOp->getResults(), arg);
+      if (it != ifOp->getResults().end()) {
+        size_t ifOpIdx = it.getIndex();
+        Value thenValue = ifOp.thenYield()->getOperand(ifOpIdx);
+        Value elseValue = ifOp.elseYield()->getOperand(ifOpIdx);
+
+        rewriter.replaceAllUsesWith(it.getBase(), elseValue);
+        rewriter.replaceAllUsesWith(op.getAfterArguments()[idx], thenValue);
+      }
+    }
+
+    SmallVector<Value> additionalUsedValues;
+    auto isValueUsedInsideIf = [&](Value val) {
+      return llvm::any_of(val.getUsers(), [&](Operation *user) {
+        return ifOp.getThenRegion().isAncestor(user->getParentRegion());
+      });
+    };
+
+    // Collect additional used values from before region.
+    for (Operation *it = ifOp->getPrevNode(); it != nullptr;
+         it = it->getPrevNode())
+      llvm::copy_if(it->getResults(), std::back_inserter(additionalUsedValues),
+                    isValueUsedInsideIf);
+
+    llvm::copy_if(op.getBeforeArguments(),
+                  std::back_inserter(additionalUsedValues),
+                  isValueUsedInsideIf);
+
+    // Create new whileOp with additional used values as results.
+    auto additionalValueTypes = llvm::map_to_vector(
+        additionalUsedValues, [](Value val) { return val.getType(); });
+    size_t additionalValueSize = additionalUsedValues.size();
+    SmallVector<Type> newResultTypes(op.getResultTypes());
+    newResultTypes.append(additionalValueTypes);
+
+    auto newWhileOp =
+        scf::WhileOp::create(rewriter, loc, newResultTypes, op.getInits());
+
+    newWhileOp.getBefore().takeBody(op.getBefore());
+    newWhileOp.getAfter().takeBody(op.getAfter());
+    newWhileOp.getAfter().addArguments(
+        additionalValueTypes, SmallVector<Location>(additionalValueSize, loc));
+
+    SmallVector<Value> conditionArgs = conditionOp.getArgs();
+    llvm::append_range(conditionArgs, additionalUsedValues);
+
+    // Update conditionOp inside new whileOp before region.
+    rewriter.setInsertionPoint(conditionOp);
+    rewriter.replaceOpWithNewOp<scf::ConditionOp>(
+        conditionOp, conditionOp.getCondition(), conditionArgs);
+
+    // Replace uses of additional used values inside the ifOp then region with
+    // the whileOp after region arguments.
+    rewriter.replaceUsesWithIf(
+        additionalUsedValues,
+        newWhileOp.getAfterArguments().take_back(additionalValueSize),
+        [&](OpOperand &use) {
+          return ifOp.getThenRegion().isAncestor(
+              use.getOwner()->getParentRegion());
+        });
+
+    // Inline ifOp then region into new whileOp after region.
+    rewriter.eraseOp(ifOp.thenYield());
+    rewriter.inlineBlockBefore(ifOp.thenBlock(), newWhileOp.getAfterBody(),
+                               newWhileOp.getAfterBody()->begin());
+    rewriter.eraseOp(ifOp);
+    rewriter.replaceOp(op,
+                       newWhileOp->getResults().drop_back(additionalValueSize));
+    return success();
+  }
+};
+
 /// Replace uses of the condition within the do block with true, since otherwise
 /// the block would not be evaluated.
 ///
@@ -4258,7 +4389,8 @@ void WhileOp::getCanonicalizationPatterns(RewritePatternSet &results,
   results.add<RemoveLoopInvariantArgsFromBeforeBlock,
               RemoveLoopInvariantValueYielded, WhileConditionTruth,
               WhileCmpCond, WhileUnusedResult, WhileRemoveDuplicatedResults,
-              WhileRemoveUnusedArgs, WhileOpAlignBeforeArgs>(context);
+              WhileRemoveUnusedArgs, WhileOpAlignBeforeArgs, WhileMoveIfDown>(
+      context);
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/mlir/test/Dialect/SCF/canonicalize.mlir b/mlir/test/Dialect/SCF/canonicalize.mlir
index 2bec63672e783..cfae3b34305de 100644
--- a/mlir/test/Dialect/SCF/canonicalize.mlir
+++ b/mlir/test/Dialect/SCF/canonicalize.mlir
@@ -974,6 +974,43 @@ func.func @replace_if_with_cond3(%arg0 : i1, %arg2: i64) -> (i32, i64) {
 
 // -----
 
+// CHECK-LABEL: @while_move_if_down
+func.func @while_move_if_down() -> i32 {
+  %0 = scf.while () : () -> (i32) {
+    %additional_used_value = "test.get_some_value1" () : () -> (i32)
+    %else_value = "test.get_some_value2" () : () -> (i32)
+    %condition = "test.condition"() : () -> i1
+    %res = scf.if %condition -> (i32) {
+      "test.use1" (%additional_used_value) : (i32) -> ()
+      %then_value = "test.get_some_value3" () : () -> (i32)
+      scf.yield %then_value : i32
+    } else {
+      scf.yield %else_value : i32
+    }
+    scf.condition(%condition) %res : i32
+  } do {
+  ^bb0(%res_arg: i32):
+    "test.use2" (%res_arg) : (i32) -> ()
+    scf.yield
+  }
+  return %0 : i32
+}
+// CHECK-NEXT:      %[[WHILE_0:.*]]:2 = scf.while : () -> (i32, i32) {
+// CHECK-NEXT:        %[[VAL_0:.*]] = "test.get_some_value1"() : () -> i32
+// CHECK-NEXT:        %[[VAL_1:.*]] = "test.get_some_value2"() : () -> i32
+// CHECK-NEXT:        %[[VAL_2:.*]] = "test.condition"() : () -> i1
+// CHECK-NEXT:        scf.condition(%[[VAL_2]]) %[[VAL_1]], %[[VAL_0]] : i32, i32
+// CHECK-NEXT:      } do {
+// CHECK-NEXT:      ^bb0(%[[VAL_3:.*]]: i32, %[[VAL_4:.*]]: i32):
+// CHECK-NEXT:        "test.use1"(%[[VAL_4]]) : (i32) -> ()
+// CHECK-NEXT:        %[[VAL_5:.*]] = "test.get_some_value3"() : () -> i32
+// CHECK-NEXT:        "test.use2"(%[[VAL_5]]) : (i32) -> ()
+// CHECK-NEXT:        scf.yield
+// CHECK-NEXT:      }
+// CHECK-NEXT:      return %[[VAL_6:.*]]#0 : i32
+
+// -----
+
 // CHECK-LABEL: @while_cond_true
 func.func @while_cond_true() -> i1 {
   %0 = scf.while () : () -> i1 {

@NexMing NexMing force-pushed the dev/while-sink-if-down branch from 4d3ff6d to ecffe33 Compare October 27, 2025 09:49
Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is a canonicalization. This seems like a cost-model based thing. Someone else might want to hoist the if condition

Copy link
Member

@matthias-springer matthias-springer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This preserves semantics while simplifying the condition region and canonicalizing control flow for further optimizations.

What further optimizations did you have in mind?

@NexMing
Copy link
Contributor Author

NexMing commented Oct 28, 2025

What further optimizations did you have in mind?

The motivation is that I want to lift cf to scf.for.
For example, I want to convert the following control flow into an scf.for.

  func.func @test(%arg0: memref<100xf32>) {
    %c100 = arith.constant 100 : index
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %cst = arith.constant 0.000000e+00 : f32
    cf.br ^bb1(%c0 : index)
  ^bb1(%0: index):  // 2 preds: ^bb0, ^bb2
    %1 = arith.cmpi slt, %0, %c100 : index
    cf.cond_br %1, ^bb2, ^bb3
  ^bb2:  // pred: ^bb1
    memref.store %cst, %arg0[%0] : memref<100xf32>
    %2 = arith.addi %0, %c1 : index
    cf.br ^bb1(%2 : index)
  ^bb3:  // pred: ^bb1
    return
  }

When I use the lift-cf-to-scf pass, I get:

  func.func @test(%arg0: memref<100xf32>) {
    %0 = ub.poison : index
    %c100 = arith.constant 100 : index
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %cst = arith.constant 0.000000e+00 : f32
    %1 = scf.while (%arg1 = %c0) : (index) -> index {
      %2 = arith.cmpi slt, %arg1, %c100 : index
      %3 = scf.if %2 -> (index) {
        memref.store %cst, %arg0[%arg1] : memref<100xf32>
        %4 = arith.addi %arg1, %c1 : index
        scf.yield %4 : index
      } else {
        scf.yield %0 : index
      }
      scf.condition(%2) %3 : index
    } do {
    ^bb0(%arg1: index):
      scf.yield %arg1 : index
    }
    return
  }

When I continued using test-scf-uplift-while-to-for, the attempt to lift scf.while to scf.for failed.
After this optimization, test-scf-uplift-while-to-for can successfully lift scf.while to scf.for.

I believe this optimization always simplifies the control flow, so I think it belongs to canonicalization.
I don’t quite understand in what situations there would be a need to hoist the if condition.
If needed, I will consider moving it into test-scf-uplift-while-to-for.

@NexMing
Copy link
Contributor Author

NexMing commented Nov 3, 2025

ping

@MaheshRavishankar
Copy link
Contributor

I believe this optimization always simplifies the control flow, so I think it belongs to canonicalization.
I don’t quite understand in what situations there would be a need to hoist the if condition.
If needed, I will consider moving it into test-scf-uplift-while-to-for.

It would be good to not have this as a canonicalization to start with. This seems like a pretty heavy weight transformation to add to canonicalizations.

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@NexMing NexMing force-pushed the dev/while-sink-if-down branch from 83f42de to 05b6a0f Compare November 4, 2025 03:48
@NexMing NexMing force-pushed the dev/while-sink-if-down branch from 05b6a0f to 19a042e Compare November 4, 2025 03:49
@NexMing
Copy link
Contributor Author

NexMing commented Nov 4, 2025

Please help review it, thank you. @MaheshRavishankar

@MaheshRavishankar MaheshRavishankar dismissed their stale review November 4, 2025 17:55

Removing blockers since this isnt a canonicalization anymore.

@joker-eph
Copy link
Collaborator

This seems like a pretty heavy weight transformation to add to canonicalizations.

Which part? The transformation itself? As long as the matching phase has a "fail fast" path it seems fine to me actually.
The most important thing is to do as little work as possible in the pattern when there is nothing to do.

@MaheshRavishankar
Copy link
Contributor

This seems like a pretty heavy weight transformation to add to canonicalizations.

Which part? The transformation itself? As long as the matching phase has a "fail fast" path it seems fine to me actually. The most important thing is to do as little work as possible in the pattern when there is nothing to do.

An if-hoisting across a while does not fit in canonicalizations.

@joker-eph
Copy link
Collaborator

joker-eph commented Nov 4, 2025

An if-hoisting across a while does not fit in canonicalizations.

  1. You're stating it as a fact, but that requires some elaboration: why is it a bad canonicalization?
  2. Can you please elaborate how your stated opinion here related to it being "heavy weight"? I'm not sure I follow and you're not answering your questions.

@MaheshRavishankar
Copy link
Contributor

An if-hoisting across a while does not fit in canonicalizations.

  1. You're stating it as a fact, but that requires some elaboration: why is it a bad canonicalization?
  2. Can you please elaborate how your stated opinion here related to it being "heavy weight"? I'm not sure I follow and you're not answering your questions.

I think we need to find a way out of this circular discussion. The onus is to prove that this is canonical. Just stating "I can't think of why you wouldn't do it" isn't justification enough. You can't turn that around and make it an onus on reviewers as to prove why this shouldnt be a canonicalization. Of all possible representations, you need to prove why a canonical representation is always superior.

But in any case, the pattern is doing a whole bunch of code motion and moving instructions around. That is an extremely heavy weight transformation. This is not a targeted "get to a better representation of an operation" transformation.

@joker-eph
Copy link
Collaborator

The onus is to prove that this is canonical.

Sorry but at some point the onus is on you to come up with principled arguments, otherwise I can't really take your objections seriously. Canonicalization can't be defined as "whatever Mahesh think is tasteful".

Of all possible representations, you need to prove why a canonical representation is always superior.

This is not how it works, it has never been the bar, and it won't be. I believe I explained this multiple times, the basic example of pushing constant to the right hand side defeats already any "proving" bar you're trying to put out.

@MaheshRavishankar
Copy link
Contributor

The onus is to prove that this is canonical.

Sorry but at some point the onus is on you to come up with principled arguments, otherwise I can't really take your objections seriously. Canonicalization can't be defined as "whatever Mahesh think is tasteful".

So please tell me what are the principles arguments for what should be a canonicalization. I think if you can enunciate that well, then we can discuss it better. It is not about my taste. This pattern is doing major code movement. Please explain to me the framework for canonicalization that justifies doing code movement this way. I have provided an actual concrete reason why this shouldnt be a canonicalization in my opinion (not a matter of taste). Now can you provide a counter argument as to why this should be a canonicalization.

Anyway, I think this is not on the author of this change. The change as is fine and well structured in my book.

Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see this was moved back into canonicalization. So I was mistaken on the state of the change.

Really this is bad practice! I urge you to not add it as a canonicalization, but I dont have the heart to block this change, even though in my book this is following bad practices.

@joker-eph
Copy link
Collaborator

So please tell me what are the principles arguments for what should be a canonicalization.

I believe I did, multiple times, but I can re-iterate. It turns out we actually have a documentation here: https://mlir.llvm.org/docs/Canonicalization/ ; in particular it links to an pretty good an article on the topic.
The section General Design lays down the principles to consider for canonicalization. Happy to iterate further on this if we need to refine.

I have provided an actual concrete reason why this shouldnt be a canonicalization in my opinion (not a matter of taste).

You indeed elaborate a little bit in your previous message, thanks for this. That said it feels a bit fuzzy to me still, let me quote the only concrete sentence:

But in any case, the pattern is doing a whole bunch of code motion and moving instructions around. That is an extremely heavy weight transformation. This is not a targeted "get to a better representation of an operation" transformation.

You're here repeating the use of "heavy weight transformation" which is ambiguous to me. I understand it as "expensive" in terms of compile-time, which is why I asked before:

Which part? The transformation itself? As long as the matching phase has a "fail fast" path it seems fine to me actually.
The most important thing is to do as little work as possible in the pattern when there is nothing to do.

But you didn't specifically followed-up on this.

So now that you're prefixing it with "doing a whole bunch of code motion and moving instructions around", I suspect you may intended to address something about the scope of the transformation to talk about "heavyweight"?
I'm not sure about the relevance of it for canonicalization though?

Now can you provide a counter argument as to why this should be a canonicalization.

Sure:

  • It does not seem costly to apply.
  • I don't see any useful information being lost here
  • We found an equivalence of the program toward one that eliminates the scf.if entirely, and reduce the scf.while condition block to one that is more easily understandable by subsequent analyses. The test case actually demonstrates this: the while-to-for conversion manages to match this form post-transformation whereas it does not otherwise. Turning a scf.while into a scf.for is also more canonical to me as it encodes structurally more information: the scf.for defines a fixed iteration space that must recovered by analysis on the scf.while, allowing subsequent transformation to get access to this information in a straightforward way on the scf.for.

@MaheshRavishankar
Copy link
Contributor

MaheshRavishankar commented Nov 7, 2025

I have tried to describe the issue I have with canonicalization too, most notably here https://discourse.llvm.org/t/rfc-update-to-general-design-section-of-operation-canonicalizations-in-mlir/79355.

But let me try to be more concrete, and maybe we need to move this to that old (or new) discourse post, but here is what canonicalizations should do afaics, a lot of which is driven by the fact that canonicalizations are applied everywhere in a given compiler stack and provide almost no control to users, which is the anti-pattern.

  1. Canonicalizations should be aimed at moving particular operation into a more "canonical" representation to allow subsequent analysis easier to write. That I think everyone understands
  2. The canonicalization itself today in MLIR have become a kitchen sink cause canonicalization patterns are added ad-hoc without any reference to how different canonicalization patterns interact. What is really needed is a class of fixed point iterations. Each fixed point iteration is trying to move the program further down the lattice. Once we know what the fixed point is doing, then we can easily determine what is the representation of that operation with respect to the lattice that is being reduced. A universal canonical form does not seem like a tractable goal. Today when a canonicalization pattern is added, it is not done with any analysis of how the pattern interacts with other canonicalization patterns, or what is the objective function that the canonicalizer is trying to optimize. It comes down to "I cant think of why someone would do anything else". That is fundamentally restricted by the understanding of the author of all possible uses of particular operation and is in no way rigorous enough to justify that pattern kicking in everywhere in a compilation stack (without any control). I think it is completely justified ask for people adding something to a canonicalization, to justify why in all possible scenarios something is canonical, but that is not done today. Instead people pushing back against it are asked to justify why something shouldnt be a canonicalization. That seems backwards to how basic proofs work. I cant just claim some mathematical fact. It needs to be proved. The proof cant be "no one gave me a counter-example, so I must be right".
  3. Another piece of evidence of canonicalizations being used as kitchen sink is the huge compilations times canonicalizations take. Every canonicalization pass is just carrying a huge number patterns without any justification of why those patterns need to be considered as canonicalizations.

That btw, is exactly what I have said before here https://discourse.llvm.org/t/rfc-update-to-general-design-section-of-operation-canonicalizations-in-mlir/79355/31?u=maheshravishankar . I dont know how else to state it. As it stands, the proclivity of everyone just adding to canonicalization without justification or analysis is a bad practice being promoted which is ultimately going to be a major issue for anyone building a serious compiler stack using MLIR.

Specific to this example, to me the movement of operations from the while region to the do region without any side-effect analysis seems like a red flag. The movement of operations itself is a compilation time overhead cause it will periodically trigger recomputation of the operation order within multiple blocks which is expensive depending on the program. Such factors should have already been accounted for when adding something to a canonicalization, but that is hardly ever done.

@NexMing
Copy link
Contributor Author

NexMing commented Nov 7, 2025

Oh, I see this was moved back into canonicalization. So I was mistaken on the state of the change.

Really this is bad practice! I urge you to not add it as a canonicalization, but I dont have the heart to block this change, even though in my book this is following bad practices.

I’ve already moved it out of canonicalization; now it’s in UpliftWhileToForPatterns.

I’m not opposed to moving it out of canonicalization, but I believe the reason for temporarily moving it out of canonicalization is that this transformation is complex and expensive, which introduces instability and uncertainty., rather than it being unsuitable for canonicalization. Once this optimization becomes sufficiently stable, could we move it back into the canonicalization stage?

Or is my understanding of canonicalization too broad? Perhaps this transformation belongs in something like a “simplify CFG” pass instead.

I would prefer to decide this based on heuristics rather than strict definitions and proofs, because it’s difficult to guarantee that something is beneficial in every case. There will inevitably be trade-offs.

@joker-eph
Copy link
Collaborator

joker-eph commented Nov 7, 2025

I have tried to describe the issue I have with canonicalization too,

Sure, I just plainly disagree with your approach to it.

I think it is completely justified ask for people adding something to a canonicalization, to justify why in all possible scenarios something is canonical, but that is not done today. Instead people pushing back against it are asked to justify why something shouldnt be a canonicalization. That seems backwards to how basic proofs work. I cant just claim some mathematical fact. It needs to be proved. The proof cant be "no one gave me a counter-example, so I must be right".

You seem to put what seems to me to be an impossible bar to reach for any practical development in LLVM/MLIR historically speaking: I don't see how anyone can "prove" this. The system isn't defined in terms of mathematical lattice that we can reason about, a lot of it is ad-hoc engineering: this is obviously imperfect but this is what we have to work with.

Another piece of evidence of canonicalizations being used as kitchen sink is the huge compilations times canonicalizations take.

I don't quite see what conclusion you're trying to make from this observation, I can imagine many root causes which have nothing to do with what you're trying to convey here.

I dont know how else to state it. As it stands, the proclivity of everyone just adding to canonicalization without justification or analysis is a bad practice being promoted which is ultimately going to be a major issue for anyone building a serious compiler stack using MLIR.

If this is what you believe, fine: just stop using canonicalization! Nobody forces you to bring these into your compiler.

Specific to this example, to me the movement of operations from the while region to the do region without any side-effect analysis seems like a red flag.

As far as I understand, this transformation is doing:

if (%cond) {
  foo()
}
cond_br %cond ^continue, ^end

^continue:
    ...
^end:
   ...

->

cond_br %cond ^continue, ^end

^continue:
    foo()
    ...
^end:
   ...

I don't quite get why we'd need any side-effect analysis here, this is a "simple" simplification of redundant control-flow (I even wonder if the SimplifyCondBranchFromCondBranchOnSameCondition on the cf.cond_br isn't doing exactly this already).

The movement of operations itself is a compilation time overhead cause it will periodically trigger recomputation of the operation order within multiple blocks which is expensive depending on the program. Such factors should have already been accounted for when adding something to a canonicalization, but that is hardly ever done.

That's totally not the kind of things I would consider. First it's highly dependent on the particular pass pipeline and the shape of the IR, but also that would preclude a lot of obvious canonicalization, starting with folding if (true) { <sequence> } to <sequence>, but also all the ones in cf dialect (which were amongst some of the first implemented).

@joker-eph
Copy link
Collaborator

I’m not opposed to moving it out of canonicalization, but I believe the reason for temporarily moving it out of canonicalization is that this transformation is complex and expensive, which introduces instability and uncertainty., rather than it being unsuitable for canonicalization. Once this optimization becomes sufficiently stable, could we move it back into the canonicalization stage?

If there is "instability and uncertainty", that's a problem to add this in the first place.
Otherwise I'd like to understand the path to get this into canonicalization? What does it mean for this to "become sufficiently stable"?

@Groverkss
Copy link
Member

Groverkss commented Nov 7, 2025

As far as I understand, this transformation is doing:

if (%cond) {
foo()
}
cond_br %cond ^continue, ^end

^continue:
...
^end:
...
->

cond_br %cond ^continue, ^end

^continue:
foo()
...
^end:
...

I think this may be over simplifying the transformation a bit. Maybe I'm wrong (so feel free to correct my example if i'm wrong), but let me try to give an example. Let's take the case:

while {

%something = some_loop_invariant_compute_but_i_never_hoisted_it_out()
if (%cond) {
  foo(%something)
}
cond_br %cond ^continue, ^end

} while {

^continue:
    ...
^end:
   ...
}

If you decide to do the same transformation, you need to somehow also forward %something. You need to either:

  • Check you don't need to forward anything from the conditional region, which would mean check all uses inside the if block
    OR
  • Clone the entire backward chain

Both of these seem heavy weight. Note that you cannot really safely do this transformation without doing atleast one of these. Analyze a whole region or cloning an entire backward chain (or finding that backward chain) seems a bit too much for a canonicalization maybe.

@NexMing
Copy link
Contributor Author

NexMing commented Nov 7, 2025

If there is "instability and uncertainty", that's a problem to add this in the first place. Otherwise I'd like to understand the path to get this into canonicalization? What does it mean for this to "become sufficiently stable"?

What I’m currently worried about is an extreme case where the number of operands to scf.condition could grow significantly, and likewise the number of result values of scf.while could increase substantially. This might make the scf.while look more complex, and I’m not sure whether this would violate the principles of canonicalization.

@NexMing NexMing requested a review from Groverkss November 8, 2025 01:11
@jpienaar
Copy link
Member

jpienaar commented Nov 9, 2025

What I’m currently worried about is an extreme case where the number of operands to scf.condition could grow significantly, and likewise the number of result values of scf.while could increase substantially. This might make the scf.while look more complex, and I’m not sure whether this would violate the principles of canonicalization.

That is valid concern. I think you are saying appearance to also mean ability to analyze. It could result in more complicated liveness analysis than pre-transform (and to Kumar's point, trade off between recomputing or increasing high water mark).

As a general pattern though that is something that could be analyzed in different flows. It seems desirable/useful pattern but not always on/non canonical, so currently proposed as pattern seems sensible. We don't have a big corpus upstream to measure cost, so I don't think we can put that bar here (of course if folks can find cases where bad and file issues great). It seems useful pattern that can be further refined here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants