Skip to content

Conversation

@Camsyn
Copy link
Contributor

@Camsyn Camsyn commented Oct 29, 2025

Previously, SimplifyCFG only simplified unconditional branches when they met a pattern (swicth -> icmp -> br -> phi) as follows:

   switch i8 %A, label %DEFAULT [ i8 1, label %end    i8 2, label %end ]
DEFAULT:
   %tmp = icmp eq i8 %A, 92
   br label %end
end:
   ... = phi i1 [ true, %entry ], [ %tmp, %DEFAULT ], [ true, %entry ]

This PR supports a new and more generic pattern (switch -> icmp -> select -> br -> phi ) to simplify unconditional branches as follows:

; BEFORE
case1:
  switch i32 %x, label %DEFAULT [ 
    i32 0, label %end    
    i32 1, label %case2 
  ]
case2:
  br label %end
DEFAULT:
  %tmp = icmp eq i32 %x, 2
  %val = select i1 %tmp, i32 V3, i32 V4
  br label %end
end:
  ... = phi i32 [ V1, %case1 ], [ V2, %case2 ], [ %val, %DEFAULT ]

We prefer to split the edge to 'end' so that there are TWO entries of V3/V4 to the PHI, merging the icmp & select into the switch, as follows:

; AFTER
case1:
  switch i32 %x, label %DEFAULT [
    i32 0, label %end
    i32 1, label %case2
    i32 2, label %case3
  ]
case2:
  br label %end
case3:
  br label %end
DEFAULT:
  br label %end
end:
  ... = phi i32 [ V1, %case1 ], [ V2, %case2 ], [ V3, %case3 ], [ V4, %DEFAULT]

Alive2 Proof: https://alive2.llvm.org/ce/z/jYHM4f
Promising Optimization Impact: dtcxzyw/llvm-opt-benchmark#3006

@github-actions
Copy link

github-actions bot commented Oct 29, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@Camsyn Camsyn requested review from dtcxzyw, lattner and nikic October 30, 2025 14:30
@lattner lattner removed their request for review October 31, 2025 00:47
@lattner
Copy link
Collaborator

lattner commented Oct 31, 2025

This optimization makes sense to me, but I'm not longer a competent reviewer for this area, thanks!

@Camsyn
Copy link
Contributor Author

Camsyn commented Oct 31, 2025

The new function tryToSimplifyUncondBranchWithICmpSelectInIt is very similar to tryToSimplifyUncondBranchWithICmpInIt , deserving a refactorization.

But I have no idea of the best practice to refactor:

  1. Merge the two functions into ONE?
  2. Keep the two functions, but extract the common part?

@Camsyn Camsyn marked this pull request as ready for review October 31, 2025 13:36
@llvmbot
Copy link
Member

llvmbot commented Oct 31, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Kunqiu Chen (Camsyn)

Changes

Previously, SimplifyCFG only simplified unconditional branches when they met a pattern (swicth -> icmp -> br -> phi) as follows:

   switch i8 %A, label %DEFAULT [ i8 1, label %end    i8 2, label %end ]
DEFAULT:
   %tmp = icmp eq i8 %A, 92
   br label %end
end:
   ... = phi i1 [ true, %entry ], [ %tmp, %DEFAULT ], [ true, %entry ]

This PR supports a new pattern (``switch -> icmp -> `select` -> `br` -> `phi` ) to simplify unconditional branches as follows:

case1:
  switch i32 %x, label %DEFAULT [ 
    i32 0, label %end    
    i32 1, label %case2 
  ]
case2:
  br label %end
DEFAULT:
  %tmp = icmp eq i32 %x, 2
  %val = select i1 %tmp, i32 V3, i32 V4
  br label %end
end:
  %res = phi i32 [ V1, %case1 ], [ V2, %case2 ], [ %val, %DEFAULT ]
  ret i32 %res

We prefer to split the edge to 'end' so that there are TWO entries of V3/V4 to the PHI, merging the icmp & select into the switch, as follows:

case1:
  switch i32 %x, label %DEFAULT [
    i32 0, label %end
    i32 1, label %case2
    i32 2, label %case3
  ]
case2:
  br label %end
case3:
  br label %end
DEFAULT:
  br label %end
end:
  %res = phi i32 [ V1, %case1 ], [ V2, %case2 ], [ V3, %case3 ], [ V4, %DEFAULT]
  ret i32 %res

Alive2 Proof: https://alive2.llvm.org/ce/z/jYHM4f
Promising Optimization Impact: dtcxzyw/llvm-opt-benchmark#3006


Full diff: https://github.com/llvm/llvm-project/pull/165580.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+176-2)
  • (modified) llvm/test/Transforms/SimplifyCFG/ARM/switch-to-lookup-table.ll (+2-2)
  • (modified) llvm/test/Transforms/SimplifyCFG/switch-transformations-no-lut.ll (+2-3)
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 4fac5d36ddb3f..927d17cfc6d2c 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -301,7 +301,9 @@ class SimplifyCFGOpt {
 
   bool tryToSimplifyUncondBranchWithICmpInIt(ICmpInst *ICI,
                                              IRBuilder<> &Builder);
-
+  bool tryToSimplifyUncondBranchWithICmpSelectInIt(ICmpInst *ICI,
+                                                   SelectInst *Select,
+                                                   IRBuilder<> &Builder);
   bool hoistCommonCodeFromSuccessors(Instruction *TI, bool AllInstsEqOnly);
   bool hoistSuccIdenticalTerminatorToSwitchOrIf(
       Instruction *TI, Instruction *I1,
@@ -5116,6 +5118,173 @@ bool SimplifyCFGOpt::tryToSimplifyUncondBranchWithICmpInIt(
   return true;
 }
 
+/// Similar to tryToSimplifyUncondBranchWithICmpInIt, but handle a more generic
+/// case. This is called when we find an icmp instruction (a seteq/setne with a
+/// constant) and its following select instruction as the only TWO instruction
+/// in a block that ends with an uncond branch.  We are looking for a very
+/// specific pattern that occurs when "
+///    if (A == 1) return C1;
+///    if (A == 2) return C2;
+///    if (A < 3) return C3;
+///    return C4;
+/// " gets simplified.  In this case, we merge the first two "branches of icmp"
+/// into a switch, but then the default value goes to an uncond block with a lt
+/// icmp and select in it, as InstCombine can not simplify "A < 3" as "A == 2".
+/// After SimplifyCFG and other subsequent optimizations (e.g., SCCP), we might
+/// get something like:
+///
+/// case1:
+///   switch i8 %A, label %DEFAULT [ i8 0, label %end    i8 1, label %case2 ]
+/// case2:
+///   br label %end
+/// DEFAULT:
+///   %tmp = icmp eq i8 %A, 2
+///   %val = select i1 %tmp, i8 C3, i8 C4
+///   br label %end
+/// end:
+///   _ = phi i8 [ C1, %case1 ], [ C2, %case2 ], [ %val, %DEFAULT ]
+///
+/// We prefer to split the edge to 'end' so that there are TWO entries of V3/V4
+/// to the PHI, merging the icmp & select into the switch, as follows:
+///
+/// case1:
+///   switch i8 %A, label %DEFAULT [
+///     i8 0, label %end
+///     i8 1, label %case2
+///     i8 2, label %case3
+///   ]
+/// case2:
+///   br label %end
+/// case3:
+///   br label %end
+/// DEFAULT:
+///   br label %end
+/// end:
+///   _ = phi i8 [ C1, %case1 ], [ C2, %case2 ], [ C3, %case2 ], [ C4, %DEFAULT]
+bool SimplifyCFGOpt::tryToSimplifyUncondBranchWithICmpSelectInIt(
+    ICmpInst *ICI, SelectInst *Select, IRBuilder<> &Builder) {
+  BasicBlock *BB = ICI->getParent();
+
+  // If the block has any PHIs in it or the icmp/select has multiple uses, it is
+  // too complex.
+  if (isa<PHINode>(BB->begin()) || !ICI->hasOneUse() || !Select->hasOneUse())
+    return false;
+
+  // The pattern we're looking for is where our only predecessor is a switch on
+  // 'V' and this block is the default case for the switch.  In this case we can
+  // fold the compared value into the switch to simplify things.
+  BasicBlock *Pred = BB->getSinglePredecessor();
+  if (!Pred || !isa<SwitchInst>(Pred->getTerminator()))
+    return false;
+
+  Value *IcmpCond;
+  ConstantInt *NewCaseVal;
+  CmpPredicate Predicate;
+
+  // Match icmp X, C
+  if (!match(ICI,
+             m_ICmp(Predicate, m_Value(IcmpCond), m_ConstantInt(NewCaseVal))))
+    return false;
+
+  Value *SelectCond, *SelectTrueVal, *SelectFalseVal;
+  // Match select Cond, TrueVal, FalseVal
+  if (!match(Select, m_Select(m_Value(SelectCond), m_Value(SelectTrueVal),
+                              m_Value(SelectFalseVal))))
+    return false;
+
+  // Check if the select condition is the same as the icmp condition.
+  if (SelectCond != ICI)
+    return false;
+
+  SwitchInst *SI = cast<SwitchInst>(Pred->getTerminator());
+  if (SI->getCondition() != IcmpCond)
+    return false;
+
+  // If BB is reachable on a non-default case, then we simply know the value of
+  // V in this block.  Substitute it and constant fold the icmp instruction
+  // away.
+  if (SI->getDefaultDest() != BB) {
+    ConstantInt *VVal = SI->findCaseDest(BB);
+    assert(VVal && "Should have a unique destination value");
+    ICI->setOperand(0, VVal);
+
+    if (Value *V = simplifyInstruction(ICI, {DL, ICI})) {
+      ICI->replaceAllUsesWith(V);
+      ICI->eraseFromParent();
+    }
+    // BB is now empty, so it is likely to simplify away.
+    return requestResimplify();
+  }
+
+  // Ok, the block is reachable from the default dest.  If the constant we're
+  // comparing exists in one of the other edges, then we can constant fold ICI
+  // and zap it.
+  if (SI->findCaseValue(NewCaseVal) != SI->case_default()) {
+    Value *V;
+    if (Predicate == ICmpInst::ICMP_EQ)
+      V = ConstantInt::getFalse(BB->getContext());
+    else
+      V = ConstantInt::getTrue(BB->getContext());
+
+    ICI->replaceAllUsesWith(V);
+    ICI->eraseFromParent();
+    // BB is now empty, so it is likely to simplify away.
+    return requestResimplify();
+  }
+
+  // The use of the select has to be in the 'end' block, by the only PHI node in
+  // the block.
+  BasicBlock *SuccBlock = BB->getTerminator()->getSuccessor(0);
+  PHINode *PHIUse = dyn_cast<PHINode>(Select->user_back());
+  if (PHIUse == nullptr || PHIUse != &SuccBlock->front() ||
+      isa<PHINode>(++BasicBlock::iterator(PHIUse)))
+    return false;
+
+  // If the icmp is a SETEQ, then the default dest gets SelectFalseVal, the new
+  // edge gets SelectTrueVal in the PHI.
+  Value *DefaultCst = SelectFalseVal;
+  Value *NewCst = SelectTrueVal;
+
+  if (ICI->getPredicate() == ICmpInst::ICMP_NE)
+    std::swap(DefaultCst, NewCst);
+
+  // Replace Select (which is used by the PHI for the default value) with
+  // SelectFalseVal or SelectTrueVal depending on if ICI is EQ or NE.
+  Select->replaceAllUsesWith(DefaultCst);
+  Select->eraseFromParent();
+  ICI->eraseFromParent();
+
+  SmallVector<DominatorTree::UpdateType, 2> Updates;
+
+  // Okay, the switch goes to this block on a default value.  Add an edge from
+  // the switch to the merge point on the compared value.
+  BasicBlock *NewBB =
+      BasicBlock::Create(BB->getContext(), "switch.edge", BB->getParent(), BB);
+  {
+    SwitchInstProfUpdateWrapper SIW(*SI);
+    auto W0 = SIW.getSuccessorWeight(0);
+    SwitchInstProfUpdateWrapper::CaseWeightOpt NewW;
+    if (W0) {
+      NewW = ((uint64_t(*W0) + 1) >> 1);
+      SIW.setSuccessorWeight(0, *NewW);
+    }
+    SIW.addCase(NewCaseVal, NewBB, NewW);
+    if (DTU)
+      Updates.push_back({DominatorTree::Insert, Pred, NewBB});
+  }
+
+  // NewBB branches to the phi block, add the uncond branch and the phi entry.
+  Builder.SetInsertPoint(NewBB);
+  Builder.SetCurrentDebugLocation(SI->getDebugLoc());
+  Builder.CreateBr(SuccBlock);
+  PHIUse->addIncoming(NewCst, NewBB);
+  if (DTU) {
+    Updates.push_back({DominatorTree::Insert, NewBB, SuccBlock});
+    DTU->applyUpdates(Updates);
+  }
+  return true;
+}
+
 /// The specified branch is a conditional branch.
 /// Check to see if it is branching on an or/and chain of icmp instructions, and
 /// fold it into a switch instruction if so.
@@ -8167,13 +8336,18 @@ bool SimplifyCFGOpt::simplifyUncondBranch(BranchInst *BI,
 
   // If the only instruction in the block is a seteq/setne comparison against a
   // constant, try to simplify the block.
-  if (ICmpInst *ICI = dyn_cast<ICmpInst>(I))
+  if (ICmpInst *ICI = dyn_cast<ICmpInst>(I)) {
     if (ICI->isEquality() && isa<ConstantInt>(ICI->getOperand(1))) {
       ++I;
       if (I->isTerminator() &&
           tryToSimplifyUncondBranchWithICmpInIt(ICI, Builder))
         return true;
+      if (isa<SelectInst>(I) && I->getNextNode()->isTerminator() &&
+          tryToSimplifyUncondBranchWithICmpSelectInIt(ICI, cast<SelectInst>(I),
+                                                      Builder))
+        return true;
     }
+  }
 
   // See if we can merge an empty landing pad block with another which is
   // equivalent.
diff --git a/llvm/test/Transforms/SimplifyCFG/ARM/switch-to-lookup-table.ll b/llvm/test/Transforms/SimplifyCFG/ARM/switch-to-lookup-table.ll
index 6def8f4eeb089..a51b816846cdc 100644
--- a/llvm/test/Transforms/SimplifyCFG/ARM/switch-to-lookup-table.ll
+++ b/llvm/test/Transforms/SimplifyCFG/ARM/switch-to-lookup-table.ll
@@ -15,8 +15,8 @@
 ; DISABLE-NOT: @{{.*}} = private unnamed_addr constant [3 x ptr] [ptr @c1, ptr @c2, ptr @c3]
 ; ENABLE:      @{{.*}} = private unnamed_addr constant [3 x ptr] [ptr @g1, ptr @g2, ptr @g3]
 ; DISABLE-NOT: @{{.*}} = private unnamed_addr constant [3 x ptr] [ptr @g1, ptr @g2, ptr @g3]
-; ENABLE:      @{{.*}} = private unnamed_addr constant [3 x ptr] [ptr @f1, ptr @f2, ptr @f3]
-; DISABLE-NOT: @{{.*}} = private unnamed_addr constant [3 x ptr] [ptr @f1, ptr @f2, ptr @f3]
+; ENABLE:      @{{.*}} = private unnamed_addr constant [4 x ptr] [ptr @f1, ptr @f2, ptr @f3, ptr @f4]
+; DISABLE-NOT: @{{.*}} = private unnamed_addr constant [4 x ptr] [ptr @f1, ptr @f2, ptr @f3, ptr @f4]
 
 target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
 target triple = "armv7a--none-eabi"
diff --git a/llvm/test/Transforms/SimplifyCFG/switch-transformations-no-lut.ll b/llvm/test/Transforms/SimplifyCFG/switch-transformations-no-lut.ll
index 25267dcc6dbcb..48be76c19e48f 100644
--- a/llvm/test/Transforms/SimplifyCFG/switch-transformations-no-lut.ll
+++ b/llvm/test/Transforms/SimplifyCFG/switch-transformations-no-lut.ll
@@ -410,13 +410,12 @@ define i1 @single_value_with_mask(i32 %x) {
 ; OPTNOLUT-NEXT:      i32 21, label %[[END]]
 ; OPTNOLUT-NEXT:      i32 48, label %[[END]]
 ; OPTNOLUT-NEXT:      i32 16, label %[[END]]
+; OPTNOLUT-NEXT:      i32 80, label %[[END]]
 ; OPTNOLUT-NEXT:    ]
 ; OPTNOLUT:       [[DEFAULT]]:
-; OPTNOLUT-NEXT:    [[CMP:%.*]] = icmp eq i32 [[X]], 80
-; OPTNOLUT-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i1 false, i1 true
 ; OPTNOLUT-NEXT:    br label %[[END]]
 ; OPTNOLUT:       [[END]]:
-; OPTNOLUT-NEXT:    [[RES:%.*]] = phi i1 [ false, %[[ENTRY]] ], [ false, %[[ENTRY]] ], [ false, %[[ENTRY]] ], [ false, %[[ENTRY]] ], [ [[SEL]], %[[DEFAULT]] ]
+; OPTNOLUT-NEXT:    [[RES:%.*]] = phi i1 [ false, %[[ENTRY]] ], [ false, %[[ENTRY]] ], [ false, %[[ENTRY]] ], [ false, %[[ENTRY]] ], [ true, %[[DEFAULT]] ], [ false, %[[ENTRY]] ]
 ; OPTNOLUT-NEXT:    ret i1 [[RES]]
 ;
 ; TTINOLUT-LABEL: define i1 @single_value_with_mask(

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd only keep tryToSimplifyUncondBranchWithICmpSelectInIt and treat icmp as select (icmp), i1 true, i1 false. Then we can handle both forms in one function.

bool SimplifyCFGOpt::tryToSimplifyUncondBranchWithICmpSelectInIt(
    ICmpInst *ICI, Value *TV, Value *FV, IRBuilder<> &Builder)

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.


/// Similar to tryToSimplifyUncondBranchWithICmpInIt, but handle a more generic
/// case. This is called when we find an icmp instruction (a seteq/setne with a
/// constant) and its following select instruction as the only TWO instruction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// constant) and its following select instruction as the only TWO instruction
/// constant) and its following select instruction as the only TWO instructions

@Camsyn Camsyn merged commit a0e222f into llvm:main Nov 8, 2025
10 checks passed
@Camsyn Camsyn deleted the feat-simplify-uncond-br-with-icmp-select branch November 8, 2025 12:52
vinay-deshmukh pushed a commit to vinay-deshmukh/llvm-project that referenced this pull request Nov 8, 2025
Previously, SimplifyCFG only simplified unconditional branches when they
met a pattern (`swicth` -> `icmp` -> `br` -> `phi`) as follows:
```LLVM
   switch i8 %A, label %DEFAULT [ i8 1, label %end    i8 2, label %end ]
DEFAULT:
   %tmp = icmp eq i8 %A, 92
   br label %end
end:
   ... = phi i1 [ true, %entry ], [ %tmp, %DEFAULT ], [ true, %entry ]
```

This PR supports a new and more generic pattern (`switch` -> `icmp` ->
`select` -> `br` -> `phi` ) to simplify unconditional branches as
follows:
```LLVM
; BEFORE
case1:
  switch i32 %x, label %DEFAULT [ 
    i32 0, label %end    
    i32 1, label %case2 
  ]
case2:
  br label %end
DEFAULT:
  %tmp = icmp eq i32 %x, 2
  %val = select i1 %tmp, i32 V3, i32 V4
  br label %end
end:
  ... = phi i32 [ V1, %case1 ], [ V2, %case2 ], [ %val, %DEFAULT ]
```

We prefer to split the edge to 'end' so that there are TWO entries of
V3/V4 to the PHI, merging the icmp & select into the switch, as follows:
```LLVM
; AFTER
case1:
  switch i32 %x, label %DEFAULT [
    i32 0, label %end
    i32 1, label %case2
    i32 2, label %case3
  ]
case2:
  br label %end
case3:
  br label %end
DEFAULT:
  br label %end
end:
  ... = phi i32 [ V1, %case1 ], [ V2, %case2 ], [ V3, %case3 ], [ V4, %DEFAULT]
```

Alive2 Proof: https://alive2.llvm.org/ce/z/jYHM4f
Promising Optimization Impact:
dtcxzyw/llvm-opt-benchmark#3006
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants