Skip to content

Conversation

@AlexMaclean
Copy link
Member

Extend folding for X Pred C2 ? X BOp C1 : C2 BOp C1 to min/max(X, C2) BOp C1 to allow min and max as BOp. This ensures a constant clamping pattern is folded into a pair of min/max instructions. Here is a simplified example of a case where this folding is not occurring currently.

int clampToU8(int v) {
    if (v < 0) return 0;
    if (v > 255) return 255;
    return v;
}

https://godbolt.org/z/78jhKPWbv

Generic proof: https://alive2.llvm.org/ce/z/cdpLYy

@AlexMaclean AlexMaclean self-assigned this Jun 5, 2025
@AlexMaclean AlexMaclean requested a review from nikic as a code owner June 5, 2025 00:31
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Jun 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 5, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Alex MacLean (AlexMaclean)

Changes

Extend folding for X Pred C2 ? X BOp C1 : C2 BOp C1 to min/max(X, C2) BOp C1 to allow min and max as BOp. This ensures a constant clamping pattern is folded into a pair of min/max instructions. Here is a simplified example of a case where this folding is not occurring currently.

int clampToU8(int v) {
    if (v &lt; 0) return 0;
    if (v &gt; 255) return 255;
    return v;
}

https://godbolt.org/z/78jhKPWbv

Generic proof: https://alive2.llvm.org/ce/z/cdpLYy


Full diff: https://github.com/llvm/llvm-project/pull/142878.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+33-14)
  • (modified) llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll (+51)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index d7d0431a5b8d0..8307a9842fb95 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -1822,7 +1822,6 @@ static Instruction *foldSelectICmpEq(SelectInst &SI, ICmpInst *ICI,
 static Value *foldSelectWithConstOpToBinOp(ICmpInst *Cmp, Value *TrueVal,
                                            Value *FalseVal,
                                            IRBuilderBase &Builder) {
-  BinaryOperator *BOp;
   Constant *C1, *C2, *C3;
   Value *X;
   CmpPredicate Predicate;
@@ -1838,30 +1837,48 @@ static Value *foldSelectWithConstOpToBinOp(ICmpInst *Cmp, Value *TrueVal,
     Predicate = ICmpInst::getInversePredicate(Predicate);
   }
 
-  if (!match(TrueVal, m_BinOp(BOp)) || !match(FalseVal, m_Constant(C3)))
+  if (!match(FalseVal, m_Constant(C3)) || !TrueVal->hasOneUse())
     return nullptr;
 
-  unsigned Opcode = BOp->getOpcode();
+  bool IsIntrinsic;
+  unsigned Opcode;
+  if (BinaryOperator *BOp = dyn_cast<BinaryOperator>(TrueVal)) {
+    Opcode = BOp->getOpcode();
+    IsIntrinsic = false;
 
-  // This fold causes some regressions and is primarily intended for
-  // add and sub. So we early exit for div and rem to minimize the
-  // regressions.
-  if (Instruction::isIntDivRem(Opcode))
-    return nullptr;
+    // This fold causes some regressions and is primarily intended for
+    // add and sub. So we early exit for div and rem to minimize the
+    // regressions.
+    if (Instruction::isIntDivRem(Opcode))
+      return nullptr;
 
-  if (!match(BOp, m_OneUse(m_BinOp(m_Specific(X), m_Constant(C2)))))
+    if (!match(BOp, m_BinOp(m_Specific(X), m_Constant(C2))))
+      return nullptr;
+
+  } else if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(TrueVal)) {
+    if (!match(II, m_MaxOrMin(m_Specific(X), m_Constant(C2))))
+      return nullptr;
+    Opcode = II->getIntrinsicID();
+    IsIntrinsic = true;
+  } else {
     return nullptr;
+  }
 
   Value *RHS;
   SelectPatternFlavor SPF;
-  const DataLayout &DL = BOp->getDataLayout();
+  const DataLayout &DL = Cmp->getDataLayout();
   auto Flipped = getFlippedStrictnessPredicateAndConstant(Predicate, C1);
 
-  if (C3 == ConstantFoldBinaryOpOperands(Opcode, C1, C2, DL)) {
+  auto FoldBinaryOpOrIntrinsic = [&](Constant *LHS, Constant *RHS) {
+    return IsIntrinsic ? ConstantFoldBinaryIntrinsic(Opcode, LHS, RHS,
+                                                     LHS->getType(), nullptr)
+                       : ConstantFoldBinaryOpOperands(Opcode, LHS, RHS, DL);
+  };
+
+  if (C3 == FoldBinaryOpOrIntrinsic(C1, C2)) {
     SPF = getSelectPattern(Predicate).Flavor;
     RHS = C1;
-  } else if (Flipped && C3 == ConstantFoldBinaryOpOperands(
-                                  Opcode, Flipped->second, C2, DL)) {
+  } else if (Flipped && C3 == FoldBinaryOpOrIntrinsic(Flipped->second, C2)) {
     SPF = getSelectPattern(Flipped->first).Flavor;
     RHS = Flipped->second;
   } else {
@@ -1870,7 +1887,9 @@ static Value *foldSelectWithConstOpToBinOp(ICmpInst *Cmp, Value *TrueVal,
 
   Intrinsic::ID IntrinsicID = getMinMaxIntrinsic(SPF);
   Value *Intrinsic = Builder.CreateBinaryIntrinsic(IntrinsicID, X, RHS);
-  return Builder.CreateBinOp(BOp->getOpcode(), Intrinsic, C2);
+  return IsIntrinsic ? Builder.CreateBinaryIntrinsic(Opcode, Intrinsic, C2)
+                     : Builder.CreateBinOp(Instruction::BinaryOps(Opcode),
+                                           Intrinsic, C2);
 }
 
 /// Visit a SelectInst that has an ICmpInst as its first operand.
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll b/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll
index 68049ca230191..c08ec1bb7de0d 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll
@@ -399,3 +399,54 @@ define i8 @sub_const_on_lhs_negative(i8 %x) {
   %s = select i1 %cmp, i8 %sub, i8 50
   ret i8 %s
 }
+
+define i8 @smin_ugt(i8 %x) {
+; CHECK-LABEL: define i8 @smin_ugt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[S:%.*]] = call i8 @llvm.umin.i8(i8 [[X]], i8 50)
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %smin = call i8 @llvm.smin.i8(i8 %x, i8 50)
+  %cmp = icmp ugt i8 %x, 100
+  %s = select i1 %cmp, i8 50, i8 %smin
+  ret i8 %s
+}
+
+define i8 @smax_ugt(i8 %x) {
+; CHECK-LABEL: define i8 @smax_ugt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.umin.i8(i8 [[X]], i8 100)
+; CHECK-NEXT:    [[S:%.*]] = call i8 @llvm.smax.i8(i8 [[TMP1]], i8 50)
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %smax = call i8 @llvm.smax.i8(i8 %x, i8 50)
+  %cmp = icmp ugt i8 %x, 100
+  %s = select i1 %cmp, i8 100, i8 %smax
+  ret i8 %s
+}
+
+define i8 @umin_slt(i8 %x) {
+; CHECK-LABEL: define i8 @umin_slt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.smax.i8(i8 [[X]], i8 0)
+; CHECK-NEXT:    [[S:%.*]] = call i8 @llvm.umin.i8(i8 [[TMP1]], i8 100)
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %cmp = icmp slt i8 %x, 0
+  %umin = tail call i8 @llvm.umin.i8(i8 %x, i8 100)
+  %s = select i1 %cmp, i8 0, i8 %umin
+  ret i8 %s
+}
+
+define i8 @umax_sgt(i8 %x) {
+; CHECK-LABEL: define i8 @umax_sgt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.smin.i8(i8 [[X]], i8 100)
+; CHECK-NEXT:    [[S:%.*]] = call i8 @llvm.umax.i8(i8 [[TMP1]], i8 50)
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %cmp = icmp sgt i8 %x, 100
+  %umax = tail call i8 @llvm.umax.i8(i8 %x, i8 50)
+  %s = select i1 %cmp, i8 100, i8 %umax
+  ret i8 %s
+}

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@AlexMaclean AlexMaclean merged commit 107601e into llvm:main Jun 6, 2025
14 checks passed
@AlexMaclean
Copy link
Member Author

@nikic / @dtcxzyw I'm considering trying to improve this function further and I'm interested in your thoughts.

The wrap flags may be preserved on the BinOp if wrapping doesn't occur when computing C2 BOp C1 (https://alive2.llvm.org/ce/z/n_3aNJ). Unfortunately, there doesn't seem like there exists a simply way to know whether wrapping occurred when constant folding the binop. The best I can think of would be to first constant-fold a sext/zext of C1 and C2 and see if evaluating BOp on a larger type is equivalent to evaluating on the original type and then extending. Does this seem like an okay approach? or do you know of a better way?

In addition, I've observed some cases where multiple binops with constants get folded into a comparison preventing this optimization from occuring. Would it be alright to iteratively fold binops with constants until we reach the other value in the select or is this sort of thing too complex / potentially expensive for InstCombine?

@dtcxzyw
Copy link
Member

dtcxzyw commented Jun 9, 2025

Unfortunately, there doesn't seem like there exists a simply way to know whether wrapping occurred when constant folding the binop.

If you only care about scalar constants/splat vectors, just use APInt::xxxx_ov.

In addition, I've observed some cases where multiple binops with constants get folded into a comparison preventing this optimization from occuring. Would it be alright to iteratively fold binops with constants until we reach the other value in the select or is this sort of thing too complex / potentially expensive for InstCombine?

Can you provide some examples?

@nikic
Copy link
Contributor

nikic commented Jun 9, 2025

@nikic / @dtcxzyw I'm considering trying to improve this function further and I'm interested in your thoughts.

The wrap flags may be preserved on the BinOp if wrapping doesn't occur when computing C2 BOp C1 (https://alive2.llvm.org/ce/z/n_3aNJ). Unfortunately, there doesn't seem like there exists a simply way to know whether wrapping occurred when constant folding the binop. The best I can think of would be to first constant-fold a sext/zext of C1 and C2 and see if evaluating BOp on a larger type is equivalent to evaluating on the original type and then extending. Does this seem like an okay approach? or do you know of a better way?

There is a willNotOverflow() helper in InstCombine you could use. Do the flags matter in practice for further optimization?

@AlexMaclean
Copy link
Member Author

There is a willNotOverflow() helper in InstCombine you could use. Do the flags matter in practice for further optimization?

Thanks! I'm not sure about IR optimizations but this would improve our ability to reorder the BinOp and Min/Max instructions during ISel and enable some potential better folding. I've created #143471 so you can see what the change would look like.

tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025
)

Extend folding for `X Pred C2 ? X BOp C1 : C2 BOp C1` to `min/max(X, C2)
BOp C1` to allow min and max as `BOp`. This ensures a constant clamping
pattern is folded into a pair of min/max instructions. Here is a
simplified example of a case where this folding is not occurring
currently.

int clampToU8(int v) {
    if (v < 0) return 0;
    if (v > 255) return 255;
    return v;
}

https://godbolt.org/z/78jhKPWbv

Generic proof: https://alive2.llvm.org/ce/z/cdpLYy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants