-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[InstCombine] Allow min/max in constant BOp min/max folding #142878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[InstCombine] Allow min/max in constant BOp min/max folding #142878
Conversation
|
@llvm/pr-subscribers-llvm-transforms Author: Alex MacLean (AlexMaclean) ChangesExtend folding for int clampToU8(int v) {
if (v < 0) return 0;
if (v > 255) return 255;
return v;
}https://godbolt.org/z/78jhKPWbv Generic proof: https://alive2.llvm.org/ce/z/cdpLYy Full diff: https://github.com/llvm/llvm-project/pull/142878.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index d7d0431a5b8d0..8307a9842fb95 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -1822,7 +1822,6 @@ static Instruction *foldSelectICmpEq(SelectInst &SI, ICmpInst *ICI,
static Value *foldSelectWithConstOpToBinOp(ICmpInst *Cmp, Value *TrueVal,
Value *FalseVal,
IRBuilderBase &Builder) {
- BinaryOperator *BOp;
Constant *C1, *C2, *C3;
Value *X;
CmpPredicate Predicate;
@@ -1838,30 +1837,48 @@ static Value *foldSelectWithConstOpToBinOp(ICmpInst *Cmp, Value *TrueVal,
Predicate = ICmpInst::getInversePredicate(Predicate);
}
- if (!match(TrueVal, m_BinOp(BOp)) || !match(FalseVal, m_Constant(C3)))
+ if (!match(FalseVal, m_Constant(C3)) || !TrueVal->hasOneUse())
return nullptr;
- unsigned Opcode = BOp->getOpcode();
+ bool IsIntrinsic;
+ unsigned Opcode;
+ if (BinaryOperator *BOp = dyn_cast<BinaryOperator>(TrueVal)) {
+ Opcode = BOp->getOpcode();
+ IsIntrinsic = false;
- // This fold causes some regressions and is primarily intended for
- // add and sub. So we early exit for div and rem to minimize the
- // regressions.
- if (Instruction::isIntDivRem(Opcode))
- return nullptr;
+ // This fold causes some regressions and is primarily intended for
+ // add and sub. So we early exit for div and rem to minimize the
+ // regressions.
+ if (Instruction::isIntDivRem(Opcode))
+ return nullptr;
- if (!match(BOp, m_OneUse(m_BinOp(m_Specific(X), m_Constant(C2)))))
+ if (!match(BOp, m_BinOp(m_Specific(X), m_Constant(C2))))
+ return nullptr;
+
+ } else if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(TrueVal)) {
+ if (!match(II, m_MaxOrMin(m_Specific(X), m_Constant(C2))))
+ return nullptr;
+ Opcode = II->getIntrinsicID();
+ IsIntrinsic = true;
+ } else {
return nullptr;
+ }
Value *RHS;
SelectPatternFlavor SPF;
- const DataLayout &DL = BOp->getDataLayout();
+ const DataLayout &DL = Cmp->getDataLayout();
auto Flipped = getFlippedStrictnessPredicateAndConstant(Predicate, C1);
- if (C3 == ConstantFoldBinaryOpOperands(Opcode, C1, C2, DL)) {
+ auto FoldBinaryOpOrIntrinsic = [&](Constant *LHS, Constant *RHS) {
+ return IsIntrinsic ? ConstantFoldBinaryIntrinsic(Opcode, LHS, RHS,
+ LHS->getType(), nullptr)
+ : ConstantFoldBinaryOpOperands(Opcode, LHS, RHS, DL);
+ };
+
+ if (C3 == FoldBinaryOpOrIntrinsic(C1, C2)) {
SPF = getSelectPattern(Predicate).Flavor;
RHS = C1;
- } else if (Flipped && C3 == ConstantFoldBinaryOpOperands(
- Opcode, Flipped->second, C2, DL)) {
+ } else if (Flipped && C3 == FoldBinaryOpOrIntrinsic(Flipped->second, C2)) {
SPF = getSelectPattern(Flipped->first).Flavor;
RHS = Flipped->second;
} else {
@@ -1870,7 +1887,9 @@ static Value *foldSelectWithConstOpToBinOp(ICmpInst *Cmp, Value *TrueVal,
Intrinsic::ID IntrinsicID = getMinMaxIntrinsic(SPF);
Value *Intrinsic = Builder.CreateBinaryIntrinsic(IntrinsicID, X, RHS);
- return Builder.CreateBinOp(BOp->getOpcode(), Intrinsic, C2);
+ return IsIntrinsic ? Builder.CreateBinaryIntrinsic(Opcode, Intrinsic, C2)
+ : Builder.CreateBinOp(Instruction::BinaryOps(Opcode),
+ Intrinsic, C2);
}
/// Visit a SelectInst that has an ICmpInst as its first operand.
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll b/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll
index 68049ca230191..c08ec1bb7de0d 100644
--- a/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll
+++ b/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll
@@ -399,3 +399,54 @@ define i8 @sub_const_on_lhs_negative(i8 %x) {
%s = select i1 %cmp, i8 %sub, i8 50
ret i8 %s
}
+
+define i8 @smin_ugt(i8 %x) {
+; CHECK-LABEL: define i8 @smin_ugt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT: [[S:%.*]] = call i8 @llvm.umin.i8(i8 [[X]], i8 50)
+; CHECK-NEXT: ret i8 [[S]]
+;
+ %smin = call i8 @llvm.smin.i8(i8 %x, i8 50)
+ %cmp = icmp ugt i8 %x, 100
+ %s = select i1 %cmp, i8 50, i8 %smin
+ ret i8 %s
+}
+
+define i8 @smax_ugt(i8 %x) {
+; CHECK-LABEL: define i8 @smax_ugt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT: [[TMP1:%.*]] = call i8 @llvm.umin.i8(i8 [[X]], i8 100)
+; CHECK-NEXT: [[S:%.*]] = call i8 @llvm.smax.i8(i8 [[TMP1]], i8 50)
+; CHECK-NEXT: ret i8 [[S]]
+;
+ %smax = call i8 @llvm.smax.i8(i8 %x, i8 50)
+ %cmp = icmp ugt i8 %x, 100
+ %s = select i1 %cmp, i8 100, i8 %smax
+ ret i8 %s
+}
+
+define i8 @umin_slt(i8 %x) {
+; CHECK-LABEL: define i8 @umin_slt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT: [[TMP1:%.*]] = call i8 @llvm.smax.i8(i8 [[X]], i8 0)
+; CHECK-NEXT: [[S:%.*]] = call i8 @llvm.umin.i8(i8 [[TMP1]], i8 100)
+; CHECK-NEXT: ret i8 [[S]]
+;
+ %cmp = icmp slt i8 %x, 0
+ %umin = tail call i8 @llvm.umin.i8(i8 %x, i8 100)
+ %s = select i1 %cmp, i8 0, i8 %umin
+ ret i8 %s
+}
+
+define i8 @umax_sgt(i8 %x) {
+; CHECK-LABEL: define i8 @umax_sgt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT: [[TMP1:%.*]] = call i8 @llvm.smin.i8(i8 [[X]], i8 100)
+; CHECK-NEXT: [[S:%.*]] = call i8 @llvm.umax.i8(i8 [[TMP1]], i8 50)
+; CHECK-NEXT: ret i8 [[S]]
+;
+ %cmp = icmp sgt i8 %x, 100
+ %umax = tail call i8 @llvm.umax.i8(i8 %x, i8 50)
+ %s = select i1 %cmp, i8 100, i8 %umax
+ ret i8 %s
+}
|
dtcxzyw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
@nikic / @dtcxzyw I'm considering trying to improve this function further and I'm interested in your thoughts. The wrap flags may be preserved on the BinOp if wrapping doesn't occur when computing C2 BOp C1 (https://alive2.llvm.org/ce/z/n_3aNJ). Unfortunately, there doesn't seem like there exists a simply way to know whether wrapping occurred when constant folding the binop. The best I can think of would be to first constant-fold a sext/zext of C1 and C2 and see if evaluating BOp on a larger type is equivalent to evaluating on the original type and then extending. Does this seem like an okay approach? or do you know of a better way? In addition, I've observed some cases where multiple binops with constants get folded into a comparison preventing this optimization from occuring. Would it be alright to iteratively fold binops with constants until we reach the other value in the select or is this sort of thing too complex / potentially expensive for InstCombine? |
If you only care about scalar constants/splat vectors, just use
Can you provide some examples? |
There is a willNotOverflow() helper in InstCombine you could use. Do the flags matter in practice for further optimization? |
Thanks! I'm not sure about IR optimizations but this would improve our ability to reorder the BinOp and Min/Max instructions during ISel and enable some potential better folding. I've created #143471 so you can see what the change would look like. |
) Extend folding for `X Pred C2 ? X BOp C1 : C2 BOp C1` to `min/max(X, C2) BOp C1` to allow min and max as `BOp`. This ensures a constant clamping pattern is folded into a pair of min/max instructions. Here is a simplified example of a case where this folding is not occurring currently. int clampToU8(int v) { if (v < 0) return 0; if (v > 255) return 255; return v; } https://godbolt.org/z/78jhKPWbv Generic proof: https://alive2.llvm.org/ce/z/cdpLYy
Extend folding for
X Pred C2 ? X BOp C1 : C2 BOp C1tomin/max(X, C2) BOp C1to allow min and max asBOp. This ensures a constant clamping pattern is folded into a pair of min/max instructions. Here is a simplified example of a case where this folding is not occurring currently.https://godbolt.org/z/78jhKPWbv
Generic proof: https://alive2.llvm.org/ce/z/cdpLYy