Skip to content

Conversation

@nikic
Copy link
Contributor

@nikic nikic commented Jul 7, 2025

If C1 is 1 and we're working with a power of two divisor, this will end up replacing the and for the remainder with a multiply and a longer dependency chain.

Fixes #147176.

@nikic nikic requested a review from dtcxzyw July 7, 2025 15:18
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Jul 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 7, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

If C1 is 1, this will end up replacing the remainder with a multiply and a longer dependency chain. This is clearly unprofitable in the case where the remainder is an and, but I think the profitability is also questionable for urem (because usually udiv and urem are produced by the same instruction), so I've disabled both cases.

Fixes #147176.


Full diff: https://github.com/llvm/llvm-project/pull/147319.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp (+3-1)
  • (modified) llvm/test/Transforms/InstCombine/add4.ll (+86)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
index 1ba548b6ff062..1ad0a9c488e80 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
@@ -1193,8 +1193,10 @@ Value *InstCombinerImpl::SimplifyAddWithRemainder(BinaryOperator &I) {
   }
   Value *DivOpV;
   APInt DivOpC;
+  // The transform is valid for C1==1, but not profitable.
   if (MatchRem(Rem, X, C0, IsSigned) &&
-      MatchDiv(Div, DivOpV, DivOpC, IsSigned) && X == DivOpV && C0 == DivOpC) {
+      MatchDiv(Div, DivOpV, DivOpC, IsSigned) && X == DivOpV && C0 == DivOpC &&
+      !C1.isOne()) {
     APInt NewC = C1 - C2 * C0;
     if (!NewC.isZero() && !Rem->hasOneUse())
       return nullptr;
diff --git a/llvm/test/Transforms/InstCombine/add4.ll b/llvm/test/Transforms/InstCombine/add4.ll
index 0e97deb4d98ad..f766ccf651aa2 100644
--- a/llvm/test/Transforms/InstCombine/add4.ll
+++ b/llvm/test/Transforms/InstCombine/add4.ll
@@ -289,3 +289,89 @@ entry:
   %add = add i32 %shl, %rem
   ret i32 %add
 }
+
+define i32 @fold_add_udiv_urem_no_mul(i32 noundef %val) {
+; CHECK-LABEL: @fold_add_udiv_urem_no_mul(
+; CHECK-NEXT:    [[DIV:%.*]] = udiv i32 [[VAL:%.*]], 10
+; CHECK-NEXT:    [[REM:%.*]] = urem i32 [[VAL]], 10
+; CHECK-NEXT:    [[ADD:%.*]] = add nuw nsw i32 [[DIV]], [[REM]]
+; CHECK-NEXT:    ret i32 [[ADD]]
+;
+  %div = udiv i32 %val, 10
+  %rem = urem i32 %val, 10
+  %add = add i32 %div, %rem
+  ret i32 %add
+}
+
+define i32 @fold_add_udiv_urem_rem_mul(i32 noundef %val) {
+; CHECK-LABEL: @fold_add_udiv_urem_rem_mul(
+; CHECK-NEXT:    [[DIV:%.*]] = udiv i32 [[VAL:%.*]], 10
+; CHECK-NEXT:    [[REM:%.*]] = urem i32 [[VAL]], 10
+; CHECK-NEXT:    [[MUL:%.*]] = mul nuw nsw i32 [[REM]], 3
+; CHECK-NEXT:    [[ADD:%.*]] = add nuw nsw i32 [[DIV]], [[MUL]]
+; CHECK-NEXT:    ret i32 [[ADD]]
+;
+  %div = udiv i32 %val, 10
+  %rem = urem i32 %val, 10
+  %mul = mul i32 %rem, 3
+  %add = add i32 %div, %mul
+  ret i32 %add
+}
+
+define i32 @fold_add_udiv_urem_pow2_no_mul(i32 noundef %arg) {
+; CHECK-LABEL: @fold_add_udiv_urem_pow2_no_mul(
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr i32 [[ARG:%.*]], 4
+; CHECK-NEXT:    [[AND:%.*]] = and i32 [[ARG]], 15
+; CHECK-NEXT:    [[ADD:%.*]] = add nuw nsw i32 [[LSHR]], [[AND]]
+; CHECK-NEXT:    ret i32 [[ADD]]
+;
+  %lshr = lshr i32 %arg, 4
+  %and = and i32 %arg, 15
+  %add = add i32 %lshr, %and
+  ret i32 %add
+}
+
+define i32 @fold_add_udiv_urem_pow2_div_mul(i32 noundef %arg) {
+; CHECK-LABEL: @fold_add_udiv_urem_pow2_div_mul(
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr i32 [[ARG:%.*]], 4
+; CHECK-NEXT:    [[TMP1:%.*]] = mul i32 [[LSHR]], -13
+; CHECK-NEXT:    [[ADD:%.*]] = add i32 [[TMP1]], [[ARG]]
+; CHECK-NEXT:    ret i32 [[ADD]]
+;
+  %lshr = lshr i32 %arg, 4
+  %mul = mul i32 %lshr, 3
+  %and = and i32 %arg, 15
+  %add = add i32 %mul, %and
+  ret i32 %add
+}
+
+define i32 @fold_add_udiv_urem_pow2_rem_mul(i32 noundef %arg) {
+; CHECK-LABEL: @fold_add_udiv_urem_pow2_rem_mul(
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr i32 [[ARG:%.*]], 4
+; CHECK-NEXT:    [[AND:%.*]] = and i32 [[ARG]], 15
+; CHECK-NEXT:    [[MUL:%.*]] = mul nuw nsw i32 [[AND]], 3
+; CHECK-NEXT:    [[ADD:%.*]] = add nuw nsw i32 [[LSHR]], [[MUL]]
+; CHECK-NEXT:    ret i32 [[ADD]]
+;
+  %lshr = lshr i32 %arg, 4
+  %and = and i32 %arg, 15
+  %mul = mul i32 %and, 3
+  %add = add i32 %lshr, %mul
+  ret i32 %add
+}
+
+define i32 @fold_add_udiv_urem_pow2_both_mul(i32 noundef %arg) {
+; CHECK-LABEL: @fold_add_udiv_urem_pow2_both_mul(
+; CHECK-NEXT:    [[LSHR:%.*]] = lshr i32 [[ARG:%.*]], 4
+; CHECK-NEXT:    [[TMP1:%.*]] = mul i32 [[ARG]], 3
+; CHECK-NEXT:    [[TMP2:%.*]] = mul i32 [[LSHR]], -41
+; CHECK-NEXT:    [[ADD:%.*]] = add i32 [[TMP2]], [[TMP1]]
+; CHECK-NEXT:    ret i32 [[ADD]]
+;
+  %lshr = lshr i32 %arg, 4
+  %mul1 = mul i32 %lshr, 7
+  %and = and i32 %arg, 15
+  %mul2 = mul i32 %and, 3
+  %add = add i32 %mul1, %mul2
+  ret i32 %add
+}

@nikic
Copy link
Contributor Author

nikic commented Jul 7, 2025

If C1 is 1, this will end up replacing the remainder with a multiply and a longer dependency chain. This is clearly unprofitable in the case where the remainder is an and, but I think the profitability is also questionable for urem (because usually udiv and urem are produced by the same instruction), so I've disabled both cases.

I've added a special case where the divisor is 2, as that gets beneficial followup folds.

On further consideration, I think it probably generally makes sense to do the transform for the cases using actual div/rem. For constant divisors we're not going to use actual divrem instructions anyway, and after playing around with codegen for different cases a bit, I think the mul ends up being marginally better in most cases.

@nikic
Copy link
Contributor Author

nikic commented Jul 8, 2025

Updated to narrow the exclusion to just the unsigned, power-of-two (which is not 2) case, so this should now only exclude the specific case of converting and to mul.

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@nikic nikic merged commit 889854b into llvm:main Jul 8, 2025
9 checks passed
@nikic nikic deleted the add-with-remainder branch July 8, 2025 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[InstCombine] Unprofitable transformation

3 participants