Skip to content

Conversation

@dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Oct 4, 2024

@llvmbot
Copy link
Member

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

Alive2: https://alive2.llvm.org/ce/z/Z7QgjH


Full diff: https://github.com/llvm/llvm-project/pull/111151.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp (+9)
  • (modified) llvm/test/Transforms/InstCombine/lshr.ll (+62)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
index ee6b60f7f70d68..b330a8ad9f8a6c 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
@@ -770,6 +770,15 @@ Value *InstCombinerImpl::SimplifyDemandedUseBits(Instruction *I,
             return InsertNewInstWith(Shl, I->getIterator());
           }
         }
+
+        const APInt *Factor;
+        if (match(I->getOperand(0),
+                  m_OneUse(m_Mul(m_Value(X), m_APInt(Factor)))) &&
+            Factor->countr_zero() >= ShiftAmt) {
+          BinaryOperator *Mul = BinaryOperator::CreateMul(
+              X, ConstantInt::get(X->getType(), Factor->lshr(ShiftAmt)));
+          return InsertNewInstWith(Mul, I->getIterator());
+        }
       }
 
       // Unsigned shift right.
diff --git a/llvm/test/Transforms/InstCombine/lshr.ll b/llvm/test/Transforms/InstCombine/lshr.ll
index 4360714c78caa6..ccc2b61b2989af 100644
--- a/llvm/test/Transforms/InstCombine/lshr.ll
+++ b/llvm/test/Transforms/InstCombine/lshr.ll
@@ -1523,3 +1523,65 @@ define <2 x i8> @bool_add_lshr_vec_wrong_shift_amt(<2 x i1> %a, <2 x i1> %b) {
   %lshr = lshr <2 x i8> %add, <i8 1, i8 2>
   ret <2 x i8> %lshr
 }
+
+define i32 @lowbits_of_lshr_mul(i64 %x) {
+; CHECK-LABEL: @lowbits_of_lshr_mul(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = trunc i64 [[X:%.*]] to i32
+; CHECK-NEXT:    [[CONV:%.*]] = mul i32 [[TMP0]], 15
+; CHECK-NEXT:    ret i32 [[CONV]]
+;
+entry:
+  %mul = mul i64 %x, 64424509440
+  %shift = lshr i64 %mul, 32
+  %conv = trunc i64 %shift to i32
+  ret i32 %conv
+}
+
+define i32 @lowbits_of_lshr_mul_mask(i32 %x) {
+; CHECK-LABEL: @lowbits_of_lshr_mul_mask(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = mul i32 [[X:%.*]], 1600
+; CHECK-NEXT:    [[CONV:%.*]] = and i32 [[TMP0]], 32704
+; CHECK-NEXT:    ret i32 [[CONV]]
+;
+entry:
+  %mul = mul i32 %x, 104857600
+  %shift = lshr i32 %mul, 16
+  %conv = and i32 %shift, 32767
+  ret i32 %conv
+}
+
+; Negative tests
+
+define i32 @lowbits_of_lshr_mul_mask_multiuse(i32 %x) {
+; CHECK-LABEL: @lowbits_of_lshr_mul_mask_multiuse(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[MUL:%.*]] = mul i32 [[X:%.*]], 104857600
+; CHECK-NEXT:    call void @use(i32 [[MUL]])
+; CHECK-NEXT:    [[SHIFT:%.*]] = lshr exact i32 [[MUL]], 16
+; CHECK-NEXT:    [[CONV:%.*]] = and i32 [[SHIFT]], 32704
+; CHECK-NEXT:    ret i32 [[CONV]]
+;
+entry:
+  %mul = mul i32 %x, 104857600
+  call void @use(i32 %mul)
+  %shift = lshr i32 %mul, 16
+  %conv = and i32 %shift, 32767
+  ret i32 %conv
+}
+
+define i32 @lowbits_of_lshr_mul_mask_indivisible(i32 %x) {
+; CHECK-LABEL: @lowbits_of_lshr_mul_mask_indivisible(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[MUL:%.*]] = mul i32 [[X:%.*]], 25600
+; CHECK-NEXT:    [[SHIFT:%.*]] = lshr i32 [[MUL]], 16
+; CHECK-NEXT:    [[CONV:%.*]] = and i32 [[SHIFT]], 32767
+; CHECK-NEXT:    ret i32 [[CONV]]
+;
+entry:
+  %mul = mul i32 %x, 25600
+  %shift = lshr i32 %mul, 16
+  %conv = and i32 %shift, 32767
+  ret i32 %conv
+}

@goldsteinn
Copy link
Contributor

Any value in also handling div?

if (match(I->getOperand(0),
m_OneUse(m_Mul(m_Value(X), m_APInt(Factor)))) &&
Factor->countr_zero() >= ShiftAmt) {
BinaryOperator *Mul = BinaryOperator::CreateMul(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can preserve nuw and nsw iff you have nuw:
https://alive2.llvm.org/ce/z/P9NudG

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(X *nuw (Y << K)) u>> K has been handled in other places: https://godbolt.org/z/43W4vE8bK

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put the two impls in the same place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the two patterns would combine well. In one case we can optimize because there is no overflow. In the other case we can optimize because the bits affected by overflow are not demanded.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dtcxzyw dtcxzyw merged commit 295d6b1 into llvm:main Dec 3, 2024
@dtcxzyw dtcxzyw deleted the perf/demanded-bits-mul-lshr branch December 3, 2024 04:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants