Skip to content

Conversation

@kper
Copy link
Contributor

@kper kper commented Sep 5, 2025

Closes #156898

I have added two cases. The first one matches when the constant is exactly power of 2. The second case was to address the general case mentioned in the linked issue. I, however, did not really solve the general case.
We can only emit a icmp ult if all the bits are one and that's only the case when the constant + 1 is a power of 2. Otherwise, we need to create icmp eq for every bit that is one.

Here are a few examples which won't be working with the two cases:

@kper kper requested a review from nikic as a code owner September 5, 2025 07:13
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Sep 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2025

@llvm/pr-subscribers-llvm-transforms

Author: None (kper)

Changes

Closes #156898

I have added two cases. The first one matches when the constant is exactly power of 2. The second case was to address the general case mentioned in the linked issue. I, however, did not really solve the general case.
We can only emit a icmp ult if all the bits are one and that's only the case when the constant + 1 is a power of 2. Otherwise, we need to create icmp eq for every bit that is one.

Here are a few examples which won't be working with the two cases:

I wonder whether I should still implement the general case since it increments the number of instructions?

cc @nikic @andjo403


Full diff: https://github.com/llvm/llvm-project/pull/157030.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (+23)
  • (modified) llvm/test/Transforms/InstCombine/trunc-lshr.ll (+21)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
index fdef49e310f81..a3e9969503f02 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
@@ -11,11 +11,13 @@
 //===----------------------------------------------------------------------===//
 
 #include "InstCombineInternal.h"
+#include "llvm/ADT/APInt.h"
 #include "llvm/ADT/SetVector.h"
 #include "llvm/Analysis/ConstantFolding.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/IR/PatternMatch.h"
+#include "llvm/IR/Value.h"
 #include "llvm/Support/KnownBits.h"
 #include "llvm/Transforms/InstCombine/InstCombiner.h"
 #include <optional>
@@ -969,6 +971,27 @@ Instruction *InstCombinerImpl::visitTrunc(TruncInst &Trunc) {
     Changed = true;
   }
 
+  const APInt *C1;
+  Value *V1;
+  // trunc (lshr i8 C1, V1) to i1 -> icmp eq V1, sqrt(C1) iff C1 is power of 2
+  if (DestWidth == 1 &&
+      match(Src, m_OneUse(m_Shr(m_Power2(C1), m_Value(V1))))) {
+    const APInt Sqrt = C1->sqrt();
+    Value *Right = ConstantInt::get(V1->getType(), Sqrt);
+    Value *Icmp = Builder.CreateICmpEQ(V1, Right);
+    return replaceInstUsesWith(Trunc, Icmp);
+  }
+
+  // trunc (lshr i8 C1, V1) to i1 -> icmp ult V1, sqrt(C1 + 1) iff (C1 + 1) is
+  // power of 2
+  if (DestWidth == 1 && match(Src, m_OneUse(m_Shr(m_APInt(C1), m_Value(V1)))) &&
+      (*C1 + 1).isPowerOf2()) {
+    const APInt Sqrt = (*C1 + 1).sqrt();
+    Value *Right = ConstantInt::get(V1->getType(), Sqrt);
+    Value *Icmp = Builder.CreateICmpULT(V1, Right);
+    return replaceInstUsesWith(Trunc, Icmp);
+  }
+
   return Changed ? &Trunc : nullptr;
 }
 
diff --git a/llvm/test/Transforms/InstCombine/trunc-lshr.ll b/llvm/test/Transforms/InstCombine/trunc-lshr.ll
index 4364b09cfa709..84daba3d13b9a 100644
--- a/llvm/test/Transforms/InstCombine/trunc-lshr.ll
+++ b/llvm/test/Transforms/InstCombine/trunc-lshr.ll
@@ -93,3 +93,24 @@ define i1 @test5(i32 %i, ptr %p) {
   ret i1 %op
 }
 
+define i1 @test6(i8 %x) {
+; CHECK-LABEL: define i1 @test6(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[TRUNC:%.*]] = icmp eq i8 [[X]], 2
+; CHECK-NEXT:    ret i1 [[TRUNC]]
+;
+  %lshr = lshr i8 4, %x
+  %trunc = trunc i8 %lshr to i1
+  ret i1 %trunc
+}
+
+define i1 @test7(i8 %x) {
+; CHECK-LABEL: define i1 @test7(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[TRUNC:%.*]] = icmp ult i8 [[X]], 4
+; CHECK-NEXT:    ret i1 [[TRUNC]]
+;
+  %lshr = lshr i8 15, %x
+  %trunc = trunc i8 %lshr to i1
+  ret i1 %trunc
+}

@dtcxzyw dtcxzyw requested a review from andjo403 September 7, 2025 14:34
@dtcxzyw dtcxzyw changed the title [InstCombine] Added optimisation for trunc (Pow2 >> x) to 1 [InstCombine] Added optimisation for trunc (Pow2 >> x) to i1 Sep 7, 2025
@kper kper force-pushed the instcombine/156898 branch from 3aadd9b to 766e5f7 Compare September 7, 2025 20:00
@andjo403
Copy link
Contributor

andjo403 commented Sep 7, 2025

the folds also seems to hold for ashr see https://alive2.llvm.org/ce/z/Dm5HEp

@kper kper force-pushed the instcombine/156898 branch from 766e5f7 to c51f8fb Compare September 8, 2025 17:01
@github-actions
Copy link

github-actions bot commented Sep 8, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@kper kper force-pushed the instcombine/156898 branch 2 times, most recently from 9a22aa8 to 96b964e Compare September 8, 2025 17:47
Copy link
Contributor

@andjo403 andjo403 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me but wait for an other review

@dtcxzyw
Copy link
Member

dtcxzyw commented Sep 9, 2025

@zyw-bot mfuzz

@kper kper force-pushed the instcombine/156898 branch from 7c9465f to 1f7e7d4 Compare September 10, 2025 15:09
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

As a follow-up, are you interested in folding trunc (shr C=0b11111...0000, %x) to i1 -> icmp ugt %x, cttz(C) - 1 as well? Not sure if it is profitable in real-world programs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Value *Right = ConstantInt::get(V1->getType(), (*C1 + 1).countr_zero());
Value *Right = ConstantInt::get(V1->getType(), C1->countr_one());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Transformation is done iff PowerOf2(C) || PowerOf2(C + 1)
@dtcxzyw
Copy link
Member

dtcxzyw commented Sep 10, 2025

@kper Please avoid force-pushing your branch if unnecessary.

@kper kper force-pushed the instcombine/156898 branch from 1f7e7d4 to 50fa45b Compare September 10, 2025 15:22
@kper
Copy link
Contributor Author

kper commented Sep 10, 2025

@kper Please avoid force-pushing your branch if unnecessary.

ah ok, sry
I just wanted to have one commit

@kper
Copy link
Contributor Author

kper commented Sep 10, 2025

LGTM. Thanks.

As a follow-up, are you interested in folding trunc (shr C=0b11111...0000, %x) to i1 -> icmp ugt %x, cttz(C) - 1 as well? Not sure if it is profitable in real-world programs.

Yeah sure :)
I will make a new PR tomorrow and let's see whether it improves something
Can't hurt to try it, thanks for the suggestion

@dtcxzyw dtcxzyw enabled auto-merge (squash) September 10, 2025 15:26
@dtcxzyw dtcxzyw merged commit 265b032 into llvm:main Sep 10, 2025
9 checks passed
nikic pushed a commit that referenced this pull request Sep 12, 2025
…157998)

Follow up of #157030

```
trunc ( lshr i8 C1, V1) to i1 -> icmp ugt V1, cttz(C1) - 1 iff (C1) is negative power of 2
trunc ( ashr i8 C1, V1) to i1 -> icmp ugt V1, cttz(C1) - 1 iff (C1) is negative power of 2
```

General proof:
lshr: https://alive2.llvm.org/ce/z/vVfaJc
ashr: https://alive2.llvm.org/ce/z/8aAcgD
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Sep 12, 2025
…> x) to i1 (#157998)

Follow up of llvm/llvm-project#157030

```
trunc ( lshr i8 C1, V1) to i1 -> icmp ugt V1, cttz(C1) - 1 iff (C1) is negative power of 2
trunc ( ashr i8 C1, V1) to i1 -> icmp ugt V1, cttz(C1) - 1 iff (C1) is negative power of 2
```

General proof:
lshr: https://alive2.llvm.org/ce/z/vVfaJc
ashr: https://alive2.llvm.org/ce/z/8aAcgD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[InstCombine] Missed optimization for trunc (Pow2 >> X) to i1

4 participants