Skip to content

Conversation

@nikic
Copy link
Contributor

@nikic nikic commented Nov 11, 2024

InstSimplify currently folds alloc1 == alloc2 to false, even if one of them is a zero-size allocation. A zero-size allocation may have the same address as another allocation.

This also disables the fold for the case where we're comparing a zero-size alloc with the middle of another allocation. It's possible that this case is legal to fold depending on our precise zero-size allocation semantics, but LangRef currently doesn't specify this either way, so we shouldn't make assumptions here.

@nikic nikic requested review from dtcxzyw and goldsteinn November 11, 2024 15:53
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Nov 11, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 11, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

InstSimplify currently folds alloc1 == alloc2 to false, even if one of them is a zero-size allocation. A zero-size allocation may have the same address as another allocation.

This also disables the fold for the case where we're comparing a zero-size alloc with the middle of another allocation. It's possible that this case is legal to fold depending on our precise zero-size allocation semantics, but LangRef currently doesn't specify this either way, so we shouldn't make assumptions here.


Full diff: https://github.com/llvm/llvm-project/pull/115728.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/InstructionSimplify.cpp (+2-2)
  • (modified) llvm/test/Transforms/InstSimplify/cmp-alloca-offsets.ll (+10-4)
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index daa468ac095c36..ccb7fd39ba969d 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -2774,8 +2774,8 @@ static Constant *computePointerICmp(CmpInst::Predicate Pred, Value *LHS,
         return nullptr;
       }(LHS);
       Opts.NullIsUnknownSize = F ? NullPointerIsDefined(F) : true;
-      if (getObjectSize(LHS, LHSSize, DL, TLI, Opts) &&
-          getObjectSize(RHS, RHSSize, DL, TLI, Opts)) {
+      if (getObjectSize(LHS, LHSSize, DL, TLI, Opts) && LHSSize != 0 &&
+          getObjectSize(RHS, RHSSize, DL, TLI, Opts) && RHSSize != 0) {
         APInt Dist = LHSOffset - RHSOffset;
         if (Dist.isNonNegative() ? Dist.ult(LHSSize) : (-Dist).ult(RHSSize))
           return ConstantInt::get(getCompareTy(LHS),
diff --git a/llvm/test/Transforms/InstSimplify/cmp-alloca-offsets.ll b/llvm/test/Transforms/InstSimplify/cmp-alloca-offsets.ll
index d076035b269e46..d2c4b944d21c8f 100644
--- a/llvm/test/Transforms/InstSimplify/cmp-alloca-offsets.ll
+++ b/llvm/test/Transforms/InstSimplify/cmp-alloca-offsets.ll
@@ -234,8 +234,9 @@ define i1 @zst_alloca_start() {
 ; CHECK-LABEL: @zst_alloca_start(
 ; CHECK-NEXT:    [[A:%.*]] = alloca i64, align 8
 ; CHECK-NEXT:    [[A2:%.*]] = alloca {}, align 8
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq ptr [[A]], [[A2]]
 ; CHECK-NEXT:    call void @escape(ptr [[A]], ptr [[A2]])
-; CHECK-NEXT:    ret i1 false
+; CHECK-NEXT:    ret i1 [[CMP]]
 ;
   %a = alloca i64
   %a2 = alloca {}
@@ -249,8 +250,10 @@ define i1 @zst_alloca_middle() {
 ; CHECK-LABEL: @zst_alloca_middle(
 ; CHECK-NEXT:    [[A:%.*]] = alloca i64, align 8
 ; CHECK-NEXT:    [[A2:%.*]] = alloca {}, align 8
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr i8, ptr [[A]], i64 4
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq ptr [[GEP]], [[A2]]
 ; CHECK-NEXT:    call void @escape(ptr [[A]], ptr [[A2]])
-; CHECK-NEXT:    ret i1 false
+; CHECK-NEXT:    ret i1 [[CMP]]
 ;
   %a = alloca i64
   %a2 = alloca {}
@@ -282,8 +285,9 @@ define i1 @zst_alloca_end() {
 define i1 @zst_global_start() {
 ; CHECK-LABEL: @zst_global_start(
 ; CHECK-NEXT:    [[A:%.*]] = alloca i64, align 8
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq ptr [[A]], @gz
 ; CHECK-NEXT:    call void @escape(ptr [[A]], ptr @gz)
-; CHECK-NEXT:    ret i1 false
+; CHECK-NEXT:    ret i1 [[CMP]]
 ;
   %a = alloca i64
   %gep = getelementptr i8, ptr %a, i64 0
@@ -295,8 +299,10 @@ define i1 @zst_global_start() {
 define i1 @zst_global_middle() {
 ; CHECK-LABEL: @zst_global_middle(
 ; CHECK-NEXT:    [[A:%.*]] = alloca i64, align 8
+; CHECK-NEXT:    [[GEP:%.*]] = getelementptr i8, ptr [[A]], i64 4
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq ptr [[GEP]], @gz
 ; CHECK-NEXT:    call void @escape(ptr [[A]], ptr @gz)
-; CHECK-NEXT:    ret i1 false
+; CHECK-NEXT:    ret i1 [[CMP]]
 ;
   %a = alloca i64
   %gep = getelementptr i8, ptr %a, i64 4

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

; CHECK-NEXT: [[CMP:%.*]] = icmp eq ptr [[GEP]], [[A2]]
; CHECK-NEXT: call void @escape(ptr [[A]], ptr [[A2]])
; CHECK-NEXT: ret i1 false
; CHECK-NEXT: ret i1 [[CMP]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an InstCombine test to demonstrate that this case can be folded using KnownBits (??000 != ??100)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, this was actually a mistake in the test. The zero-size allocation was supposed to have align 1. It's weird that LLVM infers align 8 for it by default. I adjusted the test.

InstCombine actually fails to handle this (see https://llvm.godbolt.org/z/WMPKdnrMd for variant without allocas) due to a weakness in foldICmpUsingKnownBits. I'll work on a fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the fix: #115874

; CHECK-NEXT: [[CMP:%.*]] = icmp eq ptr [[A]], @gz
; CHECK-NEXT: call void @escape(ptr [[A]], ptr @gz)
; CHECK-NEXT: ret i1 false
; CHECK-NEXT: ret i1 [[CMP]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case (alloca != global) should be folded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently LLVM doesn't make assumptions about how different allocations types are positioned relative to each other. We'd have to add LangRef wording for this, to require that allocas and globals allocations cannot be adjacent.

@nikic nikic force-pushed the instsimplify-zst-alloc branch from aec2f3b to 9b8e91f Compare November 12, 2024 09:34
@llvmbot llvmbot added the llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes label Nov 12, 2024
InstSimplify currently folds alloc1 == alloc2 to false, even if
one of them is a zero-size allocation. A zero-size allocation may
have the same address as another allocation.

This also disables the fold for the case where we're comparing a
zero-size alloc with the middle of another allocation. It's
possible that this case is legal to fold depending on our precise
zero-size allocation semantics, but LangRef currently doesn't
specify this either way, so we shouldn't make assumptions here.
@nikic nikic force-pushed the instsimplify-zst-alloc branch from 9b8e91f to f0f0f95 Compare November 14, 2024 09:36
@nikic nikic merged commit dd9f1a5 into llvm:main Nov 14, 2024
8 checks passed
@nikic nikic deleted the instsimplify-zst-alloc branch November 14, 2024 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:analysis Includes value tracking, cost tables and constant folding llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants