[SeparateConstOffsetFromGEP] Preserve inbounds flag based on ValueTracking and NUW #130617

ritter-x2a · 2025-03-10T14:34:18Z

If we know that the initial GEP was inbounds, and we change it to a
sequence of GEPs from the same base pointer where every offset is
non-negative, then the new GEPs are inbounds.

We can also preserve inbounds if the inbounds GEP and the involved additions are NUW.

For SWDEV-516125.

ritter-x2a · 2025-03-10T14:34:43Z

[SDAG] Introduce inbounds flag for pointer arithmetic #131862 : 2 dependent PRs (#131863 , #131994 )
[SeparateConstOffsetFromGEP] Preserve inbounds flag based on ValueTracking and NUW #130617 👈 (View in Graphite)
[SeparateConstOffsetFromGEP] Don't set unsound inbounds flag #130616
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-03-10T14:45:31Z

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-llvm-transforms

Author: Fabian Ritter (ritter-x2a)

Changes

If we know that the initial GEP was inbounds, and we change it to a
sequence of GEPs from the same base pointer where every offset is
non-negative, then the new GEPs are inbounds.

For SWDEV-516125.

Full diff: https://github.com/llvm/llvm-project/pull/130617.diff

4 Files Affected:

(modified) llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp (+13-5)
(modified) llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll (+23)
(modified) llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep-and-gvn.ll (+8-8)
(modified) llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll (+4-4)

diff --git a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
index 138a71ce79cef..070afdf0752f4 100644
--- a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
+++ b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
@@ -1052,6 +1052,8 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
     }
   }
 
+  bool MayRecoverInbounds = AccumulativeByteOffset >= 0 && GEP->isInBounds();
+
   // Remove the constant offset in each sequential index. The resultant GEP
   // computes the variadic base.
   // Notice that we don't remove struct field indices here. If LowerGEP is
@@ -1079,6 +1081,8 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
         // and the old index if they are not used.
         RecursivelyDeleteTriviallyDeadInstructions(UserChainTail);
         RecursivelyDeleteTriviallyDeadInstructions(OldIdx);
+        MayRecoverInbounds =
+            MayRecoverInbounds && computeKnownBits(NewIdx, *DL).isNonNegative();
       }
     }
   }
@@ -1100,11 +1104,15 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
   // address with silently-wrapping two's complement arithmetic".
   // Therefore, the final code will be a semantically equivalent.
   //
-  // TODO(jingyue): do some range analysis to keep as many inbounds as
-  // possible. GEPs with inbounds are more friendly to alias analysis.
-  // TODO(gep_nowrap): Preserve nuw at least.
-  auto NewGEPFlags = GEPNoWrapFlags::none();
-  GEP->setNoWrapFlags(GEPNoWrapFlags::none());
+  // If the initial GEP was inbounds and all variable indices and the
+  // accumulated offsets are non-negative, they can be added in any order and
+  // the intermediate results are in bounds. So, we can preserve the inbounds
+  // flag for both GEPs. GEPs with inbounds are more friendly to alias analysis.
+  //
+  // TODO(gep_nowrap): Preserve nuw?
+  auto NewGEPFlags =
+      MayRecoverInbounds ? GEPNoWrapFlags::inBounds() : GEPNoWrapFlags::none();
+  GEP->setNoWrapFlags(NewGEPFlags);
 
   // Lowers a GEP to either GEPs with a single index or arithmetic operations.
   if (LowerGEP) {
diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll
index 877de38776839..91b5bc874c154 100644
--- a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll
@@ -24,3 +24,26 @@ entry:
   store float %3, ptr %arrayidx.dst, align 4
   ret void
 }
+
+; All offsets must be positive, so inbounds can be preserved.
+define void @must_be_inbounds(ptr %dst, ptr %src, i32 %i) {
+; CHECK-LABEL: @must_be_inbounds(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I:%.*]] to i64
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds float, ptr [[SRC:%.*]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX_SRC2:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 4
+; CHECK-NEXT:    [[TMP1:%.*]] = load float, ptr [[ARRAYIDX_SRC2]], align 4
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds float, ptr [[DST:%.*]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX_DST4:%.*]] = getelementptr inbounds i8, ptr [[TMP2]], i64 4
+; CHECK-NEXT:    store float [[TMP1]], ptr [[ARRAYIDX_DST4]], align 4
+; CHECK-NEXT:    ret void
+;
+entry:
+  %i.prom = zext i32 %i to i64
+  %idx = add nsw i64 %i.prom, 1
+  %arrayidx.src = getelementptr inbounds float, ptr %src, i64 %idx
+  %3 = load float, ptr %arrayidx.src, align 4
+  %arrayidx.dst = getelementptr inbounds float, ptr %dst, i64 %idx
+  store float %3, ptr %arrayidx.dst, align 4
+  ret void
+}
diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep-and-gvn.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep-and-gvn.ll
index 9a73feb2c4b5c..4474585bf9b06 100644
--- a/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep-and-gvn.ll
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep-and-gvn.ll
@@ -157,19 +157,19 @@ define void @sum_of_array3(i32 %x, i32 %y, ptr nocapture %output) {
 ; IR-NEXT:  .preheader:
 ; IR-NEXT:    [[TMP0:%.*]] = zext i32 [[Y]] to i64
 ; IR-NEXT:    [[TMP1:%.*]] = zext i32 [[X]] to i64
-; IR-NEXT:    [[TMP2:%.*]] = getelementptr [32 x [32 x float]], ptr addrspace(3) @array, i64 0, i64 [[TMP1]], i64 [[TMP0]]
+; IR-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [32 x [32 x float]], ptr addrspace(3) @array, i64 0, i64 [[TMP1]], i64 [[TMP0]]
 ; IR-NEXT:    [[TMP3:%.*]] = addrspacecast ptr addrspace(3) [[TMP2]] to ptr
 ; IR-NEXT:    [[TMP4:%.*]] = load float, ptr [[TMP3]], align 4
 ; IR-NEXT:    [[TMP5:%.*]] = fadd float [[TMP4]], 0.000000e+00
-; IR-NEXT:    [[TMP6:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 4
+; IR-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 4
 ; IR-NEXT:    [[TMP7:%.*]] = addrspacecast ptr addrspace(3) [[TMP6]] to ptr
 ; IR-NEXT:    [[TMP8:%.*]] = load float, ptr [[TMP7]], align 4
 ; IR-NEXT:    [[TMP9:%.*]] = fadd float [[TMP5]], [[TMP8]]
-; IR-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 128
+; IR-NEXT:    [[TMP10:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 128
 ; IR-NEXT:    [[TMP11:%.*]] = addrspacecast ptr addrspace(3) [[TMP10]] to ptr
 ; IR-NEXT:    [[TMP12:%.*]] = load float, ptr [[TMP11]], align 4
 ; IR-NEXT:    [[TMP13:%.*]] = fadd float [[TMP9]], [[TMP12]]
-; IR-NEXT:    [[TMP14:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 132
+; IR-NEXT:    [[TMP14:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 132
 ; IR-NEXT:    [[TMP15:%.*]] = addrspacecast ptr addrspace(3) [[TMP14]] to ptr
 ; IR-NEXT:    [[TMP16:%.*]] = load float, ptr [[TMP15]], align 4
 ; IR-NEXT:    [[TMP17:%.*]] = fadd float [[TMP13]], [[TMP16]]
@@ -224,19 +224,19 @@ define void @sum_of_array4(i32 %x, i32 %y, ptr nocapture %output) {
 ; IR-NEXT:  .preheader:
 ; IR-NEXT:    [[TMP0:%.*]] = zext i32 [[Y]] to i64
 ; IR-NEXT:    [[TMP1:%.*]] = zext i32 [[X]] to i64
-; IR-NEXT:    [[TMP2:%.*]] = getelementptr [32 x [32 x float]], ptr addrspace(3) @array, i64 0, i64 [[TMP1]], i64 [[TMP0]]
+; IR-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [32 x [32 x float]], ptr addrspace(3) @array, i64 0, i64 [[TMP1]], i64 [[TMP0]]
 ; IR-NEXT:    [[TMP3:%.*]] = addrspacecast ptr addrspace(3) [[TMP2]] to ptr
 ; IR-NEXT:    [[TMP4:%.*]] = load float, ptr [[TMP3]], align 4
 ; IR-NEXT:    [[TMP5:%.*]] = fadd float [[TMP4]], 0.000000e+00
-; IR-NEXT:    [[TMP6:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 4
+; IR-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 4
 ; IR-NEXT:    [[TMP7:%.*]] = addrspacecast ptr addrspace(3) [[TMP6]] to ptr
 ; IR-NEXT:    [[TMP8:%.*]] = load float, ptr [[TMP7]], align 4
 ; IR-NEXT:    [[TMP9:%.*]] = fadd float [[TMP5]], [[TMP8]]
-; IR-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 128
+; IR-NEXT:    [[TMP10:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 128
 ; IR-NEXT:    [[TMP11:%.*]] = addrspacecast ptr addrspace(3) [[TMP10]] to ptr
 ; IR-NEXT:    [[TMP12:%.*]] = load float, ptr [[TMP11]], align 4
 ; IR-NEXT:    [[TMP13:%.*]] = fadd float [[TMP9]], [[TMP12]]
-; IR-NEXT:    [[TMP14:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 132
+; IR-NEXT:    [[TMP14:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 132
 ; IR-NEXT:    [[TMP15:%.*]] = addrspacecast ptr addrspace(3) [[TMP14]] to ptr
 ; IR-NEXT:    [[TMP16:%.*]] = load float, ptr [[TMP15]], align 4
 ; IR-NEXT:    [[TMP17:%.*]] = fadd float [[TMP13]], [[TMP16]]
diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll
index 77b3434f4f159..da04a6e979425 100644
--- a/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll
@@ -372,8 +372,8 @@ define ptr @trunk_explicit(ptr %ptr, i64 %idx) {
 ; CHECK-LABEL: define ptr @trunk_explicit(
 ; CHECK-SAME: ptr [[PTR:%.*]], i64 [[IDX:%.*]]) {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr [[STRUCT0:%.*]], ptr [[PTR]], i64 0, i32 3, i64 [[IDX]], i32 1
-; CHECK-NEXT:    [[PTR21:%.*]] = getelementptr i8, ptr [[TMP0]], i64 3216
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds [[STRUCT0:%.*]], ptr [[PTR]], i64 0, i32 3, i64 [[IDX]], i32 1
+; CHECK-NEXT:    [[PTR21:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 3216
 ; CHECK-NEXT:    ret ptr [[PTR21]]
 ;
 entry:
@@ -389,8 +389,8 @@ define ptr @trunk_long_idx(ptr %ptr, i64 %idx) {
 ; CHECK-LABEL: define ptr @trunk_long_idx(
 ; CHECK-SAME: ptr [[PTR:%.*]], i64 [[IDX:%.*]]) {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr [[STRUCT0:%.*]], ptr [[PTR]], i64 0, i32 3, i64 [[IDX]], i32 1
-; CHECK-NEXT:    [[PTR21:%.*]] = getelementptr i8, ptr [[TMP0]], i64 3216
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds [[STRUCT0:%.*]], ptr [[PTR]], i64 0, i32 3, i64 [[IDX]], i32 1
+; CHECK-NEXT:    [[PTR21:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 3216
 ; CHECK-NEXT:    ret ptr [[PTR21]]
 ;
 entry:

llvmbot · 2025-03-10T14:45:31Z

@llvm/pr-subscribers-backend-nvptx

Author: Fabian Ritter (ritter-x2a)

Changes

If we know that the initial GEP was inbounds, and we change it to a
sequence of GEPs from the same base pointer where every offset is
non-negative, then the new GEPs are inbounds.

For SWDEV-516125.

Full diff: https://github.com/llvm/llvm-project/pull/130617.diff

4 Files Affected:

(modified) llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp (+13-5)
(modified) llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll (+23)
(modified) llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep-and-gvn.ll (+8-8)
(modified) llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll (+4-4)

diff --git a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
index 138a71ce79cef..070afdf0752f4 100644
--- a/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
+++ b/llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
@@ -1052,6 +1052,8 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
     }
   }
 
+  bool MayRecoverInbounds = AccumulativeByteOffset >= 0 && GEP->isInBounds();
+
   // Remove the constant offset in each sequential index. The resultant GEP
   // computes the variadic base.
   // Notice that we don't remove struct field indices here. If LowerGEP is
@@ -1079,6 +1081,8 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
         // and the old index if they are not used.
         RecursivelyDeleteTriviallyDeadInstructions(UserChainTail);
         RecursivelyDeleteTriviallyDeadInstructions(OldIdx);
+        MayRecoverInbounds =
+            MayRecoverInbounds && computeKnownBits(NewIdx, *DL).isNonNegative();
       }
     }
   }
@@ -1100,11 +1104,15 @@ bool SeparateConstOffsetFromGEP::splitGEP(GetElementPtrInst *GEP) {
   // address with silently-wrapping two's complement arithmetic".
   // Therefore, the final code will be a semantically equivalent.
   //
-  // TODO(jingyue): do some range analysis to keep as many inbounds as
-  // possible. GEPs with inbounds are more friendly to alias analysis.
-  // TODO(gep_nowrap): Preserve nuw at least.
-  auto NewGEPFlags = GEPNoWrapFlags::none();
-  GEP->setNoWrapFlags(GEPNoWrapFlags::none());
+  // If the initial GEP was inbounds and all variable indices and the
+  // accumulated offsets are non-negative, they can be added in any order and
+  // the intermediate results are in bounds. So, we can preserve the inbounds
+  // flag for both GEPs. GEPs with inbounds are more friendly to alias analysis.
+  //
+  // TODO(gep_nowrap): Preserve nuw?
+  auto NewGEPFlags =
+      MayRecoverInbounds ? GEPNoWrapFlags::inBounds() : GEPNoWrapFlags::none();
+  GEP->setNoWrapFlags(NewGEPFlags);
 
   // Lowers a GEP to either GEPs with a single index or arithmetic operations.
   if (LowerGEP) {
diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll
index 877de38776839..91b5bc874c154 100644
--- a/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll
@@ -24,3 +24,26 @@ entry:
   store float %3, ptr %arrayidx.dst, align 4
   ret void
 }
+
+; All offsets must be positive, so inbounds can be preserved.
+define void @must_be_inbounds(ptr %dst, ptr %src, i32 %i) {
+; CHECK-LABEL: @must_be_inbounds(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[I_PROM:%.*]] = zext i32 [[I:%.*]] to i64
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds float, ptr [[SRC:%.*]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX_SRC2:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 4
+; CHECK-NEXT:    [[TMP1:%.*]] = load float, ptr [[ARRAYIDX_SRC2]], align 4
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds float, ptr [[DST:%.*]], i64 [[I_PROM]]
+; CHECK-NEXT:    [[ARRAYIDX_DST4:%.*]] = getelementptr inbounds i8, ptr [[TMP2]], i64 4
+; CHECK-NEXT:    store float [[TMP1]], ptr [[ARRAYIDX_DST4]], align 4
+; CHECK-NEXT:    ret void
+;
+entry:
+  %i.prom = zext i32 %i to i64
+  %idx = add nsw i64 %i.prom, 1
+  %arrayidx.src = getelementptr inbounds float, ptr %src, i64 %idx
+  %3 = load float, ptr %arrayidx.src, align 4
+  %arrayidx.dst = getelementptr inbounds float, ptr %dst, i64 %idx
+  store float %3, ptr %arrayidx.dst, align 4
+  ret void
+}
diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep-and-gvn.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep-and-gvn.ll
index 9a73feb2c4b5c..4474585bf9b06 100644
--- a/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep-and-gvn.ll
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep-and-gvn.ll
@@ -157,19 +157,19 @@ define void @sum_of_array3(i32 %x, i32 %y, ptr nocapture %output) {
 ; IR-NEXT:  .preheader:
 ; IR-NEXT:    [[TMP0:%.*]] = zext i32 [[Y]] to i64
 ; IR-NEXT:    [[TMP1:%.*]] = zext i32 [[X]] to i64
-; IR-NEXT:    [[TMP2:%.*]] = getelementptr [32 x [32 x float]], ptr addrspace(3) @array, i64 0, i64 [[TMP1]], i64 [[TMP0]]
+; IR-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [32 x [32 x float]], ptr addrspace(3) @array, i64 0, i64 [[TMP1]], i64 [[TMP0]]
 ; IR-NEXT:    [[TMP3:%.*]] = addrspacecast ptr addrspace(3) [[TMP2]] to ptr
 ; IR-NEXT:    [[TMP4:%.*]] = load float, ptr [[TMP3]], align 4
 ; IR-NEXT:    [[TMP5:%.*]] = fadd float [[TMP4]], 0.000000e+00
-; IR-NEXT:    [[TMP6:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 4
+; IR-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 4
 ; IR-NEXT:    [[TMP7:%.*]] = addrspacecast ptr addrspace(3) [[TMP6]] to ptr
 ; IR-NEXT:    [[TMP8:%.*]] = load float, ptr [[TMP7]], align 4
 ; IR-NEXT:    [[TMP9:%.*]] = fadd float [[TMP5]], [[TMP8]]
-; IR-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 128
+; IR-NEXT:    [[TMP10:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 128
 ; IR-NEXT:    [[TMP11:%.*]] = addrspacecast ptr addrspace(3) [[TMP10]] to ptr
 ; IR-NEXT:    [[TMP12:%.*]] = load float, ptr [[TMP11]], align 4
 ; IR-NEXT:    [[TMP13:%.*]] = fadd float [[TMP9]], [[TMP12]]
-; IR-NEXT:    [[TMP14:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 132
+; IR-NEXT:    [[TMP14:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 132
 ; IR-NEXT:    [[TMP15:%.*]] = addrspacecast ptr addrspace(3) [[TMP14]] to ptr
 ; IR-NEXT:    [[TMP16:%.*]] = load float, ptr [[TMP15]], align 4
 ; IR-NEXT:    [[TMP17:%.*]] = fadd float [[TMP13]], [[TMP16]]
@@ -224,19 +224,19 @@ define void @sum_of_array4(i32 %x, i32 %y, ptr nocapture %output) {
 ; IR-NEXT:  .preheader:
 ; IR-NEXT:    [[TMP0:%.*]] = zext i32 [[Y]] to i64
 ; IR-NEXT:    [[TMP1:%.*]] = zext i32 [[X]] to i64
-; IR-NEXT:    [[TMP2:%.*]] = getelementptr [32 x [32 x float]], ptr addrspace(3) @array, i64 0, i64 [[TMP1]], i64 [[TMP0]]
+; IR-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [32 x [32 x float]], ptr addrspace(3) @array, i64 0, i64 [[TMP1]], i64 [[TMP0]]
 ; IR-NEXT:    [[TMP3:%.*]] = addrspacecast ptr addrspace(3) [[TMP2]] to ptr
 ; IR-NEXT:    [[TMP4:%.*]] = load float, ptr [[TMP3]], align 4
 ; IR-NEXT:    [[TMP5:%.*]] = fadd float [[TMP4]], 0.000000e+00
-; IR-NEXT:    [[TMP6:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 4
+; IR-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 4
 ; IR-NEXT:    [[TMP7:%.*]] = addrspacecast ptr addrspace(3) [[TMP6]] to ptr
 ; IR-NEXT:    [[TMP8:%.*]] = load float, ptr [[TMP7]], align 4
 ; IR-NEXT:    [[TMP9:%.*]] = fadd float [[TMP5]], [[TMP8]]
-; IR-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 128
+; IR-NEXT:    [[TMP10:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 128
 ; IR-NEXT:    [[TMP11:%.*]] = addrspacecast ptr addrspace(3) [[TMP10]] to ptr
 ; IR-NEXT:    [[TMP12:%.*]] = load float, ptr [[TMP11]], align 4
 ; IR-NEXT:    [[TMP13:%.*]] = fadd float [[TMP9]], [[TMP12]]
-; IR-NEXT:    [[TMP14:%.*]] = getelementptr i8, ptr addrspace(3) [[TMP2]], i64 132
+; IR-NEXT:    [[TMP14:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP2]], i64 132
 ; IR-NEXT:    [[TMP15:%.*]] = addrspacecast ptr addrspace(3) [[TMP14]] to ptr
 ; IR-NEXT:    [[TMP16:%.*]] = load float, ptr [[TMP15]], align 4
 ; IR-NEXT:    [[TMP17:%.*]] = fadd float [[TMP13]], [[TMP16]]
diff --git a/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll b/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll
index 77b3434f4f159..da04a6e979425 100644
--- a/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll
+++ b/llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll
@@ -372,8 +372,8 @@ define ptr @trunk_explicit(ptr %ptr, i64 %idx) {
 ; CHECK-LABEL: define ptr @trunk_explicit(
 ; CHECK-SAME: ptr [[PTR:%.*]], i64 [[IDX:%.*]]) {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr [[STRUCT0:%.*]], ptr [[PTR]], i64 0, i32 3, i64 [[IDX]], i32 1
-; CHECK-NEXT:    [[PTR21:%.*]] = getelementptr i8, ptr [[TMP0]], i64 3216
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds [[STRUCT0:%.*]], ptr [[PTR]], i64 0, i32 3, i64 [[IDX]], i32 1
+; CHECK-NEXT:    [[PTR21:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 3216
 ; CHECK-NEXT:    ret ptr [[PTR21]]
 ;
 entry:
@@ -389,8 +389,8 @@ define ptr @trunk_long_idx(ptr %ptr, i64 %idx) {
 ; CHECK-LABEL: define ptr @trunk_long_idx(
 ; CHECK-SAME: ptr [[PTR:%.*]], i64 [[IDX:%.*]]) {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr [[STRUCT0:%.*]], ptr [[PTR]], i64 0, i32 3, i64 [[IDX]], i32 1
-; CHECK-NEXT:    [[PTR21:%.*]] = getelementptr i8, ptr [[TMP0]], i64 3216
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds [[STRUCT0:%.*]], ptr [[PTR]], i64 0, i32 3, i64 [[IDX]], i32 1
+; CHECK-NEXT:    [[PTR21:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 3216
 ; CHECK-NEXT:    ret ptr [[PTR21]]
 ;
 entry:

ritter-x2a · 2025-03-10T14:46:42Z

Started a llvm-compile-time-tracker run to check for compile time impact.

llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp

ritter-x2a · 2025-03-10T16:16:29Z

Started a llvm-compile-time-tracker run to check for compile time impact.

all within +/-0.02% (but I'm not sure if SeparateConstOffsetFromGEP even runs as part of these benchmarks).

arsenm · 2025-03-10T16:17:22Z

Started a llvm-compile-time-tracker run to check for compile time impact.

all within +/-0.02% (but I'm not sure if SeparateConstOffsetFromGEP even runs as part of these benchmarks).

It doesn't run amdgpu anything so no

ritter-x2a · 2025-04-10T07:26:54Z

Ping. The PR now also makes use and preserves NUW flags.

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp

…cking If we know that the initial GEP was inbounds, and we change it to a sequence of GEPs from the same base pointer where every offset is non-negative, then the new GEPs are inbounds. For SWDEV-516125.

ritter-x2a · 2025-04-23T10:36:46Z

Merge activity

Apr 23, 6:36 AM EDT: A user started a stack merge that includes this pull request via Graphite.
Apr 23, 6:38 AM EDT: A user merged this pull request with Graphite.

…cking and NUW (llvm#130617) If we know that the initial GEP was inbounds, and we change it to a sequence of GEPs from the same base pointer where every offset is non-negative, then the new GEPs are inbounds. We can also preserve inbounds if the inbounds GEP and the involved additions are NUW. For SWDEV-516125.

nikic · 2025-09-15T18:56:10Z

llvm/test/Transforms/SeparateConstOffsetFromGEP/NVPTX/split-gep.ll

-; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr [[STRUCT0:%.*]], ptr [[PTR]], i64 0, i32 3, i64 [[IDX]], i32 1
-; CHECK-NEXT:    [[PTR21:%.*]] = getelementptr i8, ptr [[TMP0]], i64 3216
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds [[STRUCT0:%.*]], ptr [[PTR]], i64 0, i32 3, i64 [[IDX]], i32 1
+; CHECK-NEXT:    [[PTR21:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 3216


This isn't right. The first index of the original GEP got reassociated into a separate GEP at the end here, which generally cannot preserve inbounds. (Imagine %idx is negative.)

Indeed, good catch, thanks! I think the reason for this is that AllOffsetsNonNegative is only updated here for offsets that are extracted, but non-extracted offsets (like %idx) should also be considered.
@nikic do you happen to know if it's safe to assume that the offsets from a GEP's struct index components are non-negative, or do they also need to be checked explicitly? I haven't found anything about that in a quick search in the LangRef.

I think it's okay to assume that. If you get a negative number there that means you have a struct larger than half the address space...

I've opened #159515 with a fix.

…gative when preserving inbounds If we know that the initial GEP was inbounds, and we change it to a sequence of GEPs from the same base pointer where every offset is non-negative, then the new GEPs are inbounds. So far, the implementation only checked if the extracted offsets are non-negative. In cases where non-extracted offsets can be negative, this would cause the inbounds flag to be wrongly preserved. Fixes an issue in #130617 found by nikic.

…gative when preserving inbounds (#159515) If we know that the initial GEP was inbounds, and we change it to a sequence of GEPs from the same base pointer where every offset is non-negative, then the new GEPs are inbounds. So far, the implementation only checked if the extracted offsets are non-negative. In cases where non-extracted offsets can be negative, this would cause the inbounds flag to be wrongly preserved. Fixes an issue in #130617 found by nikic.

ritter-x2a mentioned this pull request Mar 10, 2025

[SeparateConstOffsetFromGEP] Don't set unsound inbounds flag #130616

Merged

ritter-x2a requested review from arichardson, arsenm, nikic and shiltian March 10, 2025 14:44

ritter-x2a marked this pull request as ready for review March 10, 2025 14:44

llvmbot added backend:AMDGPU backend:NVPTX llvm:transforms labels Mar 10, 2025

arsenm reviewed Mar 10, 2025

View reviewed changes

llvm/test/Transforms/SeparateConstOffsetFromGEP/AMDGPU/preserve-inbounds.ll Outdated Show resolved Hide resolved

arsenm reviewed Mar 10, 2025

View reviewed changes

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp Outdated Show resolved Hide resolved

ritter-x2a force-pushed the users/ritter-x2a/03-10-_separateconstoffsetfromgep_preserve_inbounds_flag_based_on_valuetracking branch 4 times, most recently from a9bb606 to 3149661 Compare March 14, 2025 08:48

ritter-x2a force-pushed the users/ritter-x2a/03-10-_separateconstoffsetfromgep_don_t_set_unsound_inbounds_flag branch from 54f5ec1 to 07a9924 Compare March 17, 2025 08:49

ritter-x2a force-pushed the users/ritter-x2a/03-10-_separateconstoffsetfromgep_preserve_inbounds_flag_based_on_valuetracking branch 2 times, most recently from dc04fdc to cd10720 Compare March 18, 2025 08:13

ritter-x2a force-pushed the users/ritter-x2a/03-10-_separateconstoffsetfromgep_don_t_set_unsound_inbounds_flag branch from 07a9924 to 92a516e Compare March 18, 2025 08:13

Base automatically changed from users/ritter-x2a/03-10-_separateconstoffsetfromgep_don_t_set_unsound_inbounds_flag to main March 18, 2025 11:30

ritter-x2a force-pushed the users/ritter-x2a/03-10-_separateconstoffsetfromgep_preserve_inbounds_flag_based_on_valuetracking branch 2 times, most recently from 4ab790d to 5d1a65b Compare March 18, 2025 13:10

This was referenced Mar 18, 2025

[SDAG] Introduce inbounds flag for pointer arithmetic #131862

Closed

[AMDGPU][SDAG] Only fold flat offsets if they are inbounds #131863

Closed

ritter-x2a mentioned this pull request Mar 21, 2025

[AMDGPU][SDAG] Only fold flat offsets if they are inbounds #132353

Closed

ritter-x2a changed the title ~~[SeparateConstOffsetFromGEP] Preserve inbounds flag based on ValueTracking~~ [SeparateConstOffsetFromGEP] Preserve inbounds flag based on ValueTracking and NUW Mar 31, 2025

ritter-x2a requested a review from krzysz00 March 31, 2025 08:57

ritter-x2a force-pushed the users/ritter-x2a/03-10-_separateconstoffsetfromgep_preserve_inbounds_flag_based_on_valuetracking branch from 40889cc to bf8391d Compare March 31, 2025 11:37

arsenm reviewed Apr 10, 2025

View reviewed changes

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp Outdated Show resolved Hide resolved

arsenm reviewed Apr 10, 2025

View reviewed changes

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp Outdated Show resolved Hide resolved

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp Outdated Show resolved Hide resolved

arsenm approved these changes Apr 13, 2025

View reviewed changes

llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp Outdated Show resolved Hide resolved

ritter-x2a added 8 commits April 14, 2025 04:12

[SeparateConstOffsetFromGEP] Preserve inbounds flag based on ValueTra…

e04a04d

…cking If we know that the initial GEP was inbounds, and we change it to a sequence of GEPs from the same base pointer where every offset is non-negative, then the new GEPs are inbounds. For SWDEV-516125.

Simplify test case and add more

b3caba7

Use isKnownNonNegative, add available information to the SimplifyQuery

69fe986

Preserve more GEP flags based on NUW

99df663

Change some _negative tests to actually use negative numbers

56dbc9a

Simplify the checkRebuildingPreservesNUW computation

871264c

Remove isa assert, fix comment typo

dba2b31

"User *U" -> "const User *U"

7148a36

ritter-x2a force-pushed the users/ritter-x2a/03-10-_separateconstoffsetfromgep_preserve_inbounds_flag_based_on_valuetracking branch from 1070abb to 7148a36 Compare April 14, 2025 09:39

arsenm approved these changes Apr 23, 2025

View reviewed changes

ritter-x2a merged commit 720a911 into main Apr 23, 2025
10 of 11 checks passed

ritter-x2a deleted the users/ritter-x2a/03-10-_separateconstoffsetfromgep_preserve_inbounds_flag_based_on_valuetracking branch April 23, 2025 10:38

nikic reviewed Sep 15, 2025

View reviewed changes

ritter-x2a mentioned this pull request Sep 18, 2025

[SeparateConstOffsetFromGEP] Check if non-extracted indices may be negative when preserving inbounds #159515

Merged

[SeparateConstOffsetFromGEP] Preserve inbounds flag based on ValueTracking and NUW #130617

[SeparateConstOffsetFromGEP] Preserve inbounds flag based on ValueTracking and NUW #130617

Uh oh!

Conversation

ritter-x2a commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ritter-x2a commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 10, 2025

Uh oh!

ritter-x2a commented Mar 10, 2025

Uh oh!

Uh oh!

Uh oh!

ritter-x2a commented Mar 10, 2025

Uh oh!

arsenm commented Mar 10, 2025

Uh oh!

ritter-x2a commented Apr 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ritter-x2a commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

nikic Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

ritter-x2a Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

nikic Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

ritter-x2a Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ritter-x2a commented Mar 10, 2025 •

edited

Loading

ritter-x2a commented Mar 10, 2025 •

edited

Loading

llvmbot commented Mar 10, 2025 •

edited

Loading

ritter-x2a commented Apr 23, 2025 •

edited

Loading