[SROA] Rewrite invariant group intrinsics after splitting alloca #107557

antoniofrighetto · 2024-09-06T10:13:51Z

A miscompilation issue has been addressed with improved handling.

Fixes: #105537.

llvmbot · 2024-09-06T10:14:31Z

@llvm/pr-subscribers-llvm-transforms

Author: Antonio Frighetto (antoniofrighetto)

Changes

A miscompilation issue has been addressed with improved checking.

Fixes: #105537.

Full diff: https://github.com/llvm/llvm-project/pull/107557.diff

3 Files Affected:

(modified) llvm/lib/Transforms/Scalar/SROA.cpp (+4-3)
(modified) llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp (+2-1)
(modified) llvm/test/Transforms/SROA/invariant-group.ll (+16)

diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp b/llvm/lib/Transforms/Scalar/SROA.cpp
index d0186da1bc5e22..169b34181399a5 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -3522,6 +3522,10 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
            "Unexpected intrinsic!");
     LLVM_DEBUG(dbgs() << "    original: " << II << "\n");
 
+    // Do not record invariant group intrinsics for deletion.
+    if (II.isLaunderOrStripInvariantGroup())
+      return true;
+
     // Record this instruction for deletion.
     Pass.DeadInsts.push_back(&II);
 
@@ -3532,9 +3536,6 @@ class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
       return true;
     }
 
-    if (II.isLaunderOrStripInvariantGroup())
-      return true;
-
     assert(II.getArgOperand(1) == OldPtr);
     // Lifetime intrinsics are only promotable if they cover the whole alloca.
     // Therefore, we drop lifetime intrinsics which don't cover the whole
diff --git a/llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp b/llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp
index 1b7912fdf5e304..95a4178b276177 100644
--- a/llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp
+++ b/llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp
@@ -81,7 +81,8 @@ bool llvm::isAllocaPromotable(const AllocaInst *AI) {
         return false;
     } else if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(U)) {
       if (!II->isLifetimeStartOrEnd() && !II->isDroppable() &&
-          II->getIntrinsicID() != Intrinsic::fake_use)
+          II->getIntrinsicID() != Intrinsic::fake_use &&
+          !II->isLaunderOrStripInvariantGroup())
         return false;
     } else if (const BitCastInst *BCI = dyn_cast<BitCastInst>(U)) {
       if (!onlyUsedByLifetimeMarkersOrDroppableInsts(BCI))
diff --git a/llvm/test/Transforms/SROA/invariant-group.ll b/llvm/test/Transforms/SROA/invariant-group.ll
index 1be6f6e2fc32bb..08f2d08e58cee7 100644
--- a/llvm/test/Transforms/SROA/invariant-group.ll
+++ b/llvm/test/Transforms/SROA/invariant-group.ll
@@ -138,10 +138,13 @@ end:
   ret void
 }
 
+; Conservatively do not split the alloca. 
 define void @partial_promotion_of_alloca() {
 ; CHECK-LABEL: @partial_promotion_of_alloca(
 ; CHECK-NEXT:    [[STRUCT_PTR_SROA_2:%.*]] = alloca i32, align 4
+; CHECK-NEXT:    [[STRUCT_PTR:%.*]] = alloca [[T:%.*]], align 4
 ; CHECK-NEXT:    store volatile i32 0, ptr [[STRUCT_PTR_SROA_2]], align 4
+; CHECK-NEXT:    [[BARR:%.*]] = call ptr @llvm.launder.invariant.group.p0(ptr [[STRUCT_PTR]])
 ; CHECK-NEXT:    [[STRUCT_PTR_SROA_2_0_STRUCT_PTR_SROA_2_4_LOAD_VAL:%.*]] = load volatile i32, ptr [[STRUCT_PTR_SROA_2]], align 4
 ; CHECK-NEXT:    ret void
 ;
@@ -155,6 +158,19 @@ define void @partial_promotion_of_alloca() {
   ret void
 }
 
+define void @memcpy_after_laundering_alloca(ptr %ptr) {
+; CHECK-LABEL: @memcpy_after_laundering_alloca(
+; CHECK-NEXT:    [[ALLOCA:%.*]] = alloca { i64, i64 }, align 8
+; CHECK-NEXT:    [[LAUNDER:%.*]] = call ptr @llvm.launder.invariant.group.p0(ptr [[ALLOCA]])
+; CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr [[LAUNDER]], ptr [[PTR:%.*]], i64 16, i1 false)
+; CHECK-NEXT:    ret void
+;
+  %alloca = alloca { i64, i64 }, align 8
+  %launder = call ptr @llvm.launder.invariant.group.p0(ptr %alloca)
+  call void @llvm.memcpy.p0.p0.i64(ptr %launder, ptr %ptr, i64 16, i1 false)
+  ret void
+}
+
 declare void @use(ptr)
 
 !0 = !{}

antoniofrighetto · 2024-09-06T12:27:02Z

llvm/test/Transforms/SROA/invariant-group.ll

 define void @partial_promotion_of_alloca() {
 ; CHECK-LABEL: @partial_promotion_of_alloca(
 ; CHECK-NEXT:    [[STRUCT_PTR_SROA_2:%.*]] = alloca i32, align 4
+; CHECK-NEXT:    [[STRUCT_PTR:%.*]] = alloca [[T:%.*]], align 4


Uhm, I'm not entirely sure if it's fine in this specific case to have a field promoted to a scalar, and have the whole alloca still there.

nikic · 2024-09-06T13:03:33Z

llvm/lib/Transforms/Scalar/SROA.cpp


+    // Do not record invariant group intrinsics for deletion.
+    if (II.isLaunderOrStripInvariantGroup())
+      return true;


Hm, shouldn't we be creating a new intrinsic for the split pointer instead?

We should, I accidentally thought that reusing it would suffice.

nikic · 2024-09-06T13:05:09Z

llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp

      if (!II->isLifetimeStartOrEnd() && !II->isDroppable() &&
-          II->getIntrinsicID() != Intrinsic::fake_use)
+          II->getIntrinsicID() != Intrinsic::fake_use &&
+          !II->isLaunderOrStripInvariantGroup())


I don't understand how this is supposed to work. I don't think we have logic to handle these in mem2reg? Also wouldn't we have to scan their uses here?

I might be wrong, but we don't seem to have logic to handle the others here, do we actually need any specific logic in mem2reg? Why should we scan their uses? The change seems to be needed as the alloca is marked as promotable, as otherwise we would assert in PromoteMemoryToRegister. I don't think we want to mark the alloca as non-promotable when visiting the intrinsic and prevent mem2reg from having the chance to simplify the alloca further.

Okay, I think this gets handled here:

llvm-project/llvm/lib/Transforms/Utils/PromoteMemoryToRegister.cpp

Lines 495 to 507 in 05284a0

// The only users of this bitcast/GEP instruction are lifetime intrinsics.

// Follow the use/def chain to erase them now instead of leaving it for

// dead code elimination later.

for (Use &UU : llvm::make_early_inc_range(I->uses())) {

Instruction *Inst = cast<Instruction>(UU.getUser());

// Drop the use of I in droppable instructions.

if (Inst->isDroppable()) {

Inst->dropDroppableUse(UU);

continue;

}

Inst->eraseFromParent();

}

I think the handling is ultimately correct, but the comment there could use an update.

Comment updated, thanks.

nikic · 2024-10-01T13:24:22Z

llvm/lib/Transforms/Scalar/SROA.cpp

+        New = IRB.CreateStripInvariantGroup(AdjustedPtr);
+
+      New->takeName(&II);
+      II.replaceAllUsesWith(New);


Is this RAUW correct? Can't the alloca be split into two parts, both of which need their own intrinsic?

Isn't this already being done as part of visiting each slice in:

llvm-project/llvm/lib/Transforms/Scalar/SROA.cpp

Lines 4921 to 4924 in e03f427

for (Slice &S : P) {

Promotable &= Rewriter.visit(&S);

++NumUses;

}

Or am I mistaking it? It looks like the intrinsic has been rewritten for each part of the alloca (this is before mem2reg for the test case):

define void @partial_promotion_of_alloca() { %struct_ptr.sroa.0 = alloca i32, align 4 %struct_ptr.sroa.2 = alloca i32, align 4 store i32 0, ptr %struct_ptr.sroa.0, align 4 store volatile i32 0, ptr %struct_ptr.sroa.2, align 4 %barr = call ptr @llvm.launder.invariant.group.p0(ptr %struct_ptr.sroa.0) %1 = call ptr @llvm.launder.invariant.group.p0(ptr %struct_ptr.sroa.2) %struct_ptr.sroa.2.0.struct_ptr.sroa.2.4.load_val = load volatile i32, ptr %struct_ptr.sroa.2, align 4 ret void }

The case I have in mind is something like memcpy_after_laundering_alloca but where we end up with two memcpys rather than one. I expect you'll get incorrect pointers in that case.

Would you mind showing an example that could lead to get incorrect pointers? I had in mind something as follows, where we memcpy each field separately, but I suspect this is not what you were referring to:

define void @memcpy_after_laundering_allocas_field(ptr %ptr) { %alloca = alloca { i64, i64 }, align 8 %field1_ptr = getelementptr inbounds { i64, i64 }, ptr %alloca, i32 0, i32 0 %field2_ptr = getelementptr inbounds { i64, i64 }, ptr %alloca, i32 0, i32 1 %launder_field1 = call ptr @llvm.launder.invariant.group.p0(ptr %field1_ptr) %launder_field2 = call ptr @llvm.launder.invariant.group.p0(ptr %field2_ptr) call void @llvm.memcpy.p0.p0.i64(ptr %launder_field1, ptr %ptr, i64 8, i1 false) call void @llvm.memcpy.p0.p0.i64(ptr %launder_field2, ptr %ptr, i64 8, i1 false) ret void }

Oh, you mean the case where we launder only a field of the alloca (not sure if this makes sense though?)

define void @memcpy_after_laundering_alloca_2(ptr %ptr) { %alloca = alloca { i64, i64 }, align 8 %field2_ptr = getelementptr inbounds { i64, i64 }, ptr %alloca, i32 0, i32 1 %launder = call ptr @llvm.launder.invariant.group.p0(ptr %field2_ptr) call void @llvm.memcpy.p0.p0.i64(ptr %launder, ptr %ptr, i64 8, i1 false) ret void }

define void @test(ptr %ptr) { %alloca = alloca { [16 x i8], i64, [16 x i8] }, align 8 %launder = call ptr @llvm.launder.invariant.group.p0(ptr %alloca) %gep = getelementptr i8, ptr %launder, i64 16 store i64 0, ptr %gep call void @llvm.memcpy.p0.p0.i64(ptr %launder, ptr %ptr, i64 40, i1 false) ret void }

With your patch this produces:

define void @test(ptr %ptr) { %alloca.sroa.0 = alloca [16 x i8], align 8 %alloca.sroa.3 = alloca [16 x i8], align 8 %1 = call ptr @llvm.launder.invariant.group.p0(ptr %alloca.sroa.3) call void @llvm.memcpy.p0.p0.i64(ptr align 8 %alloca.sroa.0, ptr align 1 %ptr, i64 16, i1 false) %alloca.sroa.2.0.ptr.sroa_idx = getelementptr inbounds i8, ptr %ptr, i64 16 %alloca.sroa.2.0.copyload = load i64, ptr %alloca.sroa.2.0.ptr.sroa_idx, align 1 %alloca.sroa.3.0.ptr.sroa_idx = getelementptr inbounds i8, ptr %ptr, i64 24 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %alloca.sroa.3, ptr align 1 %alloca.sroa.3.0.ptr.sroa_idx, i64 16, i1 false) ret void }

Note that the launder.invariant intrinsic here is unused.

Updated this by considering whether we need to rewrite memory intrinsics as well when rewriting the invariant group one. Tried to keep everything self-contained in visitIntrinsicInst, but code was getting duplicated fast, and the memcpys were later being visited anyways. Thus, as the alloca may get split at each AllocaSliceRewriter iteration, if some of the users of the invariant group are memory intrinsics, we track the launder so as to use this one as ptr destination, instead of the new alloca.

A miscompilation issue has been addressed with improved handling. Fixes: llvm#105537.

antoniofrighetto · 2024-11-11T07:37:47Z

Kind ping.

antoniofrighetto · 2024-11-21T09:40:49Z

Ping :)

antoniofrighetto · 2024-12-09T09:03:28Z

Does this possibly look on track? Not sure if a different approach can be taken instead of having the two wrappers.

antoniofrighetto · 2025-03-13T14:00:02Z

Kind ping for possible alternative directions, would be nice to have this fixed.

efriedma-quic · 2025-03-28T01:15:46Z

This might be a stupid question... but why are we trying to preserve the calls to llvm.launder.invariant.group? SROA strips all the relevant invariant.group metadata, and llvm.launder.invariant.group is only defined to have an effect if the returned pointer is used by load/store operations with invariant.group metadata. So the calls are just no-ops; why do we want to preserve them?

…151574

antoniofrighetto requested a review from nikic September 6, 2024 10:13

llvmbot added the llvm:transforms label Sep 6, 2024

antoniofrighetto commented Sep 6, 2024

View reviewed changes

nikic reviewed Sep 6, 2024

View reviewed changes

antoniofrighetto force-pushed the feature/sroa-launder-invariant-misc branch from 3a10f2b to 05284a0 Compare September 24, 2024 16:03

antoniofrighetto changed the title ~~[SROA] Conservatively do not split the alloca if ptr is laundered~~ [SROA] Rewrite invariant group intrinsics after splitting alloca Sep 24, 2024

antoniofrighetto requested a review from nikic September 30, 2024 07:02

nikic reviewed Oct 1, 2024

View reviewed changes

antoniofrighetto requested a review from nikic October 7, 2024 07:54

[SROA] Rewrite invariant group intrinsics after splitting alloca

1927a19

A miscompilation issue has been addressed with improved handling. Fixes: llvm#105537.

antoniofrighetto force-pushed the feature/sroa-launder-invariant-misc branch from 2a70bbf to 1927a19 Compare October 29, 2024 18:35

antoniofrighetto requested a review from efriedma-quic December 19, 2024 20:49

antoniofrighetto requested a review from davemgreen March 13, 2025 14:00

nikic mentioned this pull request Aug 1, 2025

[llvm][sroa] Disable support for invariant.group #151743

Merged

tommymcm added a commit to tommymcm/llvm-project that referenced this pull request Aug 1, 2025

[llvm][sroa] Added invariant.group tests from llvm#107557 and llvm#…

02c381f

…151574

	// The only users of this bitcast/GEP instruction are lifetime intrinsics.
	// Follow the use/def chain to erase them now instead of leaving it for
	// dead code elimination later.
	for (Use &UU : llvm::make_early_inc_range(I->uses())) {
	Instruction *Inst = cast<Instruction>(UU.getUser());

	// Drop the use of I in droppable instructions.
	if (Inst->isDroppable()) {
	Inst->dropDroppableUse(UU);
	continue;
	}
	Inst->eraseFromParent();
	}

	for (Slice &S : P) {
	Promotable &= Rewriter.visit(&S);
	++NumUses;
	}

[SROA] Rewrite invariant group intrinsics after splitting alloca #107557

Are you sure you want to change the base?

[SROA] Rewrite invariant group intrinsics after splitting alloca #107557

Uh oh!

Conversation

antoniofrighetto commented Sep 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 6, 2024

Uh oh!

antoniofrighetto Sep 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoniofrighetto Oct 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikic Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoniofrighetto commented Nov 11, 2024

Uh oh!

antoniofrighetto commented Nov 21, 2024

Uh oh!

antoniofrighetto commented Dec 9, 2024

Uh oh!

antoniofrighetto commented Mar 13, 2025

Uh oh!

efriedma-quic commented Mar 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

antoniofrighetto commented Sep 6, 2024 •

edited

Loading

antoniofrighetto Sep 6, 2024 •

edited

Loading

antoniofrighetto Oct 1, 2024 •

edited

Loading

nikic Oct 7, 2024 •

edited

Loading