[InstCombine] Propagate invariant.load metadata across unpacked loads #152186

VedantParanjape · 2025-08-05T18:11:49Z

For loads that operate on aggregate type, instcombine unpacks the loads. It does not preserve the invariant.load metadata. This patch fixes that, it looks for the metadata in the parent load and attaches the metadata to the unpacked loads.

%struct.double2 = type { double, double }
%struct.double1 = type { double }

define %struct.double2 @func1(ptr %a) {
  %1 = load %struct.double2, ptr %a, align 16, !invariant.load !1
  ret %struct.double2 %1
}

!1 = !{}

Reproducer: https://godbolt.org/z/hcY8MMvYh

For loads that operate on aggregate type, instcombine unpacks the loads. It does not preserve the invariant.load metadata. This patch fixes that, it looks for the metadata in the parent load and attaches the metadata to the unpacked loads.

llvmbot · 2025-08-05T18:12:24Z

@llvm/pr-subscribers-llvm-transforms

Author: Vedant Paranjape (VedantParanjape)

Changes

For loads that operate on aggregate type, instcombine unpacks the loads. It does not preserve the invariant.load metadata. This patch fixes that, it looks for the metadata in the parent load and attaches the metadata to the unpacked loads.

%struct.double2 = type { double, double }
%struct.double1 = type { double }

define %struct.double2 @<!-- -->func1(ptr %a) {
  %1 = load %struct.double2, ptr %a, align 16, !invariant.load !1
  ret %struct.double2 %1
}

!1 = !{}

Reproducer: https://godbolt.org/z/hcY8MMvYh

Full diff: https://github.com/llvm/llvm-project/pull/152186.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp (+10)
(added) llvm/test/Transforms/InstCombine/invariant-metadata-propagation.ll (+46)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp b/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
index 0be1034b046b6..68252b06e3a9a 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
@@ -718,6 +718,14 @@ static Instruction *combineLoadToOperationType(InstCombinerImpl &IC,
   return nullptr;
 }
 
+// Check if the aggregate load has a invariant.load metadata
+// If aggregate load has invariant.load metadata, add it to the
+// unpacked loads as well.
+static void copyInvariantLoadMetadata(LoadInst &LI, LoadInst *NewLoad) {
+  if (MDNode *MD = LI.getMetadata("invariant.load"))
+    NewLoad->setMetadata("invariant.load", MD);
+}
+
 static Instruction *unpackLoadToAggregate(InstCombinerImpl &IC, LoadInst &LI) {
   // FIXME: We could probably with some care handle both volatile and atomic
   // stores here but it isn't clear that this is important.
@@ -737,6 +745,7 @@ static Instruction *unpackLoadToAggregate(InstCombinerImpl &IC, LoadInst &LI) {
       LoadInst *NewLoad = IC.combineLoadToNewType(LI, ST->getTypeAtIndex(0U),
                                                   ".unpack");
       NewLoad->setAAMetadata(LI.getAAMetadata());
+      copyInvariantLoadMetadata(LI, NewLoad);
       return IC.replaceInstUsesWith(LI, IC.Builder.CreateInsertValue(
         PoisonValue::get(T), NewLoad, 0, Name));
     }
@@ -764,6 +773,7 @@ static Instruction *unpackLoadToAggregate(InstCombinerImpl &IC, LoadInst &LI) {
           Name + ".unpack");
       // Propagate AA metadata. It'll still be valid on the narrowed load.
       L->setAAMetadata(LI.getAAMetadata());
+      copyInvariantLoadMetadata(LI, L);
       V = IC.Builder.CreateInsertValue(V, L, i);
     }
 
diff --git a/llvm/test/Transforms/InstCombine/invariant-metadata-propagation.ll b/llvm/test/Transforms/InstCombine/invariant-metadata-propagation.ll
new file mode 100644
index 0000000000000..acc5e7ca8d2b4
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/invariant-metadata-propagation.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S < %s -passes=instcombine | FileCheck %s
+
+%struct.double2 = type { double, double }
+%struct.double1 = type { double }
+
+define %struct.double2 @func1(ptr addrspace(1) %a) {
+; CHECK-LABEL: define %struct.double2 @func1(
+; CHECK-SAME: ptr addrspace(1) [[A:%.*]]) {
+; CHECK-NEXT:    [[DOTUNPACK:%.*]] = load double, ptr addrspace(1) [[A]], align 16, !invariant.load [[META0:![0-9]+]]
+; CHECK-NEXT:    [[TMP1:%.*]] = insertvalue [[STRUCT_DOUBLE2:%.*]] poison, double [[DOTUNPACK]], 0
+; CHECK-NEXT:    [[DOTELT1:%.*]] = getelementptr inbounds nuw i8, ptr addrspace(1) [[A]], i64 8
+; CHECK-NEXT:    [[DOTUNPACK2:%.*]] = load double, ptr addrspace(1) [[DOTELT1]], align 8, !invariant.load [[META0]]
+; CHECK-NEXT:    [[TMP2:%.*]] = insertvalue [[STRUCT_DOUBLE2]] [[TMP1]], double [[DOTUNPACK2]], 1
+; CHECK-NEXT:    ret [[STRUCT_DOUBLE2]] [[TMP2]]
+;
+  %1 = load %struct.double2, ptr addrspace(1) %a, align 16, !invariant.load !1
+  ret %struct.double2 %1
+}
+
+define %struct.double2 @func2(ptr %a) {
+; CHECK-LABEL: define %struct.double2 @func2(
+; CHECK-SAME: ptr [[A:%.*]]) {
+; CHECK-NEXT:    [[DOTUNPACK:%.*]] = load double, ptr [[A]], align 16, !invariant.load [[META0]]
+; CHECK-NEXT:    [[TMP1:%.*]] = insertvalue [[STRUCT_DOUBLE2:%.*]] poison, double [[DOTUNPACK]], 0
+; CHECK-NEXT:    [[DOTELT1:%.*]] = getelementptr inbounds nuw i8, ptr [[A]], i64 8
+; CHECK-NEXT:    [[DOTUNPACK2:%.*]] = load double, ptr [[DOTELT1]], align 8, !invariant.load [[META0]]
+; CHECK-NEXT:    [[TMP2:%.*]] = insertvalue [[STRUCT_DOUBLE2]] [[TMP1]], double [[DOTUNPACK2]], 1
+; CHECK-NEXT:    ret [[STRUCT_DOUBLE2]] [[TMP2]]
+;
+  %1 = load %struct.double2, ptr %a, align 16, !invariant.load !1
+  ret %struct.double2 %1
+}
+
+define %struct.double1 @func3(ptr %a) {
+; CHECK-LABEL: define %struct.double1 @func3(
+; CHECK-SAME: ptr [[A:%.*]]) {
+; CHECK-NEXT:    [[DOTUNPACK:%.*]] = load double, ptr [[A]], align 16, !invariant.load [[META0]]
+; CHECK-NEXT:    [[TMP1:%.*]] = insertvalue [[STRUCT_DOUBLE1:%.*]] poison, double [[DOTUNPACK]], 0
+; CHECK-NEXT:    ret [[STRUCT_DOUBLE1]] [[TMP1]]
+;
+  %1 = load %struct.double1, ptr %a, align 16, !invariant.load !1
+  ret %struct.double1 %1
+}
+
+!1 = !{}

arsenm · 2025-08-05T23:51:40Z

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp

+// If aggregate load has invariant.load metadata, add it to the
+// unpacked loads as well.
+static void copyInvariantLoadMetadata(LoadInst &LI, LoadInst *NewLoad) {
+  if (MDNode *MD = LI.getMetadata("invariant.load"))


Should access by MD_ enum instead of by name. Also isn't this what copyMetadataForLoad is for?

I don't think that function will correctly propagate metadata for unpacking loads. For example, it copies MD_tbaa as is. But the tbaa nodes would change for scalar loads vs aggregate loads.

@arsenm now using enums instead of by name.

The point is still that most of these metadata should propagate, this isn't unique to invariant.load

We can use copyMetadataForLoad.

copyMetadataForLoad sounds like only designed for load type casting. And some code that checks NewType->isPointerTy() implicitly assumes the OldType is also a pointer, which sounds incorrect for spliting loads.

// Note, essentially every kind of metadata should be preserved here! This // routine is supposed to clone a load instruction changing *only its type*. // The only metadata it makes sense to drop is metadata which is invalidated // when the pointer type changes. This should essentially never be the case // in LLVM, but we explicitly switch over only known metadata to be // conservatively correct. If you are adding metadata to LLVM which pertains // to loads, you almost certainly want to add it here.

Shall we add a copyMetadataForSplitLoad?

copyMetadataForLoad sounds like only designed for load type casting.

You are right. If we have to repeat it multiple times in the future, it is definitely better to have a helper function.

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp

github-actions · 2025-08-06T23:10:01Z

✅ With the latest revision this PR passed the C/C++ code formatter.

VedantParanjape · 2025-08-13T17:02:58Z

gentle ping! This is part of my internship project and I will finish in a week. Could the reviewers please take a look at it, and let me know how to proceed with this patch? Thanks.

dtcxzyw · 2025-08-13T17:26:43Z

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp

+      AAMDNodes adjustedAANodes = LI.getAAMetadata().adjustForAccess(
+          SL->getElementOffset(i),
+          SL->getElementOffset(i).getKnownMinValue());


Suggested change

AAMDNodes adjustedAANodes = LI.getAAMetadata().adjustForAccess(

SL->getElementOffset(i),

SL->getElementOffset(i).getKnownMinValue());

TypeSize offset = SL->getElementOffset(i);

AAMDNodes adjustedAANodes = LI.getAAMetadata();

if (offset.isFixed()) {

adjustedAANodes = adjustedAANodes.adjustForAccess(

offset.getFixedValue(),

ST->getElementType(i), DL);

}

else {

adjustedAANodes.TBAA = nullptr;

adjustedAANodes.TBAAStruct = nullptr;

}

For structs with scalable vector elements, the TBAA information is not applicable. You can either zero out or assert here.

Please file a separate patch for this change. I don't want to block this patch.

Agreed, this is a separate issue unrelated to this PR.

Sure, I will upload this as a separate patch.

dtcxzyw · 2025-08-13T17:31:05Z

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp

      LoadInst *NewLoad = IC.combineLoadToNewType(LI, ST->getTypeAtIndex(0U),
                                                  ".unpack");
      NewLoad->setAAMetadata(LI.getAAMetadata());
+      copyInvariantLoadMetadata(LI, NewLoad);


Suggested change

copyInvariantLoadMetadata(LI, NewLoad);

NewLI->copyMetadata(LI, LLVMContext::MD_invariant_load);

dtcxzyw · 2025-08-13T17:31:16Z

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp

      // Propagate AA metadata. It'll still be valid on the narrowed load.
-      L->setAAMetadata(LI.getAAMetadata());
+      L->setAAMetadata(adjustedAANodes);
+      copyInvariantLoadMetadata(LI, L);


Suggested change

copyInvariantLoadMetadata(LI, L);

NewLI->copyMetadata(LI, LLVMContext::MD_invariant_load);

@dtcxzyw I have fixed these issues, could you please give a lgtm? I will create another PR for adjusting AA metadata.

dtcxzyw · 2025-08-13T18:16:24Z

Just out of curiosity, is the original IR generated from the CUDA intrinsic __ldg(const double2 *p)? If this is the case, is it profitable to unpack the aggregate load?

dtcxzyw · 2025-08-13T18:29:13Z

Just out of curiosity, is the original IR generated from the CUDA intrinsic __ldg(const double2 *p)? If this is the case, is it profitable to unpack the aggregate load?

Interesting, the result with your patch looks the best :)
https://godbolt.org/z/fTanbbqq1

fiigii · 2025-08-13T19:23:42Z

is the original IR generated from the CUDA intrinsic __ldg(const double2 *p)?

Yes, other high-level compilers also can use invariant.load to express ld.global.nc with NVPTX.

If this is the case, is it profitable to unpack the aggregate load?

Yes, load store vectorization relies on this IC function to vectorize overaligned aggregate loads, which is one of the most important optimizations for CUDA

This reverts commit 9899e69.

VedantParanjape · 2025-08-13T23:32:31Z

This change seems to crash the following testcase. Seems totally unrelated to my change, and it is a llc testcase.

-- Testing: 60849 tests, 64 workers --
  Testing:  0.. 10.. 20.. 30..
  FAIL: LLVM :: CodeGen/Thumb2/mve-vcvt-fixed-to-float.ll (23899 of 60849)
  ******************** TEST 'LLVM :: CodeGen/Thumb2/mve-vcvt-fixed-to-float.ll' FAILED ********************
  Exit Code: 1

dtcxzyw · 2025-08-14T02:23:30Z

This change seems to crash the following testcase. Seems totally unrelated to my change, and it is a llc testcase.

-- Testing: 60849 tests, 64 workers --
  Testing:  0.. 10.. 20.. 30..
  FAIL: LLVM :: CodeGen/Thumb2/mve-vcvt-fixed-to-float.ll (23899 of 60849)
  ******************** TEST 'LLVM :: CodeGen/Thumb2/mve-vcvt-fixed-to-float.ll' FAILED ********************
  Exit Code: 1

Yeah it was also observed in another PR #152851 (comment)

dtcxzyw

LGTM.

VedantParanjape · 2025-08-14T02:28:33Z

LGTM.

can I merge bypassing build requirements? or wait for the build issue to get resolved.

nikic · 2025-08-14T12:56:53Z

LGTM.

can I merge bypassing build requirements? or wait for the build issue to get resolved.

Generally, feel free to merge as-is if there are only pre-existing test failures (like in this case).

VedantParanjape · 2025-08-14T22:30:33Z

@dtcxzyw I am not able to write a good testcase for adjustOffset PR. I tried something like this: https://godbolt.org/z/x7eYdE8EY but the code change doesn't do anything. I think the adjustOffset API doesn't support new struct path tbaa!

dtcxzyw · 2025-08-17T17:21:48Z

@dtcxzyw I am not able to write a good testcase for adjustOffset PR. I tried something like this: https://godbolt.org/z/x7eYdE8EY but the code change doesn't do anything. I think the adjustOffset API doesn't support new struct path tbaa!

llvm-project/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp

Lines 746 to 763 in 71925a9

    
           MDNode *AAMDNodes::shiftTBAA(MDNode *MD, size_t Offset) { 
        
             // Fast path if there's no offset 
        
             if (Offset == 0) 
        
               return MD; 
        
             // Fast path if there's no path tbaa node (and thus scalar) 
        
             if (!isStructPathTBAA(MD)) 
        
               return MD; 
        
             // The correct behavior here is to add the offset into the TBAA 
        
             // struct node offset. The base type, however may not have defined 
        
             // a type at this additional offset, resulting in errors. Since 
        
             // this method is only used within a given load/store access 
        
             // the offset provided is only used to subdivide the previous load 
        
             // maintaining the validity of the previous TBAA. 
        
             // 
        
             // This, however, should be revisited in the future. 
        
             return MD; 
        
           }

It is not implemented :(

VedantParanjape requested a review from nikic as a code owner August 5, 2025 18:11

llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Aug 5, 2025

Merge branch 'main' into invariant-change

af85146

VedantParanjape requested review from arsenm and fhahn August 5, 2025 18:18

VedantParanjape self-assigned this Aug 5, 2025

arsenm reviewed Aug 5, 2025

View reviewed changes

use enums in mdnode

71d9921

dtcxzyw reviewed Aug 6, 2025

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp Show resolved Hide resolved

VedantParanjape removed the request for review from fhahn August 6, 2025 22:07

adjust AAMetadata

9899e69

dtcxzyw reviewed Aug 13, 2025

View reviewed changes

VedantParanjape added 2 commits August 13, 2025 22:35

Revert "adjust AAMetadata"

9955d70

This reverts commit 9899e69.

addressed review

2d4e276

dtcxzyw approved these changes Aug 14, 2025

View reviewed changes

VedantParanjape removed the request for review from nikic August 14, 2025 02:27

Merge branch 'main' into invariant-change

efe732a

VedantParanjape enabled auto-merge (squash) August 14, 2025 02:50

VedantParanjape disabled auto-merge August 14, 2025 02:50

VedantParanjape merged commit 44df982 into llvm:main Aug 14, 2025
7 of 9 checks passed

-      AAMDNodes adjustedAANodes = LI.getAAMetadata().adjustForAccess(
-          SL->getElementOffset(i),
-          SL->getElementOffset(i).getKnownMinValue());
+      TypeSize offset = SL->getElementOffset(i);
+      AAMDNodes adjustedAANodes = LI.getAAMetadata();
+      if (offset.isFixed()) {
+        adjustedAANodes = adjustedAANodes.adjustForAccess(
+          offset.getFixedValue(),
+          ST->getElementType(i), DL);
+      }
+      else {
+        adjustedAANodes.TBAA = nullptr;
+        adjustedAANodes.TBAAStruct = nullptr;
+      }

	copyInvariantLoadMetadata(LI, NewLoad);
	NewLI->copyMetadata(LI, LLVMContext::MD_invariant_load);

	copyInvariantLoadMetadata(LI, L);
	NewLI->copyMetadata(LI, LLVMContext::MD_invariant_load);

[InstCombine] Propagate invariant.load metadata across unpacked loads #152186

[InstCombine] Propagate invariant.load metadata across unpacked loads #152186

Uh oh!

Conversation

VedantParanjape commented Aug 5, 2025

Uh oh!

llvmbot commented Aug 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VedantParanjape commented Aug 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VedantParanjape Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dtcxzyw commented Aug 13, 2025

Uh oh!

dtcxzyw commented Aug 13, 2025

Uh oh!

fiigii commented Aug 13, 2025

Uh oh!

VedantParanjape commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dtcxzyw commented Aug 14, 2025

Uh oh!

dtcxzyw left a comment

Choose a reason for hiding this comment

Uh oh!

VedantParanjape commented Aug 14, 2025

Uh oh!

nikic commented Aug 14, 2025

Uh oh!

Uh oh!

VedantParanjape commented Aug 14, 2025

Uh oh!

dtcxzyw commented Aug 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

github-actions bot commented Aug 6, 2025 •

edited

Loading

VedantParanjape Aug 13, 2025 •

edited

Loading

VedantParanjape commented Aug 13, 2025 •

edited

Loading