-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[InstCombine] Propagate invariant.load metadata across unpacked loads #152186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
For loads that operate on aggregate type, instcombine unpacks the loads. It does not preserve the invariant.load metadata. This patch fixes that, it looks for the metadata in the parent load and attaches the metadata to the unpacked loads.
@llvm/pr-subscribers-llvm-transforms Author: Vedant Paranjape (VedantParanjape) ChangesFor loads that operate on aggregate type, instcombine unpacks the loads. It does not preserve the invariant.load metadata. This patch fixes that, it looks for the metadata in the parent load and attaches the metadata to the unpacked loads.
Reproducer: https://godbolt.org/z/hcY8MMvYh Full diff: https://github.com/llvm/llvm-project/pull/152186.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp b/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
index 0be1034b046b6..68252b06e3a9a 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
@@ -718,6 +718,14 @@ static Instruction *combineLoadToOperationType(InstCombinerImpl &IC,
return nullptr;
}
+// Check if the aggregate load has a invariant.load metadata
+// If aggregate load has invariant.load metadata, add it to the
+// unpacked loads as well.
+static void copyInvariantLoadMetadata(LoadInst &LI, LoadInst *NewLoad) {
+ if (MDNode *MD = LI.getMetadata("invariant.load"))
+ NewLoad->setMetadata("invariant.load", MD);
+}
+
static Instruction *unpackLoadToAggregate(InstCombinerImpl &IC, LoadInst &LI) {
// FIXME: We could probably with some care handle both volatile and atomic
// stores here but it isn't clear that this is important.
@@ -737,6 +745,7 @@ static Instruction *unpackLoadToAggregate(InstCombinerImpl &IC, LoadInst &LI) {
LoadInst *NewLoad = IC.combineLoadToNewType(LI, ST->getTypeAtIndex(0U),
".unpack");
NewLoad->setAAMetadata(LI.getAAMetadata());
+ copyInvariantLoadMetadata(LI, NewLoad);
return IC.replaceInstUsesWith(LI, IC.Builder.CreateInsertValue(
PoisonValue::get(T), NewLoad, 0, Name));
}
@@ -764,6 +773,7 @@ static Instruction *unpackLoadToAggregate(InstCombinerImpl &IC, LoadInst &LI) {
Name + ".unpack");
// Propagate AA metadata. It'll still be valid on the narrowed load.
L->setAAMetadata(LI.getAAMetadata());
+ copyInvariantLoadMetadata(LI, L);
V = IC.Builder.CreateInsertValue(V, L, i);
}
diff --git a/llvm/test/Transforms/InstCombine/invariant-metadata-propagation.ll b/llvm/test/Transforms/InstCombine/invariant-metadata-propagation.ll
new file mode 100644
index 0000000000000..acc5e7ca8d2b4
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/invariant-metadata-propagation.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S < %s -passes=instcombine | FileCheck %s
+
+%struct.double2 = type { double, double }
+%struct.double1 = type { double }
+
+define %struct.double2 @func1(ptr addrspace(1) %a) {
+; CHECK-LABEL: define %struct.double2 @func1(
+; CHECK-SAME: ptr addrspace(1) [[A:%.*]]) {
+; CHECK-NEXT: [[DOTUNPACK:%.*]] = load double, ptr addrspace(1) [[A]], align 16, !invariant.load [[META0:![0-9]+]]
+; CHECK-NEXT: [[TMP1:%.*]] = insertvalue [[STRUCT_DOUBLE2:%.*]] poison, double [[DOTUNPACK]], 0
+; CHECK-NEXT: [[DOTELT1:%.*]] = getelementptr inbounds nuw i8, ptr addrspace(1) [[A]], i64 8
+; CHECK-NEXT: [[DOTUNPACK2:%.*]] = load double, ptr addrspace(1) [[DOTELT1]], align 8, !invariant.load [[META0]]
+; CHECK-NEXT: [[TMP2:%.*]] = insertvalue [[STRUCT_DOUBLE2]] [[TMP1]], double [[DOTUNPACK2]], 1
+; CHECK-NEXT: ret [[STRUCT_DOUBLE2]] [[TMP2]]
+;
+ %1 = load %struct.double2, ptr addrspace(1) %a, align 16, !invariant.load !1
+ ret %struct.double2 %1
+}
+
+define %struct.double2 @func2(ptr %a) {
+; CHECK-LABEL: define %struct.double2 @func2(
+; CHECK-SAME: ptr [[A:%.*]]) {
+; CHECK-NEXT: [[DOTUNPACK:%.*]] = load double, ptr [[A]], align 16, !invariant.load [[META0]]
+; CHECK-NEXT: [[TMP1:%.*]] = insertvalue [[STRUCT_DOUBLE2:%.*]] poison, double [[DOTUNPACK]], 0
+; CHECK-NEXT: [[DOTELT1:%.*]] = getelementptr inbounds nuw i8, ptr [[A]], i64 8
+; CHECK-NEXT: [[DOTUNPACK2:%.*]] = load double, ptr [[DOTELT1]], align 8, !invariant.load [[META0]]
+; CHECK-NEXT: [[TMP2:%.*]] = insertvalue [[STRUCT_DOUBLE2]] [[TMP1]], double [[DOTUNPACK2]], 1
+; CHECK-NEXT: ret [[STRUCT_DOUBLE2]] [[TMP2]]
+;
+ %1 = load %struct.double2, ptr %a, align 16, !invariant.load !1
+ ret %struct.double2 %1
+}
+
+define %struct.double1 @func3(ptr %a) {
+; CHECK-LABEL: define %struct.double1 @func3(
+; CHECK-SAME: ptr [[A:%.*]]) {
+; CHECK-NEXT: [[DOTUNPACK:%.*]] = load double, ptr [[A]], align 16, !invariant.load [[META0]]
+; CHECK-NEXT: [[TMP1:%.*]] = insertvalue [[STRUCT_DOUBLE1:%.*]] poison, double [[DOTUNPACK]], 0
+; CHECK-NEXT: ret [[STRUCT_DOUBLE1]] [[TMP1]]
+;
+ %1 = load %struct.double1, ptr %a, align 16, !invariant.load !1
+ ret %struct.double1 %1
+}
+
+!1 = !{}
|
// If aggregate load has invariant.load metadata, add it to the | ||
// unpacked loads as well. | ||
static void copyInvariantLoadMetadata(LoadInst &LI, LoadInst *NewLoad) { | ||
if (MDNode *MD = LI.getMetadata("invariant.load")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should access by MD_ enum instead of by name. Also isn't this what copyMetadataForLoad is for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that function will correctly propagate metadata for unpacking loads. For example, it copies MD_tbaa as is. But the tbaa nodes would change for scalar loads vs aggregate loads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arsenm now using enums instead of by name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is still that most of these metadata should propagate, this isn't unique to invariant.load
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use copyMetadataForLoad
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyMetadataForLoad
sounds like only designed for load type casting. And some code that checks NewType->isPointerTy()
implicitly assumes the OldType is also a pointer, which sounds incorrect for spliting loads.
// Note, essentially every kind of metadata should be preserved here! This
// routine is supposed to clone a load instruction changing *only its type*.
// The only metadata it makes sense to drop is metadata which is invalidated
// when the pointer type changes. This should essentially never be the case
// in LLVM, but we explicitly switch over only known metadata to be
// conservatively correct. If you are adding metadata to LLVM which pertains
// to loads, you almost certainly want to add it here.
Shall we add a copyMetadataForSplitLoad
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyMetadataForLoad sounds like only designed for load type casting.
You are right. If we have to repeat it multiple times in the future, it is definitely better to have a helper function.
✅ With the latest revision this PR passed the C/C++ code formatter. |
gentle ping! This is part of my internship project and I will finish in a week. Could the reviewers please take a look at it, and let me know how to proceed with this patch? Thanks. |
AAMDNodes adjustedAANodes = LI.getAAMetadata().adjustForAccess( | ||
SL->getElementOffset(i), | ||
SL->getElementOffset(i).getKnownMinValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AAMDNodes adjustedAANodes = LI.getAAMetadata().adjustForAccess( | |
SL->getElementOffset(i), | |
SL->getElementOffset(i).getKnownMinValue()); | |
TypeSize offset = SL->getElementOffset(i); | |
AAMDNodes adjustedAANodes = LI.getAAMetadata(); | |
if (offset.isFixed()) { | |
adjustedAANodes = adjustedAANodes.adjustForAccess( | |
offset.getFixedValue(), | |
ST->getElementType(i), DL); | |
} | |
else { | |
adjustedAANodes.TBAA = nullptr; | |
adjustedAANodes.TBAAStruct = nullptr; | |
} |
For structs with scalable vector elements, the TBAA information is not applicable. You can either zero out or assert here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please file a separate patch for this change. I don't want to block this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this is a separate issue unrelated to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will upload this as a separate patch.
@@ -737,6 +745,7 @@ static Instruction *unpackLoadToAggregate(InstCombinerImpl &IC, LoadInst &LI) { | |||
LoadInst *NewLoad = IC.combineLoadToNewType(LI, ST->getTypeAtIndex(0U), | |||
".unpack"); | |||
NewLoad->setAAMetadata(LI.getAAMetadata()); | |||
copyInvariantLoadMetadata(LI, NewLoad); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyInvariantLoadMetadata(LI, NewLoad); | |
NewLI->copyMetadata(LI, LLVMContext::MD_invariant_load); |
// Propagate AA metadata. It'll still be valid on the narrowed load. | ||
L->setAAMetadata(LI.getAAMetadata()); | ||
L->setAAMetadata(adjustedAANodes); | ||
copyInvariantLoadMetadata(LI, L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyInvariantLoadMetadata(LI, L); | |
NewLI->copyMetadata(LI, LLVMContext::MD_invariant_load); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dtcxzyw I have fixed these issues, could you please give a lgtm? I will create another PR for adjusting AA metadata.
Just out of curiosity, is the original IR generated from the CUDA intrinsic |
Interesting, the result with your patch looks the best :) |
Yes, other high-level compilers also can use
Yes, load store vectorization relies on this IC function to vectorize overaligned aggregate loads, which is one of the most important optimizations for CUDA |
This reverts commit 9899e69.
This change seems to crash the following testcase. Seems totally unrelated to my change, and it is a llc testcase.
|
Yeah it was also observed in another PR #152851 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
can I merge bypassing build requirements? or wait for the build issue to get resolved. |
Generally, feel free to merge as-is if there are only pre-existing test failures (like in this case). |
For loads that operate on aggregate type, instcombine unpacks the loads. It does not preserve the invariant.load metadata. This patch fixes that, it looks for the metadata in the parent load and attaches the metadata to the unpacked loads.
Reproducer: https://godbolt.org/z/hcY8MMvYh