-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[LoopDist] Add metadata for checking post process state of distribute… #153902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-llvm-transforms Author: Michael Berg (mcberg2021) Changes…d loops Add a count of the number of partitions LoopDist made when distributing a loop in meta data, then check for loops which are already distributed to prevent reprocessing. Patch is 31.62 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/153902.diff 8 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/LoopDistribute.cpp b/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
index 27d3004d81947..d54fb482da262 100644
--- a/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
@@ -113,6 +113,8 @@ static cl::opt<bool> EnableLoopDistribute(
cl::desc("Enable the new, experimental LoopDistribution Pass"),
cl::init(false));
+static const char *DistributedCountMetaData = "llvm.loop.distribute.count";
+
STATISTIC(NumLoopsDistributed, "Number of loops distributed");
namespace {
@@ -826,6 +828,8 @@ class LoopDistributeForLoop {
{LLVMLoopDistributeFollowupAll, LLVMLoopDistributeFollowupFallback},
"llvm.loop.distribute.", true);
LVer.getNonVersionedLoop()->setLoopID(UnversionedLoopID);
+ addStringMetadataToLoop(LVer.getNonVersionedLoop(),
+ DistributedCountMetaData, Partitions.getSize());
}
// Create identical copies of the original loop for each partition and hook
@@ -986,6 +990,12 @@ static bool runImpl(Function &F, LoopInfo *LI, DominatorTree *DT,
for (Loop *L : Worklist) {
LoopDistributeForLoop LDL(L, &F, LI, DT, SE, LAIs, ORE);
+ // Do not reprocess loops we already distributed
+ if (auto Distributed = getOptionalIntLoopAttribute(L, DistributedCountMetaData)) {
+ LLVM_DEBUG(dbgs() << "LDist: Distributed loop guarded for reprocessing\n");
+ continue;
+ }
+
// If distribution was forced for the specific loop to be
// enabled/disabled, follow that. Otherwise use the global flag.
if (LDL.isForced().value_or(EnableLoopDistribute))
diff --git a/llvm/test/Transforms/LoopDistribute/cross-partition-access.ll b/llvm/test/Transforms/LoopDistribute/cross-partition-access.ll
index 6e1106c3277a7..112cd910a0efa 100644
--- a/llvm/test/Transforms/LoopDistribute/cross-partition-access.ll
+++ b/llvm/test/Transforms/LoopDistribute/cross-partition-access.ll
@@ -64,30 +64,30 @@ define dso_local void @_Z13distribution3PiS_S_S_i(ptr nocapture noundef %a, ptr
; CHECK: [[FOR_BODY_LDIST1]]:
; CHECK-NEXT: [[IDXPROM_LDIST1:%.*]] = phi i64 [ 0, %[[FOR_BODY_PH_LDIST1]] ], [ [[I6_LDIST1:%.*]], %[[FOR_BODY_LDIST1]] ]
; CHECK-NEXT: [[ARRAYIDX_LDIST1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IDXPROM_LDIST1]]
-; CHECK-NEXT: [[I2_LDIST1:%.*]] = load i32, ptr [[ARRAYIDX_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META6:![0-9]+]]
+; CHECK-NEXT: [[I2_LDIST1:%.*]] = load i32, ptr [[ARRAYIDX_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META7:![0-9]+]]
; CHECK-NEXT: [[ADD4_LDIST1:%.*]] = add nsw i32 [[I2_LDIST1]], 1
; CHECK-NEXT: [[ARRAYIDX8_LDIST1:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IDXPROM_LDIST1]]
-; CHECK-NEXT: store i32 [[ADD4_LDIST1]], ptr [[ARRAYIDX8_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META9:![0-9]+]], !noalias [[META11:![0-9]+]]
+; CHECK-NEXT: store i32 [[ADD4_LDIST1]], ptr [[ARRAYIDX8_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META10:![0-9]+]], !noalias [[META12:![0-9]+]]
; CHECK-NEXT: [[I3_LDIST1:%.*]] = getelementptr i32, ptr [[C]], i64 [[IDXPROM_LDIST1]]
; CHECK-NEXT: [[ARRAYIDX17_LDIST1:%.*]] = getelementptr i8, ptr [[I3_LDIST1]], i64 -4
-; CHECK-NEXT: [[I4_LDIST1:%.*]] = load i32, ptr [[ARRAYIDX17_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META14:![0-9]+]], !noalias [[META15:![0-9]+]]
+; CHECK-NEXT: [[I4_LDIST1:%.*]] = load i32, ptr [[ARRAYIDX17_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META15:![0-9]+]], !noalias [[META16:![0-9]+]]
; CHECK-NEXT: [[SUB18_LDIST1:%.*]] = sub nsw i32 [[ADD4_LDIST1]], [[I4_LDIST1]]
-; CHECK-NEXT: store i32 [[SUB18_LDIST1]], ptr [[I3_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META14]], !noalias [[META15]]
+; CHECK-NEXT: store i32 [[SUB18_LDIST1]], ptr [[I3_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META15]], !noalias [[META16]]
; CHECK-NEXT: [[I6_LDIST1]] = add i64 [[IDXPROM_LDIST1]], 1
; CHECK-NEXT: [[CMP1_NOT_LDIST1:%.*]] = icmp eq i64 [[I6_LDIST1]], [[LEN]]
-; CHECK-NEXT: br i1 [[CMP1_NOT_LDIST1]], label %[[FOR_BODY_PH:.*]], label %[[FOR_BODY_LDIST1]], !llvm.loop [[LOOP16:![0-9]+]]
+; CHECK-NEXT: br i1 [[CMP1_NOT_LDIST1]], label %[[FOR_BODY_PH:.*]], label %[[FOR_BODY_LDIST1]], !llvm.loop [[LOOP17:![0-9]+]]
; CHECK: [[FOR_BODY_PH]]:
; CHECK-NEXT: br label %[[FOR_BODY:.*]]
; CHECK: [[FOR_BODY]]:
; CHECK-NEXT: [[IDXPROM:%.*]] = phi i64 [ 0, %[[FOR_BODY_PH]] ], [ [[I6:%.*]], %[[FOR_BODY]] ]
; CHECK-NEXT: [[ARRAYIDX8:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IDXPROM]]
-; CHECK-NEXT: [[I5:%.*]] = load i32, ptr [[ARRAYIDX8]], align 4, !tbaa [[TBAA0]], !alias.scope [[META9]], !noalias [[META11]]
+; CHECK-NEXT: [[I5:%.*]] = load i32, ptr [[ARRAYIDX8]], align 4, !tbaa [[TBAA0]], !alias.scope [[META10]], !noalias [[META12]]
; CHECK-NEXT: [[ADD27:%.*]] = add nsw i32 [[I5]], 2
; CHECK-NEXT: [[ARRAYIDX31:%.*]] = getelementptr inbounds i32, ptr [[D]], i64 [[IDXPROM]]
-; CHECK-NEXT: store i32 [[ADD27]], ptr [[ARRAYIDX31]], align 4, !tbaa [[TBAA0]], !alias.scope [[META15]], !noalias [[META6]]
+; CHECK-NEXT: store i32 [[ADD27]], ptr [[ARRAYIDX31]], align 4, !tbaa [[TBAA0]], !alias.scope [[META16]], !noalias [[META7]]
; CHECK-NEXT: [[I6]] = add i64 [[IDXPROM]], 1
; CHECK-NEXT: [[CMP1_NOT:%.*]] = icmp eq i64 [[I6]], [[LEN]]
-; CHECK-NEXT: br i1 [[CMP1_NOT]], label %[[END_LOOPEXIT_LOOPEXIT20:.*]], label %[[FOR_BODY]], !llvm.loop [[LOOP16]]
+; CHECK-NEXT: br i1 [[CMP1_NOT]], label %[[END_LOOPEXIT_LOOPEXIT20:.*]], label %[[FOR_BODY]], !llvm.loop [[LOOP17]]
; CHECK: [[END_LOOPEXIT_LOOPEXIT]]:
; CHECK-NEXT: br label %[[END_LOOPEXIT:.*]]
; CHECK: [[END_LOOPEXIT_LOOPEXIT20]]:
@@ -143,17 +143,18 @@ end: ; preds = %end.loopexit, %entr
; CHECK: [[META1]] = !{!"int", [[META2:![0-9]+]], i64 0}
; CHECK: [[META2]] = !{!"omnipotent char", [[META3:![0-9]+]], i64 0}
; CHECK: [[META3]] = !{!"Simple C++ TBAA"}
-; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META5:![0-9]+]]}
+; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META5:![0-9]+]], [[META6:![0-9]+]]}
; CHECK: [[META5]] = !{!"llvm.loop.mustprogress"}
-; CHECK: [[META6]] = !{[[META7:![0-9]+]]}
-; CHECK: [[META7]] = distinct !{[[META7]], [[META8:![0-9]+]]}
-; CHECK: [[META8]] = distinct !{[[META8]], !"LVerDomain"}
-; CHECK: [[META9]] = !{[[META10:![0-9]+]]}
-; CHECK: [[META10]] = distinct !{[[META10]], [[META8]]}
-; CHECK: [[META11]] = !{[[META12:![0-9]+]], [[META13:![0-9]+]], [[META7]]}
-; CHECK: [[META12]] = distinct !{[[META12]], [[META8]]}
-; CHECK: [[META13]] = distinct !{[[META13]], [[META8]]}
-; CHECK: [[META14]] = !{[[META12]]}
+; CHECK: [[META6]] = !{!"llvm.loop.distribute.count", i32 2}
+; CHECK: [[META7]] = !{[[META8:![0-9]+]]}
+; CHECK: [[META8]] = distinct !{[[META8]], [[META9:![0-9]+]]}
+; CHECK: [[META9]] = distinct !{[[META9]], !"LVerDomain"}
+; CHECK: [[META10]] = !{[[META11:![0-9]+]]}
+; CHECK: [[META11]] = distinct !{[[META11]], [[META9]]}
+; CHECK: [[META12]] = !{[[META13:![0-9]+]], [[META14:![0-9]+]], [[META8]]}
+; CHECK: [[META13]] = distinct !{[[META13]], [[META9]]}
+; CHECK: [[META14]] = distinct !{[[META14]], [[META9]]}
; CHECK: [[META15]] = !{[[META13]]}
-; CHECK: [[LOOP16]] = distinct !{[[LOOP16]], [[META5]]}
+; CHECK: [[META16]] = !{[[META14]]}
+; CHECK: [[LOOP17]] = distinct !{[[LOOP17]], [[META5]]}
;.
diff --git a/llvm/test/Transforms/LoopDistribute/followup.ll b/llvm/test/Transforms/LoopDistribute/followup.ll
index 55307bdf24991..cbfc070a92614 100644
--- a/llvm/test/Transforms/LoopDistribute/followup.ll
+++ b/llvm/test/Transforms/LoopDistribute/followup.ll
@@ -58,29 +58,29 @@ define void @f(ptr %a, ptr %b, ptr %c, ptr %d, ptr %e) {
; CHECK: [[FOR_BODY_LDIST1]]:
; CHECK-NEXT: [[IND_LDIST1:%.*]] = phi i64 [ 0, %[[FOR_BODY_PH_LDIST1]] ], [ [[ADD_LDIST1:%.*]], %[[FOR_BODY_LDIST1]] ]
; CHECK-NEXT: [[ARRAYIDXA_LDIST1:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IND_LDIST1]]
-; CHECK-NEXT: [[LOADA_LDIST1:%.*]] = load i32, ptr [[ARRAYIDXA_LDIST1]], align 4, !alias.scope [[META3:![0-9]+]], !noalias [[META6:![0-9]+]]
+; CHECK-NEXT: [[LOADA_LDIST1:%.*]] = load i32, ptr [[ARRAYIDXA_LDIST1]], align 4, !alias.scope [[META4:![0-9]+]], !noalias [[META7:![0-9]+]]
; CHECK-NEXT: [[ARRAYIDXB_LDIST1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IND_LDIST1]]
-; CHECK-NEXT: [[LOADB_LDIST1:%.*]] = load i32, ptr [[ARRAYIDXB_LDIST1]], align 4, !alias.scope [[META10:![0-9]+]]
+; CHECK-NEXT: [[LOADB_LDIST1:%.*]] = load i32, ptr [[ARRAYIDXB_LDIST1]], align 4, !alias.scope [[META11:![0-9]+]]
; CHECK-NEXT: [[MULA_LDIST1:%.*]] = mul i32 [[LOADB_LDIST1]], [[LOADA_LDIST1]]
; CHECK-NEXT: [[ADD_LDIST1]] = add nuw nsw i64 [[IND_LDIST1]], 1
; CHECK-NEXT: [[ARRAYIDXA_PLUS_4_LDIST1:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[ADD_LDIST1]]
-; CHECK-NEXT: store i32 [[MULA_LDIST1]], ptr [[ARRAYIDXA_PLUS_4_LDIST1]], align 4, !alias.scope [[META3]], !noalias [[META6]]
+; CHECK-NEXT: store i32 [[MULA_LDIST1]], ptr [[ARRAYIDXA_PLUS_4_LDIST1]], align 4, !alias.scope [[META4]], !noalias [[META7]]
; CHECK-NEXT: [[EXITCOND_LDIST1:%.*]] = icmp eq i64 [[ADD_LDIST1]], 20
-; CHECK-NEXT: br i1 [[EXITCOND_LDIST1]], label %[[FOR_BODY_PH:.*]], label %[[FOR_BODY_LDIST1]], !llvm.loop [[LOOP12:![0-9]+]]
+; CHECK-NEXT: br i1 [[EXITCOND_LDIST1]], label %[[FOR_BODY_PH:.*]], label %[[FOR_BODY_LDIST1]], !llvm.loop [[LOOP13:![0-9]+]]
; CHECK: [[FOR_BODY_PH]]:
; CHECK-NEXT: br label %[[FOR_BODY:.*]]
; CHECK: [[FOR_BODY]]:
; CHECK-NEXT: [[IND:%.*]] = phi i64 [ 0, %[[FOR_BODY_PH]] ], [ [[ADD:%.*]], %[[FOR_BODY]] ]
; CHECK-NEXT: [[ADD]] = add nuw nsw i64 [[IND]], 1
; CHECK-NEXT: [[ARRAYIDXD:%.*]] = getelementptr inbounds i32, ptr [[D]], i64 [[IND]]
-; CHECK-NEXT: [[LOADD:%.*]] = load i32, ptr [[ARRAYIDXD]], align 4, !alias.scope [[META14:![0-9]+]]
+; CHECK-NEXT: [[LOADD:%.*]] = load i32, ptr [[ARRAYIDXD]], align 4, !alias.scope [[META15:![0-9]+]]
; CHECK-NEXT: [[ARRAYIDXE:%.*]] = getelementptr inbounds i32, ptr [[E]], i64 [[IND]]
-; CHECK-NEXT: [[LOADE:%.*]] = load i32, ptr [[ARRAYIDXE]], align 4, !alias.scope [[META15:![0-9]+]]
+; CHECK-NEXT: [[LOADE:%.*]] = load i32, ptr [[ARRAYIDXE]], align 4, !alias.scope [[META16:![0-9]+]]
; CHECK-NEXT: [[MULC:%.*]] = mul i32 [[LOADD]], [[LOADE]]
; CHECK-NEXT: [[ARRAYIDXC:%.*]] = getelementptr inbounds i32, ptr [[C]], i64 [[IND]]
-; CHECK-NEXT: store i32 [[MULC]], ptr [[ARRAYIDXC]], align 4, !alias.scope [[META16:![0-9]+]], !noalias [[META10]]
+; CHECK-NEXT: store i32 [[MULC]], ptr [[ARRAYIDXC]], align 4, !alias.scope [[META17:![0-9]+]], !noalias [[META11]]
; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[ADD]], 20
-; CHECK-NEXT: br i1 [[EXITCOND]], label %[[FOR_END_LOOPEXIT16:.*]], label %[[FOR_BODY]], !llvm.loop [[LOOP17:![0-9]+]]
+; CHECK-NEXT: br i1 [[EXITCOND]], label %[[FOR_END_LOOPEXIT16:.*]], label %[[FOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]]
; CHECK: [[FOR_END_LOOPEXIT]]:
; CHECK-NEXT: br label %[[FOR_END:.*]]
; CHECK: [[FOR_END_LOOPEXIT16]]:
@@ -131,23 +131,24 @@ for.end:
!4 = !{!"llvm.loop.distribute.followup_sequential", !{!"FollowupSequential", i32 8}}
!5 = !{!"llvm.loop.distribute.followup_fallback", !{!"FollowupFallback"}}
;.
-; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]], [[META3:![0-9]+]]}
; CHECK: [[META1]] = !{!"FollowupAll"}
; CHECK: [[META2]] = !{!"FollowupFallback"}
-; CHECK: [[META3]] = !{[[META4:![0-9]+]]}
-; CHECK: [[META4]] = distinct !{[[META4]], [[META5:![0-9]+]]}
-; CHECK: [[META5]] = distinct !{[[META5]], !"LVerDomain"}
-; CHECK: [[META6]] = !{[[META7:![0-9]+]], [[META8:![0-9]+]], [[META9:![0-9]+]]}
-; CHECK: [[META7]] = distinct !{[[META7]], [[META5]]}
-; CHECK: [[META8]] = distinct !{[[META8]], [[META5]]}
-; CHECK: [[META9]] = distinct !{[[META9]], [[META5]]}
-; CHECK: [[META10]] = !{[[META11:![0-9]+]]}
-; CHECK: [[META11]] = distinct !{[[META11]], [[META5]]}
-; CHECK: [[LOOP12]] = distinct !{[[LOOP12]], [[META1]], [[META13:![0-9]+]]}
-; CHECK: [[META13]] = !{!"FollowupSequential", i32 8}
-; CHECK: [[META14]] = !{[[META8]]}
+; CHECK: [[META3]] = !{!"llvm.loop.distribute.count", i32 2}
+; CHECK: [[META4]] = !{[[META5:![0-9]+]]}
+; CHECK: [[META5]] = distinct !{[[META5]], [[META6:![0-9]+]]}
+; CHECK: [[META6]] = distinct !{[[META6]], !"LVerDomain"}
+; CHECK: [[META7]] = !{[[META8:![0-9]+]], [[META9:![0-9]+]], [[META10:![0-9]+]]}
+; CHECK: [[META8]] = distinct !{[[META8]], [[META6]]}
+; CHECK: [[META9]] = distinct !{[[META9]], [[META6]]}
+; CHECK: [[META10]] = distinct !{[[META10]], [[META6]]}
+; CHECK: [[META11]] = !{[[META12:![0-9]+]]}
+; CHECK: [[META12]] = distinct !{[[META12]], [[META6]]}
+; CHECK: [[LOOP13]] = distinct !{[[LOOP13]], [[META1]], [[META14:![0-9]+]]}
+; CHECK: [[META14]] = !{!"FollowupSequential", i32 8}
; CHECK: [[META15]] = !{[[META9]]}
-; CHECK: [[META16]] = !{[[META7]]}
-; CHECK: [[LOOP17]] = distinct !{[[LOOP17]], [[META1]], [[META18:![0-9]+]]}
-; CHECK: [[META18]] = !{!"FollowupCoincident", i1 false}
+; CHECK: [[META16]] = !{[[META10]]}
+; CHECK: [[META17]] = !{[[META8]]}
+; CHECK: [[LOOP18]] = distinct !{[[LOOP18]], [[META1]], [[META19:![0-9]+]]}
+; CHECK: [[META19]] = !{!"FollowupCoincident", i1 false}
;.
diff --git a/llvm/test/Transforms/LoopDistribute/laa-invalidation.ll b/llvm/test/Transforms/LoopDistribute/laa-invalidation.ll
index ee42860cd250e..62c5627ac2d38 100644
--- a/llvm/test/Transforms/LoopDistribute/laa-invalidation.ll
+++ b/llvm/test/Transforms/LoopDistribute/laa-invalidation.ll
@@ -27,13 +27,13 @@ define void @test_pr50940(ptr %A, ptr %B) {
; CHECK-NEXT: store i16 1, ptr [[B]], align 1
; CHECK-NEXT: [[IV_NEXT_LVER_ORIG]] = add nuw nsw i16 [[IV_LVER_ORIG]], 1
; CHECK-NEXT: [[C_1_LVER_ORIG:%.*]] = icmp ult i16 [[IV_LVER_ORIG]], 38
-; CHECK-NEXT: br i1 [[C_1_LVER_ORIG]], label [[INNER_LVER_ORIG]], label [[EXIT_LOOPEXIT:%.*]]
+; CHECK-NEXT: br i1 [[C_1_LVER_ORIG]], label [[INNER_LVER_ORIG]], label [[EXIT_LOOPEXIT:%.*]], !llvm.loop [[LOOP0:![0-9]+]]
; CHECK: inner.ph3.ldist1:
; CHECK-NEXT: br label [[INNER_LDIST1:%.*]]
; CHECK: inner.ldist1:
; CHECK-NEXT: [[IV_LDIST1:%.*]] = phi i16 [ 0, [[INNER_PH3_LDIST1]] ], [ [[IV_NEXT_LDIST1:%.*]], [[INNER_LDIST1]] ]
-; CHECK-NEXT: [[L_LDIST1:%.*]] = load <2 x i16>, ptr [[UGLYGEP]], align 1, !alias.scope !0, !noalias !3
-; CHECK-NEXT: store i16 0, ptr [[GEP_A_3]], align 1, !alias.scope !0, !noalias !3
+; CHECK-NEXT: [[L_LDIST1:%.*]] = load <2 x i16>, ptr [[UGLYGEP]], align 1, !alias.scope [[META2:![0-9]+]], !noalias [[META5:![0-9]+]]
+; CHECK-NEXT: store i16 0, ptr [[GEP_A_3]], align 1, !alias.scope [[META2]], !noalias [[META5]]
; CHECK-NEXT: [[IV_NEXT_LDIST1]] = add nuw nsw i16 [[IV_LDIST1]], 1
; CHECK-NEXT: [[C_1_LDIST1:%.*]] = icmp ult i16 [[IV_LDIST1]], 38
; CHECK-NEXT: br i1 [[C_1_LDIST1]], label [[INNER_LDIST1]], label [[INNER_PH3:%.*]]
@@ -41,7 +41,7 @@ define void @test_pr50940(ptr %A, ptr %B) {
; CHECK-NEXT: br label [[INNER:%.*]]
; CHECK: inner:
; CHECK-NEXT: [[IV:%.*]] = phi i16 [ 0, [[INNER_PH3]] ], [ [[IV_NEXT:%.*]], [[INNER]] ]
-; CHECK-NEXT: store i16 1, ptr [[B]], align 1, !alias.scope !3
+; CHECK-NEXT: store i16 1, ptr [[B]], align 1, !alias.scope [[META5]]
; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i16 [[IV]], 1
; CHECK-NEXT: [[C_1:%.*]] = icmp ult i16 [[IV]], 38
; CHECK-NEXT: br i1 [[C_1]], label [[INNER]], label [[EXIT_LOOPEXIT4:%.*]]
diff --git a/llvm/test/Transforms/LoopDistribute/no-reprocess.ll b/llvm/test/Transforms/LoopDistribute/no-reprocess.ll
new file mode 100644
index 0000000000000..40fa08548366f
--- /dev/null
+++ b/llvm/test/Transforms/LoopDistribute/no-reprocess.ll
@@ -0,0 +1,130 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -aa-pipeline=basic-aa -passes=loop-distribute -enable-loop-distribute -verify-loop-info -verify-dom-info -debug-only=loop-distribute --disable-output -S < %s 2>&1 | FileCheck %s
+; REQUIRES: asserts
+
+; Test if a loop already distributed will not reprocess because of metadata
+; information marking it as processed.
+
+; CHECK: LDist: Distributed loop guarded for reprocessing
+; CHECK: LDist: Skipping;
+
+define dso_local void @_Z13distribution3PiS_S_S_i(ptr noundef captures(none) %a, ptr noundef readonly captures(none) %b, ptr noundef captures(none) %c, ptr noundef writeonly captures(none) %d, i64 noundef signext %len) {
+entry:
+ %cmp = icmp sgt i64 %len, 0
+ br i1 %cmp, label %end, label %for.body.lver.check
+
+for.body.lver.check: ; preds = %entry
+ %0 = shl i64 %len, 2
+ %scevgep = getelementptr i8, ptr %a, i64 %0
+ %scevgep1 = getelementptr i8, ptr %c, i64 -4
+ %scevgep2 = getelementptr i8, ptr %c, i64 %0
+ %scevgep3 = getelementptr i8, ptr %d, i64 %0
+ %scevgep4 = getelementptr i8, ptr %b, i64 %0
+ %bound0 = icmp ult ptr %a, %scevgep2
+ %bound1 = icmp ult ptr %scevgep1, %scevgep
+ %found.conflict = and i1 %bound0, %bound1
+ %bound05 = icmp ult ptr %a, %scevgep3
+ %bound16 = icmp ult ptr %d, %scevgep
+ %found.conflict7 = and i1 %bound05, %bound16
+ %conflict.rdx = or i1 %found.conflict, %found.conflict7
+ %bound08 = icmp ult ptr %a, %scevgep4
+ %bound19 = icmp ult ptr %b, %scevgep
+ %found.conflict10 = and i1 %bound08, %bound19
+ %conflict.rdx11 = or i1 %conflict.rdx, %found.conflict10
+ %bound012 = icmp ult ptr %scevgep1, %scevgep3
+ %bound113 = icmp ult ptr %d, %scevgep2
+ %found.conflict14 = and i1 %bound012, %bound113
+ %conflict.rdx15 = or i1 %conflict.rdx11, %found.conflict14
+ %bound016 = icmp ult ptr %d, %scevgep4
+ %bound117 = icmp ult ptr %b, %scevgep3
+ %found.conflict18 = and i1 %bound016, %bound117
+ %conflict.rdx19 = or i1 %conflict.rdx15, %found.conflict18
+ br i1 %conflict.rdx19, label %for.body.ph.lver.orig, label %for.body.ph.ldist1
+
+for.body.ph.lver.orig: ; preds = %for.body.lver.check
+ br label %for.body.lver.orig
+
+for.body.lver.orig: ; preds = %for.body.lver.orig, %for.body.ph.lver.orig
+ %indvars.iv.lver.orig = phi i64 [ 0, %for.body.ph.lver.orig ], [ %indvars.iv.next.lver.orig, %for.body.lver.orig ]
+ %arrayidx.lver.orig = getelementptr inbounds i32, ptr %b, i64 %indvars.iv.lver.orig
+ %i2.lver.orig = load i32, ptr %arrayidx.lver.orig, align 4, !tbaa !0
+ %add4.lver.orig = add nsw i32 %i2.lver.orig, 1
+ %arrayidx8.lver.orig = getelementptr inbounds i32, ptr %a, i64 %indvars.iv.lver.orig
+ store i32 %add4.lver.orig, ptr %arrayidx8.lver.orig, align 4, !tbaa !0
+ %i3.lver.orig = getelementptr i32, ptr %c, i64 %indvars.iv.lver.orig
+ %arrayidx17.lver.orig = getelementptr i8, ptr %i3.lver.orig, i64 -4
+ %i4.lver.orig = load i32, ptr %arrayidx17.lver.orig, align 4, !tbaa !0
+ %sub18.lver.orig = sub nsw i32 %add4.lver.orig, %i4.lver.orig
+ store i32 %sub18.lver.orig, ptr %i3.lver.orig, align 4, !tbaa !0
+ %i5.lver.orig = load i32, ptr %arrayidx8.lver.orig, align 4, !tbaa !0
+ %add27.lver.orig = add nsw i32 %i5.lver.orig, 2
+ %arrayidx31.lver.orig = getelementptr inbounds i32, ptr %d, i64 %indvars.iv.lver.orig
+ store i32 %add27.lver.orig, ptr %arrayidx31.lver.orig, align 4, !tbaa !0
+ %indvars.iv.next.lver.orig = add i64 %indvars.iv.lver.orig, 1
+ %cmp1.not.lver.orig = icmp eq i64 %indvars.iv.next.lver.orig, %len
+ br i1 %cmp1.not.lver.orig, label %end.loopexit.loopexit, label %for.body.lver.orig, !llvm.loop !4
+
+for.body.ph.ldist1: ; preds = %for.body.lver.check
+ br label %for.body.ldist1
+
+for.body.ldist1: ; preds = %for.body.ldist1, %for.body.ph.ldist1
+ %indvars.iv.ldist1 = phi i64 [ 0, %for.body.ph.ldist1 ], [ %indvars.iv.next.ldist1, %for.body.ldist1 ]
+ %arrayidx.ldist1 = getelementptr inbounds i32, ptr %b, i64 %indvars.iv.ldist1
+ %i2.ldist1 = load i32, ptr %arrayidx.ldist1, align 4, !tbaa !0, !alias.scope !7
+ %add4.ldist1 = add nsw i32 %i2.ldist1, 1
+ %arrayidx8.ldist1 = getelementptr inbounds i32, ptr %a, i64 %indvars.iv.ldist1
+ store i32 %add4.ldist1, ptr %arrayidx8.ldist1, align 4, !tbaa !0, !alias.scope !10, !noalias !12
+ %...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
4bf1302
to
d3463cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a count instead of just a flag that marks the loop as already vectorized? .count
looks like an instruction to the compiler to distribute into that number of os parts (compare llvm.loop.unroll.count
).
LoopVectorizer's marker is llvm.loop.isvectorized
.
Ok, I can do that. It's functionally equivalent for our purposes. |
51a389e
to
fbd591a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can do that. It's functionally equivalent for our purposes.
If there is a use for the number, we can add it, but a rationale is missing. The rationale would also determine how to interpret the number. Otherwise use the simplest data structure that does the job.
Please also add documentation to the language reference. Yes, llvm.loop.isvectorized
is missing already.
fbd591a
to
86f742a
Compare
llvm/docs/TransformMetadata.rst
Outdated
It is recommended to add ``llvm.loop.isdistributed`` to mark loops | ||
that have been transformed by LoopDistribute so that they are not | ||
reprocessed under LTO, where they may be given a second opportunity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The paragraph you copy&pasted is a recommendations what frontends should do. llvm.loop.isdistributed
is added by LLVM itself. It doesn't make sense to make to recommend frontends to do something they do not have control over.
I had mostly the Language Reference in mind; it is supposed to have an exhaustive list of all recognized metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the language reference and there may still need to be a transform meta data entry.
…d loops Add a boolean marker noting when loop distribution was sucessfully applied in a loops meta data, then check for loops which are already distributed to prevent reprocessing.
86f742a
to
ab071e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you
…d loops
Add a count of the number of partitions LoopDist made when distributing a loop in meta data, then check for loops which are already distributed to prevent reprocessing.
We see this happen on some spec apps, LD is on by default at SiFive.