Skip to content

Conversation

mcberg2021
Copy link
Contributor

@mcberg2021 mcberg2021 commented Aug 15, 2025

…d loops

Add a count of the number of partitions LoopDist made when distributing a loop in meta data, then check for loops which are already distributed to prevent reprocessing.

We see this happen on some spec apps, LD is on by default at SiFive.

@llvmbot
Copy link
Member

llvmbot commented Aug 15, 2025

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-llvm-transforms

Author: Michael Berg (mcberg2021)

Changes

…d loops

Add a count of the number of partitions LoopDist made when distributing a loop in meta data, then check for loops which are already distributed to prevent reprocessing.


Patch is 31.62 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/153902.diff

8 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/LoopDistribute.cpp (+10)
  • (modified) llvm/test/Transforms/LoopDistribute/cross-partition-access.ll (+20-19)
  • (modified) llvm/test/Transforms/LoopDistribute/followup.ll (+25-24)
  • (modified) llvm/test/Transforms/LoopDistribute/laa-invalidation.ll (+4-4)
  • (added) llvm/test/Transforms/LoopDistribute/no-reprocess.ll (+130)
  • (modified) llvm/test/Transforms/LoopDistribute/scev-inserted-runtime-check.ll (+6-6)
  • (modified) llvm/test/Transforms/LoopVersioning/noalias-version-twice.ll (+22-18)
  • (modified) llvm/test/tools/UpdateTestChecks/update_analyze_test_checks/Inputs/loop-distribute.ll.expected (+3-3)
diff --git a/llvm/lib/Transforms/Scalar/LoopDistribute.cpp b/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
index 27d3004d81947..d54fb482da262 100644
--- a/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopDistribute.cpp
@@ -113,6 +113,8 @@ static cl::opt<bool> EnableLoopDistribute(
     cl::desc("Enable the new, experimental LoopDistribution Pass"),
     cl::init(false));
 
+static const char *DistributedCountMetaData = "llvm.loop.distribute.count";
+
 STATISTIC(NumLoopsDistributed, "Number of loops distributed");
 
 namespace {
@@ -826,6 +828,8 @@ class LoopDistributeForLoop {
           {LLVMLoopDistributeFollowupAll, LLVMLoopDistributeFollowupFallback},
           "llvm.loop.distribute.", true);
       LVer.getNonVersionedLoop()->setLoopID(UnversionedLoopID);
+      addStringMetadataToLoop(LVer.getNonVersionedLoop(),
+                              DistributedCountMetaData, Partitions.getSize());
     }
 
     // Create identical copies of the original loop for each partition and hook
@@ -986,6 +990,12 @@ static bool runImpl(Function &F, LoopInfo *LI, DominatorTree *DT,
   for (Loop *L : Worklist) {
     LoopDistributeForLoop LDL(L, &F, LI, DT, SE, LAIs, ORE);
 
+    // Do not reprocess loops we already distributed
+    if (auto Distributed = getOptionalIntLoopAttribute(L, DistributedCountMetaData)) {
+      LLVM_DEBUG(dbgs() << "LDist: Distributed loop guarded for reprocessing\n");
+      continue;
+    }
+
     // If distribution was forced for the specific loop to be
     // enabled/disabled, follow that.  Otherwise use the global flag.
     if (LDL.isForced().value_or(EnableLoopDistribute))
diff --git a/llvm/test/Transforms/LoopDistribute/cross-partition-access.ll b/llvm/test/Transforms/LoopDistribute/cross-partition-access.ll
index 6e1106c3277a7..112cd910a0efa 100644
--- a/llvm/test/Transforms/LoopDistribute/cross-partition-access.ll
+++ b/llvm/test/Transforms/LoopDistribute/cross-partition-access.ll
@@ -64,30 +64,30 @@ define dso_local void @_Z13distribution3PiS_S_S_i(ptr nocapture noundef %a, ptr
 ; CHECK:       [[FOR_BODY_LDIST1]]:
 ; CHECK-NEXT:    [[IDXPROM_LDIST1:%.*]] = phi i64 [ 0, %[[FOR_BODY_PH_LDIST1]] ], [ [[I6_LDIST1:%.*]], %[[FOR_BODY_LDIST1]] ]
 ; CHECK-NEXT:    [[ARRAYIDX_LDIST1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IDXPROM_LDIST1]]
-; CHECK-NEXT:    [[I2_LDIST1:%.*]] = load i32, ptr [[ARRAYIDX_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META6:![0-9]+]]
+; CHECK-NEXT:    [[I2_LDIST1:%.*]] = load i32, ptr [[ARRAYIDX_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META7:![0-9]+]]
 ; CHECK-NEXT:    [[ADD4_LDIST1:%.*]] = add nsw i32 [[I2_LDIST1]], 1
 ; CHECK-NEXT:    [[ARRAYIDX8_LDIST1:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IDXPROM_LDIST1]]
-; CHECK-NEXT:    store i32 [[ADD4_LDIST1]], ptr [[ARRAYIDX8_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META9:![0-9]+]], !noalias [[META11:![0-9]+]]
+; CHECK-NEXT:    store i32 [[ADD4_LDIST1]], ptr [[ARRAYIDX8_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META10:![0-9]+]], !noalias [[META12:![0-9]+]]
 ; CHECK-NEXT:    [[I3_LDIST1:%.*]] = getelementptr i32, ptr [[C]], i64 [[IDXPROM_LDIST1]]
 ; CHECK-NEXT:    [[ARRAYIDX17_LDIST1:%.*]] = getelementptr i8, ptr [[I3_LDIST1]], i64 -4
-; CHECK-NEXT:    [[I4_LDIST1:%.*]] = load i32, ptr [[ARRAYIDX17_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META14:![0-9]+]], !noalias [[META15:![0-9]+]]
+; CHECK-NEXT:    [[I4_LDIST1:%.*]] = load i32, ptr [[ARRAYIDX17_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META15:![0-9]+]], !noalias [[META16:![0-9]+]]
 ; CHECK-NEXT:    [[SUB18_LDIST1:%.*]] = sub nsw i32 [[ADD4_LDIST1]], [[I4_LDIST1]]
-; CHECK-NEXT:    store i32 [[SUB18_LDIST1]], ptr [[I3_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META14]], !noalias [[META15]]
+; CHECK-NEXT:    store i32 [[SUB18_LDIST1]], ptr [[I3_LDIST1]], align 4, !tbaa [[TBAA0]], !alias.scope [[META15]], !noalias [[META16]]
 ; CHECK-NEXT:    [[I6_LDIST1]] = add i64 [[IDXPROM_LDIST1]], 1
 ; CHECK-NEXT:    [[CMP1_NOT_LDIST1:%.*]] = icmp eq i64 [[I6_LDIST1]], [[LEN]]
-; CHECK-NEXT:    br i1 [[CMP1_NOT_LDIST1]], label %[[FOR_BODY_PH:.*]], label %[[FOR_BODY_LDIST1]], !llvm.loop [[LOOP16:![0-9]+]]
+; CHECK-NEXT:    br i1 [[CMP1_NOT_LDIST1]], label %[[FOR_BODY_PH:.*]], label %[[FOR_BODY_LDIST1]], !llvm.loop [[LOOP17:![0-9]+]]
 ; CHECK:       [[FOR_BODY_PH]]:
 ; CHECK-NEXT:    br label %[[FOR_BODY:.*]]
 ; CHECK:       [[FOR_BODY]]:
 ; CHECK-NEXT:    [[IDXPROM:%.*]] = phi i64 [ 0, %[[FOR_BODY_PH]] ], [ [[I6:%.*]], %[[FOR_BODY]] ]
 ; CHECK-NEXT:    [[ARRAYIDX8:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IDXPROM]]
-; CHECK-NEXT:    [[I5:%.*]] = load i32, ptr [[ARRAYIDX8]], align 4, !tbaa [[TBAA0]], !alias.scope [[META9]], !noalias [[META11]]
+; CHECK-NEXT:    [[I5:%.*]] = load i32, ptr [[ARRAYIDX8]], align 4, !tbaa [[TBAA0]], !alias.scope [[META10]], !noalias [[META12]]
 ; CHECK-NEXT:    [[ADD27:%.*]] = add nsw i32 [[I5]], 2
 ; CHECK-NEXT:    [[ARRAYIDX31:%.*]] = getelementptr inbounds i32, ptr [[D]], i64 [[IDXPROM]]
-; CHECK-NEXT:    store i32 [[ADD27]], ptr [[ARRAYIDX31]], align 4, !tbaa [[TBAA0]], !alias.scope [[META15]], !noalias [[META6]]
+; CHECK-NEXT:    store i32 [[ADD27]], ptr [[ARRAYIDX31]], align 4, !tbaa [[TBAA0]], !alias.scope [[META16]], !noalias [[META7]]
 ; CHECK-NEXT:    [[I6]] = add i64 [[IDXPROM]], 1
 ; CHECK-NEXT:    [[CMP1_NOT:%.*]] = icmp eq i64 [[I6]], [[LEN]]
-; CHECK-NEXT:    br i1 [[CMP1_NOT]], label %[[END_LOOPEXIT_LOOPEXIT20:.*]], label %[[FOR_BODY]], !llvm.loop [[LOOP16]]
+; CHECK-NEXT:    br i1 [[CMP1_NOT]], label %[[END_LOOPEXIT_LOOPEXIT20:.*]], label %[[FOR_BODY]], !llvm.loop [[LOOP17]]
 ; CHECK:       [[END_LOOPEXIT_LOOPEXIT]]:
 ; CHECK-NEXT:    br label %[[END_LOOPEXIT:.*]]
 ; CHECK:       [[END_LOOPEXIT_LOOPEXIT20]]:
@@ -143,17 +143,18 @@ end:                                              ; preds = %end.loopexit, %entr
 ; CHECK: [[META1]] = !{!"int", [[META2:![0-9]+]], i64 0}
 ; CHECK: [[META2]] = !{!"omnipotent char", [[META3:![0-9]+]], i64 0}
 ; CHECK: [[META3]] = !{!"Simple C++ TBAA"}
-; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META5:![0-9]+]]}
+; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META5:![0-9]+]], [[META6:![0-9]+]]}
 ; CHECK: [[META5]] = !{!"llvm.loop.mustprogress"}
-; CHECK: [[META6]] = !{[[META7:![0-9]+]]}
-; CHECK: [[META7]] = distinct !{[[META7]], [[META8:![0-9]+]]}
-; CHECK: [[META8]] = distinct !{[[META8]], !"LVerDomain"}
-; CHECK: [[META9]] = !{[[META10:![0-9]+]]}
-; CHECK: [[META10]] = distinct !{[[META10]], [[META8]]}
-; CHECK: [[META11]] = !{[[META12:![0-9]+]], [[META13:![0-9]+]], [[META7]]}
-; CHECK: [[META12]] = distinct !{[[META12]], [[META8]]}
-; CHECK: [[META13]] = distinct !{[[META13]], [[META8]]}
-; CHECK: [[META14]] = !{[[META12]]}
+; CHECK: [[META6]] = !{!"llvm.loop.distribute.count", i32 2}
+; CHECK: [[META7]] = !{[[META8:![0-9]+]]}
+; CHECK: [[META8]] = distinct !{[[META8]], [[META9:![0-9]+]]}
+; CHECK: [[META9]] = distinct !{[[META9]], !"LVerDomain"}
+; CHECK: [[META10]] = !{[[META11:![0-9]+]]}
+; CHECK: [[META11]] = distinct !{[[META11]], [[META9]]}
+; CHECK: [[META12]] = !{[[META13:![0-9]+]], [[META14:![0-9]+]], [[META8]]}
+; CHECK: [[META13]] = distinct !{[[META13]], [[META9]]}
+; CHECK: [[META14]] = distinct !{[[META14]], [[META9]]}
 ; CHECK: [[META15]] = !{[[META13]]}
-; CHECK: [[LOOP16]] = distinct !{[[LOOP16]], [[META5]]}
+; CHECK: [[META16]] = !{[[META14]]}
+; CHECK: [[LOOP17]] = distinct !{[[LOOP17]], [[META5]]}
 ;.
diff --git a/llvm/test/Transforms/LoopDistribute/followup.ll b/llvm/test/Transforms/LoopDistribute/followup.ll
index 55307bdf24991..cbfc070a92614 100644
--- a/llvm/test/Transforms/LoopDistribute/followup.ll
+++ b/llvm/test/Transforms/LoopDistribute/followup.ll
@@ -58,29 +58,29 @@ define void @f(ptr %a, ptr %b, ptr %c, ptr %d, ptr %e) {
 ; CHECK:       [[FOR_BODY_LDIST1]]:
 ; CHECK-NEXT:    [[IND_LDIST1:%.*]] = phi i64 [ 0, %[[FOR_BODY_PH_LDIST1]] ], [ [[ADD_LDIST1:%.*]], %[[FOR_BODY_LDIST1]] ]
 ; CHECK-NEXT:    [[ARRAYIDXA_LDIST1:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IND_LDIST1]]
-; CHECK-NEXT:    [[LOADA_LDIST1:%.*]] = load i32, ptr [[ARRAYIDXA_LDIST1]], align 4, !alias.scope [[META3:![0-9]+]], !noalias [[META6:![0-9]+]]
+; CHECK-NEXT:    [[LOADA_LDIST1:%.*]] = load i32, ptr [[ARRAYIDXA_LDIST1]], align 4, !alias.scope [[META4:![0-9]+]], !noalias [[META7:![0-9]+]]
 ; CHECK-NEXT:    [[ARRAYIDXB_LDIST1:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IND_LDIST1]]
-; CHECK-NEXT:    [[LOADB_LDIST1:%.*]] = load i32, ptr [[ARRAYIDXB_LDIST1]], align 4, !alias.scope [[META10:![0-9]+]]
+; CHECK-NEXT:    [[LOADB_LDIST1:%.*]] = load i32, ptr [[ARRAYIDXB_LDIST1]], align 4, !alias.scope [[META11:![0-9]+]]
 ; CHECK-NEXT:    [[MULA_LDIST1:%.*]] = mul i32 [[LOADB_LDIST1]], [[LOADA_LDIST1]]
 ; CHECK-NEXT:    [[ADD_LDIST1]] = add nuw nsw i64 [[IND_LDIST1]], 1
 ; CHECK-NEXT:    [[ARRAYIDXA_PLUS_4_LDIST1:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[ADD_LDIST1]]
-; CHECK-NEXT:    store i32 [[MULA_LDIST1]], ptr [[ARRAYIDXA_PLUS_4_LDIST1]], align 4, !alias.scope [[META3]], !noalias [[META6]]
+; CHECK-NEXT:    store i32 [[MULA_LDIST1]], ptr [[ARRAYIDXA_PLUS_4_LDIST1]], align 4, !alias.scope [[META4]], !noalias [[META7]]
 ; CHECK-NEXT:    [[EXITCOND_LDIST1:%.*]] = icmp eq i64 [[ADD_LDIST1]], 20
-; CHECK-NEXT:    br i1 [[EXITCOND_LDIST1]], label %[[FOR_BODY_PH:.*]], label %[[FOR_BODY_LDIST1]], !llvm.loop [[LOOP12:![0-9]+]]
+; CHECK-NEXT:    br i1 [[EXITCOND_LDIST1]], label %[[FOR_BODY_PH:.*]], label %[[FOR_BODY_LDIST1]], !llvm.loop [[LOOP13:![0-9]+]]
 ; CHECK:       [[FOR_BODY_PH]]:
 ; CHECK-NEXT:    br label %[[FOR_BODY:.*]]
 ; CHECK:       [[FOR_BODY]]:
 ; CHECK-NEXT:    [[IND:%.*]] = phi i64 [ 0, %[[FOR_BODY_PH]] ], [ [[ADD:%.*]], %[[FOR_BODY]] ]
 ; CHECK-NEXT:    [[ADD]] = add nuw nsw i64 [[IND]], 1
 ; CHECK-NEXT:    [[ARRAYIDXD:%.*]] = getelementptr inbounds i32, ptr [[D]], i64 [[IND]]
-; CHECK-NEXT:    [[LOADD:%.*]] = load i32, ptr [[ARRAYIDXD]], align 4, !alias.scope [[META14:![0-9]+]]
+; CHECK-NEXT:    [[LOADD:%.*]] = load i32, ptr [[ARRAYIDXD]], align 4, !alias.scope [[META15:![0-9]+]]
 ; CHECK-NEXT:    [[ARRAYIDXE:%.*]] = getelementptr inbounds i32, ptr [[E]], i64 [[IND]]
-; CHECK-NEXT:    [[LOADE:%.*]] = load i32, ptr [[ARRAYIDXE]], align 4, !alias.scope [[META15:![0-9]+]]
+; CHECK-NEXT:    [[LOADE:%.*]] = load i32, ptr [[ARRAYIDXE]], align 4, !alias.scope [[META16:![0-9]+]]
 ; CHECK-NEXT:    [[MULC:%.*]] = mul i32 [[LOADD]], [[LOADE]]
 ; CHECK-NEXT:    [[ARRAYIDXC:%.*]] = getelementptr inbounds i32, ptr [[C]], i64 [[IND]]
-; CHECK-NEXT:    store i32 [[MULC]], ptr [[ARRAYIDXC]], align 4, !alias.scope [[META16:![0-9]+]], !noalias [[META10]]
+; CHECK-NEXT:    store i32 [[MULC]], ptr [[ARRAYIDXC]], align 4, !alias.scope [[META17:![0-9]+]], !noalias [[META11]]
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp eq i64 [[ADD]], 20
-; CHECK-NEXT:    br i1 [[EXITCOND]], label %[[FOR_END_LOOPEXIT16:.*]], label %[[FOR_BODY]], !llvm.loop [[LOOP17:![0-9]+]]
+; CHECK-NEXT:    br i1 [[EXITCOND]], label %[[FOR_END_LOOPEXIT16:.*]], label %[[FOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]]
 ; CHECK:       [[FOR_END_LOOPEXIT]]:
 ; CHECK-NEXT:    br label %[[FOR_END:.*]]
 ; CHECK:       [[FOR_END_LOOPEXIT16]]:
@@ -131,23 +131,24 @@ for.end:
 !4 = !{!"llvm.loop.distribute.followup_sequential", !{!"FollowupSequential", i32 8}}
 !5 = !{!"llvm.loop.distribute.followup_fallback", !{!"FollowupFallback"}}
 ;.
-; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]], [[META3:![0-9]+]]}
 ; CHECK: [[META1]] = !{!"FollowupAll"}
 ; CHECK: [[META2]] = !{!"FollowupFallback"}
-; CHECK: [[META3]] = !{[[META4:![0-9]+]]}
-; CHECK: [[META4]] = distinct !{[[META4]], [[META5:![0-9]+]]}
-; CHECK: [[META5]] = distinct !{[[META5]], !"LVerDomain"}
-; CHECK: [[META6]] = !{[[META7:![0-9]+]], [[META8:![0-9]+]], [[META9:![0-9]+]]}
-; CHECK: [[META7]] = distinct !{[[META7]], [[META5]]}
-; CHECK: [[META8]] = distinct !{[[META8]], [[META5]]}
-; CHECK: [[META9]] = distinct !{[[META9]], [[META5]]}
-; CHECK: [[META10]] = !{[[META11:![0-9]+]]}
-; CHECK: [[META11]] = distinct !{[[META11]], [[META5]]}
-; CHECK: [[LOOP12]] = distinct !{[[LOOP12]], [[META1]], [[META13:![0-9]+]]}
-; CHECK: [[META13]] = !{!"FollowupSequential", i32 8}
-; CHECK: [[META14]] = !{[[META8]]}
+; CHECK: [[META3]] = !{!"llvm.loop.distribute.count", i32 2}
+; CHECK: [[META4]] = !{[[META5:![0-9]+]]}
+; CHECK: [[META5]] = distinct !{[[META5]], [[META6:![0-9]+]]}
+; CHECK: [[META6]] = distinct !{[[META6]], !"LVerDomain"}
+; CHECK: [[META7]] = !{[[META8:![0-9]+]], [[META9:![0-9]+]], [[META10:![0-9]+]]}
+; CHECK: [[META8]] = distinct !{[[META8]], [[META6]]}
+; CHECK: [[META9]] = distinct !{[[META9]], [[META6]]}
+; CHECK: [[META10]] = distinct !{[[META10]], [[META6]]}
+; CHECK: [[META11]] = !{[[META12:![0-9]+]]}
+; CHECK: [[META12]] = distinct !{[[META12]], [[META6]]}
+; CHECK: [[LOOP13]] = distinct !{[[LOOP13]], [[META1]], [[META14:![0-9]+]]}
+; CHECK: [[META14]] = !{!"FollowupSequential", i32 8}
 ; CHECK: [[META15]] = !{[[META9]]}
-; CHECK: [[META16]] = !{[[META7]]}
-; CHECK: [[LOOP17]] = distinct !{[[LOOP17]], [[META1]], [[META18:![0-9]+]]}
-; CHECK: [[META18]] = !{!"FollowupCoincident", i1 false}
+; CHECK: [[META16]] = !{[[META10]]}
+; CHECK: [[META17]] = !{[[META8]]}
+; CHECK: [[LOOP18]] = distinct !{[[LOOP18]], [[META1]], [[META19:![0-9]+]]}
+; CHECK: [[META19]] = !{!"FollowupCoincident", i1 false}
 ;.
diff --git a/llvm/test/Transforms/LoopDistribute/laa-invalidation.ll b/llvm/test/Transforms/LoopDistribute/laa-invalidation.ll
index ee42860cd250e..62c5627ac2d38 100644
--- a/llvm/test/Transforms/LoopDistribute/laa-invalidation.ll
+++ b/llvm/test/Transforms/LoopDistribute/laa-invalidation.ll
@@ -27,13 +27,13 @@ define void @test_pr50940(ptr %A, ptr %B) {
 ; CHECK-NEXT:    store i16 1, ptr [[B]], align 1
 ; CHECK-NEXT:    [[IV_NEXT_LVER_ORIG]] = add nuw nsw i16 [[IV_LVER_ORIG]], 1
 ; CHECK-NEXT:    [[C_1_LVER_ORIG:%.*]] = icmp ult i16 [[IV_LVER_ORIG]], 38
-; CHECK-NEXT:    br i1 [[C_1_LVER_ORIG]], label [[INNER_LVER_ORIG]], label [[EXIT_LOOPEXIT:%.*]]
+; CHECK-NEXT:    br i1 [[C_1_LVER_ORIG]], label [[INNER_LVER_ORIG]], label [[EXIT_LOOPEXIT:%.*]], !llvm.loop [[LOOP0:![0-9]+]]
 ; CHECK:       inner.ph3.ldist1:
 ; CHECK-NEXT:    br label [[INNER_LDIST1:%.*]]
 ; CHECK:       inner.ldist1:
 ; CHECK-NEXT:    [[IV_LDIST1:%.*]] = phi i16 [ 0, [[INNER_PH3_LDIST1]] ], [ [[IV_NEXT_LDIST1:%.*]], [[INNER_LDIST1]] ]
-; CHECK-NEXT:    [[L_LDIST1:%.*]] = load <2 x i16>, ptr [[UGLYGEP]], align 1, !alias.scope !0, !noalias !3
-; CHECK-NEXT:    store i16 0, ptr [[GEP_A_3]], align 1, !alias.scope !0, !noalias !3
+; CHECK-NEXT:    [[L_LDIST1:%.*]] = load <2 x i16>, ptr [[UGLYGEP]], align 1, !alias.scope [[META2:![0-9]+]], !noalias [[META5:![0-9]+]]
+; CHECK-NEXT:    store i16 0, ptr [[GEP_A_3]], align 1, !alias.scope [[META2]], !noalias [[META5]]
 ; CHECK-NEXT:    [[IV_NEXT_LDIST1]] = add nuw nsw i16 [[IV_LDIST1]], 1
 ; CHECK-NEXT:    [[C_1_LDIST1:%.*]] = icmp ult i16 [[IV_LDIST1]], 38
 ; CHECK-NEXT:    br i1 [[C_1_LDIST1]], label [[INNER_LDIST1]], label [[INNER_PH3:%.*]]
@@ -41,7 +41,7 @@ define void @test_pr50940(ptr %A, ptr %B) {
 ; CHECK-NEXT:    br label [[INNER:%.*]]
 ; CHECK:       inner:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i16 [ 0, [[INNER_PH3]] ], [ [[IV_NEXT:%.*]], [[INNER]] ]
-; CHECK-NEXT:    store i16 1, ptr [[B]], align 1, !alias.scope !3
+; CHECK-NEXT:    store i16 1, ptr [[B]], align 1, !alias.scope [[META5]]
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i16 [[IV]], 1
 ; CHECK-NEXT:    [[C_1:%.*]] = icmp ult i16 [[IV]], 38
 ; CHECK-NEXT:    br i1 [[C_1]], label [[INNER]], label [[EXIT_LOOPEXIT4:%.*]]
diff --git a/llvm/test/Transforms/LoopDistribute/no-reprocess.ll b/llvm/test/Transforms/LoopDistribute/no-reprocess.ll
new file mode 100644
index 0000000000000..40fa08548366f
--- /dev/null
+++ b/llvm/test/Transforms/LoopDistribute/no-reprocess.ll
@@ -0,0 +1,130 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -aa-pipeline=basic-aa -passes=loop-distribute -enable-loop-distribute -verify-loop-info -verify-dom-info -debug-only=loop-distribute --disable-output -S < %s 2>&1 | FileCheck %s
+; REQUIRES: asserts
+
+; Test if a loop already distributed will not reprocess because of metadata
+; information marking it as processed.
+
+; CHECK: LDist: Distributed loop guarded for reprocessing
+; CHECK: LDist: Skipping;
+
+define dso_local void @_Z13distribution3PiS_S_S_i(ptr noundef captures(none) %a, ptr noundef readonly captures(none) %b, ptr noundef captures(none) %c, ptr noundef writeonly captures(none) %d, i64 noundef signext %len) {
+entry:
+  %cmp = icmp sgt i64 %len, 0
+  br i1 %cmp, label %end, label %for.body.lver.check
+
+for.body.lver.check:                              ; preds = %entry
+  %0 = shl i64 %len, 2
+  %scevgep = getelementptr i8, ptr %a, i64 %0
+  %scevgep1 = getelementptr i8, ptr %c, i64 -4
+  %scevgep2 = getelementptr i8, ptr %c, i64 %0
+  %scevgep3 = getelementptr i8, ptr %d, i64 %0
+  %scevgep4 = getelementptr i8, ptr %b, i64 %0
+  %bound0 = icmp ult ptr %a, %scevgep2
+  %bound1 = icmp ult ptr %scevgep1, %scevgep
+  %found.conflict = and i1 %bound0, %bound1
+  %bound05 = icmp ult ptr %a, %scevgep3
+  %bound16 = icmp ult ptr %d, %scevgep
+  %found.conflict7 = and i1 %bound05, %bound16
+  %conflict.rdx = or i1 %found.conflict, %found.conflict7
+  %bound08 = icmp ult ptr %a, %scevgep4
+  %bound19 = icmp ult ptr %b, %scevgep
+  %found.conflict10 = and i1 %bound08, %bound19
+  %conflict.rdx11 = or i1 %conflict.rdx, %found.conflict10
+  %bound012 = icmp ult ptr %scevgep1, %scevgep3
+  %bound113 = icmp ult ptr %d, %scevgep2
+  %found.conflict14 = and i1 %bound012, %bound113
+  %conflict.rdx15 = or i1 %conflict.rdx11, %found.conflict14
+  %bound016 = icmp ult ptr %d, %scevgep4
+  %bound117 = icmp ult ptr %b, %scevgep3
+  %found.conflict18 = and i1 %bound016, %bound117
+  %conflict.rdx19 = or i1 %conflict.rdx15, %found.conflict18
+  br i1 %conflict.rdx19, label %for.body.ph.lver.orig, label %for.body.ph.ldist1
+
+for.body.ph.lver.orig:                            ; preds = %for.body.lver.check
+  br label %for.body.lver.orig
+
+for.body.lver.orig:                               ; preds = %for.body.lver.orig, %for.body.ph.lver.orig
+  %indvars.iv.lver.orig = phi i64 [ 0, %for.body.ph.lver.orig ], [ %indvars.iv.next.lver.orig, %for.body.lver.orig ]
+  %arrayidx.lver.orig = getelementptr inbounds i32, ptr %b, i64 %indvars.iv.lver.orig
+  %i2.lver.orig = load i32, ptr %arrayidx.lver.orig, align 4, !tbaa !0
+  %add4.lver.orig = add nsw i32 %i2.lver.orig, 1
+  %arrayidx8.lver.orig = getelementptr inbounds i32, ptr %a, i64 %indvars.iv.lver.orig
+  store i32 %add4.lver.orig, ptr %arrayidx8.lver.orig, align 4, !tbaa !0
+  %i3.lver.orig = getelementptr i32, ptr %c, i64 %indvars.iv.lver.orig
+  %arrayidx17.lver.orig = getelementptr i8, ptr %i3.lver.orig, i64 -4
+  %i4.lver.orig = load i32, ptr %arrayidx17.lver.orig, align 4, !tbaa !0
+  %sub18.lver.orig = sub nsw i32 %add4.lver.orig, %i4.lver.orig
+  store i32 %sub18.lver.orig, ptr %i3.lver.orig, align 4, !tbaa !0
+  %i5.lver.orig = load i32, ptr %arrayidx8.lver.orig, align 4, !tbaa !0
+  %add27.lver.orig = add nsw i32 %i5.lver.orig, 2
+  %arrayidx31.lver.orig = getelementptr inbounds i32, ptr %d, i64 %indvars.iv.lver.orig
+  store i32 %add27.lver.orig, ptr %arrayidx31.lver.orig, align 4, !tbaa !0
+  %indvars.iv.next.lver.orig = add i64 %indvars.iv.lver.orig, 1
+  %cmp1.not.lver.orig = icmp eq i64 %indvars.iv.next.lver.orig, %len
+  br i1 %cmp1.not.lver.orig, label %end.loopexit.loopexit, label %for.body.lver.orig, !llvm.loop !4
+
+for.body.ph.ldist1:                               ; preds = %for.body.lver.check
+  br label %for.body.ldist1
+
+for.body.ldist1:                                  ; preds = %for.body.ldist1, %for.body.ph.ldist1
+  %indvars.iv.ldist1 = phi i64 [ 0, %for.body.ph.ldist1 ], [ %indvars.iv.next.ldist1, %for.body.ldist1 ]
+  %arrayidx.ldist1 = getelementptr inbounds i32, ptr %b, i64 %indvars.iv.ldist1
+  %i2.ldist1 = load i32, ptr %arrayidx.ldist1, align 4, !tbaa !0, !alias.scope !7
+  %add4.ldist1 = add nsw i32 %i2.ldist1, 1
+  %arrayidx8.ldist1 = getelementptr inbounds i32, ptr %a, i64 %indvars.iv.ldist1
+  store i32 %add4.ldist1, ptr %arrayidx8.ldist1, align 4, !tbaa !0, !alias.scope !10, !noalias !12
+  %...
[truncated]

Copy link

github-actions bot commented Aug 15, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@mcberg2021 mcberg2021 force-pushed the users/mcberg2021_LoopDistribute2 branch from 4bf1302 to d3463cf Compare August 15, 2025 23:40
Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a count instead of just a flag that marks the loop as already vectorized? .count looks like an instruction to the compiler to distribute into that number of os parts (compare llvm.loop.unroll.count).

LoopVectorizer's marker is llvm.loop.isvectorized.

@mcberg2021
Copy link
Contributor Author

Why a count instead of just a flag that marks the loop as already vectorized? .count looks like an instruction to the compiler to distribute into that number of os parts (compare llvm.loop.unroll.count).

LoopVectorizer's marker is llvm.loop.isvectorized.

Ok, I can do that. It's functionally equivalent for our purposes.

@mcberg2021 mcberg2021 force-pushed the users/mcberg2021_LoopDistribute2 branch 3 times, most recently from 51a389e to fbd591a Compare August 20, 2025 06:29
@mcberg2021 mcberg2021 requested a review from Meinersbur August 20, 2025 07:53
Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can do that. It's functionally equivalent for our purposes.

If there is a use for the number, we can add it, but a rationale is missing. The rationale would also determine how to interpret the number. Otherwise use the simplest data structure that does the job.

Please also add documentation to the language reference. Yes, llvm.loop.isvectorized is missing already.

@mcberg2021 mcberg2021 force-pushed the users/mcberg2021_LoopDistribute2 branch from fbd591a to 86f742a Compare August 20, 2025 20:55
Comment on lines 349 to 351
It is recommended to add ``llvm.loop.isdistributed`` to mark loops
that have been transformed by LoopDistribute so that they are not
reprocessed under LTO, where they may be given a second opportunity.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The paragraph you copy&pasted is a recommendations what frontends should do. llvm.loop.isdistributed is added by LLVM itself. It doesn't make sense to make to recommend frontends to do something they do not have control over.

I had mostly the Language Reference in mind; it is supposed to have an exhaustive list of all recognized metadata.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the language reference and there may still need to be a transform meta data entry.

…d loops

Add a boolean marker noting when loop distribution was sucessfully applied
in a loops meta data, then check for loops which are already distributed
to prevent reprocessing.
@mcberg2021 mcberg2021 force-pushed the users/mcberg2021_LoopDistribute2 branch from 86f742a to ab071e5 Compare August 21, 2025 18:47
Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@mcberg2021 mcberg2021 merged commit efa99ec into main Aug 22, 2025
10 checks passed
@mcberg2021 mcberg2021 deleted the users/mcberg2021_LoopDistribute2 branch August 22, 2025 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants