Skip to content

Conversation

@fhahn
Copy link
Contributor

@fhahn fhahn commented Jul 31, 2025

Materialze Build(Struct)Vectors explicitly for VPRecplicateRecipes, to serve their users requiring a vector, instead of doing so when unrolling by VF.

Now we only need to implicitly build vectors in VPTransformState::get for VPInstructions. Once they are also unrolled by VF we can remove the code-path alltogether.

@llvmbot
Copy link
Member

llvmbot commented Jul 31, 2025

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Materialze Build(Struct)Vectors explicitly for VPRecplicateRecipes, to serve their users requiring a vector, instead of doing so when unrolling by VF.

Now we only need to implicitly build vectors in VPTransformState::get for VPInstructions. Once they are also unrolled by VF we can remove the code-path alltogether.


Full diff: https://github.com/llvm/llvm-project/pull/151487.diff

6 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+1)
  • (modified) llvm/lib/Transforms/Vectorize/VPlan.cpp (+2)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+2)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+45)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.h (+4)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp (+29-16)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 850c4a11edc67..9c547f759617c 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7284,6 +7284,7 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
   // cost model is complete for better cost estimates.
   VPlanTransforms::runPass(VPlanTransforms::unrollByUF, BestVPlan, BestUF,
                            OrigLoop->getHeader()->getContext());
+  VPlanTransforms::runPass(VPlanTransforms::materializeBuildVectors, BestVPlan);
   VPlanTransforms::runPass(VPlanTransforms::replicateByVF, BestVPlan, BestVF);
   VPlanTransforms::runPass(VPlanTransforms::materializeBroadcasts, BestVPlan);
   bool HasBranchWeights =
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index 25b9616880bf4..96ccf5bf50a25 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -372,6 +372,8 @@ Value *VPTransformState::get(const VPValue *Def, bool NeedsScalar) {
     set(Def, VectorValue);
   } else {
     assert(!VF.isScalable() && "VF is assumed to be non scalable.");
+    assert(isa<VPInstruction>(Def) && "Explicit BuildVector recipes must "
+                                      "handle packing for non-VPInstructions.");
     // Initialize packing with insertelements to start from poison.
     VectorValue = PoisonValue::get(toVectorizedTy(LastInst->getType(), VF));
     for (unsigned Lane = 0; Lane < VF.getFixedValue(); ++Lane)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 68e7c20a070f4..7689ed6c2bd35 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -461,6 +461,8 @@ unsigned VPInstruction::getNumOperandsForOpcode(unsigned Opcode) {
   case Instruction::Load:
   case VPInstruction::AnyOf:
   case VPInstruction::BranchOnCond:
+  case VPInstruction::BuildStructVector:
+  case VPInstruction::BuildVector:
   case VPInstruction::CalculateTripCountMinusVF:
   case VPInstruction::CanonicalIVIncrementForPart:
   case VPInstruction::ExplicitVectorLength:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index ad8235d891c5d..4c0f7f40408c5 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -3192,6 +3192,51 @@ void VPlanTransforms::materializeVectorTripCount(
     Plan.getVectorTripCount().setUnderlyingValue(NewC->getValue());
 }
 
+void VPlanTransforms::materializeBuildVectors(VPlan &Plan) {
+  if (Plan.hasScalarVFOnly())
+    return;
+
+  VPTypeAnalysis TypeInfo(Plan);
+  VPRegionBlock *LoopRegion = Plan.getVectorLoopRegion();
+  auto VPBBsOutsideLoopRegion = VPBlockUtils::blocksOnly<VPBasicBlock>(
+      vp_depth_first_shallow(Plan.getEntry()));
+  auto VPBBsInsideLoopRegion = VPBlockUtils::blocksOnly<VPBasicBlock>(
+      vp_depth_first_shallow(LoopRegion->getEntry()));
+  for (VPBasicBlock *VPBB :
+       concat<VPBasicBlock *>(VPBBsOutsideLoopRegion, VPBBsInsideLoopRegion)) {
+    for (VPRecipeBase &R : make_early_inc_range(*VPBB)) {
+      auto *RepR = dyn_cast<VPReplicateRecipe>(&R);
+      if (!RepR || RepR->isSingleScalar())
+        continue;
+      VPInstruction *BuildVector = nullptr;
+      for (VPUser *U : to_vector(RepR->users())) {
+        VPRegionBlock *ParentRegion =
+            cast<VPRecipeBase>(U)->getParent()->getParent();
+        if (U->usesScalars(RepR) && ParentRegion == LoopRegion)
+          continue;
+
+        if (!BuildVector) {
+          Type *ScalarTy = TypeInfo.inferScalarType(RepR);
+          unsigned Opc = ScalarTy->isStructTy()
+                             ? VPInstruction::BuildStructVector
+                             : VPInstruction::BuildVector;
+          BuildVector = new VPInstruction(Opc, {RepR});
+          BuildVector->insertAfter(RepR);
+        }
+
+        // Only update a single operand per users, as the same user is added
+        // multiple times, once per use.
+        // TODO: Introduce de-duplicating iterator over users.
+        for (unsigned Idx = 0; Idx != U->getNumOperands(); ++Idx)
+          if (U->getOperand(Idx) == RepR) {
+            U->setOperand(Idx, BuildVector);
+            break;
+          }
+      }
+    }
+  }
+}
+
 /// Returns true if \p V is VPWidenLoadRecipe or VPInterleaveRecipe that can be
 /// converted to a narrower recipe. \p V is used by a wide recipe that feeds a
 /// store interleave group at index \p Idx, \p WideMember0 is the recipe feeding
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
index 880159f760922..1a19e15bbaa25 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
@@ -256,6 +256,10 @@ struct VPlanTransforms {
                                          unsigned BestUF,
                                          PredicatedScalarEvolution &PSE);
 
+  /// Add explicit Build[Struct]Vector recipes that combine scalar values
+  /// produced by VPReplicateRecipes to a single vector.
+  static void materializeBuildVectors(VPlan &Plan);
+
   /// Try to convert a plan with interleave groups with VF elements to a plan
   /// with the interleave groups replaced by wide loads and stores processing VF
   /// elements, if all transformed interleave groups access the full vector
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
index 871e37ef3966a..7f23fb5b7d11a 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
@@ -463,9 +463,10 @@ void VPlanTransforms::unrollByUF(VPlan &Plan, unsigned UF, LLVMContext &Ctx) {
 }
 
 /// Create a single-scalar clone of \p RepR for lane \p Lane.
-static VPReplicateRecipe *cloneForLane(VPlan &Plan, VPBuilder &Builder,
-                                       Type *IdxTy, VPReplicateRecipe *RepR,
-                                       VPLane Lane) {
+static VPReplicateRecipe *
+cloneForLane(VPlan &Plan, VPBuilder &Builder, Type *IdxTy,
+             VPReplicateRecipe *RepR, VPLane Lane,
+             DenseMap<VPValue *, SmallVector<VPValue *>> &Value2Lanes) {
   // Collect the operands at Lane, creating extracts as needed.
   SmallVector<VPValue *> NewOps;
   for (VPValue *Op : RepR->operands()) {
@@ -478,6 +479,11 @@ static VPReplicateRecipe *cloneForLane(VPlan &Plan, VPBuilder &Builder,
           Builder.createNaryOp(VPInstruction::ExtractLastElement, {Op}));
       continue;
     }
+    if (Value2Lanes.contains(Op)) {
+      NewOps.push_back(Value2Lanes[Op][Lane.getKnownLane()]);
+      continue;
+    }
+
     // Look through buildvector to avoid unnecessary extracts.
     if (match(Op, m_BuildVector())) {
       NewOps.push_back(
@@ -510,6 +516,8 @@ void VPlanTransforms::replicateByVF(VPlan &Plan, ElementCount VF) {
       vp_depth_first_shallow(Plan.getVectorLoopRegion()->getEntry()));
   auto VPBBsToUnroll =
       concat<VPBasicBlock *>(VPBBsOutsideLoopRegion, VPBBsInsideLoopRegion);
+  DenseMap<VPValue *, SmallVector<VPValue *>> Value2Lanes;
+  SmallVector<VPRecipeBase *> ToRemove;
   for (VPBasicBlock *VPBB : VPBBsToUnroll) {
     for (VPRecipeBase &R : make_early_inc_range(*VPBB)) {
       auto *RepR = dyn_cast<VPReplicateRecipe>(&R);
@@ -521,12 +529,12 @@ void VPlanTransforms::replicateByVF(VPlan &Plan, ElementCount VF) {
         if (isa<StoreInst>(RepR->getUnderlyingInstr()) &&
             vputils::isSingleScalar(RepR->getOperand(1))) {
           // Stores to invariant addresses need to store the last lane only.
-          cloneForLane(Plan, Builder, IdxTy, RepR,
-                       VPLane::getLastLaneForVF(VF));
+          cloneForLane(Plan, Builder, IdxTy, RepR, VPLane::getLastLaneForVF(VF),
+                       Value2Lanes);
         } else {
           // Create single-scalar version of RepR for all lanes.
           for (unsigned I = 0; I != VF.getKnownMinValue(); ++I)
-            cloneForLane(Plan, Builder, IdxTy, RepR, VPLane(I));
+            cloneForLane(Plan, Builder, IdxTy, RepR, VPLane(I), Value2Lanes);
         }
         RepR->eraseFromParent();
         continue;
@@ -534,23 +542,28 @@ void VPlanTransforms::replicateByVF(VPlan &Plan, ElementCount VF) {
       /// Create single-scalar version of RepR for all lanes.
       SmallVector<VPValue *> LaneDefs;
       for (unsigned I = 0; I != VF.getKnownMinValue(); ++I)
-        LaneDefs.push_back(cloneForLane(Plan, Builder, IdxTy, RepR, VPLane(I)));
+        LaneDefs.push_back(
+            cloneForLane(Plan, Builder, IdxTy, RepR, VPLane(I), Value2Lanes));
 
+      Value2Lanes[RepR] = LaneDefs;
       /// Users that only demand the first lane can use the definition for lane
       /// 0.
       RepR->replaceUsesWithIf(LaneDefs[0], [RepR](VPUser &U, unsigned) {
         return U.onlyFirstLaneUsed(RepR);
       });
 
-      // If needed, create a Build(Struct)Vector recipe to insert the scalar
-      // lane values into a vector.
-      Type *ResTy = RepR->getUnderlyingInstr()->getType();
-      VPValue *VecRes = Builder.createNaryOp(
-          ResTy->isStructTy() ? VPInstruction::BuildStructVector
-                              : VPInstruction::BuildVector,
-          LaneDefs);
-      RepR->replaceAllUsesWith(VecRes);
-      RepR->eraseFromParent();
+      for (VPUser *U : to_vector(RepR->users())) {
+        auto *VPI = dyn_cast<VPInstruction>(U);
+        if (!VPI || (VPI->getOpcode() != VPInstruction::BuildVector &&
+                     VPI->getOpcode() != VPInstruction::BuildStructVector))
+          continue;
+        VPI->setOperand(0, LaneDefs[0]);
+        for (VPValue *Def : drop_begin(LaneDefs))
+          VPI->addOperand(Def);
+      }
+      ToRemove.push_back(RepR);
     }
   }
+  for (auto *R : reverse(ToRemove))
+    R->eraseFromParent();
 }

fhahn added 3 commits August 1, 2025 20:55
Materialze Build(Struct)Vectors explicitly for VPRecplicateRecipes, to
serve their users requiring a vector, instead of doing so when unrolling
by VF.

Now we only need to implicitly build vectors in VPTransformState::get
for VPInstructions. Once they are also unrolled by VF we can remove the
code-path alltogether.
@fhahn fhahn force-pushed the vplan-materialize-buildvector-for-vpreplicate-recipes branch from b4d634d to d293818 Compare August 3, 2025 10:40
Copy link
Contributor

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanations. This patch LGTM, thanks!


// Only update a single operand per users, as the same user is added
// multiple times, once per use.
// TODO: Introduce de-duplicating iterator over users.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not the right place for this TODO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, code is gone in the latest version, thanks

@ayalz
Copy link
Collaborator

ayalz commented Aug 8, 2025

Materialze Build(Struct)Vectors explicitly for VPRecplicateRecipes

"for VPReplicateRecipes" needed - other cases of packing scalars to vectors that need handling include VPInstructions.
Any relation to VPPredInstPHIRecipe, as in VPReplicateRecipe::shouldPack()?

Copy link
Collaborator

@ayalz ayalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post approval review.

Comment on lines 3216 to 3220
for (unsigned Idx = 0; Idx != U->getNumOperands(); ++Idx)
if (U->getOperand(Idx) == RepR) {
U->setOperand(Idx, BuildVector);
break;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth some replaceFirstUseWith() API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is gone now, thanks


if (!BuildVector) {
Type *ScalarTy = TypeInfo.inferScalarType(RepR);
unsigned Opc = ScalarTy->isStructTy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unsigned Opc = ScalarTy->isStructTy()
unsigned Opcode = ScalarTy->isStructTy()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

for (VPUser *U : to_vector(RepR->users())) {
VPRegionBlock *ParentRegion =
cast<VPRecipeBase>(U)->getParent()->getParent();
if (U->usesScalars(RepR) && ParentRegion == LoopRegion)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth commenting that users (of vectors) within replicate regions are still handled by VPReplicateRegion::execute(), via shouldPack().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added above ,thanks

PredicatedScalarEvolution &PSE);

/// Add explicit Build[Struct]Vector recipes that combine scalar values
/// produced by VPReplicateRecipes to a single vector.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// produced by VPReplicateRecipes to a single vector.
/// into single vectors.

"produced by VPReplicateRecipes" is temporary - should also apply to VPInstructions?
Distinct from materializeBroadcasts() in combining "multiple scalar values into a single vector", rather than splatting a single scalar value into a vector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, clarified thanks

VPlanTransforms::removeDeadRecipes(Plan);
}

/// Create a single-scalar clone of \p RepR for lane \p Lane.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document \p Values2Lanes
Better called Def2LaneDefs, or RepDef2SingleLaneDefs? I.e., mapping a single old replicating Def to a vector of new single-scalar Defs replacing it according to lanes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated naming and added comment, thanks

Builder.createNaryOp(VPInstruction::ExtractLastElement, {Op}));
continue;
}
if (Value2Lanes.contains(Op)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a comment explaining this case, e.g., if the operand is a replicating recipe already unrolled per VF, use the associated single-scalar Def. Previously a replicating recipe feeding another would have a buildVector between them, which was looked-through below, now Value2Lanes is used to hook them up directly, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep exactly, added thanks

vp_depth_first_shallow(Plan.getEntry()));
auto VPBBsInsideLoopRegion = VPBlockUtils::blocksOnly<VPBasicBlock>(
vp_depth_first_shallow(LoopRegion->getEntry()));
// Materialize Build(Struct)Vector for all replicating VPReplicateRecipes,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"replicating" stand for "non uniform", leaving the uniform case to materializeBroadcasts()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

LaneDefs);
RepR->replaceAllUsesWith(VecRes);
RepR->eraseFromParent();
for (VPUser *U : to_vector(RepR->users())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth explaining what's about to happen - update each build vector user that currently has RepR as its only operand, to have all LaneDefs as its operands.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks

VPI->getOpcode() != VPInstruction::BuildStructVector))
continue;
assert(VPI->getNumOperands() == 1 &&
"Build(Struct)Vector must have a single operand");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Build(Struct)Vector must have a single operand");
"Build(Struct)Vector must have a single operand before replicating by VF");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added thanks

continue;
assert(VPI->getNumOperands() == 1 &&
"Build(Struct)Vector must have a single operand");
VPI->setOperand(0, LaneDefs[0]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove VPI as a user of RepR, as done by replaceUsesWithIf(LaneDefs[0])?

Also remove RepR as user of Op when cloneForLane() hooks up to Op's LaneDefs, so that dce will be able to collect all recipes left useless? Or check if RepR becomes useless after each such removal, and collect it if so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above will only update the users that only use the first lane. Probably best done together with adding the other operands here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do RepR->removeUser(U) here, complementing the removals of users above, so that ToRemove recipes eventually become use-less? May non-VPInstruction users remain? Non-buildVector users should be removed when ToRemove is erased below, in reverse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we still need to update the operand to use LaneDefs[0] instead of RepR here, which will implicitly remove VPI as user of RepR.

@fhahn
Copy link
Contributor Author

fhahn commented Aug 8, 2025

Materialze Build(Struct)Vectors explicitly for VPRecplicateRecipes

"for VPReplicateRecipes" needed - other cases of packing scalars to vectors that need handling include VPInstructions. Any relation to VPPredInstPHIRecipe, as in VPReplicateRecipe::shouldPack()?

For now this is just used for recipes that are unrolled by VF, which at the moment is only VPReplicateRecipe outside replicate regions.

As follow-ups, we should also use the explicit packing for VPInstructions (and unroll them) and for for shouldPack. For the latter may require a bit more thought though: currently the packing will happen directly after computing the scalar value per lane. With the current packing recipes, we would have the scalar values coming out of the replicate regions, and pack them after the fact.

Perhaps for that we would need single vector inserts.

concat<VPBasicBlock *>(VPBBsOutsideLoopRegion, VPBBsInsideLoopRegion)) {
for (VPRecipeBase &R : make_early_inc_range(*VPBB)) {
auto *RepR = dyn_cast<VPReplicateRecipe>(&R);
auto UsesVectorOrInsideReplicateRegions = [RepR, LoopRegion](VPUser *U) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto UsesVectorOrInsideReplicateRegions = [RepR, LoopRegion](VPUser *U) {
auto UsesVectorOrInsideReplicateRegion = [RepR, LoopRegion](VPUser *U) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks

continue;
assert(VPI->getNumOperands() == 1 &&
"Build(Struct)Vector must have a single operand");
VPI->setOperand(0, LaneDefs[0]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do RepR->removeUser(U) here, complementing the removals of users above, so that ToRemove recipes eventually become use-less? May non-VPInstruction users remain? Non-buildVector users should be removed when ToRemove is erased below, in reverse.

vp_depth_first_shallow(Plan.getVectorLoopRegion()->getEntry()));
auto VPBBsToUnroll =
concat<VPBasicBlock *>(VPBBsOutsideLoopRegion, VPBBsInsideLoopRegion);
DenseMap<VPValue *, SmallVector<VPValue *>> Def2LaneDefs;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DenseMap<VPValue *, SmallVector<VPValue *>> Def2LaneDefs;
// A mapping of current VPValue definitions to collections of new VPValues defined per lane. Serves to hook-up potential users of current VPValue definition that are replicated-per-VF later.
DenseMap<VPValue *, SmallVector<VPValue *>> Def2LaneDefs;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks

auto VPBBsToUnroll =
concat<VPBasicBlock *>(VPBBsOutsideLoopRegion, VPBBsInsideLoopRegion);
DenseMap<VPValue *, SmallVector<VPValue *>> Def2LaneDefs;
SmallVector<VPRecipeBase *> ToRemove;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SmallVector<VPRecipeBase *> ToRemove;
// The removal of current recipes being replaced by new ones needs to be delayed after Def2LaneDefs is no longer in use.
SmallVector<VPRecipeBase *> ToRemove;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added thanks

cloneForLane(Plan, Builder, IdxTy, RepR, VPLane(I));
cloneForLane(Plan, Builder, IdxTy, RepR, VPLane(I), Def2LaneDefs);
}
RepR->eraseFromParent();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the erasure of use-less RepR here removes it from being a user of its operands.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup

Def2LaneDefs[RepR] = LaneDefs;
/// Users that only demand the first lane can use the definition for lane
/// 0.
RepR->replaceUsesWithIf(LaneDefs[0], [RepR](VPUser &U, unsigned) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the RUWI of removes users of RepR that use only its first lane.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup

Comment on lines 572 to 573
for (VPValue *Def : drop_begin(LaneDefs))
VPI->addOperand(Def);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (VPValue *Def : drop_begin(LaneDefs))
VPI->addOperand(Def);
for (VPValue *LaneDef : drop_begin(LaneDefs))
VPI->addOperand(LaneDef);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added thanks

Comment on lines +578 to +579
for (auto *R : reverse(ToRemove))
R->eraseFromParent();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (auto *R : reverse(ToRemove))
R->eraseFromParent();
for (auto *R : reverse(ToRemove)) {
assert(R->users().empty() && "Recipe to removed expected to be free of users.");
R->eraseFromParent();
}

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

::eraseFromParent will already assert that there are no remaining users.

@fhahn fhahn merged commit 7e99893 into llvm:main Aug 18, 2025
9 checks passed
@fhahn fhahn deleted the vplan-materialize-buildvector-for-vpreplicate-recipes branch August 18, 2025 19:50
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Aug 18, 2025
…cipes. (NFCI) (#151487)

Materialze Build(Struct)Vectors explicitly for VPRecplicateRecipes, to
serve their users requiring a vector, instead of doing so when unrolling
by VF.

Now we only need to implicitly build vectors in VPTransformState::get
for VPInstructions. Once they are also unrolled by VF we can remove the
code-path alltogether.

PR: llvm/llvm-project#151487
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants