-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[VPlan] Build initial VPlan 0 using HCFGBuilder for inner loops. (NFC) #124432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
85721ea
b8ef9b4
d2bdde2
953eeff
6a872d7
efc4b83
7a9e7ab
f8fa5c0
83e933a
73e2beb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -8309,7 +8309,7 @@ VPRecipeBuilder::tryToWidenMemory(Instruction *I, ArrayRef<VPValue *> Operands, | |||||||||||||||||||||||||||||
| : GEPNoWrapFlags::none(), | ||||||||||||||||||||||||||||||
| I->getDebugLoc()); | ||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||
| Builder.getInsertBlock()->appendRecipe(VectorPtr); | ||||||||||||||||||||||||||||||
| VectorPtr->insertBefore(&*Builder.getInsertPoint()); | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| Ptr = VectorPtr; | ||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||
| if (LoadInst *Load = dyn_cast<LoadInst>(I)) | ||||||||||||||||||||||||||||||
|
|
@@ -9221,6 +9221,7 @@ static void addExitUsersForFirstOrderRecurrences( | |||||||||||||||||||||||||||||
| VPlanPtr | ||||||||||||||||||||||||||||||
| LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) { | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| using namespace llvm::VPlanPatternMatch; | ||||||||||||||||||||||||||||||
| SmallPtrSet<const InterleaveGroup<Instruction> *, 1> InterleaveGroups; | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| // --------------------------------------------------------------------------- | ||||||||||||||||||||||||||||||
|
|
@@ -9244,6 +9245,10 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) { | |||||||||||||||||||||||||||||
| PSE, RequiresScalarEpilogueCheck, | ||||||||||||||||||||||||||||||
| CM.foldTailByMasking(), OrigLoop); | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| // Build hierarchical CFG. | ||||||||||||||||||||||||||||||
| VPlanHCFGBuilder HCFGBuilder(OrigLoop, LI, *Plan); | ||||||||||||||||||||||||||||||
| HCFGBuilder.buildHierarchicalCFG(); | ||||||||||||||||||||||||||||||
|
Comment on lines
+9325
to
+9327
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good, maybe best to merge as follow-up |
||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| // Don't use getDecisionAndClampRange here, because we don't know the UF | ||||||||||||||||||||||||||||||
| // so this function is better to be conservative, rather than to split | ||||||||||||||||||||||||||||||
| // it up into different VPlans. | ||||||||||||||||||||||||||||||
|
|
@@ -9312,23 +9317,45 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) { | |||||||||||||||||||||||||||||
| RecipeBuilder.collectScaledReductions(Range); | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| auto *MiddleVPBB = Plan->getMiddleBlock(); | ||||||||||||||||||||||||||||||
| ReversePostOrderTraversal<VPBlockShallowTraversalWrapper<VPBlockBase *>> RPOT( | ||||||||||||||||||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Move the above comment over here:
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done thanks |
||||||||||||||||||||||||||||||
| Plan->getVectorLoopRegion()->getEntry()); | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| VPBasicBlock::iterator MBIP = MiddleVPBB->getFirstNonPhi(); | ||||||||||||||||||||||||||||||
| for (BasicBlock *BB : make_range(DFS.beginRPO(), DFS.endRPO())) { | ||||||||||||||||||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also remove above which become dead.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done thanks |
||||||||||||||||||||||||||||||
| // Relevant instructions from basic block BB will be grouped into VPRecipe | ||||||||||||||||||||||||||||||
| // ingredients and fill a new VPBasicBlock. | ||||||||||||||||||||||||||||||
| if (VPBB != HeaderVPBB) | ||||||||||||||||||||||||||||||
| VPBB->setName(BB->getName()); | ||||||||||||||||||||||||||||||
| Builder.setInsertPoint(VPBB); | ||||||||||||||||||||||||||||||
| VPBlockBase *PrevVPBB = nullptr; | ||||||||||||||||||||||||||||||
| for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(RPOT)) { | ||||||||||||||||||||||||||||||
| // Skip VPBBs not corresponding to any input IR basic blocks. | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| // Skip VPBBs not corresponding to any input IR basic blocks. | |
| // Handle VPBBs down to the latch. |
Latch also conceptually has a corresponding IRBB, with underlying Instructions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks. Initially the latch in the IR is not the latch/exiting block in the VPlan; the exiting block is created as part of the skeleton.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we set the insert point of Builder once, here, as done now, possibly to first non-Phi of VPBB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately that doesn't play nice with remove the recipes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the header mask also be created only if NeedsMasks? (Seems independent)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently createHeaderMask itself checks if tail folding is enabled, and only then creates the mask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Time to move edge masks and block masks to be cached according to VPBB's rather than IRBB's?
Rather than hacking VPlanHCFGBuilder and PlainCFGBuilder to record and retrieve IRBB for VPBB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is somewhat a chicken-and-egg problem. Without this patch, we cannot do that, because we only ever create the flattened CFG in VPlan. I might be missing another option, but I think once this patch lands as first step we can move the mask creation to be based on the VPBlocks. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this a try, but it doesn't work out with the flattening of the CFG here, because we might already removed edges earlier before querying a mask on that edge.
We could first compute all masks in a separate loop, but there also are quite a few places that need updating, include BasicBlock -> VPBasicBlock, various places that get the mask for the parent of an instruction, looking for BranchOnCond instead of BranchInst, so that would also increase the diff quite a bit unfortunately. It would probably be better to tackle that separately, also to make testing easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, another suggestion: getIRBBForVPB() is needed only here - for non-header (non-latch) (non-empty?) VPBB, whose recipes are all expected to have underlying Instructions, which know their parental BasicBlock? Can the latter be retrieved as in below rather than hacking HCFGBuilder to record and expose this VRBB-to-IRBB mapping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfrotunately we don't create recipes for branches, so there may be empty blocks for which we would need to create masks (because there succesors may require them at the moment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's go with HCFGBuilder.getIRBBForVPB(), for now ...
Following our roadmap, HCFGBuilder would be the VPlan-to-VPlan transformation producing "buildLoop" from "wrapInput", and tryToBuildVPlanWithRecipes would follow as a subsequent VPlan-to-VPlan transformation(s). Obtaining this VPBB-to-IRBB mapping from HCFGBuilder should be removed in one of two ways (or more):
- If the buildLoop VPlan is to maintain its correspondence with IR basic blocks then a new type of VPBB can be introduced - one providing getIRBasicBlock() similar to VPIRBasicBlock but having an execute() similar to VPBasicBlock - that generates new basic blocks, or deemed abstract/unreachable - to be materialized into a concreate VPBB prior to codegen.
- If, OTOH, getIRBBForVPB() is meant only as a temporary solution to support createBlockInMask() until the latter is upgraded to work directly on VPBB's, then a FIXME should be added.
(Recipes are created for internal conditional branches but not for unconditional branches (including early exits) - which imply a single successor. An empty block also implies a single predecessor due to lack of phi recipes. Figuring out the BasicBlock from the single predecessor and/or successor of an empty VPBB seems cumbersome; getIRBBForVPB() could record the mapping for empty VPBB's only; simplest to probably to record all VPBB's in loop, for now.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a FXIME to the declaration, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| RecipeBuilder.createBlockInMask(HCFGBuilder.getIRBBForVPB(VPBB)); | |
| auto *SingleDef = cast<VPSingleDefRecipe>(VPBB->begin()); | |
| Instruction *Instr = SingleDef->getUnderlyingInstr(); | |
| RecipeBuilder.createBlockInMask(Instr->getParent()); |
?
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO corresponds to traversing BB's insns w/o debug, which no longer appears here. Belongs in PlainCFGBuilder::createVPInstructionsForVPBB()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks!
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If dyn_cast need to also check below if SingleDef (is null).
Only SingleDef's are expected, so better replace dyn_cast with cast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, should be cast, updated thanks!
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an assert - expecting only VPWidenPHIRecipes or VPInstructions with underlying values?
In any case, perhaps DeMorganizing would be more readable:
!(isa<VPWidenPHIRecipe>(SingleDef) ||
(isa<VPInstruction>(SingleDef) && SingleDef->getUnderlyingValue()))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to skip only the recipes that explicitly don't need transformation + a comment and assert.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can match via some m_Switch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately at the moment the matchers require a fixed number of operands; need to think about how to match without a fixed number of ops.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then perhaps more consistent / simpler to do:
| if (match(&R, m_BranchOnCond(m_VPValue())) || | |
| (isa<VPInstruction>(&R) && | |
| cast<VPInstruction>(&R)->getOpcode() == Instruction::Switch)) { | |
| if (isa<VPInstruction>(&R) && | |
| (cast<VPInstruction>(&R)->getOpcode() == VPInstruction::BranchOnCond || | |
| (cast<VPInstruction>(&R)->getOpcode() == Instruction::Switch)) { |
(once VPWidenPHIRecipe is turned into a VPInstruction this would be simpler)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: define Instr earlier to be reused above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getUnderlyingInstr will assert if there's no underlying value, left here for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth commenting that the other operand of the header phi - the one across the back-edge, will be added later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps simpler to do:
| if (!Legal->isInvariantStoreOfReduction(SI)) { | |
| R.eraseFromParent(); | |
| continue; | |
| } | |
| auto *Recipe = new VPReplicateRecipe( | |
| SI, RecipeBuilder.mapToVPValues(Instr->operands()), | |
| true /* IsUniform */); | |
| Recipe->insertBefore(*MiddleVPBB, MBIP); | |
| if (Legal->isInvariantStoreOfReduction(SI)) { | |
| auto *Recipe = new VPReplicateRecipe( | |
| SI, RecipeBuilder.mapToVPValues(Instr->operands()), | |
| true /* IsUniform */); | |
| Recipe->insertBefore(*MiddleVPBB, MBIP); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assert is associated with the preceding explanation. Both become obsolete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to only move truncated inductions, for which it still is needed
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Builder to place Recipe in its insert point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert(Recipe->getNumDefinedValues() == 0); | |
| assert(Recipe->getNumDefinedValues() == 0 && "Unexpected multidef recipe"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added thanks
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's special about the exiting block, want to retain edges that leave the loop to exit blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing special, just needed a connect when breaking on reaching the exiting block. Remove logic here, thanks
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this doing anything, given that we're already visited Pred earlier in RPOT and disconnected it from all its successors? Can assert the absence of any predecessor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed in the latest version, removed thanks
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -587,9 +587,11 @@ static bool hasConditionalTerminator(const VPBasicBlock *VPBB) { | |||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| const VPRecipeBase *R = &VPBB->back(); | ||||||||||||||||||||||||||||
| bool IsCondBranch = isa<VPBranchOnMaskRecipe>(R) || | ||||||||||||||||||||||||||||
| match(R, m_BranchOnCond(m_VPValue())) || | ||||||||||||||||||||||||||||
| match(R, m_BranchOnCount(m_VPValue(), m_VPValue())); | ||||||||||||||||||||||||||||
| bool IsCondBranch = | ||||||||||||||||||||||||||||
| isa<VPBranchOnMaskRecipe>(R) || match(R, m_BranchOnCond(m_VPValue())) || | ||||||||||||||||||||||||||||
| match(R, m_BranchOnCount(m_VPValue(), m_VPValue())) || | ||||||||||||||||||||||||||||
| (isa<VPInstruction>(R) && | ||||||||||||||||||||||||||||
| cast<VPInstruction>(R)->getOpcode() == Instruction::Switch); | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
| bool IsCondBranch = | |
| isa<VPBranchOnMaskRecipe>(R) || match(R, m_BranchOnCond(m_VPValue())) || | |
| match(R, m_BranchOnCount(m_VPValue(), m_VPValue())) || | |
| (isa<VPInstruction>(R) && | |
| cast<VPInstruction>(R)->getOpcode() == Instruction::Switch); | |
| bool IsCondBranch = isa<VPBranchOnMaskRecipe>(R) || | |
| match(R, m_BranchOnCond(m_VPValue())) || | |
| match(R, m_BranchOnCount(m_VPValue(), m_VPValue())); | |
| bool IsSwitch = isa<VPInstruction>(R) && | |
| cast<VPInstruction>(R)->getOpcode() == Instruction::Switch); |
plus asserting below that, if, VPBB has at least 2 successors, then
assert((IsCondBranch || IsSwitch) && "block with multiple successors not terminated by "
"conditional branch nor switch recipe");
Switches are allowed to have a single successor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks!
Yes I think switches can have a single successor (just the default destination)
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| bool IsCondBranch = | |
| isa<VPBranchOnMaskRecipe>(R) || match(R, m_BranchOnCond(m_VPValue())) || | |
| match(R, m_BranchOnCount(m_VPValue(), m_VPValue())) || | |
| (isa<VPInstruction>(R) && | |
| cast<VPInstruction>(R)->getOpcode() == Instruction::Switch); | |
| (void)IsCondBranch; | |
| bool IsCondBranch = isa<VPBranchOnMaskRecipe>(R) || | |
| match(R, m_BranchOnCond(m_VPValue())) || | |
| match(R, m_BranchOnCount(m_VPValue(), m_VPValue())); | |
| bool IsSwitch = isa<VPInstruction>(R) && | |
| cast<VPInstruction>(R)->getOpcode() == Instruction::Switch); | |
| (void)IsCondBranch; | |
| (void)IsSwitch; |
plus asserting below that, if, VPBB has at least 2 successors, then
assert((IsCondBranch || IsSwitch) && "block with multiple successors not terminated by "
"conditional branch nor switch recipe");
Switches can have a single successor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done thanks
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can check >2 case separately expecting IsSwitch only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -75,7 +75,7 @@ class PlainCFGBuilder { | |||||
| : TheLoop(Lp), LI(LI), Plan(P) {} | ||||||
|
|
||||||
| /// Build plain CFG for TheLoop and connects it to Plan's entry. | ||||||
| void buildPlainCFG(); | ||||||
| void buildPlainCFG(DenseMap<VPBlockBase *, BasicBlock *> &VPB2IRBB); | ||||||
| }; | ||||||
| } // anonymous namespace | ||||||
|
|
||||||
|
|
@@ -238,9 +238,9 @@ bool PlainCFGBuilder::isExternalDef(Value *Val) { | |||||
| return false; | ||||||
|
|
||||||
| // Check whether Instruction definition is in the loop exit. | ||||||
|
||||||
| // Check whether Instruction definition is in the loop exit. | |
| // Check whether Instruction definition is in a loop exit. |
?
Is this notion of "External Def" still valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. I think so as
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer expecting a loop with single exit?
Teaching PlainCFGBuilder about multiple exits - does this imply that "native" can now vectorize such outerloops, and worth testing?
OTOH, PlainCFGBuilder::buildPlainCFG() does handle multiple/non-unique exits(?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is needed for the inner loop multi-exit support. The native path still won't vectorize early exits, due to checks in legality I think. I can add a test case, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Teaching PlainCFGBuilder about switch statements - does this imply that "native" can now vectorize outerloops with switches, and worth testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The native path still won't vectorize switches, due to checks in legality I think. I can add a test case, thanks
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -46,7 +46,7 @@ define void @vector_reverse_i64(ptr nocapture noundef writeonly %A, ptr nocaptur | |
| ; CHECK-NEXT: LV: Found an estimated cost of 0 for VF vscale x 4 For instruction: br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit, !llvm.loop !0 | ||
| ; CHECK-NEXT: LV: Using user VF vscale x 4. | ||
| ; CHECK-NEXT: LV: Loop does not require scalar epilogue | ||
| ; CHECK-NEXT: LV: Scalarizing: %i.0 = add nsw i32 %i.0.in8, -1 | ||
| ; CHECK: LV: Scalarizing: %i.0 = add nsw i32 %i.0.in8, -1 | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why/Is this line no longer NEXT? Same below.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We now also include printing the VPlan after HCFG construction, should add a test to check that or drop the extra output?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, whatever you prefer. |
||
| ; CHECK-NEXT: LV: Scalarizing: %idxprom = zext i32 %i.0 to i64 | ||
| ; CHECK-NEXT: LV: Scalarizing: %arrayidx = getelementptr inbounds i32, ptr %B, i64 %idxprom | ||
| ; CHECK-NEXT: LV: Scalarizing: %arrayidx3 = getelementptr inbounds i32, ptr %A, i64 %idxprom | ||
|
|
@@ -295,7 +295,7 @@ define void @vector_reverse_f32(ptr nocapture noundef writeonly %A, ptr nocaptur | |
| ; CHECK-NEXT: LV: Found an estimated cost of 0 for VF vscale x 4 For instruction: br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit, !llvm.loop !0 | ||
| ; CHECK-NEXT: LV: Using user VF vscale x 4. | ||
| ; CHECK-NEXT: LV: Loop does not require scalar epilogue | ||
| ; CHECK-NEXT: LV: Scalarizing: %i.0 = add nsw i32 %i.0.in8, -1 | ||
| ; CHECK: LV: Scalarizing: %i.0 = add nsw i32 %i.0.in8, -1 | ||
| ; CHECK-NEXT: LV: Scalarizing: %idxprom = zext i32 %i.0 to i64 | ||
| ; CHECK-NEXT: LV: Scalarizing: %arrayidx = getelementptr inbounds float, ptr %B, i64 %idxprom | ||
| ; CHECK-NEXT: LV: Scalarizing: %arrayidx3 = getelementptr inbounds float, ptr %A, i64 %idxprom | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems awkward, would be better to have VPBuilder support
as inspired by IRBuilderBase?