-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[VPlan] Simplify Plan's entry in removeBranchOnConst. #154510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
f85de24
33afce8
208a182
1e16872
17fe80c
528e463
03203ae
7b19cec
c9228c1
189b639
01e7486
3d90160
4390c24
e0f99d9
519ae8b
ca164d9
1bac0c2
0006272
5b28b16
05c8386
3b47e50
7bcbbe6
174293a
afdf4c2
4accca8
df8c9da
83cb4dc
cae7c85
6ea007b
ea2db4e
bef221f
6a89924
0ec1a59
44537d9
b5405c1
ce766c7
be17a75
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2347,12 +2347,15 @@ Value *EpilogueVectorizerMainLoop::createIterationCountCheck( | |
| } | ||
|
|
||
| /// Replace \p VPBB with a VPIRBasicBlock wrapping \p IRBB. All recipes from \p | ||
| /// VPBB are moved to the end of the newly created VPIRBasicBlock. VPBB must | ||
| /// have a single predecessor, which is rewired to the new VPIRBasicBlock. All | ||
| /// successors of VPBB, if any, are rewired to the new VPIRBasicBlock. | ||
| /// VPBB are moved to the end of the newly created VPIRBasicBlock. All | ||
| /// predecessors and successors of VPBB, if any, are rewired to the new | ||
| /// VPIRBasicBlock. If \p VPBB may be unreachable, \p Plan must be passed. | ||
| static VPIRBasicBlock *replaceVPBBWithIRVPBB(VPBasicBlock *VPBB, | ||
| BasicBlock *IRBB) { | ||
| VPIRBasicBlock *IRVPBB = VPBB->getPlan()->createVPIRBasicBlock(IRBB); | ||
| BasicBlock *IRBB, | ||
| VPlan *Plan = nullptr) { | ||
| if (!Plan) | ||
| Plan = VPBB->getPlan(); | ||
| VPIRBasicBlock *IRVPBB = Plan->createVPIRBasicBlock(IRBB); | ||
| auto IP = IRVPBB->begin(); | ||
| for (auto &R : make_early_inc_range(VPBB->phis())) | ||
| R.moveBefore(*IRVPBB, IP); | ||
|
|
@@ -7184,6 +7187,19 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan( | |
| VPlanTransforms::optimizeForVFAndUF(BestVPlan, BestVF, BestUF, PSE); | ||
| VPlanTransforms::simplifyRecipes(BestVPlan); | ||
| VPlanTransforms::removeBranchOnConst(BestVPlan); | ||
| if (BestVPlan.getEntry()->getSingleSuccessor() == | ||
| BestVPlan.getScalarPreheader()) { | ||
|
Comment on lines
+7190
to
+7191
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. removeBranchOnConst() could conceivably bypass the vector loop; this actually happens in few tests. Worth emitting a missed-vectorization remark.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, added an analysis remark. I am not sure if missed-vectorization would be accurate, because this is for cases where we would create a dead vector loop and should not even try to vectorize.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, it appears the loop isn't vectorized because the Trip Count guard is known to always jump to the scalar loop, i.e., where VFxUF is known to exceed TC, so conceptually a smaller VFxUF could work. But tests include unvectorizable non-loop cases where TC<=1, which should better be cleaned up before calling LV, certainly before reaching LVP::executePlan().
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed, we already have a TODO where we created the known True condition
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have a TODO here too; wondering if the message should specify that vectorization is dead or never executes - due to insufficient trip-count.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated message to mention insufficient trip count, thanks |
||
| // TODO: The vector loop would be dead, should not even try to vectorize. | ||
| ORE->emit([&]() { | ||
| return OptimizationRemarkAnalysis(DEBUG_TYPE, "VectorizationDead", | ||
| OrigLoop->getStartLoc(), | ||
| OrigLoop->getHeader()) | ||
| << "Created vector loop never executes due to insufficient trip " | ||
| "count."; | ||
| }); | ||
| return DenseMap<const SCEV *, Value *>(); | ||
| } | ||
|
|
||
| VPlanTransforms::narrowInterleaveGroups( | ||
| BestVPlan, BestVF, | ||
| TTI.getRegisterBitWidth(TargetTransformInfo::RGK_FixedWidthVector)); | ||
|
|
@@ -7226,7 +7242,7 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan( | |
| // middle block. The vector loop is created during VPlan execution. | ||
| State.CFG.PrevBB = ILV.createVectorizedLoopSkeleton(); | ||
| replaceVPBBWithIRVPBB(BestVPlan.getScalarPreheader(), | ||
| State.CFG.PrevBB->getSingleSuccessor()); | ||
| State.CFG.PrevBB->getSingleSuccessor(), &BestVPlan); | ||
| VPlanTransforms::removeDeadRecipes(BestVPlan); | ||
|
|
||
| assert(verifyVPlanIsValid(BestVPlan, true /*VerifyLate*/) && | ||
|
|
@@ -7257,6 +7273,13 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan( | |
| // | ||
| //===------------------------------------------------===// | ||
|
|
||
| // Retrieve loop information before executing the plan, which may remove the | ||
| // original loop, if it becomes unreachable. | ||
| MDNode *LID = OrigLoop->getLoopID(); | ||
| unsigned OrigLoopInvocationWeight = 0; | ||
| std::optional<unsigned> OrigAverageTripCount = | ||
| getLoopEstimatedTripCount(OrigLoop, &OrigLoopInvocationWeight); | ||
|
|
||
| BestVPlan.execute(&State); | ||
|
|
||
| // 2.6. Maintain Loop Hints | ||
|
|
@@ -7270,7 +7293,8 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan( | |
| updateLoopMetadataAndProfileInfo( | ||
| HeaderVPBB ? LI->getLoopFor(State.CFG.VPBB2IRBB.lookup(HeaderVPBB)) | ||
| : nullptr, | ||
| HeaderVPBB, VectorizingEpilogue, | ||
| HeaderVPBB, BestVPlan, VectorizingEpilogue, LID, OrigAverageTripCount, | ||
| OrigLoopInvocationWeight, | ||
| estimateElementCount(BestVF * BestUF, CM.getVScaleForTuning()), | ||
| DisableRuntimeUnroll); | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,8 +6,8 @@ target triple = "arm64-apple-macosx11.0.0" | |
| define void @fshl_operand_first_order_recurrence(ptr %dst, ptr noalias %src) { | ||
| ; CHECK-LABEL: define void @fshl_operand_first_order_recurrence( | ||
| ; CHECK-SAME: ptr [[DST:%.*]], ptr noalias [[SRC:%.*]]) { | ||
| ; CHECK-NEXT: [[ENTRY:.*]]: | ||
| ; CHECK-NEXT: br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]] | ||
| ; CHECK-NEXT: [[ENTRY:.*:]] | ||
| ; CHECK-NEXT: br label %[[VECTOR_PH:.*]] | ||
| ; CHECK: [[VECTOR_PH]]: | ||
| ; CHECK-NEXT: br label %[[VECTOR_BODY:.*]] | ||
| ; CHECK: [[VECTOR_BODY]]: | ||
|
|
@@ -30,14 +30,12 @@ define void @fshl_operand_first_order_recurrence(ptr %dst, ptr noalias %src) { | |
| ; CHECK-NEXT: br i1 [[TMP14]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]] | ||
| ; CHECK: [[MIDDLE_BLOCK]]: | ||
| ; CHECK-NEXT: [[VECTOR_RECUR_EXTRACT:%.*]] = extractelement <2 x i64> [[WIDE_LOAD1]], i32 1 | ||
| ; CHECK-NEXT: br label %[[SCALAR_PH]] | ||
| ; CHECK-NEXT: br label %[[SCALAR_PH:.*]] | ||
| ; CHECK: [[SCALAR_PH]]: | ||
| ; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ 100, %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ] | ||
| ; CHECK-NEXT: [[SCALAR_RECUR_INIT:%.*]] = phi i64 [ [[VECTOR_RECUR_EXTRACT]], %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ] | ||
| ; CHECK-NEXT: br label %[[LOOP:.*]] | ||
| ; CHECK: [[LOOP]]: | ||
| ; CHECK-NEXT: [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ] | ||
| ; CHECK-NEXT: [[RECUR:%.*]] = phi i64 [ [[SCALAR_RECUR_INIT]], %[[SCALAR_PH]] ], [ [[L:%.*]], %[[LOOP]] ] | ||
| ; CHECK-NEXT: [[IV:%.*]] = phi i64 [ 100, %[[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ] | ||
| ; CHECK-NEXT: [[RECUR:%.*]] = phi i64 [ [[VECTOR_RECUR_EXTRACT]], %[[SCALAR_PH]] ], [ [[L:%.*]], %[[LOOP]] ] | ||
| ; CHECK-NEXT: [[GEP_SRC:%.*]] = getelementptr inbounds i64, ptr [[SRC]], i64 [[IV]] | ||
| ; CHECK-NEXT: [[L]] = load i64, ptr [[GEP_SRC]], align 8 | ||
| ; CHECK-NEXT: [[OR:%.*]] = tail call i64 @llvm.fshl.i64(i64 1, i64 [[RECUR]], i64 1) | ||
|
|
@@ -73,7 +71,7 @@ define void @powi_call(ptr %P) { | |
| ; CHECK-LABEL: define void @powi_call( | ||
| ; CHECK-SAME: ptr [[P:%.*]]) { | ||
| ; CHECK-NEXT: [[ENTRY:.*:]] | ||
| ; CHECK-NEXT: br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]] | ||
| ; CHECK-NEXT: br label %[[VECTOR_PH:.*]] | ||
| ; CHECK: [[VECTOR_PH]]: | ||
| ; CHECK-NEXT: br label %[[VECTOR_BODY:.*]] | ||
| ; CHECK: [[VECTOR_BODY]]: | ||
|
|
@@ -83,7 +81,7 @@ define void @powi_call(ptr %P) { | |
| ; CHECK-NEXT: br label %[[MIDDLE_BLOCK:.*]] | ||
| ; CHECK: [[MIDDLE_BLOCK]]: | ||
| ; CHECK-NEXT: br label %[[EXIT:.*]] | ||
| ; CHECK: [[SCALAR_PH]]: | ||
| ; CHECK: [[SCALAR_PH:.*]]: | ||
| ; CHECK-NEXT: br label %[[LOOP:.*]] | ||
| ; CHECK: [[LOOP]]: | ||
| ; CHECK-NEXT: [[IV:%.*]] = phi i64 [ 0, %[[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ] | ||
|
|
@@ -93,7 +91,7 @@ define void @powi_call(ptr %P) { | |
| ; CHECK-NEXT: store double [[POWI]], ptr [[GEP]], align 8 | ||
| ; CHECK-NEXT: [[IV_NEXT]] = add i64 [[IV]], 1 | ||
| ; CHECK-NEXT: [[EC:%.*]] = icmp eq i64 [[IV]], 1 | ||
| ; CHECK-NEXT: br i1 [[EC]], label %[[EXIT]], label %[[LOOP]], !llvm.loop [[LOOP4:![0-9]+]] | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. metadata dropped, scalar loop unreachable
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep |
||
| ; CHECK-NEXT: br i1 [[EC]], label %[[EXIT]], label %[[LOOP]] | ||
| ; CHECK: [[EXIT]]: | ||
| ; CHECK-NEXT: ret void | ||
| ; | ||
|
|
@@ -224,5 +222,4 @@ declare i64 @llvm.fshl.i64(i64, i64, i64) | |
| ; CHECK: [[META1]] = !{!"llvm.loop.isvectorized", i32 1} | ||
| ; CHECK: [[META2]] = !{!"llvm.loop.unroll.runtime.disable"} | ||
| ; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META2]], [[META1]]} | ||
| ; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META2]], [[META1]]} | ||
| ;. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closed ) above thanks