Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 38 additions & 18 deletions llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -386,33 +386,53 @@ std::unique_ptr<VPlan> VPlanTransforms::buildPlainCFG(
/// Checks if \p HeaderVPB is a loop header block in the plain CFG; that is, it
/// has exactly 2 predecessors (preheader and latch), where the block
/// dominates the latch and the preheader dominates the block. If it is a
/// header block return true, making sure the preheader appears first and
/// the latch second. Otherwise return false.
static bool canonicalHeader(VPBlockBase *HeaderVPB,
const VPDominatorTree &VPDT) {
/// header block return true and canonicalize the predecessors of the header
/// (making sure the preheader appears first and the latch second) and the
/// successors of the latch (making sure the loop exit comes first). Otherwise
/// return false.
static bool canonicalHeaderAndLatch(VPBlockBase *HeaderVPB,
const VPDominatorTree &VPDT) {
ArrayRef<VPBlockBase *> Preds = HeaderVPB->getPredecessors();
if (Preds.size() != 2)
return false;

auto *PreheaderVPBB = Preds[0];
auto *LatchVPBB = Preds[1];
if (VPDT.dominates(PreheaderVPBB, HeaderVPB) &&
VPDT.dominates(HeaderVPB, LatchVPBB))
return true;
if (!VPDT.dominates(PreheaderVPBB, HeaderVPB) ||
!VPDT.dominates(HeaderVPB, LatchVPBB)) {
std::swap(PreheaderVPBB, LatchVPBB);

std::swap(PreheaderVPBB, LatchVPBB);
if (!VPDT.dominates(PreheaderVPBB, HeaderVPB) ||
!VPDT.dominates(HeaderVPB, LatchVPBB)) {
return false;
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
!VPDT.dominates(HeaderVPB, LatchVPBB)) {
return false;
} else {
!VPDT.dominates(HeaderVPB, LatchVPBB))
return false;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done thanks

// Canonicalize predecessors of header so that preheader is first and
// latch second.
HeaderVPB->swapPredecessors();
for (VPRecipeBase &R : cast<VPBasicBlock>(HeaderVPB)->phis())
R.swapOperands();
}
}

if (VPDT.dominates(PreheaderVPBB, HeaderVPB) &&
VPDT.dominates(HeaderVPB, LatchVPBB)) {
// Canonicalize predecessors of header so that preheader is first and latch
// second.
HeaderVPB->swapPredecessors();
for (VPRecipeBase &R : cast<VPBasicBlock>(HeaderVPB)->phis())
R.swapOperands();
return true;
// The two successors of conditional branch match the condition, with the
// first successor corresponding to true and the second to false. We
// canonicalize the successors of the latch when introducing the region, such
// that the latch exits the region when its condition is true; invert the
// original condition if the original CFG branches to the header on true.
if (!LatchVPBB->getSingleSuccessor() &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can LatchVPBB have a single successor, should we assert that it has two, following the check that header has two predecessors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, it may have 1 successor (the header, when vectorizing inner loops, where exit edges are not yet connected; that should be coming soon thought) or 2 (for inner loops when vectorizing outer loops)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, and in that case there's no terminating conditional branch yet to reverse? May be worth leaving a note.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done,t hanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider reversing (the above conditional branch which is also terminating ;-) to early-return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done thanks

LatchVPBB->getSuccessors()[0] == HeaderVPB) {
assert(LatchVPBB->getNumSuccessors() == 2 && "Must have 2 successors");
auto *Term = cast<VPBasicBlock>(LatchVPBB)->getTerminator();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert that Term is a conditional branch, so that we're certain who's first operand we're resetting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done thanks

assert(cast<VPInstruction>(Term)->getOpcode() ==
VPInstruction::BranchOnCond &&
"terminator must be a BranchOnCond");
auto *Not = new VPInstruction(VPInstruction::Not, {Term->getOperand(0)});
Not->insertBefore(Term);
Term->setOperand(0, Not);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also swap the successors to keep successors and branch condition in sync, even though both successors are soon to be removed. As in canonicalizing the two predecessors of header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done thanks,

LatchVPBB->swapSuccessors();
}

return false;
return true;
}

/// Create a new VPRegionBlock for the loop starting at \p HeaderVPB.
Expand Down Expand Up @@ -447,7 +467,7 @@ void VPlanTransforms::createLoopRegions(VPlan &Plan, Type *InductionTy,
VPDominatorTree VPDT;
VPDT.recalculate(Plan);
for (VPBlockBase *HeaderVPB : vp_depth_first_shallow(Plan.getEntry()))
if (canonicalHeader(HeaderVPB, VPDT))
if (canonicalHeaderAndLatch(HeaderVPB, VPDT))
createLoopRegion(Plan, HeaderVPB);

VPRegionBlock *TopRegion = Plan.getVectorLoopRegion();
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,9 @@ bool VPlanTransforms::tryToConvertVPInstructionsToVPRecipes(
make_early_inc_range(make_range(VPBB->begin(), EndIter))) {

VPValue *VPV = Ingredient.getVPSingleValue();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent: using Ingredient as the name of a recipe? Which (now) may even be ingredient-less...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can rename separately, thanks

if (!VPV->getUnderlyingValue())
continue;

Instruction *Inst = cast<Instruction>(VPV->getUnderlyingValue());

VPRecipeBase *NewRecipe = nullptr;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
@A = common global [1024 x i64] zeroinitializer, align 16
@B = common global [1024 x i64] zeroinitializer, align 16

; FIXME: The exit condition of the inner loop is incorrect when vectorizing.
define void @inner_latch_header_first_successor(i64 %N, i32 %c, i64 %M) {
; CHECK-LABEL: define void @inner_latch_header_first_successor(
; CHECK-SAME: i64 [[N:%.*]], i32 [[C:%.*]], i64 [[M:%.*]]) {
Expand Down Expand Up @@ -35,8 +34,9 @@ define void @inner_latch_header_first_successor(i64 %N, i32 %c, i64 %M) {
; CHECK-NEXT: [[TMP3]] = add nsw <4 x i64> [[TMP2]], [[VEC_PHI4]]
; CHECK-NEXT: [[TMP4]] = add nuw nsw <4 x i64> [[VEC_PHI]], splat (i64 1)
; CHECK-NEXT: [[TMP5:%.*]] = icmp ne <4 x i64> [[TMP4]], [[BROADCAST_SPLAT2]]
; CHECK-NEXT: [[TMP6:%.*]] = extractelement <4 x i1> [[TMP5]], i32 0
; CHECK-NEXT: br i1 [[TMP6]], label %[[VECTOR_LATCH]], label %[[INNER3]]
; CHECK-NEXT: [[TMP6:%.*]] = xor <4 x i1> [[TMP5]], splat (i1 true)
; CHECK-NEXT: [[TMP9:%.*]] = extractelement <4 x i1> [[TMP6]], i32 0
; CHECK-NEXT: br i1 [[TMP9]], label %[[VECTOR_LATCH]], label %[[INNER3]]
; CHECK: [[VECTOR_LATCH]]:
; CHECK-NEXT: [[VEC_PHI6:%.*]] = phi <4 x i64> [ [[TMP3]], %[[INNER3]] ]
; CHECK-NEXT: call void @llvm.masked.scatter.v4i64.v4p0(<4 x i64> [[VEC_PHI6]], <4 x ptr> [[TMP0]], i32 4, <4 x i1> splat (i1 true))
Expand Down
Loading