-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[LV]Initial support for safe distance in predicated DataWithEVL vectorization mode. #102897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
fadc2ab
ceed187
afb5bd2
a5c1bc3
5179f0c
ab65708
ee060dc
09b149c
34e7be5
fc73f7d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1445,9 +1445,8 @@ class LoopVectorizationCostModel { | |
|
|
||
| /// Selects and saves TailFoldingStyle for 2 options - if IV update may | ||
| /// overflow or not. | ||
| /// \param IsScalableVF true if scalable vector factors enabled. | ||
| /// \param UserIC User specific interleave count. | ||
| void setTailFoldingStyles(bool IsScalableVF, unsigned UserIC) { | ||
| void setTailFoldingStyles(unsigned UserIC) { | ||
| assert(!ChosenTailFoldingStyle && "Tail folding must not be selected yet."); | ||
| if (!Legal->canFoldTailByMasking()) { | ||
| ChosenTailFoldingStyle = | ||
|
|
@@ -1470,12 +1469,9 @@ class LoopVectorizationCostModel { | |
| // Override forced styles if needed. | ||
| // FIXME: use actual opcode/data type for analysis here. | ||
| // FIXME: Investigate opportunity for fixed vector factor. | ||
| bool EVLIsLegal = | ||
| IsScalableVF && UserIC <= 1 && | ||
alexey-bataev marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| TTI.hasActiveVectorLength(0, nullptr, Align()) && | ||
| !EnableVPlanNativePath && | ||
| // FIXME: implement support for max safe dependency distance. | ||
| Legal->isSafeForAnyVectorWidth(); | ||
| bool EVLIsLegal = UserIC <= 1 && | ||
| TTI.hasActiveVectorLength(0, nullptr, Align()) && | ||
| !EnableVPlanNativePath; | ||
|
Comment on lines
+1434
to
+1436
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (unrelated to this patch) Should this fallback of EVL, due to UserIC>1 or NativePath, apply to getPreferredTailFoldingStyle() decisions above, in addition to (de)forced styles below? |
||
| if (!EVLIsLegal) { | ||
| // If for some reason EVL mode is unsupported, fallback to | ||
| // DataWithoutLaneMask to try to vectorize the loop with folded tail | ||
|
|
@@ -1493,13 +1489,29 @@ class LoopVectorizationCostModel { | |
| } | ||
| } | ||
|
|
||
| /// Disables previously chosen tail folding policy, sets it to None. Expects, | ||
| /// that the tail policy was selected. | ||
| void disableTailFolding() { | ||
| assert(ChosenTailFoldingStyle && "Tail folding must be selected."); | ||
| ChosenTailFoldingStyle = | ||
| std::make_pair(TailFoldingStyle::None, TailFoldingStyle::None); | ||
| } | ||
|
|
||
|
||
| /// Returns true if all loop blocks should be masked to fold tail loop. | ||
| bool foldTailByMasking() const { | ||
| // TODO: check if it is possible to check for None style independent of | ||
| // IVUpdateMayOverflow flag in getTailFoldingStyle. | ||
| return getTailFoldingStyle() != TailFoldingStyle::None; | ||
| } | ||
|
|
||
| /// Return maximum safe number of elements to be processed, which do not | ||
alexey-bataev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| /// prevent store-load forwarding. | ||
|
||
| /// TODO: need to consider adjusting cost model to use this value as a | ||
| /// vectorization factor for EVL-based vectorization. | ||
| std::optional<unsigned> getMaxEVLSafeElements() const { | ||
| return MaxEVLSafeElements; | ||
| } | ||
|
|
||
| /// Returns true if the instructions in this block requires predication | ||
| /// for any reason, e.g. because tail folding now requires a predicate | ||
| /// or because the block in the original loop was predicated. | ||
|
|
@@ -1651,6 +1663,10 @@ class LoopVectorizationCostModel { | |
| /// true if scalable vectorization is supported and enabled. | ||
| std::optional<bool> IsScalableVectorizationAllowed; | ||
|
|
||
| /// Maximum safe number of elements to be processed, which do not | ||
alexey-bataev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
alexey-bataev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| /// prevent store-load forwarding. | ||
| std::optional<unsigned> MaxEVLSafeElements; | ||
|
|
||
| /// A map holding scalar costs for different vectorization factors. The | ||
| /// presence of a cost for an instruction in the mapping indicates that the | ||
| /// instruction will be scalarized when vectorizing with the associated | ||
|
|
@@ -3903,9 +3919,14 @@ FixedScalableVFPair LoopVectorizationCostModel::computeFeasibleMaxVF( | |
| // dependence distance). | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some further explanation about scalable and EVL max safe dependences, complementing the above? |
||
| unsigned MaxSafeElements = | ||
| llvm::bit_floor(Legal->getMaxSafeVectorWidthInBits() / WidestType); | ||
| unsigned MaxScalableSafeElements = MaxSafeElements; | ||
| if (foldTailWithEVL() && !Legal->isSafeForAnyVectorWidth()) { | ||
| MaxScalableSafeElements = std::numeric_limits<unsigned>::max(); | ||
|
||
| MaxEVLSafeElements = MaxSafeElements; | ||
| } | ||
|
|
||
| auto MaxSafeFixedVF = ElementCount::getFixed(MaxSafeElements); | ||
| auto MaxSafeScalableVF = getMaxLegalScalableVF(MaxSafeElements); | ||
| auto MaxSafeScalableVF = getMaxLegalScalableVF(MaxScalableSafeElements); | ||
|
||
|
|
||
| LLVM_DEBUG(dbgs() << "LV: The max safe fixed VF is: " << MaxSafeFixedVF | ||
| << ".\n"); | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Also dump max EVL safe VF?) |
||
|
|
@@ -4075,7 +4096,13 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) { | |
| InterleaveInfo.invalidateGroupsRequiringScalarEpilogue(); | ||
| } | ||
|
|
||
| FixedScalableVFPair MaxFactors = computeFeasibleMaxVF(MaxTC, UserVF, true); | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this speculative computing of feasible max VF's, assuming tail is folded, be complemented with recomputing the feasible max VF's later when we know there's no tail to fold?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You mean, instead of moving tail folding before the MaxVF computation, instead try to compute MaxVF twice, if required? I think this is possible. |
||
| // If we don't know the precise trip count, or if the trip count that we | ||
| // found modulo the vectorization factor is not zero, try to fold the tail | ||
| // by masking. | ||
| // FIXME: look for a smaller MaxVF that does divide TC rather than masking. | ||
alexey-bataev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| setTailFoldingStyles(UserIC); | ||
| FixedScalableVFPair MaxFactors = | ||
| computeFeasibleMaxVF(MaxTC, UserVF, foldTailByMasking()); | ||
|
||
|
|
||
| // Avoid tail folding if the trip count is known to be a multiple of any VF | ||
| // we choose. | ||
|
|
@@ -4106,15 +4133,11 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) { | |
| if (Rem->isZero()) { | ||
| // Accept MaxFixedVF if we do not have a tail. | ||
| LLVM_DEBUG(dbgs() << "LV: No tail will remain for any chosen VF.\n"); | ||
| disableTailFolding(); | ||
|
||
| return MaxFactors; | ||
| } | ||
| } | ||
|
|
||
| // If we don't know the precise trip count, or if the trip count that we | ||
| // found modulo the vectorization factor is not zero, try to fold the tail | ||
| // by masking. | ||
| // FIXME: look for a smaller MaxVF that does divide TC rather than masking. | ||
| setTailFoldingStyles(MaxFactors.ScalableVF.isScalable(), UserIC); | ||
| if (foldTailByMasking()) { | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Following above mentioned thought, should the following correction for over-speculated ?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As I said, some special processing will be required for EVL case with non-power-of-2 safe distance |
||
| if (getTailFoldingStyle() == TailFoldingStyle::DataWithEVL) { | ||
| LLVM_DEBUG( | ||
|
|
@@ -8496,8 +8519,8 @@ void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF, | |
| VPlanTransforms::optimize(*Plan, *PSE.getSE()); | ||
| // TODO: try to put it close to addActiveLaneMask(). | ||
| // Discard the plan if it is not EVL-compatible | ||
| if (CM.foldTailWithEVL() && | ||
| !VPlanTransforms::tryAddExplicitVectorLength(*Plan)) | ||
| if (CM.foldTailWithEVL() && !VPlanTransforms::tryAddExplicitVectorLength( | ||
| *Plan, CM.getMaxEVLSafeElements())) | ||
| break; | ||
| assert(verifyVPlanIsValid(*Plan) && "VPlan is invalid"); | ||
| VPlans.push_back(std::move(Plan)); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Can select and save 4 options in stead of 2 ... [may IV update overflow or not] x [is VF fixed or scalable] ... but better simplify this than further complicate it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still pending?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not understand how it is related