-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[LV] Count cost of middle block if TC <= VF. #168949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
243c8d7
d5aa2ae
5a9560d
423b872
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9283,6 +9283,14 @@ static bool isOutsideLoopWorkProfitable(GeneratedRTChecks &Checks, | |
| // one exists. | ||
| TotalCost += calculateEarlyExitCost(CostCtx, Plan, VF.Width); | ||
|
|
||
| // If the expected trip count is less than the VF, the vector loop will only | ||
| // execute a single iteration. Then the middle block is executed the same | ||
| // number of times as the vector region. | ||
| // TODO: Extend logic to always account for the cost of the middle block. | ||
| auto ExpectedTC = getSmallBestKnownTC(PSE, L); | ||
| if (ExpectedTC && ElementCount::isKnownLE(*ExpectedTC, VF.Width)) | ||
| TotalCost += Plan.getMiddleBlock()->cost(VF.Width, CostCtx); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you update the comment above the function Also the comment further down: needs updating too I think.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Both should be updated, thanks! |
||
|
|
||
| // When interleaving only scalar and vector cost will be equal, which in turn | ||
| // would lead to a divide by 0. Fall back to hard threshold. | ||
| if (VF.Width.isScalar()) { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code now seems inconsistent with the sum of
calculateEarlyExitCostandChecks.getCost(), which is always adding on the cost of work in the early exits regardless of trip count. Also, the code below already seems to have proper logic to handle this more complelely, i.e.I don't know why we shouldn't just always add on the cost to be consistent with the existing code that expects it? It feels really odd to treat the middle block differently to the runtime check block and early exit block code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, this is intentional in the initial patch (from the description:
Note that isOutsideLoopWorkProfitable already scales the cost of blocks outside the vector region, but the patch restricts accounting for the middle block to cases where VF <= ExpectedTC, to initially catch some worst cases and avoid regressions.)The TODO is to enable it unconditionally, but that potentially has much higher impact, which is why I'd prefer to do this as a separate commit, to make it slightly easier to narrow down regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK fair enough. If there is going to be follow-on work to do this unconditionally I'm happy with that!