-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[LoopUnroll] Fix block frequencies for epilogue #159163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/jdenny-ornl/fix-blockfreq-unroll-no-runtime
Are you sure you want to change the base?
[LoopUnroll] Fix block frequencies for epilogue #159163
Conversation
As another step in issue #135812, this patch fixes block frequencies for partial loop unrolling with an epilogue remainder loop. It does not fully handle the case when the epilogue loop itself is unrolled. That will be handled in the next patch. For the guard and latch of each of the unrolled loop and epilogue loop, this patch sets branch weights derived directly from the original loop latch branch weights. The total frequency of the original loop body, summed across all its occurrences in the unrolled loop and epilogue loop, is the same as in the original loop. This patch also sets `llvm.loop.estimated_trip_count` for the epilogue loop instead of relying on the epilogue's latch branch weights to imply it. This patch removes the XFAIL directives that PR #157754 added to the test suite.
@llvm/pr-subscribers-llvm-support @llvm/pr-subscribers-llvm-transforms Author: Joel E. Denny (jdenny-ornl) ChangesAs another step in issue #135812, this patch fixes block frequencies for partial loop unrolling with an epilogue remainder loop. It does not fully handle the case when the epilogue loop itself is unrolled. That will be handled in the next patch. For the guard and latch of each of the unrolled loop and epilogue loop, this patch sets branch weights derived directly from the original loop latch branch weights. The total frequency of the original loop body, summed across all its occurrences in the unrolled loop and epilogue loop, is the same as in the original loop. This patch also sets This patch removes the XFAIL directives that PR #157754 added to the test suite. Patch is 38.64 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/159163.diff 10 Files Affected:
diff --git a/llvm/include/llvm/Transforms/Utils/LoopUtils.h b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
index c5dbb2bdd1dd8..71754b8f62a16 100644
--- a/llvm/include/llvm/Transforms/Utils/LoopUtils.h
+++ b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
@@ -365,6 +365,38 @@ LLVM_ABI bool setLoopEstimatedTripCount(
Loop *L, unsigned EstimatedTripCount,
std::optional<unsigned> EstimatedLoopInvocationWeight = std::nullopt);
+/// Based on branch weight metadata, return either:
+/// - \c std::nullopt if the implementation is unable to handle the loop form
+/// of \p L (e.g., \p L must have a latch block that controls the loop exit).
+/// - Else, the estimated probability that, at the end of any iteration, the
+/// latch of \p L will start another iteration. The result \c P is such that
+/// `0 <= P <= 1`, and `1 - P` is the probability of exiting the loop.
+std::optional<double> getLoopProbability(Loop *L);
+
+/// Set branch weight metadata for the latch of \p L to indicate that, at the
+/// end of any iteration, its estimated probability of starting another
+/// iteration is \p P. Return false if the implementation is unable to handle
+/// the loop form of \p L (e.g., \p L must have a latch block that controls the
+/// loop exit). Otherwise, return true.
+bool setLoopProbability(Loop *L, double P);
+
+/// Based on branch weight metadata, return either:
+/// - \c std::nullopt if the implementation cannot extract the probability
+/// (e.g., \p B must have exactly two target labels, so it must be a
+/// conditional branch).
+/// - The probability \c P that control flows from \p B to its first target
+/// label such that `1 - P` is the probability of control flowing to its
+/// second target label, or vice-versa if \p ForFirstTarget is false.
+std::optional<double> getBranchProbability(BranchInst *B, bool ForFirstTarget);
+
+/// Set branch weight metadata for \p B to indicate that \p P and `1 - P` are
+/// the probabilities of control flowing to its first and second target labels,
+/// respectively, or vice-versa if \p ForFirstTarget is false. Return false if
+/// the implementation cannot set the probability (e.g., \p B must have exactly
+/// two target labels, so it must be a conditional branch). Otherwise, return
+/// true.
+bool setBranchProbability(BranchInst *B, double P, bool ForFirstTarget);
+
/// Check inner loop (L) backedge count is known to be invariant on all
/// iterations of its outer loop. If the loop has no parent, this is trivially
/// true.
diff --git a/llvm/include/llvm/Transforms/Utils/UnrollLoop.h b/llvm/include/llvm/Transforms/Utils/UnrollLoop.h
index 871c13d972470..571a0af6fd0db 100644
--- a/llvm/include/llvm/Transforms/Utils/UnrollLoop.h
+++ b/llvm/include/llvm/Transforms/Utils/UnrollLoop.h
@@ -97,7 +97,9 @@ LLVM_ABI bool UnrollRuntimeLoopRemainder(
LoopInfo *LI, ScalarEvolution *SE, DominatorTree *DT, AssumptionCache *AC,
const TargetTransformInfo *TTI, bool PreserveLCSSA,
unsigned SCEVExpansionBudget, bool RuntimeUnrollMultiExit,
- Loop **ResultLoop = nullptr);
+ Loop **ResultLoop = nullptr,
+ std::optional<unsigned> OriginalTripCount = std::nullopt,
+ std::optional<double> OriginalLoopProb = std::nullopt);
LLVM_ABI LoopUnrollResult UnrollAndJamLoop(
Loop *L, unsigned Count, unsigned TripCount, unsigned TripMultiple,
diff --git a/llvm/lib/Transforms/Utils/LoopUnroll.cpp b/llvm/lib/Transforms/Utils/LoopUnroll.cpp
index 0192ce6a960ff..b40d81100a2bd 100644
--- a/llvm/lib/Transforms/Utils/LoopUnroll.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUnroll.cpp
@@ -65,6 +65,7 @@
#include "llvm/Transforms/Utils/UnrollLoop.h"
#include "llvm/Transforms/Utils/ValueMapper.h"
#include <assert.h>
+#include <cmath>
#include <numeric>
#include <type_traits>
#include <vector>
@@ -501,6 +502,7 @@ llvm::UnrollLoop(Loop *L, UnrollLoopOptions ULO, LoopInfo *LI,
const bool MaxOrZero = SE->isBackedgeTakenCountMaxOrZero(L);
std::optional<unsigned> OriginalTripCount =
llvm::getLoopEstimatedTripCount(L);
+ std::optional<double> OriginalLoopProb = llvm::getLoopProbability(L);
// Effectively "DCE" unrolled iterations that are beyond the max tripcount
// and will never be executed.
@@ -591,11 +593,11 @@ llvm::UnrollLoop(Loop *L, UnrollLoopOptions ULO, LoopInfo *LI,
: isEpilogProfitable(L);
if (ULO.Runtime &&
- !UnrollRuntimeLoopRemainder(L, ULO.Count, ULO.AllowExpensiveTripCount,
- EpilogProfitability, ULO.UnrollRemainder,
- ULO.ForgetAllSCEV, LI, SE, DT, AC, TTI,
- PreserveLCSSA, ULO.SCEVExpansionBudget,
- ULO.RuntimeUnrollMultiExit, RemainderLoop)) {
+ !UnrollRuntimeLoopRemainder(
+ L, ULO.Count, ULO.AllowExpensiveTripCount, EpilogProfitability,
+ ULO.UnrollRemainder, ULO.ForgetAllSCEV, LI, SE, DT, AC, TTI,
+ PreserveLCSSA, ULO.SCEVExpansionBudget, ULO.RuntimeUnrollMultiExit,
+ RemainderLoop, OriginalTripCount, OriginalLoopProb)) {
if (ULO.Force)
ULO.Runtime = false;
else {
@@ -1130,13 +1132,13 @@ llvm::UnrollLoop(Loop *L, UnrollLoopOptions ULO, LoopInfo *LI,
LI->erase(L);
// We shouldn't try to use `L` anymore.
L = nullptr;
- } else if (OriginalTripCount) {
+ } else {
// Update metadata for the loop's branch weights and estimated trip count:
// - If ULO.Runtime, UnrollRuntimeLoopRemainder sets the guard branch
// weights, latch branch weights, and estimated trip count of the
// remainder loop it creates. It also sets the branch weights for the
// unrolled loop guard it creates. The branch weights for the unrolled
- // loop latch are adjusted below. FIXME: Actually handle ULO.Runtime.
+ // loop latch are adjusted below. FIXME: Handle prologue loops.
// - Otherwise, if unrolled loop iteration latches become unconditional,
// branch weights are adjusted above. FIXME: Actually handle such
// unconditional latches.
@@ -1159,10 +1161,17 @@ llvm::UnrollLoop(Loop *L, UnrollLoopOptions ULO, LoopInfo *LI,
// the unrolled loop as a whole without considering the branch weights for
// each unrolled iteration's latch within it, we store the new trip count as
// separate metadata.
- unsigned NewTripCount = *OriginalTripCount / ULO.Count;
- if (!ULO.Runtime && *OriginalTripCount % ULO.Count)
- NewTripCount += 1;
- setLoopEstimatedTripCount(L, NewTripCount);
+ if (OriginalLoopProb && ULO.Runtime && EpilogProfitability) {
+ // Where p is always the probability of executing at least 1 more
+ // iteration, the probability for at least n more iterations is p^n.
+ setLoopProbability(L, pow(*OriginalLoopProb, ULO.Count));
+ }
+ if (OriginalTripCount) {
+ unsigned NewTripCount = *OriginalTripCount / ULO.Count;
+ if (!ULO.Runtime && *OriginalTripCount % ULO.Count)
+ NewTripCount += 1;
+ setLoopEstimatedTripCount(L, NewTripCount);
+ }
}
// LoopInfo should not be valid, confirm that.
diff --git a/llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp b/llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
index 6312831cf0ee0..9ba93fcade2bd 100644
--- a/llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
@@ -40,6 +40,7 @@
#include "llvm/Transforms/Utils/LoopUtils.h"
#include "llvm/Transforms/Utils/ScalarEvolutionExpander.h"
#include "llvm/Transforms/Utils/UnrollLoop.h"
+#include <cmath>
using namespace llvm;
@@ -195,6 +196,20 @@ static void ConnectProlog(Loop *L, Value *BECount, unsigned Count,
}
}
+/// Assume, due to our position in the remainder loop or its guard, anywhere
+/// from 0 to \p N more iterations can possibly execute. Among such cases in
+/// the original loop (with loop probability \p OriginalLoopProb), what is the
+/// probability of executing at least one more iteration?
+static double probOfNextInRemainder(double OriginalLoopProb, unsigned N) {
+ // Each of these variables holds the original loop's probability that the
+ // number of iterations it will execute is some m in the specified range.
+ double ProbOne = OriginalLoopProb; // 1 <= m
+ double ProbTooMany = pow(ProbOne, N + 1); // N + 1 <= m
+ double ProbNotTooMany = 1 - ProbTooMany; // 0 <= m <= N
+ double ProbOneNotTooMany = ProbOne - ProbTooMany; // 1 <= m <= N
+ return ProbOneNotTooMany / ProbNotTooMany;
+}
+
/// Connect the unrolling epilog code to the original loop.
/// The unrolling epilog code contains code to execute the
/// 'extra' iterations if the run-time trip count modulo the
@@ -221,7 +236,8 @@ static void ConnectEpilog(Loop *L, Value *ModVal, BasicBlock *NewExit,
BasicBlock *EpilogPreHeader, BasicBlock *NewPreHeader,
ValueToValueMapTy &VMap, DominatorTree *DT,
LoopInfo *LI, bool PreserveLCSSA, ScalarEvolution &SE,
- unsigned Count, AssumptionCache &AC) {
+ unsigned Count, AssumptionCache &AC,
+ std::optional<double> OriginalLoopProb) {
BasicBlock *Latch = L->getLoopLatch();
assert(Latch && "Loop must have a latch");
BasicBlock *EpilogLatch = cast<BasicBlock>(VMap[Latch]);
@@ -332,12 +348,18 @@ static void ConnectEpilog(Loop *L, Value *ModVal, BasicBlock *NewExit,
PreserveLCSSA);
// Add the branch to the exit block (around the epilog loop)
MDNode *BranchWeights = nullptr;
- if (hasBranchWeightMD(*Latch->getTerminator())) {
+ if (!OriginalLoopProb && hasBranchWeightMD(*Latch->getTerminator())) {
// Assume equal distribution in interval [0, Count).
MDBuilder MDB(B.getContext());
BranchWeights = MDB.createBranchWeights(1, Count - 1);
}
- B.CreateCondBr(BrLoopExit, EpilogPreHeader, Exit, BranchWeights);
+ BranchInst *RemainderLoopGuard =
+ B.CreateCondBr(BrLoopExit, EpilogPreHeader, Exit, BranchWeights);
+ if (OriginalLoopProb) {
+ setBranchProbability(RemainderLoopGuard,
+ probOfNextInRemainder(*OriginalLoopProb, Count - 1),
+ /*ForFirstTarget=*/true);
+ }
InsertPt->eraseFromParent();
if (DT) {
auto *NewDom = DT->findNearestCommonDominator(Exit, NewExit);
@@ -357,14 +379,15 @@ static void ConnectEpilog(Loop *L, Value *ModVal, BasicBlock *NewExit,
/// The cloned blocks should be inserted between InsertTop and InsertBot.
/// InsertTop should be new preheader, InsertBot new loop exit.
/// Returns the new cloned loop that is created.
-static Loop *
-CloneLoopBlocks(Loop *L, Value *NewIter, const bool UseEpilogRemainder,
- const bool UnrollRemainder,
- BasicBlock *InsertTop,
- BasicBlock *InsertBot, BasicBlock *Preheader,
+static Loop *CloneLoopBlocks(Loop *L, Value *NewIter,
+ const bool UseEpilogRemainder,
+ const bool UnrollRemainder, BasicBlock *InsertTop,
+ BasicBlock *InsertBot, BasicBlock *Preheader,
std::vector<BasicBlock *> &NewBlocks,
LoopBlocksDFS &LoopBlocks, ValueToValueMapTy &VMap,
- DominatorTree *DT, LoopInfo *LI, unsigned Count) {
+ DominatorTree *DT, LoopInfo *LI, unsigned Count,
+ std::optional<unsigned> OriginalTripCount,
+ std::optional<double> OriginalLoopProb) {
StringRef suffix = UseEpilogRemainder ? "epil" : "prol";
BasicBlock *Header = L->getHeader();
BasicBlock *Latch = L->getLoopLatch();
@@ -419,7 +442,8 @@ CloneLoopBlocks(Loop *L, Value *NewIter, const bool UseEpilogRemainder,
Builder.CreateAdd(NewIdx, One, NewIdx->getName() + ".next");
Value *IdxCmp = Builder.CreateICmpNE(IdxNext, NewIter, NewIdx->getName() + ".cmp");
MDNode *BranchWeights = nullptr;
- if (hasBranchWeightMD(*LatchBR)) {
+ if (!(OriginalLoopProb && UseEpilogRemainder) &&
+ hasBranchWeightMD(*LatchBR)) {
uint32_t ExitWeight;
uint32_t BackEdgeWeight;
if (Count >= 3) {
@@ -437,7 +461,25 @@ CloneLoopBlocks(Loop *L, Value *NewIter, const bool UseEpilogRemainder,
MDBuilder MDB(Builder.getContext());
BranchWeights = MDB.createBranchWeights(BackEdgeWeight, ExitWeight);
}
- Builder.CreateCondBr(IdxCmp, FirstLoopBB, InsertBot, BranchWeights);
+ BranchInst *RemainderLoopLatch =
+ Builder.CreateCondBr(IdxCmp, FirstLoopBB, InsertBot, BranchWeights);
+ if (OriginalLoopProb && UseEpilogRemainder) {
+ // Compute the total frequency of the original loop body from the
+ // remainder iterations. Once we've reached them, the first of them
+ // always executes, so it's frequency and probability are 1.
+ double FreqRemIters = 1;
+ if (Count > 2) {
+ double ProbReaching = 1;
+ for (unsigned N = Count - 2; N >= 1; --N) {
+ ProbReaching *= probOfNextInRemainder(*OriginalLoopProb, N);
+ FreqRemIters += ProbReaching;
+ }
+ }
+ // Solve for the loop probability that would produce that frequency.
+ // Sum(i=0..inf)(Prob^i) = 1/(1-Prob) = FreqRemIters.
+ double Prob = 1 - 1 / FreqRemIters;
+ setBranchProbability(RemainderLoopLatch, Prob, /*ForFirstTarget=*/true);
+ }
NewIdx->addIncoming(Zero, InsertTop);
NewIdx->addIncoming(IdxNext, NewBB);
LatchBR->eraseFromParent();
@@ -469,6 +511,8 @@ CloneLoopBlocks(Loop *L, Value *NewIter, const bool UseEpilogRemainder,
std::optional<MDNode *> NewLoopID = makeFollowupLoopID(
LoopID, {LLVMLoopUnrollFollowupAll, LLVMLoopUnrollFollowupRemainder});
+ if (OriginalTripCount && UseEpilogRemainder)
+ setLoopEstimatedTripCount(NewLoop, *OriginalTripCount % Count);
if (NewLoopID) {
NewLoop->setLoopID(*NewLoopID);
@@ -603,7 +647,8 @@ bool llvm::UnrollRuntimeLoopRemainder(
LoopInfo *LI, ScalarEvolution *SE, DominatorTree *DT, AssumptionCache *AC,
const TargetTransformInfo *TTI, bool PreserveLCSSA,
unsigned SCEVExpansionBudget, bool RuntimeUnrollMultiExit,
- Loop **ResultLoop) {
+ Loop **ResultLoop, std::optional<unsigned> OriginalTripCount,
+ std::optional<double> OriginalLoopProb) {
LLVM_DEBUG(dbgs() << "Trying runtime unrolling on Loop: \n");
LLVM_DEBUG(L->dump());
LLVM_DEBUG(UseEpilogRemainder ? dbgs() << "Using epilog remainder.\n"
@@ -823,12 +868,23 @@ bool llvm::UnrollRuntimeLoopRemainder(
BasicBlock *UnrollingLoop = UseEpilogRemainder ? NewPreHeader : PrologExit;
// Branch to either remainder (extra iterations) loop or unrolling loop.
MDNode *BranchWeights = nullptr;
- if (hasBranchWeightMD(*Latch->getTerminator())) {
+ if (!(OriginalLoopProb && UseEpilogRemainder) &&
+ hasBranchWeightMD(*Latch->getTerminator())) {
// Assume loop is nearly always entered.
MDBuilder MDB(B.getContext());
BranchWeights = MDB.createBranchWeights(EpilogHeaderWeights);
}
- B.CreateCondBr(BranchVal, RemainderLoop, UnrollingLoop, BranchWeights);
+ BranchInst *UnrollingLoopGuard =
+ B.CreateCondBr(BranchVal, RemainderLoop, UnrollingLoop, BranchWeights);
+ if (OriginalLoopProb && UseEpilogRemainder) {
+ // The original loop's first iteration always happens. Compute the
+ // probability of the original loop executing Count-1 iterations after that
+ // to complete the first iteration of the unrolled loop.
+ double ProbOne = *OriginalLoopProb;
+ double ProbRest = pow(ProbOne, Count - 1);
+ setBranchProbability(UnrollingLoopGuard, ProbRest,
+ /*ForFirstTarget=*/false);
+ }
PreHeaderBR->eraseFromParent();
if (DT) {
if (UseEpilogRemainder)
@@ -855,9 +911,10 @@ bool llvm::UnrollRuntimeLoopRemainder(
// iterations. This function adds the appropriate CFG connections.
BasicBlock *InsertBot = UseEpilogRemainder ? LatchExit : PrologExit;
BasicBlock *InsertTop = UseEpilogRemainder ? EpilogPreHeader : PrologPreHeader;
- Loop *remainderLoop = CloneLoopBlocks(
- L, ModVal, UseEpilogRemainder, UnrollRemainder, InsertTop, InsertBot,
- NewPreHeader, NewBlocks, LoopBlocks, VMap, DT, LI, Count);
+ Loop *remainderLoop =
+ CloneLoopBlocks(L, ModVal, UseEpilogRemainder, UnrollRemainder, InsertTop,
+ InsertBot, NewPreHeader, NewBlocks, LoopBlocks, VMap, DT,
+ LI, Count, OriginalTripCount, OriginalLoopProb);
// Insert the cloned blocks into the function.
F->splice(InsertBot->getIterator(), F, NewBlocks[0]->getIterator(), F->end());
@@ -956,7 +1013,8 @@ bool llvm::UnrollRuntimeLoopRemainder(
// Connect the epilog code to the original loop and update the
// PHI functions.
ConnectEpilog(L, ModVal, NewExit, LatchExit, PreHeader, EpilogPreHeader,
- NewPreHeader, VMap, DT, LI, PreserveLCSSA, *SE, Count, *AC);
+ NewPreHeader, VMap, DT, LI, PreserveLCSSA, *SE, Count, *AC,
+ OriginalLoopProb);
// Update counter in loop for unrolling.
// Use an incrementing IV. Pre-incr/post-incr is backedge/trip count.
diff --git a/llvm/lib/Transforms/Utils/LoopUtils.cpp b/llvm/lib/Transforms/Utils/LoopUtils.cpp
index 7b1a7ce6995f8..4c05774f7f48c 100644
--- a/llvm/lib/Transforms/Utils/LoopUtils.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUtils.cpp
@@ -46,6 +46,7 @@
#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/Transforms/Utils/Local.h"
#include "llvm/Transforms/Utils/ScalarEvolutionExpander.h"
+#include <cmath>
using namespace llvm;
using namespace llvm::PatternMatch;
@@ -972,6 +973,53 @@ bool llvm::setLoopEstimatedTripCount(
return true;
}
+std::optional<double> llvm::getLoopProbability(Loop *L) {
+ BranchInst *LatchBranch = getExpectedExitLoopLatchBranch(L);
+ if (!LatchBranch)
+ return std::nullopt;
+ bool FirstTargetIsLoop = LatchBranch->getSuccessor(0) == L->getHeader();
+ return getBranchProbability(LatchBranch, FirstTargetIsLoop);
+}
+
+bool llvm::setLoopProbability(Loop *L, double P) {
+ BranchInst *LatchBranch = getExpectedExitLoopLatchBranch(L);
+ if (!LatchBranch)
+ return false;
+ bool FirstTargetIsLoop = LatchBranch->getSuccessor(0) == L->getHeader();
+ return setBranchProbability(LatchBranch, P, FirstTargetIsLoop);
+}
+
+std::optional<double> llvm::getBranchProbability(BranchInst *B,
+ bool ForFirstTarget) {
+ if (B->getNumSuccessors() != 2)
+ return std::nullopt;
+ uint64_t Weight0, Weight1;
+ if (!extractBranchWeights(*B, Weight0, Weight1))
+ return std::nullopt;
+ if (!ForFirstTarget)
+ std::swap(Weight0, Weight1);
+ return double(Weight0) / (double(Weight0) + double(Weight1));
+}
+
+bool llvm::setBranchProbability(BranchInst *B, double P, bool ForFirstTarget) {
+ if (B->getNumSuccessors() != 2)
+ return false;
+
+ // Sum should be some large number so that the weights accurately encode P,
+ // but it should not be so large that some branch weights will print as
+ // negative in LLVM IR as that makes LLVM tests harder to maintain.
+ const uint64_t Sum = 1000000000;
+ uint64_t Weight0 = round(P * Sum);
+ uint64_t Weight1 = round((1 - P) * Sum);
+ if (!ForFirstTarget)
+ std::swap(Weight0, Weight1);
+
+ MDBuilder MDB(B->getContext());
+ B->setMetadata(LLVMContext::MD_prof,
+ MDB.createBranchWeights(Weight0, Weight1));
+ return true;
+}
+
bool llvm::hasIterationCountInvariantInParent(Loop *InnerLoop,
ScalarEvolution &SE) {
Loop *OuterL = InnerLoop->getParentLoop();
diff --git a/llvm/test/Transforms/LoopUnroll/branch-weights-freq/unroll-epilog.ll b/llvm/test/Transforms/LoopUnroll/branch-weights-freq/unroll-epilog.ll
new file mode 100644
index 0000000000000..c7c91c6f6b815
--- /dev/null
+++ b/llvm/test/Transforms/LoopUnroll/branch-weights-freq/unroll-epilog.ll
@@ -0,0 +1,160 @@
+; Test branch weight metadata, estimated trip count metadata, and block
+; frequencies after loop unrolling with an epilogue.
+
+; ------------------------------------------------------------------------...
[truncated]
|
/// - The probability \c P that control flows from \p B to its first target | ||
/// label such that `1 - P` is the probability of control flowing to its | ||
/// second target label, or vice-versa if \p ForFirstTarget is false. | ||
std::optional<double> getBranchProbability(BranchInst *B, bool ForFirstTarget); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this in terms of double and not BranchProbability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose BranchProbability works. Thanks. Done.
Make them more consistent, and don't explain `0 <= P <= 1` as that's already required by BranchProbability.
… users/jdenny-ornl/fix-blockfreq-unroll-epilogue
As another step in issue #135812, this patch fixes block frequencies for partial loop unrolling with an epilogue remainder loop. It does not fully handle the case when the epilogue loop itself is unrolled. That will be handled in the next patch.
For the guard and latch of each of the unrolled loop and epilogue loop, this patch sets branch weights derived directly from the original loop latch branch weights. The total frequency of the original loop body, summed across all its occurrences in the unrolled loop and epilogue loop, is the same as in the original loop. This patch also sets
llvm.loop.estimated_trip_count
for the epilogue loop instead of relying on the epilogue's latch branch weights to imply it.This patch fixes branch weights in tests that PR #157754 adversely affected.