Skip to content

Commit 859b84d

Browse files
committed
Merge branch 'pgo-estimated-trip-count' into fix-peel-branch-weights
2 parents e250cfc + 13d1fbb commit 859b84d

File tree

15 files changed

+639
-114
lines changed

15 files changed

+639
-114
lines changed

llvm/docs/LangRef.rst

Lines changed: 52 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -7936,31 +7936,63 @@ loop distribution pass. See
79367936
'``llvm.loop.estimated_trip_count``' Metadata
79377937
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
79387938

7939-
This metadata records the loop's estimated trip count. The first
7940-
operand is the string ``llvm.loop.estimated_trip_count`` and the
7941-
second operand is an integer specifying the count. For example:
7939+
This metadata records an estimated trip count for the loop. The first operand
7940+
is the string ``llvm.loop.estimated_trip_count``. The second operand is an
7941+
integer specifying the count, which might be omitted for the reasons described
7942+
below. For example:
79427943

79437944
.. code-block:: llvm
79447945

79457946
!0 = !{!"llvm.loop.estimated_trip_count", i32 8}
7947+
!1 = !{!"llvm.loop.estimated_trip_count"}
79467948

7947-
A loop's estimated trip count is an estimate of the average number of
7948-
loop iterations (specifically, the number of times the loop's header
7949-
executes) each time execution reaches the loop. It is usually only an
7950-
estimate based on, for example, profile data. The actual number of
7951-
iterations might vary widely.
7952-
7953-
The estimated trip count serves as a parameter for various loop
7954-
transformations and typically helps estimate transformation cost. For
7955-
example, it can help determine how many iterations to peel or how
7956-
aggressively to unroll.
7957-
7958-
If this metadata is not present, such passes compute the estimated
7959-
trip count from any ``branch_weights`` metadata attached to the latch
7960-
block's branch instruction. Thus, this metadata frees loop
7961-
transformations to compute latch branch weights solely for the purpose
7962-
of maintaining accurate block frequencies instead of requiring the
7963-
branch weights to always serve both roles.
7949+
Purpose
7950+
"""""""
7951+
7952+
A loop's estimated trip count is an estimate of the average number of loop
7953+
iterations (specifically, the number of times the loop's header executes) each
7954+
time execution reaches the loop. It is usually only an estimate based on, for
7955+
example, profile data. The actual number of iterations might vary widely.
7956+
7957+
The estimated trip count serves as a parameter for various loop transformations
7958+
and typically helps estimate transformation cost. For example, it can help
7959+
determine how many iterations to peel or how aggressively to unroll.
7960+
7961+
Initialization and Maintenance
7962+
""""""""""""""""""""""""""""""
7963+
7964+
The ``pgo-estimate-trip-counts`` pass typically runs immediately after profile
7965+
ingestion to add this metadata to all loops. It estimates each loop's trip
7966+
count from the loop's ``branch_weights`` metadata. This way of initially
7967+
estimating trip counts appears to be useful for the passes that consume them.
7968+
7969+
As passes transform existing loops and create new loops, they must be free to
7970+
update and create ``branch_weights`` metadata to maintain accurate block
7971+
frequencies. Trip counts estimated from this new ``branch_weights`` metadata
7972+
are not necessarily useful to the passes that consume them. In general, when
7973+
passes transform and create loops, they should separately estimate new trip
7974+
counts from previously estimated trip counts, and they should record them by
7975+
creating or updating this metadata. For this or any other work involving
7976+
estimated trip counts, passes should always call
7977+
``llvm::getLoopEstimatedTripCount`` and ``llvm::setLoopEstimatedTripCount``.
7978+
7979+
Missing Metadata and Values
7980+
"""""""""""""""""""""""""""
7981+
7982+
If the current implementation of ``pgo-estimate-trip-counts`` cannot estimate a
7983+
trip count from the loop's ``branch_weights`` metadata due to the loop's form or
7984+
due to missing profile data, it creates this metadata for the loop but omits the
7985+
value. This situation is currently common (e.g., the LLVM IR loop that Clang
7986+
emits for a simple C ``for`` loop). A later pass (e.g., ``loop-rotate``) might
7987+
modify the loop's form in a way that enables estimating its trip count even if
7988+
those modifications provably never impact the actual number of loop iterations.
7989+
That later pass should then add an appropriate value to the metadata.
7990+
7991+
However, not all such passes currently do so. Thus, if this metadata has no
7992+
value, ``llvm::getLoopEstimatedTripCount`` will disregard it and estimate the
7993+
trip count from the loop's ``branch_weights`` metadata. It does the same when
7994+
the metadata is missing altogether, perhaps because ``pgo-estimate-trip-counts``
7995+
was not specified in a minimal pass list to a tool like ``opt``.
79647996

79657997
'``llvm.licm.disable``' Metadata
79667998
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

llvm/include/llvm/Analysis/LoopInfo.h

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -637,9 +637,13 @@ LLVM_ABI std::optional<bool> getOptionalBoolLoopAttribute(const Loop *TheLoop,
637637
/// Returns true if Name is applied to TheLoop and enabled.
638638
LLVM_ABI bool getBooleanLoopAttribute(const Loop *TheLoop, StringRef Name);
639639

640-
/// Find named metadata for a loop with an integer value.
641-
LLVM_ABI std::optional<int> getOptionalIntLoopAttribute(const Loop *TheLoop,
642-
StringRef Name);
640+
/// Find named metadata for a loop with an integer value. Return
641+
/// \c std::nullopt if the metadata has no value or is missing altogether. If
642+
/// \p Missing, set \c *Missing to indicate whether the metadata is missing
643+
/// altogether.
644+
LLVM_ABI std::optional<int>
645+
getOptionalIntLoopAttribute(const Loop *TheLoop, StringRef Name,
646+
bool *Missing = nullptr);
643647

644648
/// Find named metadata for a loop with an integer value. Return \p Default if
645649
/// not set.
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
//===- PGOEstimateTripCounts.h ----------------------------------*- C++ -*-===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
#ifndef LLVM_TRANSFORMS_INSTRUMENTATION_PGOESTIMATETRIPCOUNTS_H
10+
#define LLVM_TRANSFORMS_INSTRUMENTATION_PGOESTIMATETRIPCOUNTS_H
11+
12+
#include "llvm/IR/PassManager.h"
13+
14+
namespace llvm {
15+
16+
struct PGOEstimateTripCountsPass
17+
: public PassInfoMixin<PGOEstimateTripCountsPass> {
18+
PGOEstimateTripCountsPass() {}
19+
PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
20+
};
21+
22+
} // namespace llvm
23+
24+
#endif // LLVM_TRANSFORMS_INSTRUMENTATION_PGOESTIMATETRIPCOUNTS_H

llvm/include/llvm/Transforms/Utils/LoopUtils.h

Lines changed: 63 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -316,37 +316,73 @@ LLVM_ABI TransformationMode hasDistributeTransformation(const Loop *L);
316316
LLVM_ABI TransformationMode hasLICMVersioningTransformation(const Loop *L);
317317
/// @}
318318

319-
/// Set input string into loop metadata by keeping other values intact.
320-
/// If the string is already in loop metadata update value if it is
321-
/// different.
322-
LLVM_ABI void addStringMetadataToLoop(Loop *TheLoop, const char *MDString,
323-
unsigned V = 0);
324-
325-
/// Returns a loop's estimated trip count based on
326-
/// llvm.loop.estimated_trip_count metadata or, if none, branch weight metadata.
327-
/// In addition if \p EstimatedLoopInvocationWeight is not null it is
328-
/// initialized with weight of loop's latch leading to the exit.
329-
/// Returns a valid positive trip count, saturated at UINT_MAX, or std::nullopt
330-
/// when a meaningful estimate cannot be made.
319+
/// Set the string \p MDString into the loop metadata of \p TheLoop while
320+
/// keeping other loop metadata intact. Set \p *V as its value, or set it
321+
/// without a value if \p V is \c std::nullopt to indicate the value is unknown.
322+
/// If \p MDString is already in the loop metadata, update it if its value (or
323+
/// lack of value) is different. Return true if metadata was changed.
324+
LLVM_ABI bool addStringMetadataToLoop(Loop *TheLoop, const char *MDString,
325+
std::optional<unsigned> V = 0);
326+
327+
/// Return either:
328+
/// - The value of \c llvm.loop.estimated_trip_count from the loop metadata of
329+
/// \p L, if that metadata is present and has a value.
330+
/// - Else, a new estimate of the trip count from the latch branch weights of
331+
/// \p L, if the estimation's implementation is able to handle the loop form
332+
/// of \p L (e.g., \p L must have a latch block that controls the loop exit).
333+
/// - Else, \c std::nullopt.
334+
///
335+
/// An estimated trip count is always a valid positive trip count, saturated at
336+
/// \c UINT_MAX.
337+
///
338+
/// Via \c LLVM_DEBUG, emit diagnostics that include "WARNING" when the metadata
339+
/// is in an unexpected state as that indicates some transformation has
340+
/// corrupted it. If \p DbgForInit, expect the metadata to be missing.
341+
/// Otherwise, expect the metadata to be present, and expect it to have no value
342+
/// only if the trip count is currently inestimable from the latch branch
343+
/// weights.
344+
///
345+
/// In addition, if \p EstimatedLoopInvocationWeight, then either:
346+
/// - Set \p *EstimatedLoopInvocationWeight to the weight of the latch's branch
347+
/// to the loop exit.
348+
/// - Do not set it and return \c std::nullopt if the current implementation
349+
/// cannot compute that weight (e.g., if \p L does not have a latch block that
350+
/// controls the loop exit) or the weight is zero (because zero cannot be
351+
/// used to compute new branch weights that reflect the estimated trip count).
352+
///
353+
/// TODO: Eventually, once all passes have migrated away from setting branch
354+
/// weights to indicate estimated trip counts, this function will drop the
355+
/// \p EstimatedLoopInvocationWeight parameter.
331356
LLVM_ABI std::optional<unsigned>
332357
getLoopEstimatedTripCount(Loop *L,
333-
unsigned *EstimatedLoopInvocationWeight = nullptr);
334-
335-
/// Set a loop's llvm.loop.estimated_trip_count metadata and, if \p
336-
/// EstimatedLoopInvocationWeight, branch weight metadata to reflect that loop
337-
/// has \p EstimatedTripCount iterations and \p EstimatedLoopInvocationWeight
338-
/// exit weight through latch. Returns true if metadata is successfully updated,
339-
/// false otherwise. Note that loop must have a latch block which controls loop
340-
/// exit in order to succeed.
358+
unsigned *EstimatedLoopInvocationWeight = nullptr,
359+
bool DbgForInit = false);
360+
361+
/// Set \c llvm.loop.estimated_trip_count with the value \c *EstimatedTripCount
362+
/// in the loop metadata of \p L, or set it without a value if
363+
/// \c !EstimatedTripCount to indicate that \c getLoopEstimatedTripCount cannot
364+
/// estimate the trip count from latch branch weights. If
365+
/// \c !EstimatedTripCount but \c getLoopEstimatedTripCount can estimate the
366+
/// trip counts, future calls to \c getLoopEstimatedTripCount will diagnose the
367+
/// metadata as corrupt.
368+
///
369+
/// In addition, if \p EstimatedLoopInvocationWeight, set the branch weight
370+
/// metadata of \p L to reflect that \p L has an estimated
371+
/// \c *EstimatedTripCount iterations and has \c *EstimatedLoopInvocationWeight
372+
/// exit weight through the loop's latch.
373+
///
374+
/// Return false if \c llvm.loop.estimated_trip_count was already set according
375+
/// to \p EstimatedTripCount and so was not updated. Return false if
376+
/// \p EstimatedLoopInvocationWeight and if branch weight metadata could not be
377+
/// successfully updated (e.g., if \p L does not have a latch block that
378+
/// controls the loop exit). Otherwise, return true.
341379
///
342-
/// The use case for not setting branch weight metadata is when the original
343-
/// branch weight metadata is correct for computing block frequencies but the
344-
/// trip count has changed due to a loop transformation. The branch weight
345-
/// metadata cannot be adjusted to reflect the new trip count, so we store the
346-
/// new trip count separately.
380+
/// TODO: Eventually, once all passes have migrated away from setting branch
381+
/// weights to indicate estimated trip counts, this function will drop the
382+
/// \p EstimatedLoopInvocationWeight parameter.
347383
LLVM_ABI bool setLoopEstimatedTripCount(
348-
Loop *L, unsigned EstimatedTripCount,
349-
std::optional<unsigned> EstimatedLoopInvocationWeight);
384+
Loop *L, std::optional<unsigned> EstimatedTripCount,
385+
std::optional<unsigned> EstimatedLoopInvocationWeight = std::nullopt);
350386

351387
/// Check inner loop (L) backedge count is known to be invariant on all
352388
/// iterations of its outer loop. If the loop has no parent, this is trivially

llvm/lib/Analysis/LoopInfo.cpp

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1112,9 +1112,13 @@ bool llvm::getBooleanLoopAttribute(const Loop *TheLoop, StringRef Name) {
11121112
}
11131113

11141114
std::optional<int> llvm::getOptionalIntLoopAttribute(const Loop *TheLoop,
1115-
StringRef Name) {
1116-
const MDOperand *AttrMD =
1117-
findStringMetadataForLoop(TheLoop, Name).value_or(nullptr);
1115+
StringRef Name,
1116+
bool *Missing) {
1117+
std::optional<const MDOperand *> AttrMDOpt =
1118+
findStringMetadataForLoop(TheLoop, Name);
1119+
if (Missing)
1120+
*Missing = !AttrMDOpt;
1121+
const MDOperand *AttrMD = AttrMDOpt.value_or(nullptr);
11181122
if (!AttrMD)
11191123
return std::nullopt;
11201124

llvm/lib/Passes/PassBuilder.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,7 @@
248248
#include "llvm/Transforms/Instrumentation/NumericalStabilitySanitizer.h"
249249
#include "llvm/Transforms/Instrumentation/PGOCtxProfFlattening.h"
250250
#include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h"
251+
#include "llvm/Transforms/Instrumentation/PGOEstimateTripCounts.h"
251252
#include "llvm/Transforms/Instrumentation/PGOForceFunctionAttrs.h"
252253
#include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
253254
#include "llvm/Transforms/Instrumentation/RealtimeSanitizer.h"

llvm/lib/Passes/PassBuilderPipelines.cpp

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@
8080
#include "llvm/Transforms/Instrumentation/MemProfUse.h"
8181
#include "llvm/Transforms/Instrumentation/PGOCtxProfFlattening.h"
8282
#include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h"
83+
#include "llvm/Transforms/Instrumentation/PGOEstimateTripCounts.h"
8384
#include "llvm/Transforms/Instrumentation/PGOForceFunctionAttrs.h"
8485
#include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
8586
#include "llvm/Transforms/Scalar/ADCE.h"
@@ -1268,8 +1269,13 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
12681269
MPM.addPass(MemProfUsePass(PGOOpt->MemoryProfile, PGOOpt->FS));
12691270

12701271
if (PGOOpt && (PGOOpt->Action == PGOOptions::IRUse ||
1271-
PGOOpt->Action == PGOOptions::SampleUse))
1272+
PGOOpt->Action == PGOOptions::SampleUse)) {
12721273
MPM.addPass(PGOForceFunctionAttrsPass(PGOOpt->ColdOptType));
1274+
// TODO: Is this the right place for this pass? Should we enable it in any
1275+
// other case, such as when __builtin_expect_with_probability or
1276+
// __builtin_expect appears in the source code but profiles are not read?
1277+
MPM.addPass(PGOEstimateTripCountsPass());
1278+
}
12731279

12741280
MPM.addPass(AlwaysInlinerPass(/*InsertLifetimeIntrinsics=*/true));
12751281

@@ -2355,4 +2361,4 @@ AAManager PassBuilder::buildDefaultAAPipeline() {
23552361
bool PassBuilder::isInstrumentedPGOUse() const {
23562362
return (PGOOpt && PGOOpt->Action == PGOOptions::IRUse) ||
23572363
!UseCtxProfile.empty();
2358-
}
2364+
}

llvm/lib/Passes/PassRegistry.def

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,7 @@ MODULE_PASS("openmp-opt", OpenMPOptPass())
124124
MODULE_PASS("openmp-opt-postlink",
125125
OpenMPOptPass(ThinOrFullLTOPhase::FullLTOPostLink))
126126
MODULE_PASS("partial-inliner", PartialInlinerPass())
127+
MODULE_PASS("pgo-estimate-trip-counts", PGOEstimateTripCountsPass())
127128
MODULE_PASS("pgo-icall-prom", PGOIndirectCallPromotion())
128129
MODULE_PASS("pgo-instr-gen", PGOInstrumentationGen())
129130
MODULE_PASS("pgo-instr-use", PGOInstrumentationUse())

llvm/lib/Transforms/Instrumentation/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ add_llvm_component_library(LLVMInstrumentation
1616
LowerAllowCheckPass.cpp
1717
PGOCtxProfFlattening.cpp
1818
PGOCtxProfLowering.cpp
19+
PGOEstimateTripCounts.cpp
1920
PGOForceFunctionAttrs.cpp
2021
PGOInstrumentation.cpp
2122
PGOMemOPSizeOpt.cpp
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
#include "llvm/Transforms/Instrumentation/PGOEstimateTripCounts.h"
10+
#include "llvm/Analysis/LoopInfo.h"
11+
#include "llvm/IR/Module.h"
12+
#include "llvm/Transforms/Utils/LoopUtils.h"
13+
14+
using namespace llvm;
15+
16+
#define DEBUG_TYPE "pgo-estimate-trip-counts"
17+
18+
static bool runOnLoop(Loop *L) {
19+
bool MadeChange = false;
20+
std::optional<unsigned> TC = getLoopEstimatedTripCount(
21+
L, /*EstimatedLoopInvocationWeight=*/nullptr, /*DbgForInit=*/true);
22+
MadeChange |= setLoopEstimatedTripCount(L, TC);
23+
for (Loop *SL : *L)
24+
MadeChange |= runOnLoop(SL);
25+
return MadeChange;
26+
}
27+
28+
PreservedAnalyses PGOEstimateTripCountsPass::run(Module &M,
29+
ModuleAnalysisManager &AM) {
30+
FunctionAnalysisManager &FAM =
31+
AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
32+
bool MadeChange = false;
33+
LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": start\n");
34+
for (Function &F : M) {
35+
if (F.isDeclaration())
36+
continue;
37+
LoopInfo *LI = &FAM.getResult<LoopAnalysis>(F);
38+
if (!LI)
39+
continue;
40+
for (Loop *L : *LI)
41+
MadeChange |= runOnLoop(L);
42+
}
43+
LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": end\n");
44+
return MadeChange ? PreservedAnalyses::none() : PreservedAnalyses::all();
45+
}

0 commit comments

Comments
 (0)