Skip to content

Commit a919358

Browse files
[memprof] Dump call site matching information
MemProfiler.cpp annotates the IR with the memory profile so that we can later duplicate context. Dumping the call site matching information here allows us to analyze how well we manage to annotate the IR. Specifically, this patch dumps: - the full stack ID (to identify the profile call stack) - the index within the profile call stack where we start matching - the size of InlinedCallStack This way, we get to see what part of profile call stack we are matching, not just one frame somewhere in the profile call stack. Now, obtaining the full stack ID requires a little bit of refactoring. This patch modifies the value type of LocHashToCallSites so that it contains the full stack as well as the starting index of a match. Essentially, this patch partially reverts: commit 7c294eb Author: Kazu Hirata <[email protected]> Date: Sat Dec 14 00:03:27 2024 -0800
1 parent 7eb193b commit a919358

File tree

1 file changed

+15
-6
lines changed

1 file changed

+15
-6
lines changed

llvm/lib/Transforms/Instrumentation/MemProfiler.cpp

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1034,13 +1034,15 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
10341034
std::map<uint64_t, std::set<const AllocationInfo *>> LocHashToAllocInfo;
10351035
// A hash function for std::unordered_set<ArrayRef<Frame>> to work.
10361036
struct CallStackHash {
1037-
size_t operator()(ArrayRef<Frame> CS) const {
1038-
return computeFullStackId(CS);
1037+
size_t operator()(const std::pair<ArrayRef<Frame>, unsigned> &CS) const {
1038+
auto &[CallStack, Idx] = CS;
1039+
return computeFullStackId(ArrayRef<Frame>(CallStack).drop_front(Idx));
10391040
}
10401041
};
10411042
// For the callsites we need to record slices of the frame array (see comments
10421043
// below where the map entries are added).
1043-
std::map<uint64_t, std::unordered_set<ArrayRef<Frame>, CallStackHash>>
1044+
std::map<uint64_t, std::unordered_set<std::pair<ArrayRef<Frame>, unsigned>,
1045+
CallStackHash>>
10441046
LocHashToCallSites;
10451047
for (auto &AI : MemProfRec->AllocSites) {
10461048
NumOfMemProfAllocContextProfiles++;
@@ -1058,7 +1060,7 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
10581060
unsigned Idx = 0;
10591061
for (auto &StackFrame : CS) {
10601062
uint64_t StackId = computeStackId(StackFrame);
1061-
LocHashToCallSites[StackId].insert(ArrayRef<Frame>(CS).drop_front(Idx++));
1063+
LocHashToCallSites[StackId].emplace(CS, Idx++);
10621064
ProfileHasColumns |= StackFrame.Column;
10631065
// Once we find this function, we can stop recording.
10641066
if (StackFrame.Function == FuncGUID)
@@ -1201,15 +1203,22 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
12011203
// instruction's leaf location in the callsites map and not the allocation
12021204
// map.
12031205
assert(CallSitesIter != LocHashToCallSites.end());
1204-
for (auto CallStackIdx : CallSitesIter->second) {
1206+
for (auto &[ProfileCallStack, Idx] : CallSitesIter->second) {
12051207
// If we found and thus matched all frames on the call, create and
12061208
// attach call stack metadata.
1207-
if (stackFrameIncludesInlinedCallStack(CallStackIdx,
1209+
if (stackFrameIncludesInlinedCallStack(ProfileCallStack.drop_front(Idx),
12081210
InlinedCallStack)) {
12091211
NumOfMemProfMatchedCallSites++;
12101212
addCallsiteMetadata(I, InlinedCallStack, Ctx);
12111213
// Only need to find one with a matching call stack and add a single
12121214
// callsite metadata.
1215+
1216+
// Dump call site matching information upon request.
1217+
if (ClPrintMemProfMatchInfo) {
1218+
uint64_t FullStackId = computeFullStackId(ProfileCallStack);
1219+
errs() << "MemProf callsite " << FullStackId << " " << Idx << " "
1220+
<< InlinedCallStack.size() << "\n";
1221+
}
12131222
break;
12141223
}
12151224
}

0 commit comments

Comments
 (0)