Skip to content

Commit 2d8716d

Browse files
committed
[MemProf] Merge all callee guids for indirect call VP metadata
When matching memprof profiles, for indirect calls we use the callee guids recorded on callsites in the profile to synthesize indirect call VP metadata when none exists. However, we only do this for the first matching CallSiteEntry from the profile. In some case there can be multiple, for example when the current function was eventually inlined into multiple callers. Profile generation propagates the CallSiteEntry from those callers into the inlined callee's profile as it may not yet have been inlined in the new compile. To capture all of these potential indirect call targets, merge callee guids across all matching CallSiteEntries.
1 parent 113b2d7 commit 2d8716d

File tree

2 files changed

+46
-19
lines changed

2 files changed

+46
-19
lines changed

llvm/lib/Transforms/Instrumentation/MemProfUse.cpp

Lines changed: 33 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -528,35 +528,50 @@ static void handleCallSite(
528528
Module &M, std::set<std::vector<uint64_t>> &MatchedCallSites,
529529
OptimizationRemarkEmitter &ORE) {
530530
auto &Ctx = M.getContext();
531+
// Set of Callee GUIDs to attach to indirect calls. We accumulate all of them
532+
// to support cases where the instuction's inlined frames match multiple call
533+
// site entries, which can happen if the profile was collected from a binary
534+
// where this instruction was eventually inlined into multiple callers.
535+
SetVector<GlobalValue::GUID> CalleeGuids;
536+
bool CallsiteMDAdded = false;
531537
for (const auto &CallSiteEntry : CallSiteEntries) {
532538
// If we found and thus matched all frames on the call, create and
533539
// attach call stack metadata.
534540
if (stackFrameIncludesInlinedCallStack(CallSiteEntry.Frames,
535541
InlinedCallStack)) {
536542
NumOfMemProfMatchedCallSites++;
537-
addCallsiteMetadata(I, InlinedCallStack, Ctx);
538-
539-
// Try to attach indirect call metadata if possible.
540-
if (!CalledFunction)
541-
addVPMetadata(M, I, CallSiteEntry.CalleeGuids);
542-
543543
// Only need to find one with a matching call stack and add a single
544544
// callsite metadata.
545-
546-
// Accumulate call site matching information upon request.
547-
if (ClPrintMemProfMatchInfo) {
548-
std::vector<uint64_t> CallStack;
549-
append_range(CallStack, InlinedCallStack);
550-
MatchedCallSites.insert(std::move(CallStack));
545+
if (!CallsiteMDAdded) {
546+
addCallsiteMetadata(I, InlinedCallStack, Ctx);
547+
548+
// Accumulate call site matching information upon request.
549+
if (ClPrintMemProfMatchInfo) {
550+
std::vector<uint64_t> CallStack;
551+
append_range(CallStack, InlinedCallStack);
552+
MatchedCallSites.insert(std::move(CallStack));
553+
}
554+
ORE.emit(OptimizationRemark(DEBUG_TYPE, "MemProfUse", &I)
555+
<< ore::NV("CallSite", &I) << " in function "
556+
<< ore::NV("Caller", I.getFunction())
557+
<< " matched callsite with frame count "
558+
<< ore::NV("Frames", InlinedCallStack.size()));
559+
560+
// If this is a direct call, we're done.
561+
if (CalledFunction)
562+
break;
563+
CallsiteMDAdded = true;
551564
}
552-
ORE.emit(OptimizationRemark(DEBUG_TYPE, "MemProfUse", &I)
553-
<< ore::NV("CallSite", &I) << " in function "
554-
<< ore::NV("Caller", I.getFunction())
555-
<< " matched callsite with frame count "
556-
<< ore::NV("Frames", InlinedCallStack.size()));
557-
break;
565+
566+
assert(!CalledFunction && "Didn't expect direct call");
567+
568+
// Collect Callee GUIDs from all matching CallSiteEntries.
569+
CalleeGuids.insert(CallSiteEntry.CalleeGuids.begin(),
570+
CallSiteEntry.CalleeGuids.end());
558571
}
559572
}
573+
// Try to attach indirect call metadata if possible.
574+
addVPMetadata(M, I, CalleeGuids.getArrayRef());
560575
}
561576

562577
static void readMemprof(Module &M, Function &F,

llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,18 @@ HeapProfileRecords:
1818
- Frames:
1919
- { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: false }
2020
CalleeGuids: [0x123456789abcdef0, 0x23456789abcdef01]
21+
# The next 2 sets of frames simulates the case where this function was
22+
# eventually inlined into multiple callers. We would have propagated the
23+
# resulting frames and callee guids here for matching with they not yet
24+
# inlined bar. We should aggregate all callee guids into the metadata.
25+
- Frames:
26+
- { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: true }
27+
- { Function: _Z3foov, LineOffset: 1, Column: 6, IsInlineFrame: false }
28+
CalleeGuids: [0x1234, 0x2345]
29+
- Frames:
30+
- { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: true }
31+
- { Function: _Z3foov, LineOffset: 10, Column: 7, IsInlineFrame: false }
32+
CalleeGuids: [0x3456, 0x4567]
2133
...
2234

2335
;--- basic.ll
@@ -31,7 +43,7 @@ entry:
3143
ret void
3244
}
3345

34-
; CHECK-ENABLE: !6 = !{!"VP", i32 0, i64 2, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1}
46+
; CHECK-ENABLE: !6 = !{!"VP", i32 0, i64 6, i64 13398, i64 1, i64 17767, i64 1, i64 4660, i64 1, i64 9029, i64 1, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1}
3547

3648
!llvm.module.flags = !{!2, !3}
3749

0 commit comments

Comments
 (0)