-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[memprof] Improve call site matching #129770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Suppose we have a call instruction satisfying: - AllocInfoIter != LocHashToAllocInfo.end() - CallSitesIter != LocHashToCallSites.end() - !isAllocationWithHotColdVariant(CI->getCalledFunction(), TLI) In this case this patch, we would take: if (AllocInfoIter != LocHashToAllocInfo.end() but end up discarding the opportunity because of the call to isAllocationWithHotColdVariant. This can happen in C++ code like: new Something[100]; which is lowered to two calls -- new and the constructor. This patch fixes the problem by falling back to the call site annotation if we have !isAllocationWithHotColdVariant.
|
@llvm/pr-subscribers-llvm-transforms Author: Kazu Hirata (kazutakahirata) ChangesSuppose we have a call instruction satisfying:
In this case this patch, we would take: if (AllocInfoIter != LocHashToAllocInfo.end() but end up discarding the opportunity because of the call to This can happen in C++ code like: new Something[100]; which is lowered to two calls -- new and the constructor. This patch fixes the problem by falling back to the call site Full diff: https://github.com/llvm/llvm-project/pull/129770.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp b/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
index a0a5834731c2e..71a0b29f82a7e 100644
--- a/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
@@ -1122,10 +1122,9 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
// instruction's leaf location in that map, and if the rest of the
// instruction's locations match the prefix Frame locations on an
// allocation context with the same leaf.
- if (AllocInfoIter != LocHashToAllocInfo.end()) {
- // Only consider allocations which support hinting.
- if (!isAllocationWithHotColdVariant(CI->getCalledFunction(), TLI))
- continue;
+ if (AllocInfoIter != LocHashToAllocInfo.end() &&
+ // Only consider allocations which support hinting.
+ isAllocationWithHotColdVariant(CI->getCalledFunction(), TLI)) {
// We may match this instruction's location list to multiple MIB
// contexts. Add them to a Trie specialized for trimming the contexts to
// the minimal needed to disambiguate contexts with unique behavior.
@@ -1189,10 +1188,12 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
continue;
}
+ if (CallSitesIter == LocHashToCallSites.end())
+ continue;
+
// Otherwise, add callsite metadata. If we reach here then we found the
// instruction's leaf location in the callsites map and not the allocation
// map.
- assert(CallSitesIter != LocHashToCallSites.end());
for (auto CallStackIdx : CallSitesIter->second) {
// If we found and thus matched all frames on the call, create and
// attach call stack metadata.
diff --git a/llvm/test/Transforms/PGOProfile/memprof-call-site-at-alloc-site.ll b/llvm/test/Transforms/PGOProfile/memprof-call-site-at-alloc-site.ll
new file mode 100644
index 0000000000000..0a51128a7fcef
--- /dev/null
+++ b/llvm/test/Transforms/PGOProfile/memprof-call-site-at-alloc-site.ll
@@ -0,0 +1,52 @@
+; Verifies that a call site gets annotated even when a call site matches an
+; allocation call stack but does not call one of the memory allocation
+; functions.
+
+; REQUIRES: x86_64-linux
+; RUN: split-file %s %t
+; RUN: llvm-profdata merge %t/memprof-call-site-at-alloc-site.yaml -o %t/memprof-call-site-at-alloc-site.memprofdata
+; RUN: opt < %t/memprof-call-site-at-alloc-site.ll -passes='memprof-use<profile-filename=%t/memprof-call-site-at-alloc-site.memprofdata>' -memprof-print-match-info -S 2>&1 | FileCheck %s
+
+;--- memprof-call-site-at-alloc-site.yaml
+---
+HeapProfileRecords:
+ - GUID: _Z3foov
+ AllocSites:
+ - Callstack:
+ - { Function: _Z3foov, LineOffset: 6, Column: 12, IsInlineFrame: false }
+ MemInfoBlock:
+ AllocCount: 1
+ TotalSize: 898709270
+ TotalLifetime: 1000000
+ TotalLifetimeAccessDensity: 1
+ CallSites:
+ - - { Function: _Z3foov, LineOffset: 6, Column: 12, IsInlineFrame: false }
+...
+
+;--- memprof-call-site-at-alloc-site.ll
+; CHECK: MemProf callsite match for inline call stack 774294594974568741
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-grtev4-linux-gnu"
+
+declare void @_ZN9SomethingC1Ev()
+
+define void @_Z3foov() {
+ %1 = call ptr @_Znam(), !dbg !3
+ call void @_ZN9SomethingC1Ev(), !dbg !3
+ ret void
+}
+
+declare ptr @_Znam()
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1)
+!1 = !DIFile(filename: "something.cc", directory: "/")
+!2 = !{i32 2, !"Debug Info Version", i32 3}
+!3 = !DILocation(line: 106, column: 12, scope: !4)
+!4 = distinct !DISubprogram(name: "Init", linkageName: "_Z3foov", scope: !5, file: !1, line: 100, type: !7, scopeLine: 100, spFlags: DISPFlagDefinition, unit: !0)
+!5 = distinct !DICompositeType(tag: DW_TAG_class_type, name: "Something", file: !1)
+!6 = !{}
+!7 = distinct !DISubroutineType(types: !6)
|
|
@llvm/pr-subscribers-pgo Author: Kazu Hirata (kazutakahirata) ChangesSuppose we have a call instruction satisfying:
In this case this patch, we would take: if (AllocInfoIter != LocHashToAllocInfo.end() but end up discarding the opportunity because of the call to This can happen in C++ code like: new Something[100]; which is lowered to two calls -- new and the constructor. This patch fixes the problem by falling back to the call site Full diff: https://github.com/llvm/llvm-project/pull/129770.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp b/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
index a0a5834731c2e..71a0b29f82a7e 100644
--- a/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemProfiler.cpp
@@ -1122,10 +1122,9 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
// instruction's leaf location in that map, and if the rest of the
// instruction's locations match the prefix Frame locations on an
// allocation context with the same leaf.
- if (AllocInfoIter != LocHashToAllocInfo.end()) {
- // Only consider allocations which support hinting.
- if (!isAllocationWithHotColdVariant(CI->getCalledFunction(), TLI))
- continue;
+ if (AllocInfoIter != LocHashToAllocInfo.end() &&
+ // Only consider allocations which support hinting.
+ isAllocationWithHotColdVariant(CI->getCalledFunction(), TLI)) {
// We may match this instruction's location list to multiple MIB
// contexts. Add them to a Trie specialized for trimming the contexts to
// the minimal needed to disambiguate contexts with unique behavior.
@@ -1189,10 +1188,12 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
continue;
}
+ if (CallSitesIter == LocHashToCallSites.end())
+ continue;
+
// Otherwise, add callsite metadata. If we reach here then we found the
// instruction's leaf location in the callsites map and not the allocation
// map.
- assert(CallSitesIter != LocHashToCallSites.end());
for (auto CallStackIdx : CallSitesIter->second) {
// If we found and thus matched all frames on the call, create and
// attach call stack metadata.
diff --git a/llvm/test/Transforms/PGOProfile/memprof-call-site-at-alloc-site.ll b/llvm/test/Transforms/PGOProfile/memprof-call-site-at-alloc-site.ll
new file mode 100644
index 0000000000000..0a51128a7fcef
--- /dev/null
+++ b/llvm/test/Transforms/PGOProfile/memprof-call-site-at-alloc-site.ll
@@ -0,0 +1,52 @@
+; Verifies that a call site gets annotated even when a call site matches an
+; allocation call stack but does not call one of the memory allocation
+; functions.
+
+; REQUIRES: x86_64-linux
+; RUN: split-file %s %t
+; RUN: llvm-profdata merge %t/memprof-call-site-at-alloc-site.yaml -o %t/memprof-call-site-at-alloc-site.memprofdata
+; RUN: opt < %t/memprof-call-site-at-alloc-site.ll -passes='memprof-use<profile-filename=%t/memprof-call-site-at-alloc-site.memprofdata>' -memprof-print-match-info -S 2>&1 | FileCheck %s
+
+;--- memprof-call-site-at-alloc-site.yaml
+---
+HeapProfileRecords:
+ - GUID: _Z3foov
+ AllocSites:
+ - Callstack:
+ - { Function: _Z3foov, LineOffset: 6, Column: 12, IsInlineFrame: false }
+ MemInfoBlock:
+ AllocCount: 1
+ TotalSize: 898709270
+ TotalLifetime: 1000000
+ TotalLifetimeAccessDensity: 1
+ CallSites:
+ - - { Function: _Z3foov, LineOffset: 6, Column: 12, IsInlineFrame: false }
+...
+
+;--- memprof-call-site-at-alloc-site.ll
+; CHECK: MemProf callsite match for inline call stack 774294594974568741
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-grtev4-linux-gnu"
+
+declare void @_ZN9SomethingC1Ev()
+
+define void @_Z3foov() {
+ %1 = call ptr @_Znam(), !dbg !3
+ call void @_ZN9SomethingC1Ev(), !dbg !3
+ ret void
+}
+
+declare ptr @_Znam()
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1)
+!1 = !DIFile(filename: "something.cc", directory: "/")
+!2 = !{i32 2, !"Debug Info Version", i32 3}
+!3 = !DILocation(line: 106, column: 12, scope: !4)
+!4 = distinct !DISubprogram(name: "Init", linkageName: "_Z3foov", scope: !5, file: !1, line: 100, type: !7, scopeLine: 100, spFlags: DISPFlagDefinition, unit: !0)
+!5 = distinct !DICompositeType(tag: DW_TAG_class_type, name: "Something", file: !1)
+!6 = !{}
+!7 = distinct !DISubroutineType(types: !6)
|
snehasish
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Suppose we have a call instruction satisfying:
In this case this patch, we would take:
if (AllocInfoIter != LocHashToAllocInfo.end()
but end up discarding the opportunity because of the call to
isAllocationWithHotColdVariant.
This can happen in C++ code like:
new Something[100];
which is lowered to two calls -- new and the constructor.
This patch fixes the problem by falling back to the call site
annotation if we have !isAllocationWithHotColdVariant.