Skip to content

Commit 6fb87b2

Browse files
authored
[llvm][AsmPrinter] Call graph section format. (#159866)
Make .callgraph section's layout efficient in space. Document the layout of the section.
1 parent b3f2d93 commit 6fb87b2

13 files changed

+340
-104
lines changed

llvm/docs/CallGraphSection.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# .callgraph Section Layout
2+
3+
The `.callgraph` section is used to store call graph information for each function. The section contains a series of records, with each record corresponding to a single function.
4+
5+
## Per Function Record Layout
6+
7+
Each record in the `.callgraph` section has the following binary layout:
8+
9+
| Field | Type | Size (bits) | Description |
10+
| -------------------------------------- | ------------- | ----------- | ------------------------------------------------------------------------------------------------------- |
11+
| Format Version | `uint8_t` | 8 | The version of the record format. The current version is 0. |
12+
| Flags | `uint8_t` | 8 | A bitfield where: Bit 0 is set if the function is a potential indirect call target; Bit 1 is set if there are direct callees; Bit 2 is set if there are indirect callees. The remaining 5 bits are reserved. |
13+
| Function Entry PC | `uintptr_t` | 32/64 | The address of the function's entry point. |
14+
| Function Type ID | `uint64_t` | 64 | The type ID of the function. This field is non-zero if the function is a potential indirect call target and its type is known. |
15+
| Number of Unique Direct Callees | `ULEB128` | Variable | The number of unique direct call destinations from this function. This field is only present if there is at least one direct callee. |
16+
| Direct Callees Array | `uintptr_t[]` | Variable | An array of unique direct callee entry point addresses. This field is only present if there is at least one direct callee. |
17+
| Number of Unique Indirect Target Type IDs| `ULEB128` | Variable | The number of unique indirect call target type IDs. This field is only present if there is at least one indirect target type ID. |
18+
| Indirect Target Type IDs Array | `uint64_t[]` | Variable | An array of unique indirect call target type IDs. This field is only present if there is at least one indirect target type ID. |

llvm/docs/CodeGenerator.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1662,6 +1662,13 @@ and stack sizes (unsigned LEB128). The stack size values only include the space
16621662
allocated in the function prologue. Functions with dynamic stack allocations are
16631663
not included.
16641664

1665+
Emitting function call graph information
1666+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1667+
1668+
A section containing metadata on function call graph will be emitted when
1669+
``TargetOptions::EmitCallGraphSection`` is set (--call-graph-section). Layout of
1670+
this section is documented in detail at :doc:`CallGraphSection`.
1671+
16651672
VLIW Packetizer
16661673
---------------
16671674

llvm/docs/Reference.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ LLVM and API reference documentation.
1515
BranchWeightMetadata
1616
Bugpoint
1717
CalleeTypeMetadata
18+
CallGraphSection
1819
CIBestPractices
1920
CommandGuide/index
2021
ContentAddressableStorage

llvm/include/llvm/CodeGen/AsmPrinter.h

Lines changed: 7 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -198,26 +198,13 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass {
198198
/// and targets.
199199
using CGTypeId = uint64_t;
200200

201-
/// Map type identifiers to callsite labels. Labels are generated for each
202-
/// indirect callsite in the function.
203-
SmallVector<std::pair<CGTypeId, MCSymbol *>> CallSiteLabels;
201+
/// Unique target type IDs.
202+
SmallSet<CGTypeId, 4> IndirectCalleeTypeIDs;
203+
/// Unique direct callees.
204204
SmallSet<MCSymbol *, 4> DirectCallees;
205205
};
206206

207-
/// Enumeration of function kinds, and their mapping to function kind values
208-
/// stored in callgraph section entries.
209-
enum class FunctionKind : uint64_t {
210-
/// Function cannot be target to indirect calls.
211-
NOT_INDIRECT_TARGET = 0,
212-
213-
/// Function may be target to indirect calls but its type id is unknown.
214-
INDIRECT_TARGET_UNKNOWN_TID = 1,
215-
216-
/// Function may be target to indirect calls and its type id is known.
217-
INDIRECT_TARGET_KNOWN_TID = 2,
218-
};
219-
220-
enum CallGraphSectionFormatVersion : uint64_t {
207+
enum CallGraphSectionFormatVersion : uint8_t {
221208
V_0 = 0,
222209
};
223210

@@ -386,9 +373,9 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass {
386373
/// are available. Returns empty string otherwise.
387374
StringRef getConstantSectionSuffix(const Constant *C) const;
388375

389-
/// Iff MI is an indirect call, generate and emit a label after the callsites
390-
/// which will be used to populate the .callgraph section. For direct
391-
/// callsites add the callee symbol to direct callsites list of FuncCGInfo.
376+
/// If MI is an indirect call, add expected type IDs to indirect type ids
377+
/// list. If MI is a direct call add the callee symbol to direct callsites
378+
/// list of FuncCGInfo.
392379
void handleCallsiteForCallgraph(
393380
FunctionCallGraphInfo &FuncCGInfo,
394381
const MachineFunction::CallSiteInfoMap &CallSitesInfoMap,

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

Lines changed: 61 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
#include "WinException.h"
2121
#include "llvm/ADT/APFloat.h"
2222
#include "llvm/ADT/APInt.h"
23+
#include "llvm/ADT/BitmaskEnum.h"
2324
#include "llvm/ADT/DenseMap.h"
2425
#include "llvm/ADT/STLExtras.h"
2526
#include "llvm/ADT/SmallPtrSet.h"
@@ -205,6 +206,17 @@ class AddrLabelMapCallbackPtr final : CallbackVH {
205206
};
206207
} // namespace
207208

209+
namespace callgraph {
210+
LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE();
211+
enum Flags : uint8_t {
212+
None = 0,
213+
IsIndirectTarget = 1u << 0,
214+
HasDirectCallees = 1u << 1,
215+
HasIndirectCallees = 1u << 2,
216+
LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue*/ HasIndirectCallees)
217+
};
218+
} // namespace callgraph
219+
208220
class llvm::AddrLabelMap {
209221
MCContext &Context;
210222
struct AddrLabelSymEntry {
@@ -1683,63 +1695,65 @@ void AsmPrinter::emitCallGraphSection(const MachineFunction &MF,
16831695
OutStreamer->pushSection();
16841696
OutStreamer->switchSection(FuncCGSection);
16851697

1686-
// Emit format version number.
1687-
OutStreamer->emitInt64(CallGraphSectionFormatVersion::V_0);
1688-
1689-
// Emit function's self information, which is composed of:
1690-
// 1) FunctionEntryPc
1691-
// 2) FunctionKind: Whether the function is indirect target, and if so,
1692-
// whether its type id is known.
1693-
// 3) FunctionTypeId: Emit only when the function is an indirect target
1694-
// and its type id is known.
1695-
1696-
// Emit function entry pc.
16971698
const MCSymbol *FunctionSymbol = getFunctionBegin();
1698-
OutStreamer->emitSymbolValue(FunctionSymbol, TM.getProgramPointerSize());
1699-
1699+
const Function &F = MF.getFunction();
17001700
// If this function has external linkage or has its address taken and
17011701
// it is not a callback, then anything could call it.
1702-
const Function &F = MF.getFunction();
17031702
bool IsIndirectTarget =
17041703
!F.hasLocalLinkage() || F.hasAddressTaken(nullptr,
17051704
/*IgnoreCallbackUses=*/true,
17061705
/*IgnoreAssumeLikeCalls=*/true,
17071706
/*IgnoreLLVMUsed=*/false);
17081707

1709-
// FIXME: FunctionKind takes a few values but emitted as a 64-bit value.
1710-
// Can be optimized to occupy 2 bits instead.
1711-
// Emit function kind, and type id if available.
1712-
if (!IsIndirectTarget) {
1713-
OutStreamer->emitInt64(
1714-
static_cast<uint64_t>(FunctionKind::NOT_INDIRECT_TARGET));
1715-
} else {
1716-
if (const auto *TypeId = extractNumericCGTypeId(F)) {
1717-
OutStreamer->emitInt64(
1718-
static_cast<uint64_t>(FunctionKind::INDIRECT_TARGET_KNOWN_TID));
1719-
OutStreamer->emitInt64(TypeId->getZExtValue());
1720-
} else {
1721-
OutStreamer->emitInt64(
1722-
static_cast<uint64_t>(FunctionKind::INDIRECT_TARGET_UNKNOWN_TID));
1723-
}
1724-
}
1708+
const auto &DirectCallees = FuncCGInfo.DirectCallees;
1709+
const auto &IndirectCalleeTypeIDs = FuncCGInfo.IndirectCalleeTypeIDs;
1710+
1711+
using namespace callgraph;
1712+
Flags CGFlags = Flags::None;
1713+
if (IsIndirectTarget)
1714+
CGFlags |= Flags::IsIndirectTarget;
1715+
if (DirectCallees.size() > 0)
1716+
CGFlags |= Flags::HasDirectCallees;
1717+
if (IndirectCalleeTypeIDs.size() > 0)
1718+
CGFlags |= Flags::HasIndirectCallees;
1719+
1720+
// Emit function's call graph information.
1721+
// 1) CallGraphSectionFormatVersion
1722+
// 2) Flags
1723+
// a. LSB bit 0 is set to 1 if the function is a potential indirect
1724+
// target.
1725+
// b. LSB bit 1 is set to 1 if there are direct callees.
1726+
// c. LSB bit 2 is set to 1 if there are indirect callees.
1727+
// d. Rest of the 5 bits in Flags are reserved for any future use.
1728+
// 3) Function entry PC.
1729+
// 4) FunctionTypeID if the function is indirect target and its type id
1730+
// is known, otherwise it is set to 0.
1731+
// 5) Number of unique direct callees, if at least one exists.
1732+
// 6) For each unique direct callee, the callee's PC.
1733+
// 7) Number of unique indirect target type IDs, if at least one exists.
1734+
// 8) Each unique indirect target type id.
1735+
OutStreamer->emitInt8(CallGraphSectionFormatVersion::V_0);
1736+
OutStreamer->emitInt8(static_cast<uint8_t>(CGFlags));
1737+
OutStreamer->emitSymbolValue(FunctionSymbol, TM.getProgramPointerSize());
1738+
const auto *TypeId = extractNumericCGTypeId(F);
1739+
if (IsIndirectTarget && TypeId)
1740+
OutStreamer->emitInt64(TypeId->getZExtValue());
1741+
else
1742+
OutStreamer->emitInt64(0);
17251743

1726-
// Emit callsite labels, where each element is a pair of type id and
1727-
// indirect callsite pc.
1728-
const auto &CallSiteLabels = FuncCGInfo.CallSiteLabels;
1729-
OutStreamer->emitInt64(CallSiteLabels.size());
1730-
for (const auto &[TypeId, Label] : CallSiteLabels) {
1731-
OutStreamer->emitInt64(TypeId);
1732-
OutStreamer->emitSymbolValue(Label, TM.getProgramPointerSize());
1744+
if (DirectCallees.size() > 0) {
1745+
OutStreamer->emitULEB128IntValue(DirectCallees.size());
1746+
for (const auto &CalleeSymbol : DirectCallees)
1747+
OutStreamer->emitSymbolValue(CalleeSymbol, TM.getProgramPointerSize());
1748+
FuncCGInfo.DirectCallees.clear();
17331749
}
1734-
FuncCGInfo.CallSiteLabels.clear();
1735-
1736-
const auto &DirectCallees = FuncCGInfo.DirectCallees;
1737-
OutStreamer->emitInt64(DirectCallees.size());
1738-
for (const auto &CalleeSymbol : DirectCallees) {
1739-
OutStreamer->emitSymbolValue(CalleeSymbol, TM.getProgramPointerSize());
1750+
if (IndirectCalleeTypeIDs.size() > 0) {
1751+
OutStreamer->emitULEB128IntValue(IndirectCalleeTypeIDs.size());
1752+
for (const auto &CalleeTypeId : IndirectCalleeTypeIDs)
1753+
OutStreamer->emitInt64(CalleeTypeId);
1754+
FuncCGInfo.IndirectCalleeTypeIDs.clear();
17401755
}
1741-
FuncCGInfo.DirectCallees.clear();
1742-
1756+
// End of emitting call graph section contents.
17431757
OutStreamer->popSection();
17441758
}
17451759

@@ -1877,8 +1891,7 @@ void AsmPrinter::handleCallsiteForCallgraph(
18771891
FunctionCallGraphInfo &FuncCGInfo,
18781892
const MachineFunction::CallSiteInfoMap &CallSitesInfoMap,
18791893
const MachineInstr &MI) {
1880-
assert(MI.isCall() &&
1881-
"Callsite labels are meant for call instructions only.");
1894+
assert(MI.isCall() && "This method is meant for call instructions only.");
18821895
const MachineOperand &CalleeOperand = MI.getOperand(0);
18831896
if (CalleeOperand.isGlobal() || CalleeOperand.isSymbol()) {
18841897
// Handle direct calls.
@@ -1903,10 +1916,8 @@ void AsmPrinter::handleCallsiteForCallgraph(
19031916
// Handle indirect callsite info.
19041917
// Only indirect calls have type identifiers set.
19051918
for (ConstantInt *CalleeTypeId : CallSiteInfo->second.CalleeTypeIds) {
1906-
MCSymbol *S = MF->getContext().createTempSymbol();
1907-
OutStreamer->emitLabel(S);
19081919
uint64_t CalleeTypeIdVal = CalleeTypeId->getZExtValue();
1909-
FuncCGInfo.CallSiteLabels.emplace_back(CalleeTypeIdVal, S);
1920+
FuncCGInfo.IndirectCalleeTypeIDs.insert(CalleeTypeIdVal);
19101921
}
19111922
}
19121923

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
;; Test if a potential indirect call target function which has internal linkage and
2+
;; address taken has its type ID emitted to callgraph section.
3+
;; This test also makes sure that callback functions which meet the above constraint
4+
;; are handled correctly.
5+
6+
; RUN: llc -mtriple=arm-unknown-linux --call-graph-section -o - < %s | FileCheck %s
7+
8+
declare !type !0 void @_Z6doWorkPFviE(ptr)
9+
10+
define i32 @_Z4testv() !type !1 {
11+
entry:
12+
call void @_Z6doWorkPFviE(ptr nonnull @_ZL10myCallbacki)
13+
ret i32 0
14+
}
15+
16+
; CHECK: _ZL10myCallbacki:
17+
; CHECK-NEXT: [[LABEL_FUNC:\.Lfunc_begin[0-9]+]]:
18+
define internal void @_ZL10myCallbacki(i32 %value) !type !2 {
19+
entry:
20+
%sink = alloca i32, align 4
21+
store volatile i32 %value, ptr %sink, align 4
22+
%i1 = load volatile i32, ptr %sink, align 4
23+
ret void
24+
}
25+
26+
!0 = !{i64 0, !"_ZTSFvPFviEE.generalized"}
27+
!1 = !{i64 0, !"_ZTSFivE.generalized"}
28+
!2 = !{i64 0, !"_ZTSFviE.generalized"}
29+
30+
; CHECK: .section .callgraph,"o",%progbits,.text
31+
;; Version
32+
; CHECK-NEXT: .byte 0
33+
;; Flags -- Potential indirect target so LSB is set to 1. Other bits are 0.
34+
; CHECK-NEXT: .byte 1
35+
;; Function Entry PC
36+
; CHECK-NEXT: .long [[LABEL_FUNC]]
37+
;; Function type ID -5212364466660467813
38+
; CHECK-NEXT: .long 1154849691
39+
; CHECK-NEXT: .long 3081369122
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
;; Test if temporary labels are generated for each indirect callsite.
2+
;; Test if the .callgraph section contains the MD5 hash of callees' type (type id)
3+
;; is correctly paired with its corresponding temporary label generated for indirect
4+
;; call sites annotated with !callee_type metadata.
5+
;; Test if the .callgraph section contains unique direct callees.
6+
7+
; RUN: llc -mtriple=arm-unknown-linux --call-graph-section -o - < %s | FileCheck %s
8+
9+
declare !type !0 void @direct_foo()
10+
declare !type !1 i32 @direct_bar(i8)
11+
declare !type !2 ptr @direct_baz(ptr)
12+
13+
; CHECK: ball:
14+
; CHECK-NEXT: [[LABEL_FUNC:\.Lfunc_begin[0-9]+]]:
15+
define ptr @ball() {
16+
entry:
17+
call void @direct_foo()
18+
%fp_foo_val = load ptr, ptr null, align 8
19+
call void (...) %fp_foo_val(), !callee_type !0
20+
call void @direct_foo()
21+
%fp_bar_val = load ptr, ptr null, align 8
22+
%call_fp_bar = call i32 %fp_bar_val(i8 0), !callee_type !2
23+
%call_fp_bar_direct = call i32 @direct_bar(i8 1)
24+
%fp_baz_val = load ptr, ptr null, align 8
25+
%call_fp_baz = call ptr %fp_baz_val(ptr null), !callee_type !4
26+
call void @direct_foo()
27+
%call_fp_baz_direct = call ptr @direct_baz(ptr null)
28+
call void @direct_foo()
29+
ret ptr %call_fp_baz
30+
}
31+
32+
!0 = !{!1}
33+
!1 = !{i64 0, !"_ZTSFvE.generalized"}
34+
!2 = !{!3}
35+
!3 = !{i64 0, !"_ZTSFicE.generalized"}
36+
!4 = !{!5}
37+
!5 = !{i64 0, !"_ZTSFPvS_E.generalized"}
38+
39+
; CHECK: .section .callgraph,"o",%progbits,.text
40+
;; Version
41+
; CHECK-NEXT: .byte 0
42+
;; Flags
43+
; CHECK-NEXT: .byte 7
44+
;; Function Entry PC
45+
; CHECK-NEXT: .long [[LABEL_FUNC]]
46+
;; Function type ID -- set to 0 as no type metadata attached to function.
47+
; CHECK-NEXT: .long 0
48+
; CHECK-NEXT: .long 0
49+
;; Number of unique direct callees.
50+
; CHECK-NEXT: .byte 3
51+
;; Direct callees.
52+
; CHECK-NEXT: .long direct_foo
53+
; CHECK-NEXT: .long direct_bar
54+
; CHECK-NEXT: .long direct_baz
55+
;; Number of unique indirect target type IDs.
56+
; CHECK-NEXT: .byte 3
57+
;; Indirect type IDs.
58+
; CHECK-NEXT: .long 838288420
59+
; CHECK-NEXT: .long 1053552373
60+
; CHECK-NEXT: .long 1505527380
61+
; CHECK-NEXT: .long 814631809
62+
; CHECK-NEXT: .long 342417018
63+
; CHECK-NEXT: .long 2013108216
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
;; Tests that we store the type identifiers in .callgraph section of the object file for tailcalls.
2+
3+
; RUN: llc -mtriple=arm-unknown-linux --call-graph-section -filetype=obj -o - < %s | \
4+
; RUN: llvm-readelf -x .callgraph - | FileCheck %s
5+
6+
define i32 @check_tailcall(ptr %func, i8 %x) !type !0 {
7+
entry:
8+
%call = tail call i32 %func(i8 signext %x), !callee_type !1
9+
ret i32 %call
10+
}
11+
12+
define i32 @main(i32 %argc) !type !3 {
13+
entry:
14+
%andop = and i32 %argc, 1
15+
%cmp = icmp eq i32 %andop, 0
16+
%foo.bar = select i1 %cmp, ptr @foo, ptr @bar
17+
%call.i = tail call i32 %foo.bar(i8 signext 97), !callee_type !1
18+
ret i32 %call.i
19+
}
20+
21+
declare !type !2 i32 @foo(i8 signext)
22+
23+
declare !type !2 i32 @bar(i8 signext)
24+
25+
!0 = !{i64 0, !"_ZTSFiPvcE.generalized"}
26+
!1 = !{!2}
27+
!2 = !{i64 0, !"_ZTSFicE.generalized"}
28+
!3 = !{i64 0, !"_ZTSFiiE.generalized"}
29+
30+
; CHECK: Hex dump of section '.callgraph':
31+
; CHECK-NEXT: 0x00000000 00050000 00008e19 0b7f3326 e3000154
32+
; CHECK-NEXT: 0x00000010 86bc5981 4b8e3000 05100000 00a150b8
33+
;; Verify that the type id 0x308e4b8159bc8654 is in section.
34+
; CHECK-NEXT: 0x00000020 3e0cfe3c b2015486 bc59814b 8e30

0 commit comments

Comments
 (0)