Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions llvm/docs/CallGraphSection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# .callgraph Section Layout

The `.callgraph` section is used to store call graph information for each function, which can be used for post-link analyses and optimizations. The section contains a series of records, with each record corresponding to a single function.

For efficiency, we make a distinction between direct and indirect call data. For direct calls, we record the unique callees, not the location of each individual call. For indirect calls, we record the location of each call site and the type ID of the callee. Post link analysis scripts which utilize this information to reconstuct the program call graph can potentially receive more information regarding indirect callsites from the user to improve the precision of the call graph.

## Per Function Record Layout

Each record in the `.callgraph` section has the following binary layout:

| Field | Type | Size (bits) | Description |
| ---------------------------- | ------------- | ----------- | ------------------------------------------------------------------------------------------------------- |
| Format Version | `uint32_t` | 32 | The version of the record format. The current version is 0. |
| Function Entry PC | `uintptr_t` | 32/64 | The address of the function's entry point. |
| Function Kind | `uint8_t` | 8 | An enum indicating the function's properties (e.g., if it's an indirect call target). |
| Function Type ID | `uint64_t` | 64 | The type ID of the function. This field is **only** present if `Function Kind` is `INDIRECT_TARGET_KNOWN_TID`. |
| Number of Indirect Callsites | `uint32_t` | 32 | The number of indirect call sites within the function. |
| Indirect Callsites Array | `Callsite[]` | Variable | An array of `Callsite` records, with a length of `Number of Indirect Callsites`. |
| Number of Unique Direct Callees | `uint32_t` | 32 | The number of unique direct call destinations from this function. |
| Direct Callees Array | `uintptr_t[]` | Variable | An array of unique direct callee entry point addresses, with a length of `Number of Direct Callees`. |

### Indirect Callsite Record Layout

Each record in the `Indirect Callsites Array` has the following layout:

| Field | Type | Size (bits) | Description |
| ----------------- | ----------- | ----------- | ----------------------------------------- |
| Type ID | `uint64_t` | 64 | The type ID of the indirect call target. |
| Callsite PC | `uintptr_t` | 32/64 | The address of the indirect call site. |
6 changes: 3 additions & 3 deletions llvm/include/llvm/CodeGen/AsmPrinter.h
Original file line number Diff line number Diff line change
Expand Up @@ -200,13 +200,13 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass {

/// Map type identifiers to callsite labels. Labels are generated for each
/// indirect callsite in the function.
SmallVector<std::pair<CGTypeId, MCSymbol *>> CallSiteLabels;
SmallVector<std::pair<CGTypeId, MCSymbol *>> IndirectCallsites;
SmallSet<MCSymbol *, 4> DirectCallees;
};

/// Enumeration of function kinds, and their mapping to function kind values
/// stored in callgraph section entries.
enum class FunctionKind : uint64_t {
enum class FunctionKind : uint8_t {
/// Function cannot be target to indirect calls.
NOT_INDIRECT_TARGET = 0,

Expand All @@ -217,7 +217,7 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass {
INDIRECT_TARGET_KNOWN_TID = 2,
};

enum CallGraphSectionFormatVersion : uint64_t {
enum CallGraphSectionFormatVersion : uint32_t {
V_0 = 0,
};

Expand Down
90 changes: 44 additions & 46 deletions llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1685,58 +1685,56 @@ void AsmPrinter::emitCallGraphSection(const MachineFunction &MF,
OutStreamer->pushSection();
OutStreamer->switchSection(FuncCGSection);

// Emit format version number.
OutStreamer->emitInt64(CallGraphSectionFormatVersion::V_0);

// Emit function's self information, which is composed of:
// 1) FunctionEntryPc
// 2) FunctionKind: Whether the function is indirect target, and if so,
// whether its type id is known.
// 3) FunctionTypeId: Emit only when the function is an indirect target
// and its type id is known.

// Emit function entry pc.
const MCSymbol *FunctionSymbol = getFunctionBegin();
OutStreamer->emitSymbolValue(FunctionSymbol, TM.getProgramPointerSize());

// If this function has external linkage or has its address taken and
// it is not a callback, then anything could call it.
const Function &F = MF.getFunction();
bool IsIndirectTarget =
!F.hasLocalLinkage() || F.hasAddressTaken(nullptr,
/*IgnoreCallbackUses=*/true,
/*IgnoreAssumeLikeCalls=*/true,
/*IgnoreLLVMUsed=*/false);

// FIXME: FunctionKind takes a few values but emitted as a 64-bit value.
// Can be optimized to occupy 2 bits instead.
// Emit function kind, and type id if available.
if (!IsIndirectTarget) {
OutStreamer->emitInt64(
static_cast<uint64_t>(FunctionKind::NOT_INDIRECT_TARGET));
} else {
auto EmitFunctionKindAndTypeId = [&]() {
const Function &F = MF.getFunction();
// If this function has external linkage or has its address taken and
// it is not a callback, then anything could call it.
bool IsIndirectTarget = !F.hasLocalLinkage() ||
F.hasAddressTaken(nullptr,
/*IgnoreCallbackUses=*/true,
/*IgnoreAssumeLikeCalls=*/true,
/*IgnoreLLVMUsed=*/false);
if (!IsIndirectTarget) {
OutStreamer->emitInt8(
static_cast<uint8_t>(FunctionKind::NOT_INDIRECT_TARGET));
return;
}
if (const auto *TypeId = extractNumericCGTypeId(F)) {
OutStreamer->emitInt64(
static_cast<uint64_t>(FunctionKind::INDIRECT_TARGET_KNOWN_TID));
OutStreamer->emitInt8(
static_cast<uint8_t>(FunctionKind::INDIRECT_TARGET_KNOWN_TID));
OutStreamer->emitInt64(TypeId->getZExtValue());
} else {
OutStreamer->emitInt64(
static_cast<uint64_t>(FunctionKind::INDIRECT_TARGET_UNKNOWN_TID));
return;
}
}
OutStreamer->emitInt8(
static_cast<uint8_t>(FunctionKind::INDIRECT_TARGET_UNKNOWN_TID));
};

// Emit callsite labels, where each element is a pair of type id and
// indirect callsite pc.
const auto &CallSiteLabels = FuncCGInfo.CallSiteLabels;
OutStreamer->emitInt64(CallSiteLabels.size());
for (const auto &[TypeId, Label] : CallSiteLabels) {
// Emit function's call graph information.
// 1) CallGraphSectionFormatVersion
// 2) Function entry PC.
// 3) FunctionKind: Whether the function is indirect target, and if so,
// whether its type id is known.
// 4) FunctionTypeID if the function is indirect target, and its type id is
// known.
// 5) Number of indirect callsites.
// 6) For each indirect callsite, its
// callsite PC and callee's expected type id.
// 7) Number of unique direct callees.
// 8) For each unique direct callee, the callee's PC.

OutStreamer->emitInt32(CallGraphSectionFormatVersion::V_0);
const MCSymbol *FunctionSymbol = getFunctionBegin();
OutStreamer->emitSymbolValue(FunctionSymbol, TM.getProgramPointerSize());
EmitFunctionKindAndTypeId();
const auto &IndirectCallsites = FuncCGInfo.IndirectCallsites;
OutStreamer->emitInt32(IndirectCallsites.size());
const auto &DirectCallees = FuncCGInfo.DirectCallees;
for (const auto &[TypeId, Label] : IndirectCallsites) {
OutStreamer->emitInt64(TypeId);
OutStreamer->emitSymbolValue(Label, TM.getProgramPointerSize());
}
FuncCGInfo.CallSiteLabels.clear();

const auto &DirectCallees = FuncCGInfo.DirectCallees;
OutStreamer->emitInt64(DirectCallees.size());
FuncCGInfo.IndirectCallsites.clear();
OutStreamer->emitInt32(DirectCallees.size());
for (const auto &CalleeSymbol : DirectCallees) {
OutStreamer->emitSymbolValue(CalleeSymbol, TM.getProgramPointerSize());
}
Expand Down Expand Up @@ -1908,7 +1906,7 @@ void AsmPrinter::handleCallsiteForCallgraph(
MCSymbol *S = MF->getContext().createTempSymbol();
OutStreamer->emitLabel(S);
uint64_t CalleeTypeIdVal = CalleeTypeId->getZExtValue();
FuncCGInfo.CallSiteLabels.emplace_back(CalleeTypeIdVal, S);
FuncCGInfo.IndirectCallsites.emplace_back(CalleeTypeIdVal, S);
}
}

Expand Down
10 changes: 5 additions & 5 deletions llvm/test/CodeGen/X86/call-graph-section-assembly.ll
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,10 @@ entry:

; CHECK: .section .callgraph,"o",@progbits,.text

; CHECK-NEXT: .quad 0
; CHECK-NEXT: .long 0
; CHECK-NEXT: .quad [[LABEL_FUNC]]
; CHECK-NEXT: .quad 1
; CHECK-NEXT: .quad 3
; CHECK-NEXT: .byte 1
; CHECK-NEXT: .long 3
!0 = !{!1}
!1 = !{i64 0, !"_ZTSFvE.generalized"}
;; Test for MD5 hash of _ZTSFvE.generalized and the generated temporary callsite label.
Expand All @@ -53,8 +53,8 @@ entry:
;; Test for MD5 hash of _ZTSFPvS_E.generalized and the generated temporary callsite label.
; CHECK-NEXT: .quad 8646233951371320954
; CHECK-NEXT: .quad [[LABEL_TMP2]]
;; Test for number of direct calls and {callsite_label, callee} pairs.
; CHECK-NEXT: .quad 3
;; Test for number of direct calls and direct callees.
; CHECK-NEXT: .long 3
; CHECK-NEXT: .quad direct_foo
; CHECK-NEXT: .quad direct_bar
; CHECK-NEXT: .quad direct_baz
4 changes: 3 additions & 1 deletion llvm/test/CodeGen/X86/call-graph-section-tailcall.ll
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ declare !type !2 i32 @bar(i8 signext)

!0 = !{i64 0, !"_ZTSFiPvcE.generalized"}
!1 = !{!2}
; CHECK-DAG: 5486bc59 814b8e30
;; Verify that the type id 0x308e4b8159bc8654 is in section.
; CHECK: {{.*}} 005486bc 59814b8e
; CHECK-NEXT: 0x00000020 30000000 00000000 00000000 00000000
!2 = !{i64 0, !"_ZTSFicE.generalized"}
!3 = !{i64 0, !"_ZTSFiiE.generalized"}
6 changes: 3 additions & 3 deletions llvm/test/CodeGen/X86/call-graph-section.ll
Original file line number Diff line number Diff line change
Expand Up @@ -25,12 +25,12 @@ entry:

; CHECK: Hex dump of section '.callgraph':

; CHECK-DAG: 2444f731 f5eecb3e
; CHECK-DAG: 2444f7 31f5eecb 3e
!0 = !{i64 0, !"_ZTSFvE.generalized"}
!1 = !{!0}
; CHECK-DAG: 5486bc59 814b8e30
; CHECK-DAG: 5486bc 59814b8e 30
!2 = !{i64 0, !"_ZTSFicE.generalized"}
!3 = !{!2}
; CHECK-DAG: 7ade6814 f897fd77
; CHECK-DAG: 7ade68 14f897fd 77
!4 = !{!5}
!5 = !{i64 0, !"_ZTSFPvS_E.generalized"}
Loading