Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions llvm/docs/CallGraphSection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# .callgraph Section Layout

The `.callgraph` section is used to store call graph information for each function. The section contains a series of records, with each record corresponding to a single function.

## Per Function Record Layout

Each record in the `.callgraph` section has the following binary layout:

| Field | Type | Size (bits) | Description |
| -------------------------------------- | ------------- | ----------- | ------------------------------------------------------------------------------------------------------- |
| Format Version | `uint8_t` | 8 | The version of the record format. The current version is 0. |
| Flags | `uint8_t` | 8 | A bitfield where: Bit 0 is set if the function is a potential indirect call target; Bit 1 is set if there are direct callees; Bit 2 is set if there are indirect callees. The remaining 5 bits are reserved. |
| Function Entry PC | `uintptr_t` | 32/64 | The address of the function's entry point. |
| Function Type ID | `uint64_t` | 64 | The type ID of the function. This field is non-zero if the function is a potential indirect call target and its type is known. |
| Number of Unique Direct Callees | `ULEB128` | Variable | The number of unique direct call destinations from this function. This field is only present if there is at least one direct callee. |
| Direct Callees Array | `uintptr_t[]` | Variable | An array of unique direct callee entry point addresses. This field is only present if there is at least one direct callee. |
| Number of Unique Indirect Target Type IDs| `ULEB128` | Variable | The number of unique indirect call target type IDs. This field is only present if there is at least one indirect target type ID. |
| Indirect Target Type IDs Array | `uint64_t[]` | Variable | An array of unique indirect call target type IDs. This field is only present if there is at least one indirect target type ID. |
7 changes: 7 additions & 0 deletions llvm/docs/CodeGenerator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1662,6 +1662,13 @@ and stack sizes (unsigned LEB128). The stack size values only include the space
allocated in the function prologue. Functions with dynamic stack allocations are
not included.

Emitting function call graph information
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A section containing metadata on function call graph will be emitted when
``TargetOptions::EmitCallGraphSection`` is set (--call-graph-section). Layout of
this section is documented in detail at :doc:`CallGraphSection`.

VLIW Packetizer
---------------

Expand Down
1 change: 1 addition & 0 deletions llvm/docs/Reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ LLVM and API reference documentation.
BranchWeightMetadata
Bugpoint
CalleeTypeMetadata
CallGraphSection
CIBestPractices
CommandGuide/index
ContentAddressableStorage
Expand Down
27 changes: 7 additions & 20 deletions llvm/include/llvm/CodeGen/AsmPrinter.h
Original file line number Diff line number Diff line change
Expand Up @@ -198,26 +198,13 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass {
/// and targets.
using CGTypeId = uint64_t;

/// Map type identifiers to callsite labels. Labels are generated for each
/// indirect callsite in the function.
SmallVector<std::pair<CGTypeId, MCSymbol *>> CallSiteLabels;
/// Unique target type IDs.
SmallSet<CGTypeId, 4> IndirectCalleeTypeIDs;
/// Unique direct callees.
SmallSet<MCSymbol *, 4> DirectCallees;
};

/// Enumeration of function kinds, and their mapping to function kind values
/// stored in callgraph section entries.
enum class FunctionKind : uint64_t {
/// Function cannot be target to indirect calls.
NOT_INDIRECT_TARGET = 0,

/// Function may be target to indirect calls but its type id is unknown.
INDIRECT_TARGET_UNKNOWN_TID = 1,

/// Function may be target to indirect calls and its type id is known.
INDIRECT_TARGET_KNOWN_TID = 2,
};

enum CallGraphSectionFormatVersion : uint64_t {
enum CallGraphSectionFormatVersion : uint8_t {
V_0 = 0,
};

Expand Down Expand Up @@ -386,9 +373,9 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass {
/// are available. Returns empty string otherwise.
StringRef getConstantSectionSuffix(const Constant *C) const;

/// Iff MI is an indirect call, generate and emit a label after the callsites
/// which will be used to populate the .callgraph section. For direct
/// callsites add the callee symbol to direct callsites list of FuncCGInfo.
/// If MI is an indirect call, add expected type IDs to indirect type ids
/// list. If MI is a direct call add the callee symbol to direct callsites
/// list of FuncCGInfo.
void handleCallsiteForCallgraph(
FunctionCallGraphInfo &FuncCGInfo,
const MachineFunction::CallSiteInfoMap &CallSitesInfoMap,
Expand Down
111 changes: 61 additions & 50 deletions llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
#include "WinException.h"
#include "llvm/ADT/APFloat.h"
#include "llvm/ADT/APInt.h"
#include "llvm/ADT/BitmaskEnum.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallPtrSet.h"
Expand Down Expand Up @@ -205,6 +206,17 @@ class AddrLabelMapCallbackPtr final : CallbackVH {
};
} // namespace

namespace callgraph {
LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE();
enum Flags : uint8_t {
None = 0,
IsIndirectTarget = 1u << 0,
HasDirectCallees = 1u << 1,
HasIndirectCallees = 1u << 2,
LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue*/ HasIndirectCallees)
};
} // namespace callgraph

class llvm::AddrLabelMap {
MCContext &Context;
struct AddrLabelSymEntry {
Expand Down Expand Up @@ -1683,63 +1695,65 @@ void AsmPrinter::emitCallGraphSection(const MachineFunction &MF,
OutStreamer->pushSection();
OutStreamer->switchSection(FuncCGSection);

// Emit format version number.
OutStreamer->emitInt64(CallGraphSectionFormatVersion::V_0);

// Emit function's self information, which is composed of:
// 1) FunctionEntryPc
// 2) FunctionKind: Whether the function is indirect target, and if so,
// whether its type id is known.
// 3) FunctionTypeId: Emit only when the function is an indirect target
// and its type id is known.

// Emit function entry pc.
const MCSymbol *FunctionSymbol = getFunctionBegin();
OutStreamer->emitSymbolValue(FunctionSymbol, TM.getProgramPointerSize());

const Function &F = MF.getFunction();
// If this function has external linkage or has its address taken and
// it is not a callback, then anything could call it.
const Function &F = MF.getFunction();
bool IsIndirectTarget =
!F.hasLocalLinkage() || F.hasAddressTaken(nullptr,
/*IgnoreCallbackUses=*/true,
/*IgnoreAssumeLikeCalls=*/true,
/*IgnoreLLVMUsed=*/false);

// FIXME: FunctionKind takes a few values but emitted as a 64-bit value.
// Can be optimized to occupy 2 bits instead.
// Emit function kind, and type id if available.
if (!IsIndirectTarget) {
OutStreamer->emitInt64(
static_cast<uint64_t>(FunctionKind::NOT_INDIRECT_TARGET));
} else {
if (const auto *TypeId = extractNumericCGTypeId(F)) {
OutStreamer->emitInt64(
static_cast<uint64_t>(FunctionKind::INDIRECT_TARGET_KNOWN_TID));
OutStreamer->emitInt64(TypeId->getZExtValue());
} else {
OutStreamer->emitInt64(
static_cast<uint64_t>(FunctionKind::INDIRECT_TARGET_UNKNOWN_TID));
}
}
const auto &DirectCallees = FuncCGInfo.DirectCallees;
const auto &IndirectCalleeTypeIDs = FuncCGInfo.IndirectCalleeTypeIDs;

using namespace callgraph;
Flags CGFlags = Flags::None;
if (IsIndirectTarget)
CGFlags |= Flags::IsIndirectTarget;
if (DirectCallees.size() > 0)
CGFlags |= Flags::HasDirectCallees;
if (IndirectCalleeTypeIDs.size() > 0)
CGFlags |= Flags::HasIndirectCallees;

// Emit function's call graph information.
// 1) CallGraphSectionFormatVersion
// 2) Flags
// a. LSB bit 0 is set to 1 if the function is a potential indirect
// target.
// b. LSB bit 1 is set to 1 if there are direct callees.
// c. LSB bit 2 is set to 1 if there are indirect callees.
// d. Rest of the 5 bits in Flags are reserved for any future use.
// 3) Function entry PC.
// 4) FunctionTypeID if the function is indirect target and its type id
// is known, otherwise it is set to 0.
// 5) Number of unique direct callees, if at least one exists.
// 6) For each unique direct callee, the callee's PC.
// 7) Number of unique indirect target type IDs, if at least one exists.
// 8) Each unique indirect target type id.
OutStreamer->emitInt8(CallGraphSectionFormatVersion::V_0);
OutStreamer->emitInt8(static_cast<uint8_t>(CGFlags));
OutStreamer->emitSymbolValue(FunctionSymbol, TM.getProgramPointerSize());
const auto *TypeId = extractNumericCGTypeId(F);
if (IsIndirectTarget && TypeId)
OutStreamer->emitInt64(TypeId->getZExtValue());
else
OutStreamer->emitInt64(0);

// Emit callsite labels, where each element is a pair of type id and
// indirect callsite pc.
const auto &CallSiteLabels = FuncCGInfo.CallSiteLabels;
OutStreamer->emitInt64(CallSiteLabels.size());
for (const auto &[TypeId, Label] : CallSiteLabels) {
OutStreamer->emitInt64(TypeId);
OutStreamer->emitSymbolValue(Label, TM.getProgramPointerSize());
if (DirectCallees.size() > 0) {
OutStreamer->emitULEB128IntValue(DirectCallees.size());
for (const auto &CalleeSymbol : DirectCallees)
OutStreamer->emitSymbolValue(CalleeSymbol, TM.getProgramPointerSize());
FuncCGInfo.DirectCallees.clear();
}
FuncCGInfo.CallSiteLabels.clear();

const auto &DirectCallees = FuncCGInfo.DirectCallees;
OutStreamer->emitInt64(DirectCallees.size());
for (const auto &CalleeSymbol : DirectCallees) {
OutStreamer->emitSymbolValue(CalleeSymbol, TM.getProgramPointerSize());
if (IndirectCalleeTypeIDs.size() > 0) {
OutStreamer->emitULEB128IntValue(IndirectCalleeTypeIDs.size());
for (const auto &CalleeTypeId : IndirectCalleeTypeIDs)
OutStreamer->emitInt64(CalleeTypeId);
FuncCGInfo.IndirectCalleeTypeIDs.clear();
}
FuncCGInfo.DirectCallees.clear();

// End of emitting call graph section contents.
OutStreamer->popSection();
}

Expand Down Expand Up @@ -1877,8 +1891,7 @@ void AsmPrinter::handleCallsiteForCallgraph(
FunctionCallGraphInfo &FuncCGInfo,
const MachineFunction::CallSiteInfoMap &CallSitesInfoMap,
const MachineInstr &MI) {
assert(MI.isCall() &&
"Callsite labels are meant for call instructions only.");
assert(MI.isCall() && "This method is meant for call instructions only.");
const MachineOperand &CalleeOperand = MI.getOperand(0);
if (CalleeOperand.isGlobal() || CalleeOperand.isSymbol()) {
// Handle direct calls.
Expand All @@ -1903,10 +1916,8 @@ void AsmPrinter::handleCallsiteForCallgraph(
// Handle indirect callsite info.
// Only indirect calls have type identifiers set.
for (ConstantInt *CalleeTypeId : CallSiteInfo->second.CalleeTypeIds) {
MCSymbol *S = MF->getContext().createTempSymbol();
OutStreamer->emitLabel(S);
uint64_t CalleeTypeIdVal = CalleeTypeId->getZExtValue();
FuncCGInfo.CallSiteLabels.emplace_back(CalleeTypeIdVal, S);
FuncCGInfo.IndirectCalleeTypeIDs.insert(CalleeTypeIdVal);
}
}

Expand Down
39 changes: 39 additions & 0 deletions llvm/test/CodeGen/ARM/call-graph-section-addrtaken.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
;; Test if a potential indirect call target function which has internal linkage and
;; address taken has its type ID emitted to callgraph section.
;; This test also makes sure that callback functions which meet the above constraint
;; are handled correctly.

; RUN: llc -mtriple=arm-unknown-linux --call-graph-section -o - < %s | FileCheck %s

declare !type !0 void @_Z6doWorkPFviE(ptr)

define i32 @_Z4testv() !type !1 {
entry:
call void @_Z6doWorkPFviE(ptr nonnull @_ZL10myCallbacki)
ret i32 0
}

; CHECK: _ZL10myCallbacki:
; CHECK-NEXT: [[LABEL_FUNC:\.Lfunc_begin[0-9]+]]:
define internal void @_ZL10myCallbacki(i32 %value) !type !2 {
entry:
%sink = alloca i32, align 4
store volatile i32 %value, ptr %sink, align 4
%i1 = load volatile i32, ptr %sink, align 4
ret void
}

!0 = !{i64 0, !"_ZTSFvPFviEE.generalized"}
!1 = !{i64 0, !"_ZTSFivE.generalized"}
!2 = !{i64 0, !"_ZTSFviE.generalized"}

; CHECK: .section .callgraph,"o",%progbits,.text
;; Version
; CHECK-NEXT: .byte 0
;; Flags -- Potential indirect target so LSB is set to 1. Other bits are 0.
; CHECK-NEXT: .byte 1
;; Function Entry PC
; CHECK-NEXT: .long [[LABEL_FUNC]]
;; Function type ID -5212364466660467813
; CHECK-NEXT: .long 1154849691
; CHECK-NEXT: .long 3081369122
63 changes: 63 additions & 0 deletions llvm/test/CodeGen/ARM/call-graph-section-assembly.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
;; Test if temporary labels are generated for each indirect callsite.
;; Test if the .callgraph section contains the MD5 hash of callees' type (type id)
;; is correctly paired with its corresponding temporary label generated for indirect
;; call sites annotated with !callee_type metadata.
;; Test if the .callgraph section contains unique direct callees.

; RUN: llc -mtriple=arm-unknown-linux --call-graph-section -o - < %s | FileCheck %s

declare !type !0 void @direct_foo()
declare !type !1 i32 @direct_bar(i8)
declare !type !2 ptr @direct_baz(ptr)

; CHECK: ball:
; CHECK-NEXT: [[LABEL_FUNC:\.Lfunc_begin[0-9]+]]:
define ptr @ball() {
entry:
call void @direct_foo()
%fp_foo_val = load ptr, ptr null, align 8
call void (...) %fp_foo_val(), !callee_type !0
call void @direct_foo()
%fp_bar_val = load ptr, ptr null, align 8
%call_fp_bar = call i32 %fp_bar_val(i8 0), !callee_type !2
%call_fp_bar_direct = call i32 @direct_bar(i8 1)
%fp_baz_val = load ptr, ptr null, align 8
%call_fp_baz = call ptr %fp_baz_val(ptr null), !callee_type !4
call void @direct_foo()
%call_fp_baz_direct = call ptr @direct_baz(ptr null)
call void @direct_foo()
ret ptr %call_fp_baz
}

!0 = !{!1}
!1 = !{i64 0, !"_ZTSFvE.generalized"}
!2 = !{!3}
!3 = !{i64 0, !"_ZTSFicE.generalized"}
!4 = !{!5}
!5 = !{i64 0, !"_ZTSFPvS_E.generalized"}

; CHECK: .section .callgraph,"o",%progbits,.text
;; Version
; CHECK-NEXT: .byte 0
;; Flags
; CHECK-NEXT: .byte 7
;; Function Entry PC
; CHECK-NEXT: .long [[LABEL_FUNC]]
;; Function type ID -- set to 0 as no type metadata attached to function.
; CHECK-NEXT: .long 0
; CHECK-NEXT: .long 0
;; Number of unique direct callees.
; CHECK-NEXT: .byte 3
;; Direct callees.
; CHECK-NEXT: .long direct_foo
; CHECK-NEXT: .long direct_bar
; CHECK-NEXT: .long direct_baz
;; Number of unique indirect target type IDs.
; CHECK-NEXT: .byte 3
;; Indirect type IDs.
; CHECK-NEXT: .long 838288420
; CHECK-NEXT: .long 1053552373
; CHECK-NEXT: .long 1505527380
; CHECK-NEXT: .long 814631809
; CHECK-NEXT: .long 342417018
; CHECK-NEXT: .long 2013108216
34 changes: 34 additions & 0 deletions llvm/test/CodeGen/ARM/call-graph-section-tailcall.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
;; Tests that we store the type identifiers in .callgraph section of the object file for tailcalls.

; RUN: llc -mtriple=arm-unknown-linux --call-graph-section -filetype=obj -o - < %s | \
; RUN: llvm-readelf -x .callgraph - | FileCheck %s

define i32 @check_tailcall(ptr %func, i8 %x) !type !0 {
entry:
%call = tail call i32 %func(i8 signext %x), !callee_type !1
ret i32 %call
}

define i32 @main(i32 %argc) !type !3 {
entry:
%andop = and i32 %argc, 1
%cmp = icmp eq i32 %andop, 0
%foo.bar = select i1 %cmp, ptr @foo, ptr @bar
%call.i = tail call i32 %foo.bar(i8 signext 97), !callee_type !1
ret i32 %call.i
}

declare !type !2 i32 @foo(i8 signext)

declare !type !2 i32 @bar(i8 signext)

!0 = !{i64 0, !"_ZTSFiPvcE.generalized"}
!1 = !{!2}
!2 = !{i64 0, !"_ZTSFicE.generalized"}
!3 = !{i64 0, !"_ZTSFiiE.generalized"}

; CHECK: Hex dump of section '.callgraph':
; CHECK-NEXT: 0x00000000 00050000 00008e19 0b7f3326 e3000154
; CHECK-NEXT: 0x00000010 86bc5981 4b8e3000 05100000 00a150b8
;; Verify that the type id 0x308e4b8159bc8654 is in section.
; CHECK-NEXT: 0x00000020 3e0cfe3c b2015486 bc59814b 8e30
Loading