Skip to content

Commit 3c805fd

Browse files
committed
Callgraph section format changes.
1 parent 42b195e commit 3c805fd

File tree

3 files changed

+76
-49
lines changed

3 files changed

+76
-49
lines changed

llvm/docs/CallGraphSection.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# .callgraph Section Layout
2+
3+
The `.callgraph` section is used to store call graph information for each function, which can be used for post-link analyses and optimizations. The section contains a series of records, with each record corresponding to a single function.
4+
5+
For efficiency, we make a distinction between direct and indirect call data. For direct calls, we record the unique callees, not the location of each individual call. For indirect calls, we record the location of each call site and the type ID of the callee. Post link analysis scripts which utilize this information to reconstuct the program call graph can potentially receive more information regarding indirect callsites from the user to improve the precision of the call graph.
6+
7+
## Per Function Record Layout
8+
9+
Each record in the `.callgraph` section has the following binary layout:
10+
11+
| Field | Type | Size (bits) | Description |
12+
| ---------------------------- | ------------- | ----------- | ------------------------------------------------------------------------------------------------------- |
13+
| Format Version | `uint32_t` | 32 | The version of the record format. The current version is 0. |
14+
| Function Entry PC | `uintptr_t` | 32/64 | The address of the function's entry point. |
15+
| Function Kind | `uint8_t` | 8 | An enum indicating the function's properties (e.g., if it's an indirect call target). |
16+
| Function Type ID | `uint64_t` | 64 | The type ID of the function. This field is **only** present if `Function Kind` is `INDIRECT_TARGET_KNOWN_TID`. |
17+
| Number of Indirect Callsites | `uint32_t` | 32 | The number of indirect call sites within the function. |
18+
| Indirect Callsites Array | `Callsite[]` | Variable | An array of `Callsite` records, with a length of `Number of Indirect Callsites`. |
19+
| Number of Unique Direct Callees | `uint32_t` | 32 | The number of unique direct call destinations from this function. |
20+
| Direct Callees Array | `uintptr_t[]` | Variable | An array of unique direct callee entry point addresses, with a length of `Number of Direct Callees`. |
21+
22+
### Indirect Callsite Record Layout
23+
24+
Each record in the `Indirect Callsites Array` has the following layout:
25+
26+
| Field | Type | Size (bits) | Description |
27+
| ----------------- | ----------- | ----------- | ----------------------------------------- |
28+
| Type ID | `uint64_t` | 64 | The type ID of the indirect call target. |
29+
| Callsite PC | `uintptr_t` | 32/64 | The address of the indirect call site. |

llvm/include/llvm/CodeGen/AsmPrinter.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -200,13 +200,13 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass {
200200

201201
/// Map type identifiers to callsite labels. Labels are generated for each
202202
/// indirect callsite in the function.
203-
SmallVector<std::pair<CGTypeId, MCSymbol *>> CallSiteLabels;
203+
SmallVector<std::pair<CGTypeId, MCSymbol *>> IndirectCallsites;
204204
SmallSet<MCSymbol *, 4> DirectCallees;
205205
};
206206

207207
/// Enumeration of function kinds, and their mapping to function kind values
208208
/// stored in callgraph section entries.
209-
enum class FunctionKind : uint64_t {
209+
enum class FunctionKind : uint8_t {
210210
/// Function cannot be target to indirect calls.
211211
NOT_INDIRECT_TARGET = 0,
212212

@@ -217,7 +217,7 @@ class LLVM_ABI AsmPrinter : public MachineFunctionPass {
217217
INDIRECT_TARGET_KNOWN_TID = 2,
218218
};
219219

220-
enum CallGraphSectionFormatVersion : uint64_t {
220+
enum CallGraphSectionFormatVersion : uint32_t {
221221
V_0 = 0,
222222
};
223223

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

Lines changed: 44 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1685,58 +1685,56 @@ void AsmPrinter::emitCallGraphSection(const MachineFunction &MF,
16851685
OutStreamer->pushSection();
16861686
OutStreamer->switchSection(FuncCGSection);
16871687

1688-
// Emit format version number.
1689-
OutStreamer->emitInt64(CallGraphSectionFormatVersion::V_0);
1690-
1691-
// Emit function's self information, which is composed of:
1692-
// 1) FunctionEntryPc
1693-
// 2) FunctionKind: Whether the function is indirect target, and if so,
1694-
// whether its type id is known.
1695-
// 3) FunctionTypeId: Emit only when the function is an indirect target
1696-
// and its type id is known.
1697-
1698-
// Emit function entry pc.
1699-
const MCSymbol *FunctionSymbol = getFunctionBegin();
1700-
OutStreamer->emitSymbolValue(FunctionSymbol, TM.getProgramPointerSize());
1701-
1702-
// If this function has external linkage or has its address taken and
1703-
// it is not a callback, then anything could call it.
1704-
const Function &F = MF.getFunction();
1705-
bool IsIndirectTarget =
1706-
!F.hasLocalLinkage() || F.hasAddressTaken(nullptr,
1707-
/*IgnoreCallbackUses=*/true,
1708-
/*IgnoreAssumeLikeCalls=*/true,
1709-
/*IgnoreLLVMUsed=*/false);
1710-
1711-
// FIXME: FunctionKind takes a few values but emitted as a 64-bit value.
1712-
// Can be optimized to occupy 2 bits instead.
1713-
// Emit function kind, and type id if available.
1714-
if (!IsIndirectTarget) {
1715-
OutStreamer->emitInt64(
1716-
static_cast<uint64_t>(FunctionKind::NOT_INDIRECT_TARGET));
1717-
} else {
1688+
auto EmitFunctionKindAndTypeId = [&]() {
1689+
const Function &F = MF.getFunction();
1690+
// If this function has external linkage or has its address taken and
1691+
// it is not a callback, then anything could call it.
1692+
bool IsIndirectTarget = !F.hasLocalLinkage() ||
1693+
F.hasAddressTaken(nullptr,
1694+
/*IgnoreCallbackUses=*/true,
1695+
/*IgnoreAssumeLikeCalls=*/true,
1696+
/*IgnoreLLVMUsed=*/false);
1697+
if (!IsIndirectTarget) {
1698+
OutStreamer->emitInt8(
1699+
static_cast<uint8_t>(FunctionKind::NOT_INDIRECT_TARGET));
1700+
return;
1701+
}
17181702
if (const auto *TypeId = extractNumericCGTypeId(F)) {
1719-
OutStreamer->emitInt64(
1720-
static_cast<uint64_t>(FunctionKind::INDIRECT_TARGET_KNOWN_TID));
1703+
OutStreamer->emitInt8(
1704+
static_cast<uint8_t>(FunctionKind::INDIRECT_TARGET_KNOWN_TID));
17211705
OutStreamer->emitInt64(TypeId->getZExtValue());
1722-
} else {
1723-
OutStreamer->emitInt64(
1724-
static_cast<uint64_t>(FunctionKind::INDIRECT_TARGET_UNKNOWN_TID));
1706+
return;
17251707
}
1726-
}
1708+
OutStreamer->emitInt8(
1709+
static_cast<uint8_t>(FunctionKind::INDIRECT_TARGET_UNKNOWN_TID));
1710+
};
17271711

1728-
// Emit callsite labels, where each element is a pair of type id and
1729-
// indirect callsite pc.
1730-
const auto &CallSiteLabels = FuncCGInfo.CallSiteLabels;
1731-
OutStreamer->emitInt64(CallSiteLabels.size());
1732-
for (const auto &[TypeId, Label] : CallSiteLabels) {
1712+
// Emit function's call graph information.
1713+
// 1) CallGraphSectionFormatVersion
1714+
// 2) Function entry PC.
1715+
// 3) FunctionKind: Whether the function is indirect target, and if so,
1716+
// whether its type id is known.
1717+
// 4) FunctionTypeID if the function is indirect target, and its type id is
1718+
// known.
1719+
// 5) Number of indirect callsites.
1720+
// 6) For each indirect callsite, its
1721+
// callsite PC and callee's expected type id.
1722+
// 7) Number of unique direct callees.
1723+
// 8) For each unique direct callee, the callee's PC.
1724+
1725+
OutStreamer->emitInt32(CallGraphSectionFormatVersion::V_0);
1726+
const MCSymbol *FunctionSymbol = getFunctionBegin();
1727+
OutStreamer->emitSymbolValue(FunctionSymbol, TM.getProgramPointerSize());
1728+
EmitFunctionKindAndTypeId();
1729+
const auto &IndirectCallsites = FuncCGInfo.IndirectCallsites;
1730+
OutStreamer->emitInt32(IndirectCallsites.size());
1731+
const auto &DirectCallees = FuncCGInfo.DirectCallees;
1732+
for (const auto &[TypeId, Label] : IndirectCallsites) {
17331733
OutStreamer->emitInt64(TypeId);
17341734
OutStreamer->emitSymbolValue(Label, TM.getProgramPointerSize());
17351735
}
1736-
FuncCGInfo.CallSiteLabels.clear();
1737-
1738-
const auto &DirectCallees = FuncCGInfo.DirectCallees;
1739-
OutStreamer->emitInt64(DirectCallees.size());
1736+
FuncCGInfo.IndirectCallsites.clear();
1737+
OutStreamer->emitInt32(DirectCallees.size());
17401738
for (const auto &CalleeSymbol : DirectCallees) {
17411739
OutStreamer->emitSymbolValue(CalleeSymbol, TM.getProgramPointerSize());
17421740
}
@@ -1908,7 +1906,7 @@ void AsmPrinter::handleCallsiteForCallgraph(
19081906
MCSymbol *S = MF->getContext().createTempSymbol();
19091907
OutStreamer->emitLabel(S);
19101908
uint64_t CalleeTypeIdVal = CalleeTypeId->getZExtValue();
1911-
FuncCGInfo.CallSiteLabels.emplace_back(CalleeTypeIdVal, S);
1909+
FuncCGInfo.IndirectCallsites.emplace_back(CalleeTypeIdVal, S);
19121910
}
19131911
}
19141912

0 commit comments

Comments
 (0)