Skip to content

Commit 64c2d5f

Browse files
necipfazilPrabhuk
authored andcommitted
[CallGraphSection] Add call graph section options and documentation
This is the first of the patch series that adds the support for computing, storing, and restoring call graphs with LLVM. This adds the options and the design documentation for computing and storing the call graphs. Inferring indirect call targets from a binary is challenging without source-level information. Hence, the reconstruction of a fine-grained call graph from the binary is unfeasible for indirect/virtual calls. To address this, designed solution is to collect the necessary information to construct the call graph while the source information is present, and store it in a non-code section of the binary. To enable, use -fcall-graph-section for Clang, or --call-graph-section for LLVM. Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D105907?id=358319 Pull Request: llvm#87572
1 parent ba91c65 commit 64c2d5f

File tree

9 files changed

+280
-1
lines changed

9 files changed

+280
-1
lines changed

clang/docs/CallGraphSection.rst

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
==================
2+
Call Graph Section
3+
==================
4+
5+
Introduction
6+
============
7+
8+
With ``-fcall-graph-section``, the compiler will create a call graph section
9+
in the object file. It will include type identifiers for indirect calls and
10+
targets. This information can be used to map indirect calls to their receivers
11+
with matching types. A complete and high-precision call graph can be
12+
reconstructed by complementing this information with disassembly
13+
(see ``llvm-objdump --call-graph-info``).
14+
15+
Semantics
16+
=========
17+
18+
A coarse-grained, type-agnostic call graph may allow indirect calls to target
19+
any function in the program. This approach ensures completeness since no
20+
indirect call edge is missing. However, it is generally poor in precision
21+
due to having unneeded edges.
22+
23+
A call graph section provides type identifiers for indirect calls and targets.
24+
This information can be used to restrict the receivers of an indirect target to
25+
indirect calls with matching type. Consequently, the precision for indirect
26+
call edges are improved while maintaining the completeness.
27+
28+
The ``llvm-objdump`` utility provides a ``--call-graph-info`` option to extract
29+
full call graph information by parsing the content of the call graph section
30+
and disassembling the program for complementary information, e.g., direct
31+
calls.
32+
33+
Section layout
34+
==============
35+
36+
A call graph section consists of zero or more call graph entries.
37+
Each entry contains information on a function and its indirect calls.
38+
39+
An entry of a call graph section has the following layout in the binary:
40+
41+
+---------------------+-----------------------------------------------------------------------+
42+
| Element | Content |
43+
+=====================+=======================================================================+
44+
| FormatVersionNumber | Format version number. |
45+
+---------------------+-----------------------------------------------------------------------+
46+
| FunctionEntryPc | Function entry address. |
47+
+---------------------+-----------------------------------+-----------------------------------+
48+
| | A flag whether the function is an | - 0: not an indirect target |
49+
| FunctionKind | indirect target, and if so, | - 1: indirect target, unknown id |
50+
| | whether its type id is known. | - 2: indirect target, known id |
51+
+---------------------+-----------------------------------+-----------------------------------+
52+
| FunctionTypeId | Type id for the indirect target. Present only when FunctionKind is 2. |
53+
+---------------------+-----------------------------------------------------------------------+
54+
| CallSiteCount | Number of type id to indirect call site mappings that follow. |
55+
+---------------------+-----------------------------------------------------------------------+
56+
| CallSiteList | List of type id and indirect call site pc pairs. |
57+
+---------------------+-----------------------------------------------------------------------+
58+
59+
Each element in an entry (including each element of the contained lists and
60+
pairs) occupies 64-bit space.
61+
62+
The format version number is repeated per entry to support concatenation of
63+
call graph sections with different format versions by the linker.
64+
65+
As of now, the only supported format version is described above and has version
66+
number 0.
67+
68+
Type identifiers
69+
================
70+
71+
The type for an indirect call or target is the function signature.
72+
The mapping from a type to an identifier is an ABI detail.
73+
In the current experimental implementation, an identifier of type T is
74+
computed as follows:
75+
76+
- Obtain the generalized mangled name for “typeinfo name for T”.
77+
- Compute MD5 hash of the name as a string.
78+
- Reinterpret the first 8 bytes of the hash as a little-endian 64-bit integer.
79+
80+
To avoid mismatched pointer types, generalizations are applied.
81+
Pointers in return and argument types are treated as equivalent as long as the
82+
qualifiers for the type they point to match.
83+
For example, ``char*``, ``char**``, and ``int*`` are considered equivalent
84+
types. However, ``char*`` and ``const char*`` are considered separate types.
85+
86+
Missing type identifiers
87+
========================
88+
89+
For functions, two cases need to be considered. First, if the compiler cannot
90+
deduce a type id for an indirect target, it will be listed as an indirect target
91+
without a type id. Second, if an object without a call graph section gets
92+
linked, the final call graph section will lack information on functions from
93+
the object. For completeness, these functions need to be taken as receiver to
94+
any indirect call regardless of their type id.
95+
``llvm-objdump --call-graph-info`` lists these functions as indirect targets
96+
with `UNKNOWN` type id.
97+
98+
For indirect calls, current implementation guarantees a type id for each
99+
compiled call. However, if an object without a call graph section gets linked,
100+
no type id will be present for its indirect calls. For completeness, these calls
101+
need to be taken to target any indirect target regardless of their type id. For
102+
indirect calls, ``llvm-objdump --call-graph-info`` prints 1) a complete list of
103+
indirect calls, 2) type id to indirect call mappings. The difference of these
104+
lists allow to deduce the indirect calls with missing type ids.
105+
106+
TODO: measure and report the ratio of missed type ids
107+
108+
Performance
109+
===========
110+
111+
A call graph section does not affect the executable code and does not occupy
112+
memory during process execution. Therefore, there is no performance overhead.
113+
114+
The scheme has not yet been optimized for binary size.
115+
116+
TODO: measure and report the increase in the binary size
117+
118+
Example
119+
=======
120+
121+
For example, consider the following C++ code:
122+
123+
.. code-block:: cpp
124+
125+
namespace {
126+
// Not an indirect target
127+
void foo() {}
128+
}
129+
130+
// Indirect target 1
131+
void bar() {}
132+
133+
// Indirect target 2
134+
int baz(char a, float *b) {
135+
return 0;
136+
}
137+
138+
// Indirect target 3
139+
int main() {
140+
char a;
141+
float b;
142+
void (*fp_bar)() = bar;
143+
int (*fp_baz1)(char, float*) = baz;
144+
int (*fp_baz2)(char, float*) = baz;
145+
146+
// Indirect call site 1
147+
fp_bar();
148+
149+
// Indirect call site 2
150+
fp_baz1(a, &b);
151+
152+
// Indirect call site 3: shares the type id with indirect call site 2
153+
fp_baz2(a, &b);
154+
155+
// Direct call sites
156+
foo();
157+
bar();
158+
baz(a, &b);
159+
160+
return 0;
161+
}
162+
163+
Following will compile it with a call graph section created in the binary:
164+
165+
.. code-block:: bash
166+
167+
$ clang -fcall-graph-section example.cpp
168+
169+
During the construction of the call graph section, the type identifiers are
170+
computed as follows:
171+
172+
+---------------+-----------------------+----------------------------+----------------------------+
173+
| Function name | Generalized signature | Mangled name (itanium ABI) | Numeric type id (md5 hash) |
174+
+===============+=======================+============================+============================+
175+
| bar | void () | _ZTSFvvE.generalized | f85c699bb8ef20a2 |
176+
+---------------+-----------------------+----------------------------+----------------------------+
177+
| baz | int (char, void*) | _ZTSFicPvE.generalized | e3804d2a7f2b03fe |
178+
+---------------+-----------------------+----------------------------+----------------------------+
179+
| main | int () | _ZTSFivE.generalized | a9494def81a01dc |
180+
+---------------+-----------------------+----------------------------+----------------------------+
181+
182+
The call graph section will have the following content:
183+
184+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
185+
| FormatVersion | FunctionEntryPc | FunctionKind | FunctionTypeId | CallSiteCount | CallSiteList |
186+
+===============+=================+==============+================+===============+======================================+
187+
| 0 | EntryPc(foo) | 0 | (empty) | 0 | (empty) |
188+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
189+
| 0 | EntryPc(bar) | 2 | TypeId(bar) | 0 | (empty) |
190+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
191+
| 0 | EntryPc(baz) | 2 | TypeId(baz) | 0 | (empty) |
192+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
193+
| 0 | EntryPc(main) | 2 | TypeId(main) | 3 | * TypeId(bar), CallSitePc(fp_bar()) |
194+
| | | | | | * TypeId(baz), CallSitePc(fp_baz1()) |
195+
| | | | | | * TypeId(baz), CallSitePc(fp_baz2()) |
196+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
197+
198+
199+
The ``llvm-objdump`` utility can parse the call graph section and disassemble
200+
the program to provide complete call graph information. This includes any
201+
additional call sites from the binary:
202+
203+
.. code-block:: bash
204+
205+
$ llvm-objdump --call-graph-info a.out
206+
207+
# Comments are not a part of the llvm-objdump's output but inserted for clarifications.
208+
209+
a.out: file format elf64-x86-64
210+
# These warnings are due to the functions and the indirect calls coming from linked objects.
211+
llvm-objdump: warning: 'a.out': callgraph section does not have type ids for 3 indirect calls
212+
llvm-objdump: warning: 'a.out': callgraph section does not have information for 10 functions
213+
214+
# Unknown targets are the 10 functions the warnings mention.
215+
INDIRECT TARGET TYPES (TYPEID [FUNC_ADDR,])
216+
UNKNOWN 401000 401100 401234 401050 401090 4010d0 4011d0 401020 401060 401230
217+
a9494def81a01dc 401150 # main()
218+
f85c699bb8ef20a2 401120 # bar()
219+
e3804d2a7f2b03fe 401130 # baz()
220+
221+
# Notice that the call sites share the same type id as target functions
222+
INDIRECT CALL TYPES (TYPEID [CALL_SITE_ADDR,])
223+
f85c699bb8ef20a2 401181 # Indirect call site 1 (fp_bar())
224+
e3804d2a7f2b03fe 401191 4011a1 # Indirect call site 2 and 3 (fp_baz1() and fp_baz2())
225+
226+
INDIRECT CALL SITES (CALLER_ADDR [CALL_SITE_ADDR,])
227+
401000 401012 # _init
228+
401150 401181 401191 4011a1 # main calls fp_bar(), fp_baz1(), fp_baz2()
229+
4011d0 401215 # __libc_csu_init
230+
401020 40104a # _start
231+
232+
DIRECT CALL SITES (CALLER_ADDR [(CALL_SITE_ADDR, TARGET_ADDR),])
233+
4010d0 4010e2 401060 # __do_global_dtors_aux
234+
401150 4011a6 401110 4011ab 401120 4011ba 401130 # main calls foo(), bar(), baz()
235+
4011d0 4011fd 401000 # __libc_csu_init
236+
237+
FUNCTIONS (FUNC_ENTRY_ADDR, SYM_NAME)
238+
401000 _init
239+
401100 frame_dummy
240+
401234 _fini
241+
401050 _dl_relocate_static_pie
242+
401090 register_tm_clones
243+
4010d0 __do_global_dtors_aux
244+
401110 _ZN12_GLOBAL__N_13fooEv # (anonymous namespace)::foo()
245+
401150 main # main
246+
4011d0 __libc_csu_init
247+
401020 _start
248+
401060 deregister_tm_clones
249+
401120 _Z3barv # bar()
250+
401130 _Z3bazcPf # baz(char, float*)
251+
401230 __libc_csu_fini

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,8 @@ CODEGENOPT(EnableNoundefAttrs, 1, 0) ///< Enable emitting `noundef` attributes o
7676
CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new
7777
///< pass manager.
7878
CODEGENOPT(DisableRedZone , 1, 0) ///< Set when -mno-red-zone is enabled.
79+
CODEGENOPT(CallGraphSection, 1, 0) ///< Emit a call graph section into the
80+
///< object file.
7981
CODEGENOPT(EmitCallSiteInfo, 1, 0) ///< Emit call site info only in the case of
8082
///< '-g' + 'O>0' level.
8183
CODEGENOPT(IndirectTlsSegRefs, 1, 0) ///< Set when -mno-tls-direct-seg-refs

clang/include/clang/Driver/Options.td

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4099,6 +4099,10 @@ defm data_sections : BoolFOption<"data-sections",
40994099
PosFlag<SetTrue, [], [ClangOption, CC1Option],
41004100
"Place each data in its own section">,
41014101
NegFlag<SetFalse>>;
4102+
defm call_graph_section : BoolFOption<"call-graph-section",
4103+
CodeGenOpts<"CallGraphSection">, DefaultFalse,
4104+
PosFlag<SetTrue, [], [CC1Option], "Emit a call graph section">,
4105+
NegFlag<SetFalse>>;
41024106
defm stack_size_section : BoolFOption<"stack-size-section",
41034107
CodeGenOpts<"StackSizeSection">, DefaultFalse,
41044108
PosFlag<SetTrue, [], [ClangOption, CC1Option],

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -416,6 +416,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
416416
Options.StackUsageOutput = CodeGenOpts.StackUsageOutput;
417417
Options.EmitAddrsig = CodeGenOpts.Addrsig;
418418
Options.ForceDwarfFrameSection = CodeGenOpts.ForceDwarfFrameSection;
419+
Options.EmitCallGraphSection = CodeGenOpts.CallGraphSection;
419420
Options.EmitCallSiteInfo = CodeGenOpts.EmitCallSiteInfo;
420421
Options.EnableAIXExtendedAltivecABI = LangOpts.EnableAIXExtendedAltivecABI;
421422
Options.XRayFunctionIndex = CodeGenOpts.XRayFunctionIndex;

clang/lib/Driver/ToolChains/Clang.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6444,6 +6444,10 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
64446444
CmdArgs.push_back(A->getValue());
64456445
}
64466446

6447+
if (Args.hasFlag(options::OPT_fcall_graph_section,
6448+
options::OPT_fno_call_graph_section, false))
6449+
CmdArgs.push_back("-fcall-graph-section");
6450+
64476451
Args.addOptInFlag(CmdArgs, options::OPT_fstack_size_section,
64486452
options::OPT_fno_stack_size_section);
64496453

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
// RUN: %clang -### -S -fcall-graph-section %s 2>&1 | FileCheck --check-prefix=CALL-GRAPH-SECTION %s
2+
// RUN: %clang -### -S -fcall-graph-section -fno-call-graph-section %s 2>&1 | FileCheck --check-prefix=NO-CALL-GRAPH-SECTION %s
3+
4+
// CALL-GRAPH-SECTION: "-fcall-graph-section"
5+
// NO-CALL-GRAPH-SECTION-NOT: "-fcall-graph-section"

llvm/include/llvm/CodeGen/CommandFlags.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,8 @@ bool getEnableStackSizeSection();
130130

131131
bool getEnableAddrsig();
132132

133+
bool getEnableCallGraphSection();
134+
133135
bool getEmitCallSiteInfo();
134136

135137
bool getEnableMachineFunctionSplitter();

llvm/include/llvm/Target/TargetOptions.h

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ namespace llvm {
149149
EnableTLSDESC(false), EnableIPRA(false), EmitStackSizeSection(false),
150150
EnableMachineOutliner(false), EnableMachineFunctionSplitter(false),
151151
SupportsDefaultOutlining(false), EmitAddrsig(false), BBAddrMap(false),
152-
EmitCallSiteInfo(false), SupportsDebugEntryValues(false),
152+
EmitCallGraphSection(false), EmitCallSiteInfo(false), SupportsDebugEntryValues(false),
153153
EnableDebugEntryValues(false), ValueTrackingVariableLocations(false),
154154
ForceDwarfFrameSection(false), XRayFunctionIndex(true),
155155
DebugStrictDwarf(false), Hotpatch(false),
@@ -323,6 +323,9 @@ namespace llvm {
323323
/// to selectively generate basic block sections.
324324
std::shared_ptr<MemoryBuffer> BBSectionsFuncListBuf;
325325

326+
/// Emit section containing call graph metadata.
327+
unsigned EmitCallGraphSection : 1;
328+
326329
/// The flag enables call site info production. It is used only for debug
327330
/// info, and it is restricted only to optimized code. This can be used for
328331
/// something else, so that should be controlled in the frontend.

llvm/lib/CodeGen/CommandFlags.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,7 @@ CGOPT(EABI, EABIVersion)
100100
CGOPT(DebuggerKind, DebuggerTuningOpt)
101101
CGOPT(bool, EnableStackSizeSection)
102102
CGOPT(bool, EnableAddrsig)
103+
CGOPT(bool, EnableCallGraphSection)
103104
CGOPT(bool, EmitCallSiteInfo)
104105
CGOPT(bool, EnableMachineFunctionSplitter)
105106
CGOPT(bool, EnableDebugEntryValues)
@@ -450,6 +451,11 @@ codegen::RegisterCodeGenFlags::RegisterCodeGenFlags() {
450451
cl::init(false));
451452
CGBINDOPT(EnableAddrsig);
452453

454+
static cl::opt<bool> EnableCallGraphSection(
455+
"call-graph-section", cl::desc("Emit a call graph section"),
456+
cl::init(false));
457+
CGBINDOPT(EnableCallGraphSection);
458+
453459
static cl::opt<bool> EmitCallSiteInfo(
454460
"emit-call-site-info",
455461
cl::desc(
@@ -578,6 +584,7 @@ codegen::InitTargetOptionsFromCodeGenFlags(const Triple &TheTriple) {
578584
Options.EmitStackSizeSection = getEnableStackSizeSection();
579585
Options.EnableMachineFunctionSplitter = getEnableMachineFunctionSplitter();
580586
Options.EmitAddrsig = getEnableAddrsig();
587+
Options.EmitCallGraphSection = getEnableCallGraphSection();
581588
Options.EmitCallSiteInfo = getEmitCallSiteInfo();
582589
Options.EnableDebugEntryValues = getEnableDebugEntryValues();
583590
Options.ForceDwarfFrameSection = getForceDwarfFrameSection();

0 commit comments

Comments
 (0)