Skip to content

Commit 8f53618

Browse files
committed
[𝘀𝗽𝗿] initial version
Created using spr 1.3.6-beta.1
2 parents 581f755 + 29d4db2 commit 8f53618

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+1357
-28
lines changed

clang/docs/CallGraphSection.rst

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
==================
2+
Call Graph Section
3+
==================
4+
5+
Introduction
6+
============
7+
8+
With ``-fcall-graph-section``, the compiler will create a call graph section
9+
in the object file. It will include type identifiers for indirect calls and
10+
targets. This information can be used to map indirect calls to their receivers
11+
with matching types. A complete and high-precision call graph can be
12+
reconstructed by complementing this information with disassembly
13+
(see ``llvm-objdump --call-graph-info``).
14+
15+
Semantics
16+
=========
17+
18+
A coarse-grained, type-agnostic call graph may allow indirect calls to target
19+
any function in the program. This approach ensures completeness since no
20+
indirect call edge is missing. However, it is generally poor in precision
21+
due to having unneeded edges.
22+
23+
A call graph section provides type identifiers for indirect calls and targets.
24+
This information can be used to restrict the receivers of an indirect target to
25+
indirect calls with matching type. Consequently, the precision for indirect
26+
call edges are improved while maintaining the completeness.
27+
28+
The ``llvm-objdump`` utility provides a ``--call-graph-info`` option to extract
29+
full call graph information by parsing the content of the call graph section
30+
and disassembling the program for complementary information, e.g., direct
31+
calls.
32+
33+
Section layout
34+
==============
35+
36+
A call graph section consists of zero or more call graph entries.
37+
Each entry contains information on a function and its indirect calls.
38+
39+
An entry of a call graph section has the following layout in the binary:
40+
41+
+---------------------+-----------------------------------------------------------------------+
42+
| Element | Content |
43+
+=====================+=======================================================================+
44+
| FormatVersionNumber | Format version number. |
45+
+---------------------+-----------------------------------------------------------------------+
46+
| FunctionEntryPc | Function entry address. |
47+
+---------------------+-----------------------------------+-----------------------------------+
48+
| | A flag whether the function is an | - 0: not an indirect target |
49+
| FunctionKind | indirect target, and if so, | - 1: indirect target, unknown id |
50+
| | whether its type id is known. | - 2: indirect target, known id |
51+
+---------------------+-----------------------------------+-----------------------------------+
52+
| FunctionTypeId | Type id for the indirect target. Present only when FunctionKind is 2. |
53+
+---------------------+-----------------------------------------------------------------------+
54+
| CallSiteCount | Number of type id to indirect call site mappings that follow. |
55+
+---------------------+-----------------------------------------------------------------------+
56+
| CallSiteList | List of type id and indirect call site pc pairs. |
57+
+---------------------+-----------------------------------------------------------------------+
58+
59+
Each element in an entry (including each element of the contained lists and
60+
pairs) occupies 64-bit space.
61+
62+
The format version number is repeated per entry to support concatenation of
63+
call graph sections with different format versions by the linker.
64+
65+
As of now, the only supported format version is described above and has version
66+
number 0.
67+
68+
Type identifiers
69+
================
70+
71+
The type for an indirect call or target is the function signature.
72+
The mapping from a type to an identifier is an ABI detail.
73+
In the current experimental implementation, an identifier of type T is
74+
computed as follows:
75+
76+
- Obtain the generalized mangled name for “typeinfo name for T”.
77+
- Compute MD5 hash of the name as a string.
78+
- Reinterpret the first 8 bytes of the hash as a little-endian 64-bit integer.
79+
80+
To avoid mismatched pointer types, generalizations are applied.
81+
Pointers in return and argument types are treated as equivalent as long as the
82+
qualifiers for the type they point to match.
83+
For example, ``char*``, ``char**``, and ``int*`` are considered equivalent
84+
types. However, ``char*`` and ``const char*`` are considered separate types.
85+
86+
Missing type identifiers
87+
========================
88+
89+
For functions, two cases need to be considered. First, if the compiler cannot
90+
deduce a type id for an indirect target, it will be listed as an indirect target
91+
without a type id. Second, if an object without a call graph section gets
92+
linked, the final call graph section will lack information on functions from
93+
the object. For completeness, these functions need to be taken as receiver to
94+
any indirect call regardless of their type id.
95+
``llvm-objdump --call-graph-info`` lists these functions as indirect targets
96+
with `UNKNOWN` type id.
97+
98+
For indirect calls, current implementation guarantees a type id for each
99+
compiled call. However, if an object without a call graph section gets linked,
100+
no type id will be present for its indirect calls. For completeness, these calls
101+
need to be taken to target any indirect target regardless of their type id. For
102+
indirect calls, ``llvm-objdump --call-graph-info`` prints 1) a complete list of
103+
indirect calls, 2) type id to indirect call mappings. The difference of these
104+
lists allow to deduce the indirect calls with missing type ids.
105+
106+
TODO: measure and report the ratio of missed type ids
107+
108+
Performance
109+
===========
110+
111+
A call graph section does not affect the executable code and does not occupy
112+
memory during process execution. Therefore, there is no performance overhead.
113+
114+
The scheme has not yet been optimized for binary size.
115+
116+
TODO: measure and report the increase in the binary size
117+
118+
Example
119+
=======
120+
121+
For example, consider the following C++ code:
122+
123+
.. code-block:: cpp
124+
125+
namespace {
126+
// Not an indirect target
127+
void foo() {}
128+
}
129+
130+
// Indirect target 1
131+
void bar() {}
132+
133+
// Indirect target 2
134+
int baz(char a, float *b) {
135+
return 0;
136+
}
137+
138+
// Indirect target 3
139+
int main() {
140+
char a;
141+
float b;
142+
void (*fp_bar)() = bar;
143+
int (*fp_baz1)(char, float*) = baz;
144+
int (*fp_baz2)(char, float*) = baz;
145+
146+
// Indirect call site 1
147+
fp_bar();
148+
149+
// Indirect call site 2
150+
fp_baz1(a, &b);
151+
152+
// Indirect call site 3: shares the type id with indirect call site 2
153+
fp_baz2(a, &b);
154+
155+
// Direct call sites
156+
foo();
157+
bar();
158+
baz(a, &b);
159+
160+
return 0;
161+
}
162+
163+
Following will compile it with a call graph section created in the binary:
164+
165+
.. code-block:: bash
166+
167+
$ clang -fcall-graph-section example.cpp
168+
169+
During the construction of the call graph section, the type identifiers are
170+
computed as follows:
171+
172+
+---------------+-----------------------+----------------------------+----------------------------+
173+
| Function name | Generalized signature | Mangled name (itanium ABI) | Numeric type id (md5 hash) |
174+
+===============+=======================+============================+============================+
175+
| bar | void () | _ZTSFvvE.generalized | f85c699bb8ef20a2 |
176+
+---------------+-----------------------+----------------------------+----------------------------+
177+
| baz | int (char, void*) | _ZTSFicPvE.generalized | e3804d2a7f2b03fe |
178+
+---------------+-----------------------+----------------------------+----------------------------+
179+
| main | int () | _ZTSFivE.generalized | a9494def81a01dc |
180+
+---------------+-----------------------+----------------------------+----------------------------+
181+
182+
The call graph section will have the following content:
183+
184+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
185+
| FormatVersion | FunctionEntryPc | FunctionKind | FunctionTypeId | CallSiteCount | CallSiteList |
186+
+===============+=================+==============+================+===============+======================================+
187+
| 0 | EntryPc(foo) | 0 | (empty) | 0 | (empty) |
188+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
189+
| 0 | EntryPc(bar) | 2 | TypeId(bar) | 0 | (empty) |
190+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
191+
| 0 | EntryPc(baz) | 2 | TypeId(baz) | 0 | (empty) |
192+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
193+
| 0 | EntryPc(main) | 2 | TypeId(main) | 3 | * TypeId(bar), CallSitePc(fp_bar()) |
194+
| | | | | | * TypeId(baz), CallSitePc(fp_baz1()) |
195+
| | | | | | * TypeId(baz), CallSitePc(fp_baz2()) |
196+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
197+
198+
199+
The ``llvm-objdump`` utility can parse the call graph section and disassemble
200+
the program to provide complete call graph information. This includes any
201+
additional call sites from the binary:
202+
203+
.. code-block:: bash
204+
205+
$ llvm-objdump --call-graph-info a.out
206+
207+
# Comments are not a part of the llvm-objdump's output but inserted for clarifications.
208+
209+
a.out: file format elf64-x86-64
210+
# These warnings are due to the functions and the indirect calls coming from linked objects.
211+
llvm-objdump: warning: 'a.out': callgraph section does not have type ids for 3 indirect calls
212+
llvm-objdump: warning: 'a.out': callgraph section does not have information for 10 functions
213+
214+
# Unknown targets are the 10 functions the warnings mention.
215+
INDIRECT TARGET TYPES (TYPEID [FUNC_ADDR,])
216+
UNKNOWN 401000 401100 401234 401050 401090 4010d0 4011d0 401020 401060 401230
217+
a9494def81a01dc 401150 # main()
218+
f85c699bb8ef20a2 401120 # bar()
219+
e3804d2a7f2b03fe 401130 # baz()
220+
221+
# Notice that the call sites share the same type id as target functions
222+
INDIRECT CALL TYPES (TYPEID [CALL_SITE_ADDR,])
223+
f85c699bb8ef20a2 401181 # Indirect call site 1 (fp_bar())
224+
e3804d2a7f2b03fe 401191 4011a1 # Indirect call site 2 and 3 (fp_baz1() and fp_baz2())
225+
226+
INDIRECT CALL SITES (CALLER_ADDR [CALL_SITE_ADDR,])
227+
401000 401012 # _init
228+
401150 401181 401191 4011a1 # main calls fp_bar(), fp_baz1(), fp_baz2()
229+
4011d0 401215 # __libc_csu_init
230+
401020 40104a # _start
231+
232+
DIRECT CALL SITES (CALLER_ADDR [(CALL_SITE_ADDR, TARGET_ADDR),])
233+
4010d0 4010e2 401060 # __do_global_dtors_aux
234+
401150 4011a6 401110 4011ab 401120 4011ba 401130 # main calls foo(), bar(), baz()
235+
4011d0 4011fd 401000 # __libc_csu_init
236+
237+
FUNCTIONS (FUNC_ENTRY_ADDR, SYM_NAME)
238+
401000 _init
239+
401100 frame_dummy
240+
401234 _fini
241+
401050 _dl_relocate_static_pie
242+
401090 register_tm_clones
243+
4010d0 __do_global_dtors_aux
244+
401110 _ZN12_GLOBAL__N_13fooEv # (anonymous namespace)::foo()
245+
401150 main # main
246+
4011d0 __libc_csu_init
247+
401020 _start
248+
401060 deregister_tm_clones
249+
401120 _Z3barv # bar()
250+
401130 _Z3bazcPf # baz(char, float*)
251+
401230 __libc_csu_fini

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,8 @@ CODEGENOPT(EnableNoundefAttrs, 1, 0) ///< Enable emitting `noundef` attributes o
7878
CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new
7979
///< pass manager.
8080
CODEGENOPT(DisableRedZone , 1, 0) ///< Set when -mno-red-zone is enabled.
81+
CODEGENOPT(CallGraphSection, 1, 0) ///< Emit a call graph section into the
82+
///< object file.
8183
CODEGENOPT(EmitCallSiteInfo, 1, 0) ///< Emit call site info only in the case of
8284
///< '-g' + 'O>0' level.
8385
CODEGENOPT(IndirectTlsSegRefs, 1, 0) ///< Set when -mno-tls-direct-seg-refs

clang/include/clang/Driver/Options.td

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4291,6 +4291,10 @@ defm data_sections : BoolFOption<"data-sections",
42914291
PosFlag<SetTrue, [], [ClangOption, CC1Option],
42924292
"Place each data in its own section">,
42934293
NegFlag<SetFalse>>;
4294+
defm call_graph_section : BoolFOption<"call-graph-section",
4295+
CodeGenOpts<"CallGraphSection">, DefaultFalse,
4296+
PosFlag<SetTrue, [], [CC1Option], "Emit a call graph section">,
4297+
NegFlag<SetFalse>>;
42944298
defm stack_size_section : BoolFOption<"stack-size-section",
42954299
CodeGenOpts<"StackSizeSection">, DefaultFalse,
42964300
PosFlag<SetTrue, [], [ClangOption, CC1Option],

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -455,6 +455,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
455455
Options.StackUsageOutput = CodeGenOpts.StackUsageOutput;
456456
Options.EmitAddrsig = CodeGenOpts.Addrsig;
457457
Options.ForceDwarfFrameSection = CodeGenOpts.ForceDwarfFrameSection;
458+
Options.EmitCallGraphSection = CodeGenOpts.CallGraphSection;
458459
Options.EmitCallSiteInfo = CodeGenOpts.EmitCallSiteInfo;
459460
Options.EnableAIXExtendedAltivecABI = LangOpts.EnableAIXExtendedAltivecABI;
460461
Options.XRayFunctionIndex = CodeGenOpts.XRayFunctionIndex;

clang/lib/CodeGen/CGCall.cpp

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
#include "clang/AST/Decl.h"
2626
#include "clang/AST/DeclCXX.h"
2727
#include "clang/AST/DeclObjC.h"
28+
#include "clang/AST/Type.h"
2829
#include "clang/Basic/CodeGenOptions.h"
2930
#include "clang/Basic/TargetInfo.h"
3031
#include "clang/CodeGen/CGFunctionInfo.h"
@@ -5077,6 +5078,11 @@ static unsigned getMaxVectorWidth(const llvm::Type *Ty) {
50775078
return MaxVectorWidth;
50785079
}
50795080

5081+
static bool isCXXDeclType(const FunctionDecl *FD) {
5082+
return isa<CXXConstructorDecl>(FD) || isa<CXXMethodDecl>(FD) ||
5083+
isa<CXXDestructorDecl>(FD);
5084+
}
5085+
50805086
RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
50815087
const CGCallee &Callee,
50825088
ReturnValueSlot ReturnValue,
@@ -5765,6 +5771,38 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
57655771
AllocAlignAttrEmitter AllocAlignAttrEmitter(*this, TargetDecl, CallArgs);
57665772
Attrs = AllocAlignAttrEmitter.TryEmitAsCallSiteAttribute(Attrs);
57675773

5774+
if (CGM.getCodeGenOpts().CallGraphSection) {
5775+
// Create operand bundle only for indirect calls, not for all
5776+
if (callOrInvoke && *callOrInvoke && (*callOrInvoke)->isIndirectCall()) {
5777+
5778+
assert((TargetDecl && TargetDecl->getFunctionType() ||
5779+
Callee.getAbstractInfo().getCalleeFunctionProtoType()) &&
5780+
"cannot find callsite type");
5781+
5782+
QualType CST;
5783+
if (TargetDecl && TargetDecl->getFunctionType())
5784+
CST = QualType(TargetDecl->getFunctionType(), 0);
5785+
else if (const auto *FPT =
5786+
Callee.getAbstractInfo().getCalleeFunctionProtoType())
5787+
CST = QualType(FPT, 0);
5788+
5789+
if (!CST.isNull()) {
5790+
auto *TypeIdMD = CGM.CreateMetadataIdentifierGeneralized(CST);
5791+
auto *TypeIdMDVal =
5792+
llvm::MetadataAsValue::get(getLLVMContext(), TypeIdMD);
5793+
BundleList.emplace_back("type", TypeIdMDVal);
5794+
}
5795+
5796+
// Set type identifier metadata of indirect calls for call graph section.
5797+
if (const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
5798+
// Type id metadata is set only for C/C++ contexts.
5799+
if (isCXXDeclType(FD)) {
5800+
CGM.CreateFunctionTypeMetadataForIcall(FD->getType(), *callOrInvoke);
5801+
}
5802+
}
5803+
}
5804+
}
5805+
57685806
// Emit the actual call/invoke instruction.
57695807
llvm::CallBase *CI;
57705808
if (!InvokeDest) {

clang/lib/CodeGen/CGExpr.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6173,6 +6173,12 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType,
61736173
if (CallOrInvoke)
61746174
*CallOrInvoke = LocalCallOrInvoke;
61756175

6176+
// Set type identifier metadata of indirect calls for call graph section.
6177+
if (CGM.getCodeGenOpts().CallGraphSection && LocalCallOrInvoke &&
6178+
LocalCallOrInvoke->isIndirectCall())
6179+
CGM.CreateFunctionTypeMetadataForIcall(QualType(FnType, 0),
6180+
LocalCallOrInvoke);
6181+
61766182
return Call;
61776183
}
61786184

clang/lib/CodeGen/CGObjCMac.cpp

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2214,9 +2214,8 @@ CGObjCCommonMac::EmitMessageSend(CodeGen::CodeGenFunction &CGF,
22142214

22152215
llvm::CallBase *CallSite;
22162216
CGCallee Callee = CGCallee::forDirect(BitcastFn);
2217-
RValue rvalue = CGF.EmitCall(MSI.CallInfo, Callee, Return, ActualArgs,
2218-
&CallSite);
2219-
2217+
RValue rvalue =
2218+
CGF.EmitCall(MSI.CallInfo, Callee, Return, ActualArgs, &CallSite);
22202219
// Mark the call as noreturn if the method is marked noreturn and the
22212220
// receiver cannot be null.
22222221
if (Method && Method->hasAttr<NoReturnAttr>() && !ReceiverCanBeNull) {

0 commit comments

Comments
 (0)