Skip to content

Commit d2d56f4

Browse files
necipfazilPrabhuk
authored andcommitted
[𝘀𝗽𝗿] changes to main this commit is based on
Created using spr 1.3.6-beta.1 [skip ci]
1 parent 029e1d7 commit d2d56f4

34 files changed

+871
-25
lines changed

clang/docs/CallGraphSection.rst

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
==================
2+
Call Graph Section
3+
==================
4+
5+
Introduction
6+
============
7+
8+
With ``-fcall-graph-section``, the compiler will create a call graph section
9+
in the object file. It will include type identifiers for indirect calls and
10+
targets. This information can be used to map indirect calls to their receivers
11+
with matching types. A complete and high-precision call graph can be
12+
reconstructed by complementing this information with disassembly
13+
(see ``llvm-objdump --call-graph-info``).
14+
15+
Semantics
16+
=========
17+
18+
A coarse-grained, type-agnostic call graph may allow indirect calls to target
19+
any function in the program. This approach ensures completeness since no
20+
indirect call edge is missing. However, it is generally poor in precision
21+
due to having unneeded edges.
22+
23+
A call graph section provides type identifiers for indirect calls and targets.
24+
This information can be used to restrict the receivers of an indirect target to
25+
indirect calls with matching type. Consequently, the precision for indirect
26+
call edges are improved while maintaining the completeness.
27+
28+
The ``llvm-objdump`` utility provides a ``--call-graph-info`` option to extract
29+
full call graph information by parsing the content of the call graph section
30+
and disassembling the program for complementary information, e.g., direct
31+
calls.
32+
33+
Section layout
34+
==============
35+
36+
A call graph section consists of zero or more call graph entries.
37+
Each entry contains information on a function and its indirect calls.
38+
39+
An entry of a call graph section has the following layout in the binary:
40+
41+
+---------------------+-----------------------------------------------------------------------+
42+
| Element | Content |
43+
+=====================+=======================================================================+
44+
| FormatVersionNumber | Format version number. |
45+
+---------------------+-----------------------------------------------------------------------+
46+
| FunctionEntryPc | Function entry address. |
47+
+---------------------+-----------------------------------+-----------------------------------+
48+
| | A flag whether the function is an | - 0: not an indirect target |
49+
| FunctionKind | indirect target, and if so, | - 1: indirect target, unknown id |
50+
| | whether its type id is known. | - 2: indirect target, known id |
51+
+---------------------+-----------------------------------+-----------------------------------+
52+
| FunctionTypeId | Type id for the indirect target. Present only when FunctionKind is 2. |
53+
+---------------------+-----------------------------------------------------------------------+
54+
| CallSiteCount | Number of type id to indirect call site mappings that follow. |
55+
+---------------------+-----------------------------------------------------------------------+
56+
| CallSiteList | List of type id and indirect call site pc pairs. |
57+
+---------------------+-----------------------------------------------------------------------+
58+
59+
Each element in an entry (including each element of the contained lists and
60+
pairs) occupies 64-bit space.
61+
62+
The format version number is repeated per entry to support concatenation of
63+
call graph sections with different format versions by the linker.
64+
65+
As of now, the only supported format version is described above and has version
66+
number 0.
67+
68+
Type identifiers
69+
================
70+
71+
The type for an indirect call or target is the function signature.
72+
The mapping from a type to an identifier is an ABI detail.
73+
In the current experimental implementation, an identifier of type T is
74+
computed as follows:
75+
76+
- Obtain the generalized mangled name for “typeinfo name for T”.
77+
- Compute MD5 hash of the name as a string.
78+
- Reinterpret the first 8 bytes of the hash as a little-endian 64-bit integer.
79+
80+
To avoid mismatched pointer types, generalizations are applied.
81+
Pointers in return and argument types are treated as equivalent as long as the
82+
qualifiers for the type they point to match.
83+
For example, ``char*``, ``char**``, and ``int*`` are considered equivalent
84+
types. However, ``char*`` and ``const char*`` are considered separate types.
85+
86+
Missing type identifiers
87+
========================
88+
89+
For functions, two cases need to be considered. First, if the compiler cannot
90+
deduce a type id for an indirect target, it will be listed as an indirect target
91+
without a type id. Second, if an object without a call graph section gets
92+
linked, the final call graph section will lack information on functions from
93+
the object. For completeness, these functions need to be taken as receiver to
94+
any indirect call regardless of their type id.
95+
``llvm-objdump --call-graph-info`` lists these functions as indirect targets
96+
with `UNKNOWN` type id.
97+
98+
For indirect calls, current implementation guarantees a type id for each
99+
compiled call. However, if an object without a call graph section gets linked,
100+
no type id will be present for its indirect calls. For completeness, these calls
101+
need to be taken to target any indirect target regardless of their type id. For
102+
indirect calls, ``llvm-objdump --call-graph-info`` prints 1) a complete list of
103+
indirect calls, 2) type id to indirect call mappings. The difference of these
104+
lists allow to deduce the indirect calls with missing type ids.
105+
106+
TODO: measure and report the ratio of missed type ids
107+
108+
Performance
109+
===========
110+
111+
A call graph section does not affect the executable code and does not occupy
112+
memory during process execution. Therefore, there is no performance overhead.
113+
114+
The scheme has not yet been optimized for binary size.
115+
116+
TODO: measure and report the increase in the binary size
117+
118+
Example
119+
=======
120+
121+
For example, consider the following C++ code:
122+
123+
.. code-block:: cpp
124+
125+
namespace {
126+
// Not an indirect target
127+
void foo() {}
128+
}
129+
130+
// Indirect target 1
131+
void bar() {}
132+
133+
// Indirect target 2
134+
int baz(char a, float *b) {
135+
return 0;
136+
}
137+
138+
// Indirect target 3
139+
int main() {
140+
char a;
141+
float b;
142+
void (*fp_bar)() = bar;
143+
int (*fp_baz1)(char, float*) = baz;
144+
int (*fp_baz2)(char, float*) = baz;
145+
146+
// Indirect call site 1
147+
fp_bar();
148+
149+
// Indirect call site 2
150+
fp_baz1(a, &b);
151+
152+
// Indirect call site 3: shares the type id with indirect call site 2
153+
fp_baz2(a, &b);
154+
155+
// Direct call sites
156+
foo();
157+
bar();
158+
baz(a, &b);
159+
160+
return 0;
161+
}
162+
163+
Following will compile it with a call graph section created in the binary:
164+
165+
.. code-block:: bash
166+
167+
$ clang -fcall-graph-section example.cpp
168+
169+
During the construction of the call graph section, the type identifiers are
170+
computed as follows:
171+
172+
+---------------+-----------------------+----------------------------+----------------------------+
173+
| Function name | Generalized signature | Mangled name (itanium ABI) | Numeric type id (md5 hash) |
174+
+===============+=======================+============================+============================+
175+
| bar | void () | _ZTSFvvE.generalized | f85c699bb8ef20a2 |
176+
+---------------+-----------------------+----------------------------+----------------------------+
177+
| baz | int (char, void*) | _ZTSFicPvE.generalized | e3804d2a7f2b03fe |
178+
+---------------+-----------------------+----------------------------+----------------------------+
179+
| main | int () | _ZTSFivE.generalized | a9494def81a01dc |
180+
+---------------+-----------------------+----------------------------+----------------------------+
181+
182+
The call graph section will have the following content:
183+
184+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
185+
| FormatVersion | FunctionEntryPc | FunctionKind | FunctionTypeId | CallSiteCount | CallSiteList |
186+
+===============+=================+==============+================+===============+======================================+
187+
| 0 | EntryPc(foo) | 0 | (empty) | 0 | (empty) |
188+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
189+
| 0 | EntryPc(bar) | 2 | TypeId(bar) | 0 | (empty) |
190+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
191+
| 0 | EntryPc(baz) | 2 | TypeId(baz) | 0 | (empty) |
192+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
193+
| 0 | EntryPc(main) | 2 | TypeId(main) | 3 | * TypeId(bar), CallSitePc(fp_bar()) |
194+
| | | | | | * TypeId(baz), CallSitePc(fp_baz1()) |
195+
| | | | | | * TypeId(baz), CallSitePc(fp_baz2()) |
196+
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
197+
198+
199+
The ``llvm-objdump`` utility can parse the call graph section and disassemble
200+
the program to provide complete call graph information. This includes any
201+
additional call sites from the binary:
202+
203+
.. code-block:: bash
204+
205+
$ llvm-objdump --call-graph-info a.out
206+
207+
# Comments are not a part of the llvm-objdump's output but inserted for clarifications.
208+
209+
a.out: file format elf64-x86-64
210+
# These warnings are due to the functions and the indirect calls coming from linked objects.
211+
llvm-objdump: warning: 'a.out': callgraph section does not have type ids for 3 indirect calls
212+
llvm-objdump: warning: 'a.out': callgraph section does not have information for 10 functions
213+
214+
# Unknown targets are the 10 functions the warnings mention.
215+
INDIRECT TARGET TYPES (TYPEID [FUNC_ADDR,])
216+
UNKNOWN 401000 401100 401234 401050 401090 4010d0 4011d0 401020 401060 401230
217+
a9494def81a01dc 401150 # main()
218+
f85c699bb8ef20a2 401120 # bar()
219+
e3804d2a7f2b03fe 401130 # baz()
220+
221+
# Notice that the call sites share the same type id as target functions
222+
INDIRECT CALL TYPES (TYPEID [CALL_SITE_ADDR,])
223+
f85c699bb8ef20a2 401181 # Indirect call site 1 (fp_bar())
224+
e3804d2a7f2b03fe 401191 4011a1 # Indirect call site 2 and 3 (fp_baz1() and fp_baz2())
225+
226+
INDIRECT CALL SITES (CALLER_ADDR [CALL_SITE_ADDR,])
227+
401000 401012 # _init
228+
401150 401181 401191 4011a1 # main calls fp_bar(), fp_baz1(), fp_baz2()
229+
4011d0 401215 # __libc_csu_init
230+
401020 40104a # _start
231+
232+
DIRECT CALL SITES (CALLER_ADDR [(CALL_SITE_ADDR, TARGET_ADDR),])
233+
4010d0 4010e2 401060 # __do_global_dtors_aux
234+
401150 4011a6 401110 4011ab 401120 4011ba 401130 # main calls foo(), bar(), baz()
235+
4011d0 4011fd 401000 # __libc_csu_init
236+
237+
FUNCTIONS (FUNC_ENTRY_ADDR, SYM_NAME)
238+
401000 _init
239+
401100 frame_dummy
240+
401234 _fini
241+
401050 _dl_relocate_static_pie
242+
401090 register_tm_clones
243+
4010d0 __do_global_dtors_aux
244+
401110 _ZN12_GLOBAL__N_13fooEv # (anonymous namespace)::foo()
245+
401150 main # main
246+
4011d0 __libc_csu_init
247+
401020 _start
248+
401060 deregister_tm_clones
249+
401120 _Z3barv # bar()
250+
401130 _Z3bazcPf # baz(char, float*)
251+
401230 __libc_csu_fini

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,8 @@ CODEGENOPT(EnableNoundefAttrs, 1, 0) ///< Enable emitting `noundef` attributes o
7676
CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new
7777
///< pass manager.
7878
CODEGENOPT(DisableRedZone , 1, 0) ///< Set when -mno-red-zone is enabled.
79+
CODEGENOPT(CallGraphSection, 1, 0) ///< Emit a call graph section into the
80+
///< object file.
7981
CODEGENOPT(EmitCallSiteInfo, 1, 0) ///< Emit call site info only in the case of
8082
///< '-g' + 'O>0' level.
8183
CODEGENOPT(IndirectTlsSegRefs, 1, 0) ///< Set when -mno-tls-direct-seg-refs

clang/include/clang/Driver/Options.td

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4081,6 +4081,10 @@ defm data_sections : BoolFOption<"data-sections",
40814081
PosFlag<SetTrue, [], [ClangOption, CC1Option],
40824082
"Place each data in its own section">,
40834083
NegFlag<SetFalse>>;
4084+
defm call_graph_section : BoolFOption<"call-graph-section",
4085+
CodeGenOpts<"CallGraphSection">, DefaultFalse,
4086+
PosFlag<SetTrue, [], [CC1Option], "Emit a call graph section">,
4087+
NegFlag<SetFalse>>;
40844088
defm stack_size_section : BoolFOption<"stack-size-section",
40854089
CodeGenOpts<"StackSizeSection">, DefaultFalse,
40864090
PosFlag<SetTrue, [], [ClangOption, CC1Option],

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -420,6 +420,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
420420
Options.StackUsageOutput = CodeGenOpts.StackUsageOutput;
421421
Options.EmitAddrsig = CodeGenOpts.Addrsig;
422422
Options.ForceDwarfFrameSection = CodeGenOpts.ForceDwarfFrameSection;
423+
Options.EmitCallGraphSection = CodeGenOpts.CallGraphSection;
423424
Options.EmitCallSiteInfo = CodeGenOpts.EmitCallSiteInfo;
424425
Options.EnableAIXExtendedAltivecABI = LangOpts.EnableAIXExtendedAltivecABI;
425426
Options.XRayFunctionIndex = CodeGenOpts.XRayFunctionIndex;

clang/lib/CodeGen/CGCall.cpp

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5681,6 +5681,28 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
56815681
AllocAlignAttrEmitter AllocAlignAttrEmitter(*this, TargetDecl, CallArgs);
56825682
Attrs = AllocAlignAttrEmitter.TryEmitAsCallSiteAttribute(Attrs);
56835683

5684+
if (CGM.getCodeGenOpts().CallGraphSection) {
5685+
// FIXME: create operand bundle only for indirect calls, not for all
5686+
5687+
assert((TargetDecl && TargetDecl->getFunctionType() ||
5688+
Callee.getAbstractInfo().getCalleeFunctionProtoType()) &&
5689+
"cannot find callsite type");
5690+
5691+
QualType CST;
5692+
if (TargetDecl && TargetDecl->getFunctionType())
5693+
CST = QualType(TargetDecl->getFunctionType(), 0);
5694+
else if (const auto *FPT =
5695+
Callee.getAbstractInfo().getCalleeFunctionProtoType())
5696+
CST = QualType(FPT, 0);
5697+
5698+
if (!CST.isNull()) {
5699+
auto *TypeIdMD = CGM.CreateMetadataIdentifierGeneralized(CST);
5700+
auto *TypeIdMDVal =
5701+
llvm::MetadataAsValue::get(getLLVMContext(), TypeIdMD);
5702+
BundleList.emplace_back("type", TypeIdMDVal);
5703+
}
5704+
}
5705+
56845706
// Emit the actual call/invoke instruction.
56855707
llvm::CallBase *CI;
56865708
if (!InvokeDest) {

clang/lib/CodeGen/CGClass.cpp

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2247,7 +2247,14 @@ void CodeGenFunction::EmitCXXConstructorCall(const CXXConstructorDecl *D,
22472247
const CGFunctionInfo &Info = CGM.getTypes().arrangeCXXConstructorCall(
22482248
Args, D, Type, ExtraArgs.Prefix, ExtraArgs.Suffix, PassPrototypeArgs);
22492249
CGCallee Callee = CGCallee::forDirect(CalleePtr, GlobalDecl(D, Type));
2250-
EmitCall(Info, Callee, ReturnValueSlot(), Args, nullptr, false, Loc);
2250+
llvm::CallBase *CallOrInvoke = nullptr;
2251+
EmitCall(Info, Callee, ReturnValueSlot(), Args, &CallOrInvoke, false, Loc);
2252+
2253+
// Set type identifier metadata of indirect calls for call graph section.
2254+
if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
2255+
CallOrInvoke->isIndirectCall())
2256+
CGM.CreateFunctionTypeMetadataForIcall(D->getType(), CallOrInvoke);
2257+
22512258

22522259
// Generate vtable assumptions if we're constructing a complete object
22532260
// with a vtable. We don't do this for base subobjects for two reasons:

clang/lib/CodeGen/CGExpr.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5953,6 +5953,11 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType, const CGCallee &OrigCallee
59535953
}
59545954
}
59555955

5956+
// Set type identifier metadata of indirect calls for call graph section.
5957+
if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
5958+
CallOrInvoke->isIndirectCall())
5959+
CGM.CreateFunctionTypeMetadataForIcall(QualType(FnType, 0), CallOrInvoke);
5960+
59565961
return Call;
59575962
}
59585963

clang/lib/CodeGen/CGExprCXX.cpp

Lines changed: 29 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -93,9 +93,17 @@ RValue CodeGenFunction::EmitCXXMemberOrOperatorCall(
9393
*this, MD, This, ImplicitParam, ImplicitParamTy, CE, Args, RtlArgs);
9494
auto &FnInfo = CGM.getTypes().arrangeCXXMethodCall(
9595
Args, FPT, CallInfo.ReqArgs, CallInfo.PrefixSize);
96-
return EmitCall(FnInfo, Callee, ReturnValue, Args, nullptr,
96+
llvm::CallBase *CallOrInvoke = nullptr;
97+
auto Call = EmitCall(FnInfo, Callee, ReturnValue, Args, &CallOrInvoke,
9798
CE && CE == MustTailCall,
9899
CE ? CE->getExprLoc() : SourceLocation());
100+
101+
// Set type identifier metadata of indirect calls for call graph section.
102+
if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
103+
CallOrInvoke->isIndirectCall())
104+
CGM.CreateFunctionTypeMetadataForIcall(MD->getType(), CallOrInvoke);
105+
106+
return Call;
99107
}
100108

101109
RValue CodeGenFunction::EmitCXXDestructorCall(
@@ -119,9 +127,17 @@ RValue CodeGenFunction::EmitCXXDestructorCall(
119127
CallArgList Args;
120128
commonEmitCXXMemberOrOperatorCall(*this, Dtor, This, ImplicitParam,
121129
ImplicitParamTy, CE, Args, nullptr);
122-
return EmitCall(CGM.getTypes().arrangeCXXStructorDeclaration(Dtor), Callee,
123-
ReturnValueSlot(), Args, nullptr, CE && CE == MustTailCall,
130+
llvm::CallBase *CallOrInvoke = nullptr;
131+
auto Call = EmitCall(CGM.getTypes().arrangeCXXStructorDeclaration(Dtor), Callee,
132+
ReturnValueSlot(), Args, &CallOrInvoke, CE && CE == MustTailCall,
124133
CE ? CE->getExprLoc() : SourceLocation{});
134+
135+
// Set type identifier metadata of indirect calls for call graph section.
136+
if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
137+
CallOrInvoke->isIndirectCall())
138+
CGM.CreateFunctionTypeMetadataForIcall(DtorDecl->getType(), CallOrInvoke);
139+
140+
return Call;
125141
}
126142

127143
RValue CodeGenFunction::EmitCXXPseudoDestructorExpr(
@@ -482,10 +498,18 @@ CodeGenFunction::EmitCXXMemberPointerCallExpr(const CXXMemberCallExpr *E,
482498

483499
// And the rest of the call args
484500
EmitCallArgs(Args, FPT, E->arguments());
485-
return EmitCall(CGM.getTypes().arrangeCXXMethodCall(Args, FPT, required,
501+
llvm::CallBase *CallOrInvoke = nullptr;
502+
auto Call = EmitCall(CGM.getTypes().arrangeCXXMethodCall(Args, FPT, required,
486503
/*PrefixSize=*/0),
487-
Callee, ReturnValue, Args, nullptr, E == MustTailCall,
504+
Callee, ReturnValue, Args, &CallOrInvoke, E == MustTailCall,
488505
E->getExprLoc());
506+
507+
// Set type identifier metadata of indirect calls for call graph section.
508+
if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
509+
CallOrInvoke->isIndirectCall())
510+
CGM.CreateFunctionTypeMetadataForIcall(QualType(FPT, 0), CallOrInvoke);
511+
512+
return Call;
489513
}
490514

491515
RValue

0 commit comments

Comments
 (0)