Skip to content

Commit 50e424d

Browse files
committed
[Clang] Wire up -fsanitize=alloc-token
Wire up the `-fsanitize=alloc-token` command-line option, hooking up the AllocToken pass -- it provides allocation tokens to compatible runtime allocators, enabling different heap organization strategies, e.g. hardening schemes based on heap partitioning. The instrumentation rewrites standard allocation calls into variants that accept an additional `size_t token_id` argument. For example, calls to `malloc(size)` become `__alloc_token_malloc(size, token_id)`, and a C++ `new MyType` expression will call `__alloc_token_Znwm(size, token_id)`. Currently untyped allocation calls do not yet have `!alloc_token` metadata, and therefore receive the fallback token only. This will be fixed in subsequent changes through best-effort type-inference. One benefit of the instrumentation approach is that it can be applied transparently to large codebases, and scales in deployment as other sanitizers. Similarly to other sanitizers, instrumentation can selectively be controlled using `__attribute__((no_sanitize("alloc-token")))`. Support for sanitizer ignorelists to disable instrumentation for specific functions or source files is implemented. See clang/docs/AllocToken.rst for more usage instructions. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 Pull Request: llvm#156839
1 parent bcb8c3d commit 50e424d

20 files changed

+563
-11
lines changed

clang/docs/AllocToken.rst

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
=================
2+
Allocation Tokens
3+
=================
4+
5+
.. contents::
6+
:local:
7+
8+
Introduction
9+
============
10+
11+
Clang provides support for allocation tokens to enable allocator-level heap
12+
organization strategies. Clang assigns mode-dependent token IDs to allocation
13+
calls; the runtime behavior depends entirely on the implementation of a
14+
compatible memory allocator.
15+
16+
Possible allocator strategies include:
17+
18+
* **Security Hardening**: Placing allocations into separate, isolated heap
19+
partitions. For example, separating pointer-containing types from raw data
20+
can mitigate exploits that rely on overflowing a primitive buffer to corrupt
21+
object metadata.
22+
23+
* **Memory Layout Optimization**: Grouping related allocations to improve data
24+
locality and cache utilization.
25+
26+
* **Custom Allocation Policies**: Applying different management strategies to
27+
different partitions.
28+
29+
Token Assignment Mode
30+
=====================
31+
32+
The default mode to calculate tokens is:
33+
34+
* ``typehash``: This mode assigns a token ID based on the hash of the allocated
35+
type's name.
36+
37+
Other token ID assignment modes are supported, but they may be subject to
38+
change or removal. These may (experimentally) be selected with ``-mllvm
39+
-alloc-token-mode=<mode>``:
40+
41+
* ``random``: This mode assigns a statically-determined random token ID to each
42+
allocation site.
43+
44+
* ``increment``: This mode assigns a simple, incrementally increasing token ID
45+
to each allocation site.
46+
47+
Allocation Token Instrumentation
48+
================================
49+
50+
To enable instrumentation of allocation functions, code can be compiled with
51+
the ``-fsanitize=alloc-token`` flag:
52+
53+
.. code-block:: console
54+
55+
% clang++ -fsanitize=alloc-token example.cc
56+
57+
The instrumentation transforms allocation calls to include a token ID. For
58+
example:
59+
60+
.. code-block:: c
61+
62+
// Original:
63+
ptr = malloc(size);
64+
65+
// Instrumented:
66+
ptr = __alloc_token_malloc(size, <token id>);
67+
68+
The following command-line options affect generated token IDs:
69+
70+
* ``-falloc-token-max=<N>``
71+
Configures the maximum number of tokens. No max by default (tokens bounded
72+
by ``SIZE_MAX``).
73+
74+
.. code-block:: console
75+
76+
% clang++ -fsanitize=alloc-token -falloc-token-max=512 example.cc
77+
78+
Runtime Interface
79+
-----------------
80+
81+
A compatible runtime must be provided that implements the token-enabled
82+
allocation functions. The instrumentation generates calls to functions that
83+
take a final ``size_t token_id`` argument.
84+
85+
.. code-block:: c
86+
87+
// C standard library functions
88+
void *__alloc_token_malloc(size_t size, size_t token_id);
89+
void *__alloc_token_calloc(size_t count, size_t size, size_t token_id);
90+
void *__alloc_token_realloc(void *ptr, size_t size, size_t token_id);
91+
// ...
92+
93+
// C++ operators (mangled names)
94+
// operator new(size_t, size_t)
95+
void *__alloc_token__Znwm(size_t size, size_t token_id);
96+
// operator new[](size_t, size_t)
97+
void *__alloc_token__Znam(size_t size, size_t token_id);
98+
// ... other variants like nothrow, etc., are also instrumented.
99+
100+
Fast ABI
101+
--------
102+
103+
An alternative ABI can be enabled with ``-fsanitize-alloc-token-fast-abi``,
104+
which encodes the token ID hint in the allocation function name.
105+
106+
.. code-block:: c
107+
108+
void *__alloc_token_0_malloc(size_t size);
109+
void *__alloc_token_1_malloc(size_t size);
110+
void *__alloc_token_2_malloc(size_t size);
111+
...
112+
void *__alloc_token_0_Znwm(size_t size);
113+
void *__alloc_token_1_Znwm(size_t size);
114+
void *__alloc_token_2_Znwm(size_t size);
115+
...
116+
117+
This ABI provides a more efficient alternative where
118+
``-falloc-token-max`` is small.
119+
120+
Disabling Instrumentation
121+
-------------------------
122+
123+
To exclude specific functions from instrumentation, you can use the
124+
``no_sanitize("alloc-token")`` attribute:
125+
126+
.. code-block:: c
127+
128+
__attribute__((no_sanitize("alloc-token")))
129+
void* custom_allocator(size_t size) {
130+
return malloc(size); // Uses original malloc
131+
}
132+
133+
Note: Independent of any given allocator support, the instrumentation aims to
134+
remain performance neutral. As such, ``no_sanitize("alloc-token")``
135+
functions may be inlined into instrumented functions and vice-versa. If
136+
correctness is affected, such functions should explicitly be marked
137+
``noinline``.
138+
139+
The ``__attribute__((disable_sanitizer_instrumentation))`` is also supported to
140+
disable this and other sanitizer instrumentations.
141+
142+
Suppressions File (Ignorelist)
143+
------------------------------
144+
145+
AllocToken respects the ``src`` and ``fun`` entity types in the
146+
:doc:`SanitizerSpecialCaseList`, which can be used to omit specified source
147+
files or functions from instrumentation.
148+
149+
.. code-block:: bash
150+
151+
[alloc-token]
152+
# Exclude specific source files
153+
src:third_party/allocator.c
154+
# Exclude function name patterns
155+
fun:*custom_malloc*
156+
fun:LowLevel::*
157+
158+
.. code-block:: console
159+
160+
% clang++ -fsanitize=alloc-token -fsanitize-ignorelist=my_ignorelist.txt example.cc
161+
162+
Conditional Compilation with ``__SANITIZE_ALLOC_TOKEN__``
163+
-----------------------------------------------------------
164+
165+
In some cases, one may need to execute different code depending on whether
166+
AllocToken instrumentation is enabled. The ``__SANITIZE_ALLOC_TOKEN__`` macro
167+
can be used for this purpose.
168+
169+
.. code-block:: c
170+
171+
#ifdef __SANITIZE_ALLOC_TOKEN__
172+
// Code specific to -fsanitize=alloc-token builds
173+
#endif

clang/docs/ReleaseNotes.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -255,10 +255,16 @@ Non-comprehensive list of changes in this release
255255

256256
- Fixed a crash when the second argument to ``__builtin_assume_aligned`` was not constant (#GH161314)
257257

258+
- Introduce support for :doc:`allocation tokens <AllocToken>` to enable
259+
allocator-level heap organization strategies. A feature to instrument all
260+
allocation functions with a token ID can be enabled via the
261+
``-fsanitize=alloc-token`` flag.
262+
258263
New Compiler Flags
259264
------------------
260265
- New option ``-fno-sanitize-debug-trap-reasons`` added to disable emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
261266
- New option ``-fsanitize-debug-trap-reasons=`` added to control emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
267+
- New options for enabling allocation token instrumentation: ``-fsanitize=alloc-token``, ``-falloc-token-max=``, ``-fsanitize-alloc-token-fast-abi``, ``-fsanitize-alloc-token-extended``.
262268

263269

264270
Lanai Support

clang/docs/UsersManual.rst

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2155,13 +2155,11 @@ are listed below.
21552155

21562156
.. option:: -f[no-]sanitize=check1,check2,...
21572157

2158-
Turn on runtime checks for various forms of undefined or suspicious
2159-
behavior.
2158+
Turn on runtime checks or mitigations for various forms of undefined or
2159+
suspicious behavior. These are disabled by default.
21602160

2161-
This option controls whether Clang adds runtime checks for various
2162-
forms of undefined or suspicious behavior, and is disabled by
2163-
default. If a check fails, a diagnostic message is produced at
2164-
runtime explaining the problem. The main checks are:
2161+
The following options enable runtime checks for various forms of undefined
2162+
or suspicious behavior:
21652163

21662164
- .. _opt_fsanitize_address:
21672165

@@ -2195,6 +2193,14 @@ are listed below.
21952193
- ``-fsanitize=realtime``: :doc:`RealtimeSanitizer`,
21962194
a real-time safety checker.
21972195

2196+
The following options enable runtime mitigations for various forms of
2197+
undefined or suspicious behavior:
2198+
2199+
- ``-fsanitize=alloc-token``: Enables :doc:`allocation tokens <AllocToken>`
2200+
for allocator-level heap organization strategies, such as for security
2201+
hardening. It passes type-derived token IDs to a compatible memory
2202+
allocator. Requires linking against a token-aware allocator.
2203+
21982204
There are more fine-grained checks available: see
21992205
the :ref:`list <ubsan-checks>` of specific kinds of
22002206
undefined behavior that can be detected and the :ref:`list <cfi-schemes>`

clang/docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ Using Clang as a Compiler
4040
SanitizerCoverage
4141
SanitizerStats
4242
SanitizerSpecialCaseList
43+
AllocToken
4344
BoundsSafety
4445
BoundsSafetyAdoptionGuide
4546
BoundsSafetyImplPlans

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -306,6 +306,8 @@ CODEGENOPT(SanitizeBinaryMetadataCovered, 1, 0, Benign) ///< Emit PCs for covere
306306
CODEGENOPT(SanitizeBinaryMetadataAtomics, 1, 0, Benign) ///< Emit PCs for atomic operations.
307307
CODEGENOPT(SanitizeBinaryMetadataUAR, 1, 0, Benign) ///< Emit PCs for start of functions
308308
///< that are subject for use-after-return checking.
309+
CODEGENOPT(SanitizeAllocTokenFastABI, 1, 0, Benign) ///< Use the AllocToken fast ABI.
310+
CODEGENOPT(SanitizeAllocTokenExtended, 1, 0, Benign) ///< Extend coverage to custom allocation functions.
309311
CODEGENOPT(SanitizeStats , 1, 0, Benign) ///< Collect statistics for sanitizers.
310312
ENUM_CODEGENOPT(SanitizeDebugTrapReasons, SanitizeDebugTrapReasonKind, 2, SanitizeDebugTrapReasonKind::Detailed, Benign) ///< Control how "trap reasons" are emitted in debug info
311313
CODEGENOPT(SimplifyLibCalls , 1, 1, Benign) ///< Set when -fbuiltin is enabled.

clang/include/clang/Basic/CodeGenOptions.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -447,6 +447,10 @@ class CodeGenOptions : public CodeGenOptionsBase {
447447

448448
std::optional<double> AllowRuntimeCheckSkipHotCutoff;
449449

450+
/// Maximum number of allocation tokens (0 = no max), nullopt if none set (use
451+
/// pass default).
452+
std::optional<uint64_t> AllocTokenMax;
453+
450454
/// List of backend command-line options for -fembed-bitcode.
451455
std::vector<uint8_t> CmdArgs;
452456

clang/include/clang/Driver/Options.td

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2731,8 +2731,25 @@ def fsanitize_skip_hot_cutoff_EQ
27312731
"(0.0 [default] = skip none; 1.0 = skip all). "
27322732
"Argument format: <sanitizer1>=<value1>,<sanitizer2>=<value2>,...">;
27332733

2734+
defm sanitize_alloc_token_fast_abi : BoolOption<"f", "sanitize-alloc-token-fast-abi",
2735+
CodeGenOpts<"SanitizeAllocTokenFastABI">, DefaultFalse,
2736+
PosFlag<SetTrue, [], [ClangOption], "Use the AllocToken fast ABI">,
2737+
NegFlag<SetFalse, [], [ClangOption], "Use the default AllocToken ABI">>,
2738+
Group<f_clang_Group>;
2739+
defm sanitize_alloc_token_extended : BoolOption<"f", "sanitize-alloc-token-extended",
2740+
CodeGenOpts<"SanitizeAllocTokenExtended">, DefaultFalse,
2741+
PosFlag<SetTrue, [], [ClangOption], "Enable">,
2742+
NegFlag<SetFalse, [], [ClangOption], "Disable">,
2743+
BothFlags<[], [ClangOption], " extended coverage to custom allocation functions">>,
2744+
Group<f_clang_Group>;
2745+
27342746
} // end -f[no-]sanitize* flags
27352747

2748+
def falloc_token_max_EQ : Joined<["-"], "falloc-token-max=">,
2749+
Group<f_Group>, Visibility<[ClangOption, CC1Option]>,
2750+
MetaVarName<"<N>">,
2751+
HelpText<"Limit to maximum N allocation tokens (0 = no max)">;
2752+
27362753
def fallow_runtime_check_skip_hot_cutoff_EQ
27372754
: Joined<["-"], "fallow-runtime-check-skip-hot-cutoff=">,
27382755
Group<f_clang_Group>,

clang/include/clang/Driver/SanitizerArgs.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,8 @@ class SanitizerArgs {
7575
llvm::AsanDetectStackUseAfterReturnMode::Invalid;
7676

7777
std::string MemtagMode;
78+
bool AllocTokenFastABI = false;
79+
bool AllocTokenExtended = false;
7880

7981
public:
8082
/// Parses the sanitizer arguments from an argument list.

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,11 +60,13 @@
6060
#include "llvm/TargetParser/Triple.h"
6161
#include "llvm/Transforms/HipStdPar/HipStdPar.h"
6262
#include "llvm/Transforms/IPO/EmbedBitcodePass.h"
63+
#include "llvm/Transforms/IPO/InferFunctionAttrs.h"
6364
#include "llvm/Transforms/IPO/LowerTypeTests.h"
6465
#include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h"
6566
#include "llvm/Transforms/InstCombine/InstCombine.h"
6667
#include "llvm/Transforms/Instrumentation/AddressSanitizer.h"
6768
#include "llvm/Transforms/Instrumentation/AddressSanitizerOptions.h"
69+
#include "llvm/Transforms/Instrumentation/AllocToken.h"
6870
#include "llvm/Transforms/Instrumentation/BoundsChecking.h"
6971
#include "llvm/Transforms/Instrumentation/DataFlowSanitizer.h"
7072
#include "llvm/Transforms/Instrumentation/GCOVProfiler.h"
@@ -232,6 +234,14 @@ class EmitAssemblyHelper {
232234
};
233235
} // namespace
234236

237+
static AllocTokenOptions getAllocTokenOptions(const CodeGenOptions &CGOpts) {
238+
AllocTokenOptions Opts;
239+
Opts.MaxTokens = CGOpts.AllocTokenMax;
240+
Opts.Extended = CGOpts.SanitizeAllocTokenExtended;
241+
Opts.FastABI = CGOpts.SanitizeAllocTokenFastABI;
242+
return Opts;
243+
}
244+
235245
static SanitizerCoverageOptions
236246
getSancovOptsFromCGOpts(const CodeGenOptions &CGOpts) {
237247
SanitizerCoverageOptions Opts;
@@ -789,6 +799,16 @@ static void addSanitizers(const Triple &TargetTriple,
789799
MPM.addPass(DataFlowSanitizerPass(LangOpts.NoSanitizeFiles,
790800
PB.getVirtualFileSystemPtr()));
791801
}
802+
803+
if (LangOpts.Sanitize.has(SanitizerKind::AllocToken)) {
804+
if (Level == OptimizationLevel::O0) {
805+
// The default pass builder only infers libcall function attrs when
806+
// optimizing, so we insert it here because we need it for accurate
807+
// memory allocation function detection.
808+
MPM.addPass(InferFunctionAttrsPass());
809+
}
810+
MPM.addPass(AllocTokenPass(getAllocTokenOptions(CodeGenOpts)));
811+
}
792812
};
793813
if (ClSanitizeOnOptimizerEarlyEP) {
794814
PB.registerOptimizerEarlyEPCallback(

0 commit comments

Comments
 (0)