Skip to content

Commit b47fdc8

Browse files
author
git apple-llvm automerger
committed
Merge commit '774ffe5cce73' from llvm.org/main into next
2 parents 94b4d8e + 774ffe5 commit b47fdc8

20 files changed

+563
-11
lines changed

clang/docs/AllocToken.rst

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
=================
2+
Allocation Tokens
3+
=================
4+
5+
.. contents::
6+
:local:
7+
8+
Introduction
9+
============
10+
11+
Clang provides support for allocation tokens to enable allocator-level heap
12+
organization strategies. Clang assigns mode-dependent token IDs to allocation
13+
calls; the runtime behavior depends entirely on the implementation of a
14+
compatible memory allocator.
15+
16+
Possible allocator strategies include:
17+
18+
* **Security Hardening**: Placing allocations into separate, isolated heap
19+
partitions. For example, separating pointer-containing types from raw data
20+
can mitigate exploits that rely on overflowing a primitive buffer to corrupt
21+
object metadata.
22+
23+
* **Memory Layout Optimization**: Grouping related allocations to improve data
24+
locality and cache utilization.
25+
26+
* **Custom Allocation Policies**: Applying different management strategies to
27+
different partitions.
28+
29+
Token Assignment Mode
30+
=====================
31+
32+
The default mode to calculate tokens is:
33+
34+
* ``typehash``: This mode assigns a token ID based on the hash of the allocated
35+
type's name.
36+
37+
Other token ID assignment modes are supported, but they may be subject to
38+
change or removal. These may (experimentally) be selected with ``-mllvm
39+
-alloc-token-mode=<mode>``:
40+
41+
* ``random``: This mode assigns a statically-determined random token ID to each
42+
allocation site.
43+
44+
* ``increment``: This mode assigns a simple, incrementally increasing token ID
45+
to each allocation site.
46+
47+
Allocation Token Instrumentation
48+
================================
49+
50+
To enable instrumentation of allocation functions, code can be compiled with
51+
the ``-fsanitize=alloc-token`` flag:
52+
53+
.. code-block:: console
54+
55+
% clang++ -fsanitize=alloc-token example.cc
56+
57+
The instrumentation transforms allocation calls to include a token ID. For
58+
example:
59+
60+
.. code-block:: c
61+
62+
// Original:
63+
ptr = malloc(size);
64+
65+
// Instrumented:
66+
ptr = __alloc_token_malloc(size, <token id>);
67+
68+
The following command-line options affect generated token IDs:
69+
70+
* ``-falloc-token-max=<N>``
71+
Configures the maximum number of tokens. No max by default (tokens bounded
72+
by ``SIZE_MAX``).
73+
74+
.. code-block:: console
75+
76+
% clang++ -fsanitize=alloc-token -falloc-token-max=512 example.cc
77+
78+
Runtime Interface
79+
-----------------
80+
81+
A compatible runtime must be provided that implements the token-enabled
82+
allocation functions. The instrumentation generates calls to functions that
83+
take a final ``size_t token_id`` argument.
84+
85+
.. code-block:: c
86+
87+
// C standard library functions
88+
void *__alloc_token_malloc(size_t size, size_t token_id);
89+
void *__alloc_token_calloc(size_t count, size_t size, size_t token_id);
90+
void *__alloc_token_realloc(void *ptr, size_t size, size_t token_id);
91+
// ...
92+
93+
// C++ operators (mangled names)
94+
// operator new(size_t, size_t)
95+
void *__alloc_token__Znwm(size_t size, size_t token_id);
96+
// operator new[](size_t, size_t)
97+
void *__alloc_token__Znam(size_t size, size_t token_id);
98+
// ... other variants like nothrow, etc., are also instrumented.
99+
100+
Fast ABI
101+
--------
102+
103+
An alternative ABI can be enabled with ``-fsanitize-alloc-token-fast-abi``,
104+
which encodes the token ID hint in the allocation function name.
105+
106+
.. code-block:: c
107+
108+
void *__alloc_token_0_malloc(size_t size);
109+
void *__alloc_token_1_malloc(size_t size);
110+
void *__alloc_token_2_malloc(size_t size);
111+
...
112+
void *__alloc_token_0_Znwm(size_t size);
113+
void *__alloc_token_1_Znwm(size_t size);
114+
void *__alloc_token_2_Znwm(size_t size);
115+
...
116+
117+
This ABI provides a more efficient alternative where
118+
``-falloc-token-max`` is small.
119+
120+
Disabling Instrumentation
121+
-------------------------
122+
123+
To exclude specific functions from instrumentation, you can use the
124+
``no_sanitize("alloc-token")`` attribute:
125+
126+
.. code-block:: c
127+
128+
__attribute__((no_sanitize("alloc-token")))
129+
void* custom_allocator(size_t size) {
130+
return malloc(size); // Uses original malloc
131+
}
132+
133+
Note: Independent of any given allocator support, the instrumentation aims to
134+
remain performance neutral. As such, ``no_sanitize("alloc-token")``
135+
functions may be inlined into instrumented functions and vice-versa. If
136+
correctness is affected, such functions should explicitly be marked
137+
``noinline``.
138+
139+
The ``__attribute__((disable_sanitizer_instrumentation))`` is also supported to
140+
disable this and other sanitizer instrumentations.
141+
142+
Suppressions File (Ignorelist)
143+
------------------------------
144+
145+
AllocToken respects the ``src`` and ``fun`` entity types in the
146+
:doc:`SanitizerSpecialCaseList`, which can be used to omit specified source
147+
files or functions from instrumentation.
148+
149+
.. code-block:: bash
150+
151+
[alloc-token]
152+
# Exclude specific source files
153+
src:third_party/allocator.c
154+
# Exclude function name patterns
155+
fun:*custom_malloc*
156+
fun:LowLevel::*
157+
158+
.. code-block:: console
159+
160+
% clang++ -fsanitize=alloc-token -fsanitize-ignorelist=my_ignorelist.txt example.cc
161+
162+
Conditional Compilation with ``__SANITIZE_ALLOC_TOKEN__``
163+
-----------------------------------------------------------
164+
165+
In some cases, one may need to execute different code depending on whether
166+
AllocToken instrumentation is enabled. The ``__SANITIZE_ALLOC_TOKEN__`` macro
167+
can be used for this purpose.
168+
169+
.. code-block:: c
170+
171+
#ifdef __SANITIZE_ALLOC_TOKEN__
172+
// Code specific to -fsanitize=alloc-token builds
173+
#endif

clang/docs/ReleaseNotes.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -257,10 +257,16 @@ Non-comprehensive list of changes in this release
257257

258258
- Fixed a crash when the second argument to ``__builtin_assume_aligned`` was not constant (#GH161314)
259259

260+
- Introduce support for :doc:`allocation tokens <AllocToken>` to enable
261+
allocator-level heap organization strategies. A feature to instrument all
262+
allocation functions with a token ID can be enabled via the
263+
``-fsanitize=alloc-token`` flag.
264+
260265
New Compiler Flags
261266
------------------
262267
- New option ``-fno-sanitize-debug-trap-reasons`` added to disable emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
263268
- New option ``-fsanitize-debug-trap-reasons=`` added to control emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
269+
- New options for enabling allocation token instrumentation: ``-fsanitize=alloc-token``, ``-falloc-token-max=``, ``-fsanitize-alloc-token-fast-abi``, ``-fsanitize-alloc-token-extended``.
264270

265271

266272
Lanai Support

clang/docs/UsersManual.rst

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2155,13 +2155,11 @@ are listed below.
21552155

21562156
.. option:: -f[no-]sanitize=check1,check2,...
21572157

2158-
Turn on runtime checks for various forms of undefined or suspicious
2159-
behavior.
2158+
Turn on runtime checks or mitigations for various forms of undefined or
2159+
suspicious behavior. These are disabled by default.
21602160

2161-
This option controls whether Clang adds runtime checks for various
2162-
forms of undefined or suspicious behavior, and is disabled by
2163-
default. If a check fails, a diagnostic message is produced at
2164-
runtime explaining the problem. The main checks are:
2161+
The following options enable runtime checks for various forms of undefined
2162+
or suspicious behavior:
21652163

21662164
- .. _opt_fsanitize_address:
21672165

@@ -2195,6 +2193,14 @@ are listed below.
21952193
- ``-fsanitize=realtime``: :doc:`RealtimeSanitizer`,
21962194
a real-time safety checker.
21972195

2196+
The following options enable runtime mitigations for various forms of
2197+
undefined or suspicious behavior:
2198+
2199+
- ``-fsanitize=alloc-token``: Enables :doc:`allocation tokens <AllocToken>`
2200+
for allocator-level heap organization strategies, such as for security
2201+
hardening. It passes type-derived token IDs to a compatible memory
2202+
allocator. Requires linking against a token-aware allocator.
2203+
21982204
There are more fine-grained checks available: see
21992205
the :ref:`list <ubsan-checks>` of specific kinds of
22002206
undefined behavior that can be detected and the :ref:`list <cfi-schemes>`

clang/docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ Using Clang as a Compiler
4040
SanitizerCoverage
4141
SanitizerStats
4242
SanitizerSpecialCaseList
43+
AllocToken
4344
BoundsSafety
4445
BoundsSafetyAdoptionGuide
4546
BoundsSafetyImplPlans

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,8 @@ CODEGENOPT(SanitizeBinaryMetadataCovered, 1, 0, Benign) ///< Emit PCs for covere
307307
CODEGENOPT(SanitizeBinaryMetadataAtomics, 1, 0, Benign) ///< Emit PCs for atomic operations.
308308
CODEGENOPT(SanitizeBinaryMetadataUAR, 1, 0, Benign) ///< Emit PCs for start of functions
309309
///< that are subject for use-after-return checking.
310+
CODEGENOPT(SanitizeAllocTokenFastABI, 1, 0, Benign) ///< Use the AllocToken fast ABI.
311+
CODEGENOPT(SanitizeAllocTokenExtended, 1, 0, Benign) ///< Extend coverage to custom allocation functions.
310312
CODEGENOPT(SanitizeStats , 1, 0, Benign) ///< Collect statistics for sanitizers.
311313
ENUM_CODEGENOPT(SanitizeDebugTrapReasons, SanitizeDebugTrapReasonKind, 2, SanitizeDebugTrapReasonKind::Detailed, Benign) ///< Control how "trap reasons" are emitted in debug info
312314
CODEGENOPT(SimplifyLibCalls , 1, 1, Benign) ///< Set when -fbuiltin is enabled.

clang/include/clang/Basic/CodeGenOptions.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -450,6 +450,10 @@ class CodeGenOptions : public CodeGenOptionsBase {
450450

451451
std::optional<double> AllowRuntimeCheckSkipHotCutoff;
452452

453+
/// Maximum number of allocation tokens (0 = no max), nullopt if none set (use
454+
/// pass default).
455+
std::optional<uint64_t> AllocTokenMax;
456+
453457
/// List of backend command-line options for -fembed-bitcode.
454458
std::vector<uint8_t> CmdArgs;
455459

clang/include/clang/Driver/Options.td

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2850,8 +2850,25 @@ def fsanitize_skip_hot_cutoff_EQ
28502850
"(0.0 [default] = skip none; 1.0 = skip all). "
28512851
"Argument format: <sanitizer1>=<value1>,<sanitizer2>=<value2>,...">;
28522852

2853+
defm sanitize_alloc_token_fast_abi : BoolOption<"f", "sanitize-alloc-token-fast-abi",
2854+
CodeGenOpts<"SanitizeAllocTokenFastABI">, DefaultFalse,
2855+
PosFlag<SetTrue, [], [ClangOption], "Use the AllocToken fast ABI">,
2856+
NegFlag<SetFalse, [], [ClangOption], "Use the default AllocToken ABI">>,
2857+
Group<f_clang_Group>;
2858+
defm sanitize_alloc_token_extended : BoolOption<"f", "sanitize-alloc-token-extended",
2859+
CodeGenOpts<"SanitizeAllocTokenExtended">, DefaultFalse,
2860+
PosFlag<SetTrue, [], [ClangOption], "Enable">,
2861+
NegFlag<SetFalse, [], [ClangOption], "Disable">,
2862+
BothFlags<[], [ClangOption], " extended coverage to custom allocation functions">>,
2863+
Group<f_clang_Group>;
2864+
28532865
} // end -f[no-]sanitize* flags
28542866

2867+
def falloc_token_max_EQ : Joined<["-"], "falloc-token-max=">,
2868+
Group<f_Group>, Visibility<[ClangOption, CC1Option]>,
2869+
MetaVarName<"<N>">,
2870+
HelpText<"Limit to maximum N allocation tokens (0 = no max)">;
2871+
28552872
def fallow_runtime_check_skip_hot_cutoff_EQ
28562873
: Joined<["-"], "fallow-runtime-check-skip-hot-cutoff=">,
28572874
Group<f_clang_Group>,

clang/include/clang/Driver/SanitizerArgs.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,8 @@ class SanitizerArgs {
7575
llvm::AsanDetectStackUseAfterReturnMode::Invalid;
7676

7777
std::string MemtagMode;
78+
bool AllocTokenFastABI = false;
79+
bool AllocTokenExtended = false;
7880

7981
public:
8082
/// Parses the sanitizer arguments from an argument list.

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,11 +61,13 @@
6161
#include "llvm/TargetParser/Triple.h"
6262
#include "llvm/Transforms/HipStdPar/HipStdPar.h"
6363
#include "llvm/Transforms/IPO/EmbedBitcodePass.h"
64+
#include "llvm/Transforms/IPO/InferFunctionAttrs.h"
6465
#include "llvm/Transforms/IPO/LowerTypeTests.h"
6566
#include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h"
6667
#include "llvm/Transforms/InstCombine/InstCombine.h"
6768
#include "llvm/Transforms/Instrumentation/AddressSanitizer.h"
6869
#include "llvm/Transforms/Instrumentation/AddressSanitizerOptions.h"
70+
#include "llvm/Transforms/Instrumentation/AllocToken.h"
6971
#include "llvm/Transforms/Instrumentation/BoundsChecking.h"
7072
#include "llvm/Transforms/Instrumentation/DataFlowSanitizer.h"
7173
#include "llvm/Transforms/Instrumentation/GCOVProfiler.h"
@@ -240,6 +242,14 @@ class EmitAssemblyHelper {
240242
};
241243
} // namespace
242244

245+
static AllocTokenOptions getAllocTokenOptions(const CodeGenOptions &CGOpts) {
246+
AllocTokenOptions Opts;
247+
Opts.MaxTokens = CGOpts.AllocTokenMax;
248+
Opts.Extended = CGOpts.SanitizeAllocTokenExtended;
249+
Opts.FastABI = CGOpts.SanitizeAllocTokenFastABI;
250+
return Opts;
251+
}
252+
243253
static SanitizerCoverageOptions
244254
getSancovOptsFromCGOpts(const CodeGenOptions &CGOpts) {
245255
SanitizerCoverageOptions Opts;
@@ -807,6 +817,16 @@ static void addSanitizers(const Triple &TargetTriple,
807817
MPM.addPass(DataFlowSanitizerPass(LangOpts.NoSanitizeFiles,
808818
PB.getVirtualFileSystemPtr()));
809819
}
820+
821+
if (LangOpts.Sanitize.has(SanitizerKind::AllocToken)) {
822+
if (Level == OptimizationLevel::O0) {
823+
// The default pass builder only infers libcall function attrs when
824+
// optimizing, so we insert it here because we need it for accurate
825+
// memory allocation function detection.
826+
MPM.addPass(InferFunctionAttrsPass());
827+
}
828+
MPM.addPass(AllocTokenPass(getAllocTokenOptions(CodeGenOpts)));
829+
}
810830
};
811831
if (ClSanitizeOnOptimizerEarlyEP) {
812832
PB.registerOptimizerEarlyEPCallback(

0 commit comments

Comments
 (0)