Skip to content

Commit 75bfb7a

Browse files
committed
[𝘀𝗽𝗿] changes to main this commit is based on
Created using spr 1.3.8-beta.1 [skip ci]
1 parent a1bfa2f commit 75bfb7a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+1555
-13
lines changed

clang/docs/AllocToken.rst

Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
=================
2+
Allocation Tokens
3+
=================
4+
5+
.. contents::
6+
:local:
7+
8+
Introduction
9+
============
10+
11+
Clang provides support for allocation tokens to enable allocator-level heap
12+
organization strategies. Clang assigns mode-dependent token IDs to allocation
13+
calls; the runtime behavior depends entirely on the implementation of a
14+
compatible memory allocator.
15+
16+
Possible allocator strategies include:
17+
18+
* **Security Hardening**: Placing allocations into separate, isolated heap
19+
partitions. For example, separating pointer-containing types from raw data
20+
can mitigate exploits that rely on overflowing a primitive buffer to corrupt
21+
object metadata.
22+
23+
* **Memory Layout Optimization**: Grouping related allocations to improve data
24+
locality and cache utilization.
25+
26+
* **Custom Allocation Policies**: Applying different management strategies to
27+
different partitions.
28+
29+
Token Assignment Mode
30+
=====================
31+
32+
The default mode to calculate tokens is:
33+
34+
* *TypeHash* (mode=2): This mode assigns a token ID based on the hash of
35+
the allocated type's name.
36+
37+
Other token ID assignment modes are supported, but they may be subject to
38+
change or removal. These may (experimentally) be selected with ``-mllvm
39+
-alloc-token-mode=<mode>``:
40+
41+
* *Random* (mode=1): This mode assigns a statically-determined random token ID
42+
to each allocation site.
43+
44+
* *Increment* (mode=0): This mode assigns a simple, incrementally increasing
45+
token ID to each allocation site.
46+
47+
Allocation Token Instrumentation
48+
================================
49+
50+
To enable instrumentation of allocation functions, code can be compiled with
51+
the ``-fsanitize=alloc-token`` flag:
52+
53+
.. code-block:: console
54+
55+
% clang++ -fsanitize=alloc-token example.cc
56+
57+
The instrumentation transforms allocation calls to include a token ID. For
58+
example:
59+
60+
.. code-block:: c
61+
62+
// Original:
63+
ptr = malloc(size);
64+
65+
// Instrumented:
66+
ptr = __alloc_token_malloc(size, token_id);
67+
68+
In addition, it is typically recommended to configure the following:
69+
70+
* ``-falloc-token-max=<N>``
71+
Configures the maximum number of tokens. No max by default (tokens bounded
72+
by ``UINT64_MAX``).
73+
74+
.. code-block:: console
75+
76+
% clang++ -fsanitize=alloc-token -falloc-token-max=512 example.cc
77+
78+
Runtime Interface
79+
-----------------
80+
81+
A compatible runtime must be provided that implements the token-enabled
82+
allocation functions. The instrumentation generates calls to functions that
83+
take a final ``uint64_t token_id`` argument.
84+
85+
.. code-block:: c
86+
87+
// C standard library functions
88+
void *__alloc_token_malloc(size_t size, uint64_t token_id);
89+
void *__alloc_token_calloc(size_t count, size_t size, uint64_t token_id);
90+
void *__alloc_token_realloc(void *ptr, size_t size, uint64_t token_id);
91+
// ...
92+
93+
// C++ operators (mangled names)
94+
// operator new(size_t, uint64_t)
95+
void *__alloc_token_Znwm(size_t size, uint64_t token_id);
96+
// operator new[](size_t, uint64_t)
97+
void *__alloc_token_Znam(size_t size, uint64_t token_id);
98+
// ... other variants like nothrow, etc., are also instrumented.
99+
100+
Fast ABI
101+
--------
102+
103+
An alternative ABI can be enabled with ``-fsanitize-alloc-token-fast-abi``,
104+
which encodes the token ID hint in the allocation function name.
105+
106+
.. code-block:: c
107+
108+
void *__alloc_token_0_malloc(size_t size);
109+
void *__alloc_token_1_malloc(size_t size);
110+
void *__alloc_token_2_malloc(size_t size);
111+
...
112+
void *__alloc_token_0_Znwm(size_t size);
113+
void *__alloc_token_1_Znwm(size_t size);
114+
void *__alloc_token_2_Znwm(size_t size);
115+
...
116+
117+
This ABI provides a more efficient alternative where
118+
``-falloc-token-max`` is small.
119+
120+
Disabling Instrumentation
121+
-------------------------
122+
123+
To exclude specific functions from instrumentation, you can use the
124+
``no_sanitize("alloc-token")`` attribute:
125+
126+
.. code-block:: c
127+
128+
__attribute__((no_sanitize("alloc-token")))
129+
void* custom_allocator(size_t size) {
130+
return malloc(size); // Uses original malloc
131+
}
132+
133+
Note: Independent of any given allocator support, the instrumentation aims to
134+
remain performance neutral. As such, ``no_sanitize("alloc-token")``
135+
functions may be inlined into instrumented functions and vice-versa. If
136+
correctness is affected, such functions should explicitly be marked
137+
``noinline``.
138+
139+
The ``__attribute__((disable_sanitizer_instrumentation))`` is also supported to
140+
disable this and other sanitizer instrumentations.
141+
142+
Suppressions File (Ignorelist)
143+
------------------------------
144+
145+
AllocToken respects the ``src`` and ``fun`` entity types in the
146+
:doc:`SanitizerSpecialCaseList`, which can be used to omit specified source
147+
files or functions from instrumentation.
148+
149+
.. code-block:: bash
150+
151+
# Exclude specific source files
152+
src:third_party/allocator.c
153+
# Exclude function name patterns
154+
fun:*custom_malloc*
155+
fun:LowLevel::*
156+
157+
.. code-block:: console
158+
159+
% clang++ -fsanitize=alloc-token -fsanitize-ignorelist=my_ignorelist.txt example.cc
160+
161+
Conditional Compilation with ``__SANITIZE_ALLOC_TOKEN__``
162+
-----------------------------------------------------------
163+
164+
In some cases, one may need to execute different code depending on whether
165+
AllocToken instrumentation is enabled. The ``__SANITIZE_ALLOC_TOKEN__`` macro
166+
can be used for this purpose.
167+
168+
.. code-block:: c
169+
170+
#ifdef __SANITIZE_ALLOC_TOKEN__
171+
// Code specific to -fsanitize=alloc-token builds
172+
#endif

clang/docs/ReleaseNotes.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -203,11 +203,15 @@ Non-comprehensive list of changes in this release
203203
Currently, the use of ``__builtin_dedup_pack`` is limited to template arguments and base
204204
specifiers, it also must be used within a template context.
205205

206+
- Introduce support for allocation tokens to enable allocator-level heap
207+
organization strategies. A feature to instrument all allocation functions
208+
with a token ID can be enabled via the ``-fsanitize=alloc-token`` flag.
206209

207210
New Compiler Flags
208211
------------------
209212
- New option ``-fno-sanitize-debug-trap-reasons`` added to disable emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
210213
- New option ``-fsanitize-debug-trap-reasons=`` added to control emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
214+
- New options for enabling allocation token instrumentation: ``-fsanitize=alloc-token``, ``-falloc-token-max=``, ``-fsanitize-alloc-token-fast-abi``, ``-fsanitize-alloc-token-extended``.
211215

212216

213217
Lanai Support

clang/docs/UsersManual.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2194,6 +2194,8 @@ are listed below.
21942194
protection against stack-based memory corruption errors.
21952195
- ``-fsanitize=realtime``: :doc:`RealtimeSanitizer`,
21962196
a real-time safety checker.
2197+
- ``-fsanitize=alloc-token``: :doc:`AllocToken`,
2198+
allocation token instrumentation (requires compatible allocator).
21972199

21982200
There are more fine-grained checks available: see
21992201
the :ref:`list <ubsan-checks>` of specific kinds of

clang/include/clang/Basic/CodeGenOptions.def

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -306,6 +306,8 @@ CODEGENOPT(SanitizeBinaryMetadataCovered, 1, 0, Benign) ///< Emit PCs for covere
306306
CODEGENOPT(SanitizeBinaryMetadataAtomics, 1, 0, Benign) ///< Emit PCs for atomic operations.
307307
CODEGENOPT(SanitizeBinaryMetadataUAR, 1, 0, Benign) ///< Emit PCs for start of functions
308308
///< that are subject for use-after-return checking.
309+
CODEGENOPT(SanitizeAllocTokenFastABI, 1, 0, Benign) ///< Use the AllocToken fast ABI.
310+
CODEGENOPT(SanitizeAllocTokenExtended, 1, 0, Benign) ///< Extend coverage to custom allocation functions.
309311
CODEGENOPT(SanitizeStats , 1, 0, Benign) ///< Collect statistics for sanitizers.
310312
ENUM_CODEGENOPT(SanitizeDebugTrapReasons, SanitizeDebugTrapReasonKind, 2, SanitizeDebugTrapReasonKind::Detailed, Benign) ///< Control how "trap reasons" are emitted in debug info
311313
CODEGENOPT(SimplifyLibCalls , 1, 1, Benign) ///< Set when -fbuiltin is enabled.

clang/include/clang/Basic/CodeGenOptions.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -447,6 +447,9 @@ class CodeGenOptions : public CodeGenOptionsBase {
447447

448448
std::optional<double> AllowRuntimeCheckSkipHotCutoff;
449449

450+
/// Maximum number of allocation tokens (0 = no max).
451+
std::optional<uint64_t> AllocTokenMax;
452+
450453
/// List of backend command-line options for -fembed-bitcode.
451454
std::vector<uint8_t> CmdArgs;
452455

clang/include/clang/Basic/Sanitizers.def

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -195,6 +195,9 @@ SANITIZER_GROUP("bounds", Bounds, ArrayBounds | LocalBounds)
195195
// Scudo hardened allocator
196196
SANITIZER("scudo", Scudo)
197197

198+
// AllocToken
199+
SANITIZER("alloc-token", AllocToken)
200+
198201
// Magic group, containing all sanitizers. For example, "-fno-sanitize=all"
199202
// can be used to disable all the sanitizers.
200203
SANITIZER_GROUP("all", All, ~SanitizerMask())

clang/include/clang/Driver/Options.td

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2730,8 +2730,25 @@ def fsanitize_skip_hot_cutoff_EQ
27302730
"(0.0 [default] = skip none; 1.0 = skip all). "
27312731
"Argument format: <sanitizer1>=<value1>,<sanitizer2>=<value2>,...">;
27322732

2733+
defm sanitize_alloc_token_fast_abi : BoolOption<"f", "sanitize-alloc-token-fast-abi",
2734+
CodeGenOpts<"SanitizeAllocTokenFastABI">, DefaultFalse,
2735+
PosFlag<SetTrue, [], [ClangOption], "Use the AllocToken fast ABI">,
2736+
NegFlag<SetFalse, [], [ClangOption], "Use the default AllocToken ABI">>,
2737+
Group<f_clang_Group>;
2738+
defm sanitize_alloc_token_extended : BoolOption<"f", "sanitize-alloc-token-extended",
2739+
CodeGenOpts<"SanitizeAllocTokenExtended">, DefaultFalse,
2740+
PosFlag<SetTrue, [], [ClangOption], "Enable">,
2741+
NegFlag<SetFalse, [], [ClangOption], "Disable">,
2742+
BothFlags<[], [ClangOption], " extended coverage to custom allocation functions">>,
2743+
Group<f_clang_Group>;
2744+
27332745
} // end -f[no-]sanitize* flags
27342746

2747+
def falloc_token_max_EQ : Joined<["-"], "falloc-token-max=">,
2748+
Group<f_Group>, Visibility<[ClangOption, CC1Option, CLOption]>,
2749+
MetaVarName<"<N>">,
2750+
HelpText<"Limit to maximum N allocation tokens (0 = no max)">;
2751+
27352752
def fallow_runtime_check_skip_hot_cutoff_EQ
27362753
: Joined<["-"], "fallow-runtime-check-skip-hot-cutoff=">,
27372754
Group<f_clang_Group>,

clang/include/clang/Driver/SanitizerArgs.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
#include "llvm/Option/Arg.h"
1414
#include "llvm/Option/ArgList.h"
1515
#include "llvm/Transforms/Instrumentation/AddressSanitizerOptions.h"
16+
#include <optional>
1617
#include <string>
1718
#include <vector>
1819

@@ -73,8 +74,9 @@ class SanitizerArgs {
7374
bool HwasanUseAliases = false;
7475
llvm::AsanDetectStackUseAfterReturnMode AsanUseAfterReturn =
7576
llvm::AsanDetectStackUseAfterReturnMode::Invalid;
76-
7777
std::string MemtagMode;
78+
bool AllocTokenFastABI = false;
79+
bool AllocTokenExtended = false;
7880

7981
public:
8082
/// Parses the sanitizer arguments from an argument list.

clang/lib/CodeGen/BackendUtil.cpp

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,11 +59,13 @@
5959
#include "llvm/TargetParser/Triple.h"
6060
#include "llvm/Transforms/HipStdPar/HipStdPar.h"
6161
#include "llvm/Transforms/IPO/EmbedBitcodePass.h"
62+
#include "llvm/Transforms/IPO/InferFunctionAttrs.h"
6263
#include "llvm/Transforms/IPO/LowerTypeTests.h"
6364
#include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h"
6465
#include "llvm/Transforms/InstCombine/InstCombine.h"
6566
#include "llvm/Transforms/Instrumentation/AddressSanitizer.h"
6667
#include "llvm/Transforms/Instrumentation/AddressSanitizerOptions.h"
68+
#include "llvm/Transforms/Instrumentation/AllocToken.h"
6769
#include "llvm/Transforms/Instrumentation/BoundsChecking.h"
6870
#include "llvm/Transforms/Instrumentation/DataFlowSanitizer.h"
6971
#include "llvm/Transforms/Instrumentation/GCOVProfiler.h"
@@ -231,6 +233,14 @@ class EmitAssemblyHelper {
231233
};
232234
} // namespace
233235

236+
static AllocTokenOptions getAllocTokenOptions(const CodeGenOptions &CGOpts) {
237+
AllocTokenOptions Opts;
238+
Opts.MaxTokens = CGOpts.AllocTokenMax;
239+
Opts.Extended = CGOpts.SanitizeAllocTokenExtended;
240+
Opts.FastABI = CGOpts.SanitizeAllocTokenFastABI;
241+
return Opts;
242+
}
243+
234244
static SanitizerCoverageOptions
235245
getSancovOptsFromCGOpts(const CodeGenOptions &CGOpts) {
236246
SanitizerCoverageOptions Opts;
@@ -784,6 +794,16 @@ static void addSanitizers(const Triple &TargetTriple,
784794
if (LangOpts.Sanitize.has(SanitizerKind::DataFlow)) {
785795
MPM.addPass(DataFlowSanitizerPass(LangOpts.NoSanitizeFiles));
786796
}
797+
798+
if (LangOpts.Sanitize.has(SanitizerKind::AllocToken)) {
799+
if (Level == OptimizationLevel::O0) {
800+
// The default pass builder only infers libcall function attrs when
801+
// optimizing, so we insert it here because we need it for accurate
802+
// memory allocation function detection.
803+
MPM.addPass(InferFunctionAttrsPass());
804+
}
805+
MPM.addPass(AllocTokenPass(getAllocTokenOptions(CodeGenOpts)));
806+
}
787807
};
788808
if (ClSanitizeOnOptimizerEarlyEP) {
789809
PB.registerOptimizerEarlyEPCallback(

clang/lib/CodeGen/CGExpr.cpp

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1272,6 +1272,22 @@ void CodeGenFunction::EmitBoundsCheckImpl(const Expr *E, llvm::Value *Bound,
12721272
EmitCheck(std::make_pair(Check, CheckKind), CheckHandler, StaticData, Index);
12731273
}
12741274

1275+
void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
1276+
QualType AllocType) {
1277+
assert(SanOpts.has(SanitizerKind::AllocToken) &&
1278+
"Only needed with -fsanitize=alloc-token");
1279+
1280+
PrintingPolicy Policy(CGM.getContext().getLangOpts());
1281+
Policy.SuppressTagKeyword = true;
1282+
Policy.FullyQualifiedName = true;
1283+
std::string TypeName = AllocType.getCanonicalType().getAsString(Policy);
1284+
auto *TypeMDS = llvm::MDString::get(CGM.getLLVMContext(), TypeName);
1285+
1286+
// Format: !{<type-name>}
1287+
auto *MDN = llvm::MDNode::get(CGM.getLLVMContext(), {TypeMDS});
1288+
CB->setMetadata("alloc_token_hint", MDN);
1289+
}
1290+
12751291
CodeGenFunction::ComplexPairTy CodeGenFunction::
12761292
EmitComplexPrePostIncDec(const UnaryOperator *E, LValue LV,
12771293
bool isInc, bool isPre) {

0 commit comments

Comments
 (0)