Skip to content
Merged
Show file tree
Hide file tree
Changes from 44 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
75bfb7a
[𝘀𝗽𝗿] changes to main this commit is based on
melver Sep 4, 2025
b689546
[𝘀𝗽𝗿] initial version
melver Sep 4, 2025
68b4783
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 4, 2025
cb3d52d
fixup! Insert AllocToken into index.rst
melver Sep 4, 2025
5397b6b
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 5, 2025
33d18b2
fixup! Switch to fixed MD
melver Sep 5, 2025
14c7544
fixup! fix for incomplete types
melver Sep 8, 2025
7f70661
fixup!
melver Sep 8, 2025
22570af
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 18, 2025
1358f5a
fixup! address reviewer comments
melver Sep 18, 2025
01f8d55
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 19, 2025
3b64919
fixup! address reviewer comments round 2
melver Sep 19, 2025
69aad6d
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 22, 2025
b0e9549
fixup! use update_test_checks.py for opt tests
melver Sep 22, 2025
d5a42a1
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 23, 2025
ebab546
fixup! do not strip _
melver Sep 23, 2025
7ba5526
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 26, 2025
fb160db
fixup! address some comments
melver Sep 26, 2025
25ac802
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 26, 2025
8281324
fixup! address more comments
melver Sep 26, 2025
cb25798
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 29, 2025
2fa07d7
rebase
melver Sep 29, 2025
8641f7f
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 30, 2025
9979bca
fixup! address comments
melver Sep 30, 2025
37031e1
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 2, 2025
ca51a2b
fixup!
melver Oct 2, 2025
946afaa
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 2, 2025
0cebd94
fixup! switch clang tests back to manually written
melver Oct 2, 2025
f3e8076
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
fecfe67
rebase
melver Oct 7, 2025
6e1451c
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
fa2bb2c
rebase
melver Oct 7, 2025
10a1b88
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
6f6aa54
rebase
melver Oct 7, 2025
8502fcf
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
6ed5fe6
rebase
melver Oct 7, 2025
5e9458c
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
346e06d
rebase
melver Oct 7, 2025
fbc5f29
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
45fb47d
rebase
melver Oct 7, 2025
cfc9648
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
6225eb5
rebase
melver Oct 7, 2025
43b6898
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 8, 2025
9574188
rebase
melver Oct 8, 2025
dc6551a
rebase
melver Oct 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
178 changes: 178 additions & 0 deletions clang/docs/AllocToken.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
=================
Allocation Tokens
=================

.. contents::
:local:

Introduction
============

Clang provides support for allocation tokens to enable allocator-level heap
organization strategies. Clang assigns mode-dependent token IDs to allocation
calls; the runtime behavior depends entirely on the implementation of a
compatible memory allocator.

Possible allocator strategies include:

* **Security Hardening**: Placing allocations into separate, isolated heap
partitions. For example, separating pointer-containing types from raw data
can mitigate exploits that rely on overflowing a primitive buffer to corrupt
object metadata.

* **Memory Layout Optimization**: Grouping related allocations to improve data
locality and cache utilization.

* **Custom Allocation Policies**: Applying different management strategies to
different partitions.

Token Assignment Mode
=====================

The default mode to calculate tokens is:

* ``typehashpointersplit``: This mode assigns a token ID based on the hash of
the allocated type's name, where the top half ID-space is reserved for types
that contain pointers and the bottom half for types that do not contain
pointers.

Other token ID assignment modes are supported, but they may be subject to
change or removal. These may (experimentally) be selected with ``-mllvm
-alloc-token-mode=<mode>``:

* ``typehash``: This mode assigns a token ID based on the hash of the allocated
type's name.

* ``random``: This mode assigns a statically-determined random token ID to each
allocation site.

* ``increment``: This mode assigns a simple, incrementally increasing token ID
to each allocation site.

Allocation Token Instrumentation
================================

To enable instrumentation of allocation functions, code can be compiled with
the ``-fsanitize=alloc-token`` flag:

.. code-block:: console

% clang++ -fsanitize=alloc-token example.cc

The instrumentation transforms allocation calls to include a token ID. For
example:

.. code-block:: c

// Original:
ptr = malloc(size);

// Instrumented:
ptr = __alloc_token_malloc(size, <token id>);

The following command-line options affect generated token IDs:

* ``-falloc-token-max=<N>``
Configures the maximum number of tokens. No max by default (tokens bounded
by ``SIZE_MAX``).

.. code-block:: console

% clang++ -fsanitize=alloc-token -falloc-token-max=512 example.cc

Runtime Interface
-----------------

A compatible runtime must be provided that implements the token-enabled
allocation functions. The instrumentation generates calls to functions that
take a final ``size_t token_id`` argument.

.. code-block:: c

// C standard library functions
void *__alloc_token_malloc(size_t size, size_t token_id);
void *__alloc_token_calloc(size_t count, size_t size, size_t token_id);
void *__alloc_token_realloc(void *ptr, size_t size, size_t token_id);
// ...

// C++ operators (mangled names)
// operator new(size_t, size_t)
void *__alloc_token__Znwm(size_t size, size_t token_id);
// operator new[](size_t, size_t)
void *__alloc_token__Znam(size_t size, size_t token_id);
// ... other variants like nothrow, etc., are also instrumented.

Fast ABI
--------

An alternative ABI can be enabled with ``-fsanitize-alloc-token-fast-abi``,
which encodes the token ID hint in the allocation function name.

.. code-block:: c

void *__alloc_token_0_malloc(size_t size);
void *__alloc_token_1_malloc(size_t size);
void *__alloc_token_2_malloc(size_t size);
...
void *__alloc_token_0_Znwm(size_t size);
void *__alloc_token_1_Znwm(size_t size);
void *__alloc_token_2_Znwm(size_t size);
...

This ABI provides a more efficient alternative where
``-falloc-token-max`` is small.

Disabling Instrumentation
-------------------------

To exclude specific functions from instrumentation, you can use the
``no_sanitize("alloc-token")`` attribute:

.. code-block:: c

__attribute__((no_sanitize("alloc-token")))
void* custom_allocator(size_t size) {
return malloc(size); // Uses original malloc
}

Note: Independent of any given allocator support, the instrumentation aims to
remain performance neutral. As such, ``no_sanitize("alloc-token")``
functions may be inlined into instrumented functions and vice-versa. If
correctness is affected, such functions should explicitly be marked
``noinline``.

The ``__attribute__((disable_sanitizer_instrumentation))`` is also supported to
disable this and other sanitizer instrumentations.

Suppressions File (Ignorelist)
------------------------------

AllocToken respects the ``src`` and ``fun`` entity types in the
:doc:`SanitizerSpecialCaseList`, which can be used to omit specified source
files or functions from instrumentation.

.. code-block:: bash

[alloc-token]
# Exclude specific source files
src:third_party/allocator.c
# Exclude function name patterns
fun:*custom_malloc*
fun:LowLevel::*

.. code-block:: console

% clang++ -fsanitize=alloc-token -fsanitize-ignorelist=my_ignorelist.txt example.cc

Conditional Compilation with ``__SANITIZE_ALLOC_TOKEN__``
-----------------------------------------------------------

In some cases, one may need to execute different code depending on whether
AllocToken instrumentation is enabled. The ``__SANITIZE_ALLOC_TOKEN__`` macro
can be used for this purpose.

.. code-block:: c

#ifdef __SANITIZE_ALLOC_TOKEN__
// Code specific to -fsanitize=alloc-token builds
#endif
6 changes: 6 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -257,10 +257,16 @@ Non-comprehensive list of changes in this release

- Fixed a crash when the second argument to ``__builtin_assume_aligned`` was not constant (#GH161314)

- Introduce support for :doc:`allocation tokens <AllocToken>` to enable
allocator-level heap organization strategies. A feature to instrument all
allocation functions with a token ID can be enabled via the
``-fsanitize=alloc-token`` flag.

New Compiler Flags
------------------
- New option ``-fno-sanitize-debug-trap-reasons`` added to disable emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
- New option ``-fsanitize-debug-trap-reasons=`` added to control emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
- New options for enabling allocation token instrumentation: ``-fsanitize=alloc-token``, ``-falloc-token-max=``, ``-fsanitize-alloc-token-fast-abi``, ``-fsanitize-alloc-token-extended``.


Lanai Support
Expand Down
18 changes: 12 additions & 6 deletions clang/docs/UsersManual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2155,13 +2155,11 @@ are listed below.

.. option:: -f[no-]sanitize=check1,check2,...

Turn on runtime checks for various forms of undefined or suspicious
behavior.
Turn on runtime checks or mitigations for various forms of undefined or
suspicious behavior. These are disabled by default.

This option controls whether Clang adds runtime checks for various
forms of undefined or suspicious behavior, and is disabled by
default. If a check fails, a diagnostic message is produced at
runtime explaining the problem. The main checks are:
The following options enable runtime checks for various forms of undefined
or suspicious behavior:

- .. _opt_fsanitize_address:

Expand Down Expand Up @@ -2195,6 +2193,14 @@ are listed below.
- ``-fsanitize=realtime``: :doc:`RealtimeSanitizer`,
a real-time safety checker.

The following options enable runtime mitigations for various forms of
undefined or suspicious behavior:

- ``-fsanitize=alloc-token``: Enables :doc:`allocation tokens <AllocToken>`
for allocator-level heap organization strategies, such as for security
hardening. It passes type-derived token IDs to a compatible memory
allocator. Requires linking against a token-aware allocator.

There are more fine-grained checks available: see
the :ref:`list <ubsan-checks>` of specific kinds of
undefined behavior that can be detected and the :ref:`list <cfi-schemes>`
Expand Down
1 change: 1 addition & 0 deletions clang/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Using Clang as a Compiler
SanitizerCoverage
SanitizerStats
SanitizerSpecialCaseList
AllocToken
BoundsSafety
BoundsSafetyAdoptionGuide
BoundsSafetyImplPlans
Expand Down
2 changes: 2 additions & 0 deletions clang/include/clang/Basic/CodeGenOptions.def
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,8 @@ CODEGENOPT(SanitizeBinaryMetadataCovered, 1, 0, Benign) ///< Emit PCs for covere
CODEGENOPT(SanitizeBinaryMetadataAtomics, 1, 0, Benign) ///< Emit PCs for atomic operations.
CODEGENOPT(SanitizeBinaryMetadataUAR, 1, 0, Benign) ///< Emit PCs for start of functions
///< that are subject for use-after-return checking.
CODEGENOPT(SanitizeAllocTokenFastABI, 1, 0, Benign) ///< Use the AllocToken fast ABI.
CODEGENOPT(SanitizeAllocTokenExtended, 1, 0, Benign) ///< Extend coverage to custom allocation functions.
CODEGENOPT(SanitizeStats , 1, 0, Benign) ///< Collect statistics for sanitizers.
ENUM_CODEGENOPT(SanitizeDebugTrapReasons, SanitizeDebugTrapReasonKind, 2, SanitizeDebugTrapReasonKind::Detailed, Benign) ///< Control how "trap reasons" are emitted in debug info
CODEGENOPT(SimplifyLibCalls , 1, 1, Benign) ///< Set when -fbuiltin is enabled.
Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/Basic/CodeGenOptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,10 @@ class CodeGenOptions : public CodeGenOptionsBase {

std::optional<double> AllowRuntimeCheckSkipHotCutoff;

/// Maximum number of allocation tokens (0 = no max), nullopt if none set (use
/// pass default).
std::optional<uint64_t> AllocTokenMax;

/// List of backend command-line options for -fembed-bitcode.
std::vector<uint8_t> CmdArgs;

Expand Down
17 changes: 17 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -2731,8 +2731,25 @@ def fsanitize_skip_hot_cutoff_EQ
"(0.0 [default] = skip none; 1.0 = skip all). "
"Argument format: <sanitizer1>=<value1>,<sanitizer2>=<value2>,...">;

defm sanitize_alloc_token_fast_abi : BoolOption<"f", "sanitize-alloc-token-fast-abi",
CodeGenOpts<"SanitizeAllocTokenFastABI">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption], "Use the AllocToken fast ABI">,
NegFlag<SetFalse, [], [ClangOption], "Use the default AllocToken ABI">>,
Group<f_clang_Group>;
defm sanitize_alloc_token_extended : BoolOption<"f", "sanitize-alloc-token-extended",
CodeGenOpts<"SanitizeAllocTokenExtended">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption], "Enable">,
NegFlag<SetFalse, [], [ClangOption], "Disable">,
BothFlags<[], [ClangOption], " extended coverage to custom allocation functions">>,
Group<f_clang_Group>;

} // end -f[no-]sanitize* flags

def falloc_token_max_EQ : Joined<["-"], "falloc-token-max=">,
Group<f_Group>, Visibility<[ClangOption, CC1Option]>,
MetaVarName<"<N>">,
HelpText<"Limit to maximum N allocation tokens (0 = no max)">;

def fallow_runtime_check_skip_hot_cutoff_EQ
: Joined<["-"], "fallow-runtime-check-skip-hot-cutoff=">,
Group<f_clang_Group>,
Expand Down
2 changes: 2 additions & 0 deletions clang/include/clang/Driver/SanitizerArgs.h
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,8 @@ class SanitizerArgs {
llvm::AsanDetectStackUseAfterReturnMode::Invalid;

std::string MemtagMode;
bool AllocTokenFastABI = false;
bool AllocTokenExtended = false;

public:
/// Parses the sanitizer arguments from an argument list.
Expand Down
20 changes: 20 additions & 0 deletions clang/lib/CodeGen/BackendUtil.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -60,11 +60,13 @@
#include "llvm/TargetParser/Triple.h"
#include "llvm/Transforms/HipStdPar/HipStdPar.h"
#include "llvm/Transforms/IPO/EmbedBitcodePass.h"
#include "llvm/Transforms/IPO/InferFunctionAttrs.h"
#include "llvm/Transforms/IPO/LowerTypeTests.h"
#include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h"
#include "llvm/Transforms/InstCombine/InstCombine.h"
#include "llvm/Transforms/Instrumentation/AddressSanitizer.h"
#include "llvm/Transforms/Instrumentation/AddressSanitizerOptions.h"
#include "llvm/Transforms/Instrumentation/AllocToken.h"
#include "llvm/Transforms/Instrumentation/BoundsChecking.h"
#include "llvm/Transforms/Instrumentation/DataFlowSanitizer.h"
#include "llvm/Transforms/Instrumentation/GCOVProfiler.h"
Expand Down Expand Up @@ -232,6 +234,14 @@ class EmitAssemblyHelper {
};
} // namespace

static AllocTokenOptions getAllocTokenOptions(const CodeGenOptions &CGOpts) {
AllocTokenOptions Opts;
Opts.MaxTokens = CGOpts.AllocTokenMax;
Opts.Extended = CGOpts.SanitizeAllocTokenExtended;
Opts.FastABI = CGOpts.SanitizeAllocTokenFastABI;
return Opts;
}

static SanitizerCoverageOptions
getSancovOptsFromCGOpts(const CodeGenOptions &CGOpts) {
SanitizerCoverageOptions Opts;
Expand Down Expand Up @@ -789,6 +799,16 @@ static void addSanitizers(const Triple &TargetTriple,
MPM.addPass(DataFlowSanitizerPass(LangOpts.NoSanitizeFiles,
PB.getVirtualFileSystemPtr()));
}

if (LangOpts.Sanitize.has(SanitizerKind::AllocToken)) {
if (Level == OptimizationLevel::O0) {
// The default pass builder only infers libcall function attrs when
// optimizing, so we insert it here because we need it for accurate
// memory allocation function detection.
MPM.addPass(InferFunctionAttrsPass());
}
MPM.addPass(AllocTokenPass(getAllocTokenOptions(CodeGenOpts)));
}
};
if (ClSanitizeOnOptimizerEarlyEP) {
PB.registerOptimizerEarlyEPCallback(
Expand Down
Loading
Loading