Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
d97ba0f
[𝘀𝗽𝗿] initial version
melver Sep 4, 2025
ec80a6b
[𝘀𝗽𝗿] changes to main this commit is based on
melver Sep 4, 2025
b365333
fixup! Insert AllocToken into index.rst
melver Sep 4, 2025
7f1cbf9
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 5, 2025
08d0fad
fixup! Switch to fixed MD
melver Sep 5, 2025
ca7e255
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 18, 2025
3c760b5
fixup! address reviewer comments
melver Sep 18, 2025
3071ee6
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 19, 2025
31ca802
fixup! address reviewer comments round 2
melver Sep 19, 2025
6e4c2cb
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 22, 2025
276d084
fixup! use update_test_checks.py for opt tests
melver Sep 22, 2025
0172860
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 23, 2025
b56084e
fixup! do not strip _
melver Sep 23, 2025
cc62d76
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 26, 2025
5af7e7c
fixup! address some comments
melver Sep 26, 2025
110cef2
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 26, 2025
7510ebf
fixup! address more comments
melver Sep 26, 2025
4ff2136
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 29, 2025
5dd8067
rebase
melver Sep 29, 2025
9c8454d
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 30, 2025
76d0c51
fixup! address comments
melver Sep 30, 2025
fb43ef1
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 2, 2025
0c44a0a
fixup!
melver Oct 2, 2025
8dec1a6
fixup! switch Clang tests back to manually written
melver Oct 2, 2025
9a0ab30
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 6, 2025
5ba014e
fixup! factor out some CodeGen changes
melver Oct 6, 2025
de04875
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
6814ad5
rebase
melver Oct 7, 2025
30c499f
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
b26b266
rebase
melver Oct 7, 2025
660a1e6
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
090caf4
rebase
melver Oct 7, 2025
ee3ec27
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
129b88d
rebase
melver Oct 7, 2025
cb88bde
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
a975cf4
rebase
melver Oct 7, 2025
1bc7080
rebase
melver Oct 7, 2025
197fcfe
rebase
melver Oct 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 173 additions & 0 deletions clang/docs/AllocToken.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
=================
Allocation Tokens
=================

.. contents::
:local:

Introduction
============

Clang provides support for allocation tokens to enable allocator-level heap
organization strategies. Clang assigns mode-dependent token IDs to allocation
calls; the runtime behavior depends entirely on the implementation of a
compatible memory allocator.

Possible allocator strategies include:

* **Security Hardening**: Placing allocations into separate, isolated heap
partitions. For example, separating pointer-containing types from raw data
can mitigate exploits that rely on overflowing a primitive buffer to corrupt
object metadata.

* **Memory Layout Optimization**: Grouping related allocations to improve data
locality and cache utilization.

* **Custom Allocation Policies**: Applying different management strategies to
different partitions.

Token Assignment Mode
=====================

The default mode to calculate tokens is:

* ``typehash``: This mode assigns a token ID based on the hash of the allocated
type's name.

Other token ID assignment modes are supported, but they may be subject to
change or removal. These may (experimentally) be selected with ``-mllvm
-alloc-token-mode=<mode>``:

* ``random``: This mode assigns a statically-determined random token ID to each
allocation site.

* ``increment``: This mode assigns a simple, incrementally increasing token ID
to each allocation site.

Allocation Token Instrumentation
================================

To enable instrumentation of allocation functions, code can be compiled with
the ``-fsanitize=alloc-token`` flag:

.. code-block:: console

% clang++ -fsanitize=alloc-token example.cc

The instrumentation transforms allocation calls to include a token ID. For
example:

.. code-block:: c

// Original:
ptr = malloc(size);

// Instrumented:
ptr = __alloc_token_malloc(size, <token id>);

The following command-line options affect generated token IDs:

* ``-falloc-token-max=<N>``
Configures the maximum number of tokens. No max by default (tokens bounded
by ``SIZE_MAX``).

.. code-block:: console

% clang++ -fsanitize=alloc-token -falloc-token-max=512 example.cc

Runtime Interface
-----------------

A compatible runtime must be provided that implements the token-enabled
allocation functions. The instrumentation generates calls to functions that
take a final ``size_t token_id`` argument.

.. code-block:: c

// C standard library functions
void *__alloc_token_malloc(size_t size, size_t token_id);
void *__alloc_token_calloc(size_t count, size_t size, size_t token_id);
void *__alloc_token_realloc(void *ptr, size_t size, size_t token_id);
// ...

// C++ operators (mangled names)
// operator new(size_t, size_t)
void *__alloc_token__Znwm(size_t size, size_t token_id);
// operator new[](size_t, size_t)
void *__alloc_token__Znam(size_t size, size_t token_id);
// ... other variants like nothrow, etc., are also instrumented.

Fast ABI
--------

An alternative ABI can be enabled with ``-fsanitize-alloc-token-fast-abi``,
which encodes the token ID hint in the allocation function name.

.. code-block:: c

void *__alloc_token_0_malloc(size_t size);
void *__alloc_token_1_malloc(size_t size);
void *__alloc_token_2_malloc(size_t size);
...
void *__alloc_token_0_Znwm(size_t size);
void *__alloc_token_1_Znwm(size_t size);
void *__alloc_token_2_Znwm(size_t size);
...

This ABI provides a more efficient alternative where
``-falloc-token-max`` is small.

Disabling Instrumentation
-------------------------

To exclude specific functions from instrumentation, you can use the
``no_sanitize("alloc-token")`` attribute:

.. code-block:: c

__attribute__((no_sanitize("alloc-token")))
void* custom_allocator(size_t size) {
return malloc(size); // Uses original malloc
}

Note: Independent of any given allocator support, the instrumentation aims to
remain performance neutral. As such, ``no_sanitize("alloc-token")``
functions may be inlined into instrumented functions and vice-versa. If
correctness is affected, such functions should explicitly be marked
``noinline``.

The ``__attribute__((disable_sanitizer_instrumentation))`` is also supported to
disable this and other sanitizer instrumentations.

Suppressions File (Ignorelist)
------------------------------

AllocToken respects the ``src`` and ``fun`` entity types in the
:doc:`SanitizerSpecialCaseList`, which can be used to omit specified source
files or functions from instrumentation.

.. code-block:: bash

[alloc-token]
# Exclude specific source files
src:third_party/allocator.c
# Exclude function name patterns
fun:*custom_malloc*
fun:LowLevel::*

.. code-block:: console

% clang++ -fsanitize=alloc-token -fsanitize-ignorelist=my_ignorelist.txt example.cc

Conditional Compilation with ``__SANITIZE_ALLOC_TOKEN__``
-----------------------------------------------------------

In some cases, one may need to execute different code depending on whether
AllocToken instrumentation is enabled. The ``__SANITIZE_ALLOC_TOKEN__`` macro
can be used for this purpose.

.. code-block:: c

#ifdef __SANITIZE_ALLOC_TOKEN__
// Code specific to -fsanitize=alloc-token builds
#endif
6 changes: 6 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -257,10 +257,16 @@ Non-comprehensive list of changes in this release

- Fixed a crash when the second argument to ``__builtin_assume_aligned`` was not constant (#GH161314)

- Introduce support for :doc:`allocation tokens <AllocToken>` to enable
allocator-level heap organization strategies. A feature to instrument all
allocation functions with a token ID can be enabled via the
``-fsanitize=alloc-token`` flag.

New Compiler Flags
------------------
- New option ``-fno-sanitize-debug-trap-reasons`` added to disable emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
- New option ``-fsanitize-debug-trap-reasons=`` added to control emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
- New options for enabling allocation token instrumentation: ``-fsanitize=alloc-token``, ``-falloc-token-max=``, ``-fsanitize-alloc-token-fast-abi``, ``-fsanitize-alloc-token-extended``.


Lanai Support
Expand Down
18 changes: 12 additions & 6 deletions clang/docs/UsersManual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2155,13 +2155,11 @@ are listed below.

.. option:: -f[no-]sanitize=check1,check2,...

Turn on runtime checks for various forms of undefined or suspicious
behavior.
Turn on runtime checks or mitigations for various forms of undefined or
suspicious behavior. These are disabled by default.

This option controls whether Clang adds runtime checks for various
forms of undefined or suspicious behavior, and is disabled by
default. If a check fails, a diagnostic message is produced at
runtime explaining the problem. The main checks are:
The following options enable runtime checks for various forms of undefined
or suspicious behavior:

- .. _opt_fsanitize_address:

Expand Down Expand Up @@ -2195,6 +2193,14 @@ are listed below.
- ``-fsanitize=realtime``: :doc:`RealtimeSanitizer`,
a real-time safety checker.

The following options enable runtime mitigations for various forms of
undefined or suspicious behavior:

- ``-fsanitize=alloc-token``: Enables :doc:`allocation tokens <AllocToken>`
for allocator-level heap organization strategies, such as for security
hardening. It passes type-derived token IDs to a compatible memory
allocator. Requires linking against a token-aware allocator.

There are more fine-grained checks available: see
the :ref:`list <ubsan-checks>` of specific kinds of
undefined behavior that can be detected and the :ref:`list <cfi-schemes>`
Expand Down
1 change: 1 addition & 0 deletions clang/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Using Clang as a Compiler
SanitizerCoverage
SanitizerStats
SanitizerSpecialCaseList
AllocToken
BoundsSafety
BoundsSafetyAdoptionGuide
BoundsSafetyImplPlans
Expand Down
2 changes: 2 additions & 0 deletions clang/include/clang/Basic/CodeGenOptions.def
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,8 @@ CODEGENOPT(SanitizeBinaryMetadataCovered, 1, 0, Benign) ///< Emit PCs for covere
CODEGENOPT(SanitizeBinaryMetadataAtomics, 1, 0, Benign) ///< Emit PCs for atomic operations.
CODEGENOPT(SanitizeBinaryMetadataUAR, 1, 0, Benign) ///< Emit PCs for start of functions
///< that are subject for use-after-return checking.
CODEGENOPT(SanitizeAllocTokenFastABI, 1, 0, Benign) ///< Use the AllocToken fast ABI.
CODEGENOPT(SanitizeAllocTokenExtended, 1, 0, Benign) ///< Extend coverage to custom allocation functions.
CODEGENOPT(SanitizeStats , 1, 0, Benign) ///< Collect statistics for sanitizers.
ENUM_CODEGENOPT(SanitizeDebugTrapReasons, SanitizeDebugTrapReasonKind, 2, SanitizeDebugTrapReasonKind::Detailed, Benign) ///< Control how "trap reasons" are emitted in debug info
CODEGENOPT(SimplifyLibCalls , 1, 1, Benign) ///< Set when -fbuiltin is enabled.
Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/Basic/CodeGenOptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,10 @@ class CodeGenOptions : public CodeGenOptionsBase {

std::optional<double> AllowRuntimeCheckSkipHotCutoff;

/// Maximum number of allocation tokens (0 = no max), nullopt if none set (use
/// pass default).
std::optional<uint64_t> AllocTokenMax;

/// List of backend command-line options for -fembed-bitcode.
std::vector<uint8_t> CmdArgs;

Expand Down
17 changes: 17 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -2731,8 +2731,25 @@ def fsanitize_skip_hot_cutoff_EQ
"(0.0 [default] = skip none; 1.0 = skip all). "
"Argument format: <sanitizer1>=<value1>,<sanitizer2>=<value2>,...">;

defm sanitize_alloc_token_fast_abi : BoolOption<"f", "sanitize-alloc-token-fast-abi",
CodeGenOpts<"SanitizeAllocTokenFastABI">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption], "Use the AllocToken fast ABI">,
NegFlag<SetFalse, [], [ClangOption], "Use the default AllocToken ABI">>,
Group<f_clang_Group>;
defm sanitize_alloc_token_extended : BoolOption<"f", "sanitize-alloc-token-extended",
CodeGenOpts<"SanitizeAllocTokenExtended">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption], "Enable">,
NegFlag<SetFalse, [], [ClangOption], "Disable">,
BothFlags<[], [ClangOption], " extended coverage to custom allocation functions">>,
Group<f_clang_Group>;

} // end -f[no-]sanitize* flags

def falloc_token_max_EQ : Joined<["-"], "falloc-token-max=">,
Group<f_Group>, Visibility<[ClangOption, CC1Option]>,
MetaVarName<"<N>">,
HelpText<"Limit to maximum N allocation tokens (0 = no max)">;

def fallow_runtime_check_skip_hot_cutoff_EQ
: Joined<["-"], "fallow-runtime-check-skip-hot-cutoff=">,
Group<f_clang_Group>,
Expand Down
2 changes: 2 additions & 0 deletions clang/include/clang/Driver/SanitizerArgs.h
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,8 @@ class SanitizerArgs {
llvm::AsanDetectStackUseAfterReturnMode::Invalid;

std::string MemtagMode;
bool AllocTokenFastABI = false;
bool AllocTokenExtended = false;

public:
/// Parses the sanitizer arguments from an argument list.
Expand Down
20 changes: 20 additions & 0 deletions clang/lib/CodeGen/BackendUtil.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -60,11 +60,13 @@
#include "llvm/TargetParser/Triple.h"
#include "llvm/Transforms/HipStdPar/HipStdPar.h"
#include "llvm/Transforms/IPO/EmbedBitcodePass.h"
#include "llvm/Transforms/IPO/InferFunctionAttrs.h"
#include "llvm/Transforms/IPO/LowerTypeTests.h"
#include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h"
#include "llvm/Transforms/InstCombine/InstCombine.h"
#include "llvm/Transforms/Instrumentation/AddressSanitizer.h"
#include "llvm/Transforms/Instrumentation/AddressSanitizerOptions.h"
#include "llvm/Transforms/Instrumentation/AllocToken.h"
#include "llvm/Transforms/Instrumentation/BoundsChecking.h"
#include "llvm/Transforms/Instrumentation/DataFlowSanitizer.h"
#include "llvm/Transforms/Instrumentation/GCOVProfiler.h"
Expand Down Expand Up @@ -232,6 +234,14 @@ class EmitAssemblyHelper {
};
} // namespace

static AllocTokenOptions getAllocTokenOptions(const CodeGenOptions &CGOpts) {
AllocTokenOptions Opts;
Opts.MaxTokens = CGOpts.AllocTokenMax;
Opts.Extended = CGOpts.SanitizeAllocTokenExtended;
Opts.FastABI = CGOpts.SanitizeAllocTokenFastABI;
return Opts;
}

static SanitizerCoverageOptions
getSancovOptsFromCGOpts(const CodeGenOptions &CGOpts) {
SanitizerCoverageOptions Opts;
Expand Down Expand Up @@ -789,6 +799,16 @@ static void addSanitizers(const Triple &TargetTriple,
MPM.addPass(DataFlowSanitizerPass(LangOpts.NoSanitizeFiles,
PB.getVirtualFileSystemPtr()));
}

if (LangOpts.Sanitize.has(SanitizerKind::AllocToken)) {
if (Level == OptimizationLevel::O0) {
// The default pass builder only infers libcall function attrs when
// optimizing, so we insert it here because we need it for accurate
// memory allocation function detection.
MPM.addPass(InferFunctionAttrsPass());
}
MPM.addPass(AllocTokenPass(getAllocTokenOptions(CodeGenOpts)));
}
};
if (ClSanitizeOnOptimizerEarlyEP) {
PB.registerOptimizerEarlyEPCallback(
Expand Down
Loading