Skip to content
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
c796357
[𝘀𝗽𝗿] changes to main this commit is based on
melver Sep 4, 2025
a2e11fc
[𝘀𝗽𝗿] initial version
melver Sep 4, 2025
f9a8b15
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 4, 2025
1bc3905
rebase
melver Sep 4, 2025
8da5f63
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 5, 2025
2ca9c72
fixup! Switch to fixed MD
melver Sep 5, 2025
85dc54d
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 8, 2025
465097f
fixup! fix for incomplete types
melver Sep 8, 2025
6d9fc6a
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 8, 2025
f7d3204
fixup!
melver Sep 8, 2025
9ca8ddc
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 18, 2025
4284a8a
fixup! address reviewer comments
melver Sep 18, 2025
14c47f8
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 19, 2025
fa5f672
fixup! address reviewer comments round 2
melver Sep 19, 2025
0e30e56
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 22, 2025
ce655c3
fixup! use update_test_checks.py for opt tests
melver Sep 22, 2025
5bac7bb
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 23, 2025
ce53f3d
fixup! do not strip _
melver Sep 23, 2025
63e68f7
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 26, 2025
cb42dcf
fixup! address some comments
melver Sep 26, 2025
1f4e3e2
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 26, 2025
bac3951
fixup! address more comments
melver Sep 26, 2025
636c880
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 29, 2025
81d45c4
rebase
melver Sep 29, 2025
0c933b6
[𝘀𝗽𝗿] changes introduced through rebase
melver Sep 30, 2025
20b5a41
fixup! address comments
melver Sep 30, 2025
02b014d
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 2, 2025
6f8d25b
fixup!
melver Oct 2, 2025
e34c2d9
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 2, 2025
45a77f2
fixup! switch clang tests back to manually written
melver Oct 2, 2025
e2fe2ea
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
90a394b
rebase
melver Oct 7, 2025
f8a1390
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
3fee412
rebase
melver Oct 7, 2025
75b4ea6
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
adcaf3a
rebase
melver Oct 7, 2025
a0f7dcb
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
e533ccc
rebase
melver Oct 7, 2025
a5fb2d4
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
64b8cf9
rebase
melver Oct 7, 2025
e3a1d21
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
3e49b6e
rebase
melver Oct 7, 2025
64a2d7d
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 7, 2025
e613813
rebase
melver Oct 7, 2025
6ec0253
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 8, 2025
11188d2
rebase
melver Oct 8, 2025
4be1d65
[𝘀𝗽𝗿] changes introduced through rebase
melver Oct 8, 2025
3ffb904
rebase
melver Oct 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
206 changes: 206 additions & 0 deletions clang/docs/AllocToken.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
=================
Allocation Tokens
=================

.. contents::
:local:

Introduction
============

Clang provides support for allocation tokens to enable allocator-level heap
organization strategies. Clang assigns mode-dependent token IDs to allocation
calls; the runtime behavior depends entirely on the implementation of a
compatible memory allocator.

Possible allocator strategies include:

* **Security Hardening**: Placing allocations into separate, isolated heap
partitions. For example, separating pointer-containing types from raw data
can mitigate exploits that rely on overflowing a primitive buffer to corrupt
object metadata.

* **Memory Layout Optimization**: Grouping related allocations to improve data
locality and cache utilization.

* **Custom Allocation Policies**: Applying different management strategies to
different partitions.

Token Assignment Mode
=====================

The default mode to calculate tokens is:

* *TypeHashPointerSplit* (mode=3): This mode assigns a token ID based on
the hash of the allocated type's name, where the top half ID-space is
reserved for types that contain pointers and the bottom half for types that
do not contain pointers.

Other token ID assignment modes are supported, but they may be subject to
change or removal. These may (experimentally) be selected with ``-mllvm
-alloc-token-mode=<mode>``:

* *TypeHash* (mode=2): This mode assigns a token ID based on the hash of
the allocated type's name.

* *Random* (mode=1): This mode assigns a statically-determined random token ID
to each allocation site.

* *Increment* (mode=0): This mode assigns a simple, incrementally increasing
token ID to each allocation site.

Allocation Token Instrumentation
================================

To enable instrumentation of allocation functions, code can be compiled with
the ``-fsanitize=alloc-token`` flag:

.. code-block:: console

% clang++ -fsanitize=alloc-token example.cc

The instrumentation transforms allocation calls to include a token ID. For
example:

.. code-block:: c

// Original:
ptr = malloc(size);

// Instrumented:
ptr = __alloc_token_malloc(size, token_id);

In addition, it is typically recommended to configure the following:

* ``-falloc-token-max=<N>``
Configures the maximum number of tokens. No max by default (tokens bounded
by ``UINT64_MAX``).

.. code-block:: console

% clang++ -fsanitize=alloc-token -falloc-token-max=512 example.cc

Runtime Interface
-----------------

A compatible runtime must be provided that implements the token-enabled
allocation functions. The instrumentation generates calls to functions that
take a final ``uint64_t token_id`` argument.

.. code-block:: c

// C standard library functions
void *__alloc_token_malloc(size_t size, uint64_t token_id);
void *__alloc_token_calloc(size_t count, size_t size, uint64_t token_id);
void *__alloc_token_realloc(void *ptr, size_t size, uint64_t token_id);
// ...

// C++ operators (mangled names)
// operator new(size_t, uint64_t)
void *__alloc_token_Znwm(size_t size, uint64_t token_id);
// operator new[](size_t, uint64_t)
void *__alloc_token_Znam(size_t size, uint64_t token_id);
// ... other variants like nothrow, etc., are also instrumented.

Fast ABI
--------

An alternative ABI can be enabled with ``-fsanitize-alloc-token-fast-abi``,
which encodes the token ID hint in the allocation function name.

.. code-block:: c

void *__alloc_token_0_malloc(size_t size);
void *__alloc_token_1_malloc(size_t size);
void *__alloc_token_2_malloc(size_t size);
...
void *__alloc_token_0_Znwm(size_t size);
void *__alloc_token_1_Znwm(size_t size);
void *__alloc_token_2_Znwm(size_t size);
...

This ABI provides a more efficient alternative where
``-falloc-token-max`` is small.

Instrumenting Non-Standard Allocation Functions
-----------------------------------------------

By default, AllocToken only instruments standard library allocation functions.
This simplifies adoption, as a compatible allocator only needs to provide
token-enabled variants for a well-defined set of standard functions.

To extend instrumentation to custom allocation functions, enable broader
coverage with ``-fsanitize-alloc-token-extended``. Such functions require being
marked with the `malloc
<https://clang.llvm.org/docs/AttributeReference.html#malloc>`_ or `alloc_size
<https://clang.llvm.org/docs/AttributeReference.html#alloc-size>`_ attributes
(or a combination).

For example:

.. code-block:: c

void *custom_malloc(size_t size) __attribute__((malloc));
void *my_malloc(size_t size) __attribute__((alloc_size(1)));

// Original:
ptr1 = custom_malloc(size);
ptr2 = my_malloc(size);

// Instrumented:
ptr1 = __alloc_token_custom_malloc(size, token_id);
ptr2 = __alloc_token_my_malloc(size, token_id);

Disabling Instrumentation
-------------------------

To exclude specific functions from instrumentation, you can use the
``no_sanitize("alloc-token")`` attribute:

.. code-block:: c

__attribute__((no_sanitize("alloc-token")))
void* custom_allocator(size_t size) {
return malloc(size); // Uses original malloc
}

Note: Independent of any given allocator support, the instrumentation aims to
remain performance neutral. As such, ``no_sanitize("alloc-token")``
functions may be inlined into instrumented functions and vice-versa. If
correctness is affected, such functions should explicitly be marked
``noinline``.

The ``__attribute__((disable_sanitizer_instrumentation))`` is also supported to
disable this and other sanitizer instrumentations.

Suppressions File (Ignorelist)
------------------------------

AllocToken respects the ``src`` and ``fun`` entity types in the
:doc:`SanitizerSpecialCaseList`, which can be used to omit specified source
files or functions from instrumentation.

.. code-block:: bash

# Exclude specific source files
src:third_party/allocator.c
# Exclude function name patterns
fun:*custom_malloc*
fun:LowLevel::*

.. code-block:: console

% clang++ -fsanitize=alloc-token -fsanitize-ignorelist=my_ignorelist.txt example.cc

Conditional Compilation with ``__SANITIZE_ALLOC_TOKEN__``
-----------------------------------------------------------

In some cases, one may need to execute different code depending on whether
AllocToken instrumentation is enabled. The ``__SANITIZE_ALLOC_TOKEN__`` macro
can be used for this purpose.

.. code-block:: c

#ifdef __SANITIZE_ALLOC_TOKEN__
// Code specific to -fsanitize=alloc-token builds
#endif
4 changes: 4 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -203,11 +203,15 @@ Non-comprehensive list of changes in this release
Currently, the use of ``__builtin_dedup_pack`` is limited to template arguments and base
specifiers, it also must be used within a template context.

- Introduce support for allocation tokens to enable allocator-level heap
organization strategies. A feature to instrument all allocation functions
with a token ID can be enabled via the ``-fsanitize=alloc-token`` flag.

New Compiler Flags
------------------
- New option ``-fno-sanitize-debug-trap-reasons`` added to disable emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
- New option ``-fsanitize-debug-trap-reasons=`` added to control emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``).
- New options for enabling allocation token instrumentation: ``-fsanitize=alloc-token``, ``-falloc-token-max=``, ``-fsanitize-alloc-token-fast-abi``, ``-fsanitize-alloc-token-extended``.


Lanai Support
Expand Down
2 changes: 2 additions & 0 deletions clang/docs/UsersManual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2194,6 +2194,8 @@ are listed below.
protection against stack-based memory corruption errors.
- ``-fsanitize=realtime``: :doc:`RealtimeSanitizer`,
a real-time safety checker.
- ``-fsanitize=alloc-token``: :doc:`AllocToken`,
allocation token instrumentation (requires compatible allocator).

There are more fine-grained checks available: see
the :ref:`list <ubsan-checks>` of specific kinds of
Expand Down
1 change: 1 addition & 0 deletions clang/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Using Clang as a Compiler
SanitizerCoverage
SanitizerStats
SanitizerSpecialCaseList
AllocToken
BoundsSafety
BoundsSafetyAdoptionGuide
BoundsSafetyImplPlans
Expand Down
2 changes: 2 additions & 0 deletions clang/include/clang/Basic/CodeGenOptions.def
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,8 @@ CODEGENOPT(SanitizeBinaryMetadataCovered, 1, 0, Benign) ///< Emit PCs for covere
CODEGENOPT(SanitizeBinaryMetadataAtomics, 1, 0, Benign) ///< Emit PCs for atomic operations.
CODEGENOPT(SanitizeBinaryMetadataUAR, 1, 0, Benign) ///< Emit PCs for start of functions
///< that are subject for use-after-return checking.
CODEGENOPT(SanitizeAllocTokenFastABI, 1, 0, Benign) ///< Use the AllocToken fast ABI.
CODEGENOPT(SanitizeAllocTokenExtended, 1, 0, Benign) ///< Extend coverage to custom allocation functions.
CODEGENOPT(SanitizeStats , 1, 0, Benign) ///< Collect statistics for sanitizers.
ENUM_CODEGENOPT(SanitizeDebugTrapReasons, SanitizeDebugTrapReasonKind, 2, SanitizeDebugTrapReasonKind::Detailed, Benign) ///< Control how "trap reasons" are emitted in debug info
CODEGENOPT(SimplifyLibCalls , 1, 1, Benign) ///< Set when -fbuiltin is enabled.
Expand Down
3 changes: 3 additions & 0 deletions clang/include/clang/Basic/CodeGenOptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,9 @@ class CodeGenOptions : public CodeGenOptionsBase {

std::optional<double> AllowRuntimeCheckSkipHotCutoff;

/// Maximum number of allocation tokens (0 = no max).
std::optional<uint64_t> AllocTokenMax;

/// List of backend command-line options for -fembed-bitcode.
std::vector<uint8_t> CmdArgs;

Expand Down
3 changes: 3 additions & 0 deletions clang/include/clang/Basic/Sanitizers.def
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,9 @@ SANITIZER_GROUP("bounds", Bounds, ArrayBounds | LocalBounds)
// Scudo hardened allocator
SANITIZER("scudo", Scudo)

// AllocToken
SANITIZER("alloc-token", AllocToken)

// Magic group, containing all sanitizers. For example, "-fno-sanitize=all"
// can be used to disable all the sanitizers.
SANITIZER_GROUP("all", All, ~SanitizerMask())
Expand Down
17 changes: 17 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -2730,8 +2730,25 @@ def fsanitize_skip_hot_cutoff_EQ
"(0.0 [default] = skip none; 1.0 = skip all). "
"Argument format: <sanitizer1>=<value1>,<sanitizer2>=<value2>,...">;

defm sanitize_alloc_token_fast_abi : BoolOption<"f", "sanitize-alloc-token-fast-abi",
CodeGenOpts<"SanitizeAllocTokenFastABI">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption], "Use the AllocToken fast ABI">,
NegFlag<SetFalse, [], [ClangOption], "Use the default AllocToken ABI">>,
Group<f_clang_Group>;
defm sanitize_alloc_token_extended : BoolOption<"f", "sanitize-alloc-token-extended",
CodeGenOpts<"SanitizeAllocTokenExtended">, DefaultFalse,
PosFlag<SetTrue, [], [ClangOption], "Enable">,
NegFlag<SetFalse, [], [ClangOption], "Disable">,
BothFlags<[], [ClangOption], " extended coverage to custom allocation functions">>,
Group<f_clang_Group>;

} // end -f[no-]sanitize* flags

def falloc_token_max_EQ : Joined<["-"], "falloc-token-max=">,
Group<f_Group>, Visibility<[ClangOption, CC1Option, CLOption]>,
MetaVarName<"<N>">,
HelpText<"Limit to maximum N allocation tokens (0 = no max)">;

def fallow_runtime_check_skip_hot_cutoff_EQ
: Joined<["-"], "fallow-runtime-check-skip-hot-cutoff=">,
Group<f_clang_Group>,
Expand Down
4 changes: 3 additions & 1 deletion clang/include/clang/Driver/SanitizerArgs.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
#include "llvm/Option/Arg.h"
#include "llvm/Option/ArgList.h"
#include "llvm/Transforms/Instrumentation/AddressSanitizerOptions.h"
#include <optional>
#include <string>
#include <vector>

Expand Down Expand Up @@ -73,8 +74,9 @@ class SanitizerArgs {
bool HwasanUseAliases = false;
llvm::AsanDetectStackUseAfterReturnMode AsanUseAfterReturn =
llvm::AsanDetectStackUseAfterReturnMode::Invalid;

std::string MemtagMode;
bool AllocTokenFastABI = false;
bool AllocTokenExtended = false;

public:
/// Parses the sanitizer arguments from an argument list.
Expand Down
20 changes: 20 additions & 0 deletions clang/lib/CodeGen/BackendUtil.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,13 @@
#include "llvm/TargetParser/Triple.h"
#include "llvm/Transforms/HipStdPar/HipStdPar.h"
#include "llvm/Transforms/IPO/EmbedBitcodePass.h"
#include "llvm/Transforms/IPO/InferFunctionAttrs.h"
#include "llvm/Transforms/IPO/LowerTypeTests.h"
#include "llvm/Transforms/IPO/ThinLTOBitcodeWriter.h"
#include "llvm/Transforms/InstCombine/InstCombine.h"
#include "llvm/Transforms/Instrumentation/AddressSanitizer.h"
#include "llvm/Transforms/Instrumentation/AddressSanitizerOptions.h"
#include "llvm/Transforms/Instrumentation/AllocToken.h"
#include "llvm/Transforms/Instrumentation/BoundsChecking.h"
#include "llvm/Transforms/Instrumentation/DataFlowSanitizer.h"
#include "llvm/Transforms/Instrumentation/GCOVProfiler.h"
Expand Down Expand Up @@ -231,6 +233,14 @@ class EmitAssemblyHelper {
};
} // namespace

static AllocTokenOptions getAllocTokenOptions(const CodeGenOptions &CGOpts) {
AllocTokenOptions Opts;
Opts.MaxTokens = CGOpts.AllocTokenMax;
Opts.Extended = CGOpts.SanitizeAllocTokenExtended;
Opts.FastABI = CGOpts.SanitizeAllocTokenFastABI;
return Opts;
}

static SanitizerCoverageOptions
getSancovOptsFromCGOpts(const CodeGenOptions &CGOpts) {
SanitizerCoverageOptions Opts;
Expand Down Expand Up @@ -784,6 +794,16 @@ static void addSanitizers(const Triple &TargetTriple,
if (LangOpts.Sanitize.has(SanitizerKind::DataFlow)) {
MPM.addPass(DataFlowSanitizerPass(LangOpts.NoSanitizeFiles));
}

if (LangOpts.Sanitize.has(SanitizerKind::AllocToken)) {
if (Level == OptimizationLevel::O0) {
// The default pass builder only infers libcall function attrs when
// optimizing, so we insert it here because we need it for accurate
// memory allocation function detection.
MPM.addPass(InferFunctionAttrsPass());
}
MPM.addPass(AllocTokenPass(getAllocTokenOptions(CodeGenOpts)));
}
};
if (ClSanitizeOnOptimizerEarlyEP) {
PB.registerOptimizerEarlyEPCallback(
Expand Down
Loading
Loading