[Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098

melver · 2025-10-06T14:55:52Z

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu'

In this step we can already wire up the sanitize_alloc_token IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

This change is part of the following series:

Created using spr 1.3.8-beta.1 [skip ci]

Created using spr 1.3.8-beta.1

llvmbot · 2025-10-06T15:05:38Z

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Marco Elver (melver)

Changes

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu'

In this step we can already wire up the sanitize_alloc_token IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

This change is part of the following series:

Full diff: https://github.com/llvm/llvm-project/pull/162098.diff

2 Files Affected:

(modified) clang/include/clang/Basic/Sanitizers.def (+3)
(modified) clang/lib/CodeGen/CodeGenFunction.cpp (+2)

diff --git a/clang/include/clang/Basic/Sanitizers.def b/clang/include/clang/Basic/Sanitizers.def
index 1d0e97cc7fb4c..da85431625026 100644
--- a/clang/include/clang/Basic/Sanitizers.def
+++ b/clang/include/clang/Basic/Sanitizers.def
@@ -195,6 +195,9 @@ SANITIZER_GROUP("bounds", Bounds, ArrayBounds | LocalBounds)
 // Scudo hardened allocator
 SANITIZER("scudo", Scudo)
 
+// AllocToken
+SANITIZER("alloc-token", AllocToken)
+
 // Magic group, containing all sanitizers. For example, "-fno-sanitize=all"
 // can be used to disable all the sanitizers.
 SANITIZER_GROUP("all", All, ~SanitizerMask())
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index b2fe9171372d8..acf8de4dee147 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -846,6 +846,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
       Fn->addFnAttr(llvm::Attribute::SanitizeNumericalStability);
     if (SanOpts.hasOneOf(SanitizerKind::Memory | SanitizerKind::KernelMemory))
       Fn->addFnAttr(llvm::Attribute::SanitizeMemory);
+    if (SanOpts.has(SanitizerKind::AllocToken))
+      Fn->addFnAttr(llvm::Attribute::SanitizeAllocToken);
   }
   if (SanOpts.has(SanitizerKind::SafeStack))
     Fn->addFnAttr(llvm::Attribute::SafeStack);

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. Pull Request: llvm#162098

Created using spr 1.3.8-beta.1 [skip ci]

Created using spr 1.3.8-beta.1

Created using spr 1.3.8-beta.1 [skip ci]

Created using spr 1.3.8-beta.1

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. Pull Request: llvm#162098

… metadata (#160131) In preparation of adding the "AllocToken" pass, add the pre-requisite `sanitize_alloc_token` function attribute and `alloc_token` metadata. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

Created using spr 1.3.8-beta.1 [skip ci]

Created using spr 1.3.8-beta.1

…alloc_token metadata (#160131) In preparation of adding the "AllocToken" pass, add the pre-requisite `sanitize_alloc_token` function attribute and `alloc_token` metadata. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

Introduce `AllocToken`, an instrumentation pass designed to provide tokens to memory allocators enabling various heap organization strategies, such as heap partitioning. Initially, the pass instruments functions marked with a new attribute `sanitize_alloc_token` by rewriting allocation calls to include a token ID, appended as a function argument with the default ABI. The design aims to provide a flexible framework for implementing different token generation schemes. It currently supports the following token modes: - TypeHash (default): token IDs based on a hash of the allocated type - Random: statically-assigned pseudo-random token IDs - Increment: incrementing token IDs per TU For the `TypeHash` mode introduce support for `!alloc_token` metadata: the metadata can be attached to allocation calls to provide richer semantic information to be consumed by the AllocToken pass. Optimization remarks can be enabled to show where no metadata was available. An alternative "fast ABI" is provided, where instead of passing the token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the token ID is directly encoded into the name of the called function (e.g., `__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this offers more efficient instrumentation by avoiding the overhead of passing an additional argument at each allocation site. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1] --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

…ions" (#162412) Reverts llvm/llvm-project#162099 Reason: this commit depends on #162098, which I am reverting due to build breakage (see llvm/llvm-project#162098 (comment)).

…2413) Reverts #162098 Reason: buildbot breakage (see #162098 (comment))

thurstond · 2025-10-08T02:58:25Z

Due to the time zone difference, I've gone ahead and reverted this patch and its dependent patch (#162099)

…rKind" (#162413) Reverts llvm/llvm-project#162098 Reason: buildbot breakage (see llvm/llvm-project#162098 (comment))

melver · 2025-10-08T13:29:46Z

Due to the time zone difference, I've gone ahead and reverted this patch and its dependent patch (#162099)

Sigh, this is a brittle test, or rather an unfortunate side-effect of incrementally building & testing on a CI where the test outputs are not cleared. I was able to reproduce this when I checked out 93f2e0a, then checked out this change, and retested with the zorg scripts. Then, if I run:

rm -rf zorg-test/llvm_build_asan_ubsan/tools/clang/test/Preprocessor/Output/print-header-json.c.tmp/

And rerun the tests, the tests pass:

../zorg-test/llvm_build_asan_ubsan/bin/llvm-lit -v clang/test/Preprocessor/print-header-json.c
-- Testing: 1 tests, 1 workers --
PASS: Clang :: Preprocessor/print-header-json.c (1 of 1)

Testing Time: 3.57s

Total Discovered Tests: 1
  Passed: 1 (100.00%)

We could try to fix the test to clear the cache dir or fix the test scripts. I suspect fixing the test is the better option, because everyone who does incremental build + test will have this problem.

Summary of the problem is this: After a patch (such as one adding new sanitizer kind) that changes the binary format of PCMs (because they track codegen options), reusing a stale cached PCM is no longer binary-compatible. Here, adding a new sanitizer option altered the implicit binary layout of the serialized LangOptions. The build & test system is oblivious to this. When the new compiler attempted to read the old module file, it misinterpreted the data due to the layout mismatch, resulting in a heap-buffer-overflow.

TLDR; Clang's PCM binary format doesn't encode a version and attempting to load version-incompatible PCMs from previous test invocations after an implicit change results in a heap buffer overflow and assorted failures.

[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] For new expressions, the allocated type is syntactically known and we can trivially emit the !alloc_token metadata. A subsequent change will wire up the AllocToken pass and introduce appropriate tests. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

…162098) [ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

…62099) [ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] For new expressions, the allocated type is syntactically known and we can trivially emit the !alloc_token metadata. A subsequent change will wire up the AllocToken pass and introduce appropriate tests. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

thurstond · 2025-10-08T16:30:03Z

I see, thank you @melver!

Wire up the `-fsanitize=alloc-token` command-line option, hooking up the `AllocToken` pass -- it provides allocation tokens to compatible runtime allocators, enabling different heap organization strategies, e.g. hardening schemes based on heap partitioning. The instrumentation rewrites standard allocation calls into variants that accept an additional `size_t token_id` argument. For example, calls to `malloc(size)` become `__alloc_token_malloc(size, token_id)`, and a C++ `new MyType` expression will call `__alloc_token__Znwm(size, token_id)`. Currently untyped allocation calls do not yet have `!alloc_token` metadata, and therefore receive the fallback token only. This will be fixed in subsequent changes through best-effort type-inference. One benefit of the instrumentation approach is that it can be applied transparently to large codebases, and scales in deployment as other sanitizers. Similarly to other sanitizers, instrumentation can selectively be controlled using `__attribute__((no_sanitize("alloc-token")))`. Support for sanitizer ignorelists to disable instrumentation for specific functions or source files is implemented. See clang/docs/AllocToken.rst for more usage instructions. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

Wire up the `-fsanitize=alloc-token` command-line option, hooking up the `AllocToken` pass -- it provides allocation tokens to compatible runtime allocators, enabling different heap organization strategies, e.g. hardening schemes based on heap partitioning. The instrumentation rewrites standard allocation calls into variants that accept an additional `size_t token_id` argument. For example, calls to `malloc(size)` become `__alloc_token_malloc(size, token_id)`, and a C++ `new MyType` expression will call `__alloc_token__Znwm(size, token_id)`. Currently untyped allocation calls do not yet have `!alloc_token` metadata, and therefore receive the fallback token only. This will be fixed in subsequent changes through best-effort type-inference. One benefit of the instrumentation approach is that it can be applied transparently to large codebases, and scales in deployment as other sanitizers. Similarly to other sanitizers, instrumentation can selectively be controlled using `__attribute__((no_sanitize("alloc-token")))`. Support for sanitizer ignorelists to disable instrumentation for specific functions or source files is implemented. See clang/docs/AllocToken.rst for more usage instructions. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

Implement the TypeHashPointerSplit mode: This mode assigns a token ID based on the hash of the allocated type's name, where the top half ID-space is reserved for types that contain pointers and the bottom half for types that do not contain pointers. This mode with max tokens of 2 (`-falloc-token-max=2`) may also be valuable for heap hardening strategies that simply separate pointer types from non-pointer types. Make it the new default mode. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

…156840) Implement the TypeHashPointerSplit mode: This mode assigns a token ID based on the hash of the allocated type's name, where the top half ID-space is reserved for types that contain pointers and the bottom half for types that do not contain pointers. This mode with max tokens of 2 (`-falloc-token-max=2`) may also be valuable for heap hardening strategies that simply separate pointer types from non-pointer types. Make it the new default mode. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

#156841) For the AllocToken pass to accurately calculate token ID hints, we need to attach `!alloc_token` metadata for allocation calls. Unlike new expressions, untyped allocation calls (like `malloc`, `calloc`, `::operator new(..)`, `__builtin_operator_new`, etc.) have no syntactic type associated with them. For -fsanitize=alloc-token, type hints are sufficient, and we can attempt to infer the type based on common idioms. When encountering allocation calls (with `__attribute__((malloc))` or `__attribute__((alloc_size(..))`), attach `!alloc_token` by inferring the allocated type from (a) sizeof argument expressions such as `malloc(sizeof(MyType))`, and (b) casts such as `(MyType*)malloc(4096)`. Note that non-standard allocation functions with these attributes are not instrumented by default. Use `-fsanitize-alloc-token-extended` to instrument them as well. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

…ns and casts (#156841) For the AllocToken pass to accurately calculate token ID hints, we need to attach `!alloc_token` metadata for allocation calls. Unlike new expressions, untyped allocation calls (like `malloc`, `calloc`, `::operator new(..)`, `__builtin_operator_new`, etc.) have no syntactic type associated with them. For -fsanitize=alloc-token, type hints are sufficient, and we can attempt to infer the type based on common idioms. When encountering allocation calls (with `__attribute__((malloc))` or `__attribute__((alloc_size(..))`), attach `!alloc_token` by inferring the allocated type from (a) sizeof argument expressions such as `malloc(sizeof(MyType))`, and (b) casts such as `(MyType*)malloc(4096)`. Note that non-standard allocation functions with these attributes are not instrumented by default. Use `-fsanitize-alloc-token-extended` to instrument them as well. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

) Reverts #162099 Reason: this commit depends on #162098, which I am reverting due to build breakage (see #162098 (comment)).

…2413) Reverts #162098 Reason: buildbot breakage (see #162098 (comment))

[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] For new expressions, the allocated type is syntactically known and we can trivially emit the !alloc_token metadata. A subsequent change will wire up the AllocToken pass and introduce appropriate tests. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

Wire up the `-fsanitize=alloc-token` command-line option, hooking up the `AllocToken` pass -- it provides allocation tokens to compatible runtime allocators, enabling different heap organization strategies, e.g. hardening schemes based on heap partitioning. The instrumentation rewrites standard allocation calls into variants that accept an additional `size_t token_id` argument. For example, calls to `malloc(size)` become `__alloc_token_malloc(size, token_id)`, and a C++ `new MyType` expression will call `__alloc_token__Znwm(size, token_id)`. Currently untyped allocation calls do not yet have `!alloc_token` metadata, and therefore receive the fallback token only. This will be fixed in subsequent changes through best-effort type-inference. One benefit of the instrumentation approach is that it can be applied transparently to large codebases, and scales in deployment as other sanitizers. Similarly to other sanitizers, instrumentation can selectively be controlled using `__attribute__((no_sanitize("alloc-token")))`. Support for sanitizer ignorelists to disable instrumentation for specific functions or source files is implemented. See clang/docs/AllocToken.rst for more usage instructions. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

Implement the TypeHashPointerSplit mode: This mode assigns a token ID based on the hash of the allocated type's name, where the top half ID-space is reserved for types that contain pointers and the bottom half for types that do not contain pointers. This mode with max tokens of 2 (`-falloc-token-max=2`) may also be valuable for heap hardening strategies that simply separate pointer types from non-pointer types. Make it the new default mode. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

#156841) For the AllocToken pass to accurately calculate token ID hints, we need to attach `!alloc_token` metadata for allocation calls. Unlike new expressions, untyped allocation calls (like `malloc`, `calloc`, `::operator new(..)`, `__builtin_operator_new`, etc.) have no syntactic type associated with them. For -fsanitize=alloc-token, type hints are sufficient, and we can attempt to infer the type based on common idioms. When encountering allocation calls (with `__attribute__((malloc))` or `__attribute__((alloc_size(..))`), attach `!alloc_token` by inferring the allocated type from (a) sizeof argument expressions such as `malloc(sizeof(MyType))`, and (b) casts such as `(MyType*)malloc(4096)`. Note that non-standard allocation functions with these attributes are not instrumented by default. Use `-fsanitize-alloc-token-extended` to instrument them as well. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

llvm#156841) For the AllocToken pass to accurately calculate token ID hints, we need to attach `!alloc_token` metadata for allocation calls. Unlike new expressions, untyped allocation calls (like `malloc`, `calloc`, `::operator new(..)`, `__builtin_operator_new`, etc.) have no syntactic type associated with them. For -fsanitize=alloc-token, type hints are sufficient, and we can attempt to infer the type based on common idioms. When encountering allocation calls (with `__attribute__((malloc))` or `__attribute__((alloc_size(..))`), attach `!alloc_token` by inferring the allocated type from (a) sizeof argument expressions such as `malloc(sizeof(MyType))`, and (b) casts such as `(MyType*)malloc(4096)`. Note that non-standard allocation functions with these attributes are not instrumented by default. Use `-fsanitize-alloc-token-extended` to instrument them as well. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. llvm#160131 2. llvm#156838 3. llvm#162098 4. llvm#162099 5. llvm#156839 6. llvm#156840 7. llvm#156841 8. llvm#156842

[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. --- This change is part of the following series: 1. llvm#160131 2. llvm#156838 3. llvm#162098 4. llvm#162099 5. llvm#156839 6. llvm#156840 7. llvm#156841 8. llvm#156842

[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] For new expressions, the allocated type is syntactically known and we can trivially emit the !alloc_token metadata. A subsequent change will wire up the AllocToken pass and introduce appropriate tests. --- This change is part of the following series: 1. llvm#160131 2. llvm#156838 3. llvm#162098 4. llvm#162099 5. llvm#156839 6. llvm#156840 7. llvm#156841 8. llvm#156842

Wire up the `-fsanitize=alloc-token` command-line option, hooking up the `AllocToken` pass -- it provides allocation tokens to compatible runtime allocators, enabling different heap organization strategies, e.g. hardening schemes based on heap partitioning. The instrumentation rewrites standard allocation calls into variants that accept an additional `size_t token_id` argument. For example, calls to `malloc(size)` become `__alloc_token_malloc(size, token_id)`, and a C++ `new MyType` expression will call `__alloc_token__Znwm(size, token_id)`. Currently untyped allocation calls do not yet have `!alloc_token` metadata, and therefore receive the fallback token only. This will be fixed in subsequent changes through best-effort type-inference. One benefit of the instrumentation approach is that it can be applied transparently to large codebases, and scales in deployment as other sanitizers. Similarly to other sanitizers, instrumentation can selectively be controlled using `__attribute__((no_sanitize("alloc-token")))`. Support for sanitizer ignorelists to disable instrumentation for specific functions or source files is implemented. See clang/docs/AllocToken.rst for more usage instructions. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. llvm#160131 2. llvm#156838 3. llvm#162098 4. llvm#162099 5. llvm#156839 6. llvm#156840 7. llvm#156841 8. llvm#156842

Implement the TypeHashPointerSplit mode: This mode assigns a token ID based on the hash of the allocated type's name, where the top half ID-space is reserved for types that contain pointers and the bottom half for types that do not contain pointers. This mode with max tokens of 2 (`-falloc-token-max=2`) may also be valuable for heap hardening strategies that simply separate pointer types from non-pointer types. Make it the new default mode. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. llvm#160131 2. llvm#156838 3. llvm#162098 4. llvm#162099 5. llvm#156839 6. llvm#156840 7. llvm#156841 8. llvm#156842

llvm#156841) For the AllocToken pass to accurately calculate token ID hints, we need to attach `!alloc_token` metadata for allocation calls. Unlike new expressions, untyped allocation calls (like `malloc`, `calloc`, `::operator new(..)`, `__builtin_operator_new`, etc.) have no syntactic type associated with them. For -fsanitize=alloc-token, type hints are sufficient, and we can attempt to infer the type based on common idioms. When encountering allocation calls (with `__attribute__((malloc))` or `__attribute__((alloc_size(..))`), attach `!alloc_token` by inferring the allocated type from (a) sizeof argument expressions such as `malloc(sizeof(MyType))`, and (b) casts such as `(MyType*)malloc(4096)`. Note that non-standard allocation functions with these attributes are not instrumented by default. Use `-fsanitize-alloc-token-extended` to instrument them as well. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. llvm#160131 2. llvm#156838 3. llvm#162098 4. llvm#162099 5. llvm#156839 6. llvm#156840 7. llvm#156841 8. llvm#156842

… metadata (#160131) In preparation of adding the "AllocToken" pass, add the pre-requisite `sanitize_alloc_token` function attribute and `alloc_token` metadata. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

melver added 2 commits October 6, 2025 16:55

[𝘀𝗽𝗿] changes to main this commit is based on

a77cb18

Created using spr 1.3.8-beta.1 [skip ci]

[𝘀𝗽𝗿] initial version

2fe2c42

Created using spr 1.3.8-beta.1

melver marked this pull request as ready for review October 6, 2025 15:05

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Oct 6, 2025

melver requested review from fmayer, vitalybuka and zmodem October 6, 2025 15:05

fmayer approved these changes Oct 6, 2025

View reviewed changes

melver added 4 commits October 7, 2025 11:53

[𝘀𝗽𝗿] changes introduced through rebase

d329d5e

Created using spr 1.3.8-beta.1 [skip ci]

rebase

04a0b5c

Created using spr 1.3.8-beta.1

[𝘀𝗽𝗿] changes introduced through rebase

f6bd3a6

Created using spr 1.3.8-beta.1 [skip ci]

rebase

329bc37

Created using spr 1.3.8-beta.1

melver added 2 commits October 7, 2025 12:56

[𝘀𝗽𝗿] changes introduced through rebase

0256929

Created using spr 1.3.8-beta.1 [skip ci]

rebase

12cef77

Created using spr 1.3.8-beta.1

melver changed the base branch from users/melver/spr/main.clangcodegen-introduce-the-alloctoken-sanitizerkind to main October 7, 2025 11:30

thurstond added a commit that referenced this pull request Oct 8, 2025

Revert "[Clang][CodeGen] Introduce the AllocToken SanitizerKind" (#16…

c74fa20

…2413) Reverts #162098 Reason: buildbot breakage (see #162098 (comment))

svkeerthy pushed a commit that referenced this pull request Oct 9, 2025

Revert "[Clang][CodeGen] Emit !alloc_token for new expressions" (#162412

7fe311b

) Reverts #162099 Reason: this commit depends on #162098, which I am reverting due to build breakage (see #162098 (comment)).

svkeerthy pushed a commit that referenced this pull request Oct 9, 2025

Revert "[Clang][CodeGen] Introduce the AllocToken SanitizerKind" (#16…

adc0976

…2413) Reverts #162098 Reason: buildbot breakage (see #162098 (comment))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098

[Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098

Uh oh!

melver commented Oct 6, 2025 •

edited

Loading

Uh oh!

llvmbot commented Oct 6, 2025 •

edited

Loading

Uh oh!

thurstond commented Oct 8, 2025

Uh oh!

melver commented Oct 8, 2025

Uh oh!

thurstond commented Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098

[Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098

Uh oh!

Conversation

melver commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thurstond commented Oct 8, 2025

Uh oh!

melver commented Oct 8, 2025

Uh oh!

thurstond commented Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

melver commented Oct 6, 2025 •

edited

Loading

llvmbot commented Oct 6, 2025 •

edited

Loading