[Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098

melver · 2025-10-06T14:55:52Z

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu'

In this step we can already wire up the sanitize_alloc_token IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

This change is part of the following series:

Created using spr 1.3.8-beta.1 [skip ci]

Created using spr 1.3.8-beta.1

llvmbot · 2025-10-06T15:05:38Z

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Marco Elver (melver)

Changes

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu'

In this step we can already wire up the sanitize_alloc_token IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

This change is part of the following series:

Full diff: https://github.com/llvm/llvm-project/pull/162098.diff

2 Files Affected:

(modified) clang/include/clang/Basic/Sanitizers.def (+3)
(modified) clang/lib/CodeGen/CodeGenFunction.cpp (+2)

diff --git a/clang/include/clang/Basic/Sanitizers.def b/clang/include/clang/Basic/Sanitizers.def
index 1d0e97cc7fb4c..da85431625026 100644
--- a/clang/include/clang/Basic/Sanitizers.def
+++ b/clang/include/clang/Basic/Sanitizers.def
@@ -195,6 +195,9 @@ SANITIZER_GROUP("bounds", Bounds, ArrayBounds | LocalBounds)
 // Scudo hardened allocator
 SANITIZER("scudo", Scudo)
 
+// AllocToken
+SANITIZER("alloc-token", AllocToken)
+
 // Magic group, containing all sanitizers. For example, "-fno-sanitize=all"
 // can be used to disable all the sanitizers.
 SANITIZER_GROUP("all", All, ~SanitizerMask())
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index b2fe9171372d8..acf8de4dee147 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -846,6 +846,8 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
       Fn->addFnAttr(llvm::Attribute::SanitizeNumericalStability);
     if (SanOpts.hasOneOf(SanitizerKind::Memory | SanitizerKind::KernelMemory))
       Fn->addFnAttr(llvm::Attribute::SanitizeMemory);
+    if (SanOpts.has(SanitizerKind::AllocToken))
+      Fn->addFnAttr(llvm::Attribute::SanitizeAllocToken);
   }
   if (SanOpts.has(SanitizerKind::SafeStack))
     Fn->addFnAttr(llvm::Attribute::SafeStack);

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. Pull Request: llvm#162098

Created using spr 1.3.8-beta.1 [skip ci]

Created using spr 1.3.8-beta.1

Created using spr 1.3.8-beta.1 [skip ci]

Created using spr 1.3.8-beta.1

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. Pull Request: llvm#162098

… metadata (#160131) In preparation of adding the "AllocToken" pass, add the pre-requisite `sanitize_alloc_token` function attribute and `alloc_token` metadata. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

Created using spr 1.3.8-beta.1 [skip ci]

Created using spr 1.3.8-beta.1

…alloc_token metadata (#160131) In preparation of adding the "AllocToken" pass, add the pre-requisite `sanitize_alloc_token` function attribute and `alloc_token` metadata. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

Introduce `AllocToken`, an instrumentation pass designed to provide tokens to memory allocators enabling various heap organization strategies, such as heap partitioning. Initially, the pass instruments functions marked with a new attribute `sanitize_alloc_token` by rewriting allocation calls to include a token ID, appended as a function argument with the default ABI. The design aims to provide a flexible framework for implementing different token generation schemes. It currently supports the following token modes: - TypeHash (default): token IDs based on a hash of the allocated type - Random: statically-assigned pseudo-random token IDs - Increment: incrementing token IDs per TU For the `TypeHash` mode introduce support for `!alloc_token` metadata: the metadata can be attached to allocation calls to provide richer semantic information to be consumed by the AllocToken pass. Optimization remarks can be enabled to show where no metadata was available. An alternative "fast ABI" is provided, where instead of passing the token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the token ID is directly encoded into the name of the called function (e.g., `__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this offers more efficient instrumentation by avoiding the overhead of passing an additional argument at each allocation site. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1] --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

…56838) Introduce `AllocToken`, an instrumentation pass designed to provide tokens to memory allocators enabling various heap organization strategies, such as heap partitioning. Initially, the pass instruments functions marked with a new attribute `sanitize_alloc_token` by rewriting allocation calls to include a token ID, appended as a function argument with the default ABI. The design aims to provide a flexible framework for implementing different token generation schemes. It currently supports the following token modes: - TypeHash (default): token IDs based on a hash of the allocated type - Random: statically-assigned pseudo-random token IDs - Increment: incrementing token IDs per TU For the `TypeHash` mode introduce support for `!alloc_token` metadata: the metadata can be attached to allocation calls to provide richer semantic information to be consumed by the AllocToken pass. Optimization remarks can be enabled to show where no metadata was available. An alternative "fast ABI" is provided, where instead of passing the token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the token ID is directly encoded into the name of the called function (e.g., `__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this offers more efficient instrumentation by avoiding the overhead of passing an additional argument at each allocation site. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1] --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

Created using spr 1.3.8-beta.1

…162098) Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

For new expressions, the allocated type is syntactically known and we can trivially emit the !alloc_token metadata. A subsequent change will wire up the AllocToken pass and introduce appropriate tests. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

…62099) For new expressions, the allocated type is syntactically known and we can trivially emit the !alloc_token metadata. A subsequent change will wire up the AllocToken pass and introduce appropriate tests. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

thurstond · 2025-10-07T23:57:30Z

This change is causing a buildbot clang crash: https://lab.llvm.org/buildbot/#/builders/169/builds/15726

(I manually re-ran the buildbot at this change - 0cee4db - which crashed; it did not crash on the immediately preceding commit, 93f2e0a)

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/clang -fsyntax-only -I /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/Preprocessor/Inputs/print-header-json -isystem /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/Preprocessor/Inputs/print-header-json/system -fmodules -fimplicit-module-maps -fmodules-cache-path=/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/tools/clang/test/Preprocessor/Output/print-header-json.c.tmp /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/Preprocessor/print-header-json.c -o /dev/null
1.	/home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/Preprocessor/Inputs/print-header-json/system/system0.h:2:2: current parser token 'include'
 #0 0x000064d4020a8a76 ___interceptor_backtrace /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:4530:13
 #1 0x000064d4097963f8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:834:13
 #2 0x000064d40978fec9 llvm::sys::RunSignalHandlers() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Signals.cpp:0:5
 #3 0x000064d409794494 llvm::sys::CleanupOnSignal(unsigned long) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
 #4 0x000064d4095de721 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5
 #5 0x000064d4095dee07 CrashRecoverySignalHandler(int) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:391:1
 #6 0x00007c3738e458d0 (/lib/x86_64-linux-gnu/libc.so.6+0x458d0)
 #7 0x00007c3738ea49bc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0xa49bc)
 #8 0x00007c3738e4579e raise (/lib/x86_64-linux-gnu/libc.so.6+0x4579e)
 #9 0x00007c3738e288cd abort (/lib/x86_64-linux-gnu/libc.so.6+0x288cd)
#10 0x000064d40212b75c (/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/clang+0x11fcc75c)
#11 0x000064d4021295fe __sanitizer::Die() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:52:5
#12 0x000064d40210a25b push_back /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common.h:543:7
#13 0x000064d40210a25b __asan::ScopedInErrorReport::~ScopedInErrorReport() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/asan_report.cpp:193:29
#14 0x000064d40210c0ed __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/asan_report.cpp:536:1
#15 0x000064d40210cfb6 __asan_report_load16 /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:132:1
#16 0x000064d40bcb6d1f __copy_non_overlapping_range<const unsigned long *, const unsigned long *> /home/b/sanitizer-x86_64-linux-fast/build/libcxx_install_asan_ubsan/include/c++/v1/string:2144:38
#17 0x000064d40bcb6d1f void std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__init_with_size[abi:nn220000]<unsigned long const*, unsigned long const*>(unsigned long const*, unsigned long const*, unsigned long) /home/b/sanitizer-x86_64-linux-fast/build/libcxx_install_asan_ubsan/include/c++/v1/string:2685:18
#18 0x000064d40bb77198 clang::ASTReader::ReadString(llvm::SmallVectorImpl<unsigned long> const&, unsigned int&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:10172:7
#19 0x000064d40bb9229b clang::ASTReader::ParseLanguageOptions(llvm::SmallVector<unsigned long, 64u> const&, llvm::StringRef, bool, clang::ASTReaderListener&, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:6475:12
#20 0x000064d40bb83454 clang::ASTReader::ReadOptionsBlock(llvm::BitstreamCursor&, llvm::StringRef, unsigned int, bool, clang::ASTReaderListener&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:0:11
#21 0x000064d40bb994b9 clang::ASTReader::ReadControlBlock(clang::serialization::ModuleFile&, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, clang::serialization::ModuleFile const*, unsigned int) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:3249:15
#22 0x000064d40bb9e1d3 clang::ASTReader::ReadASTCore(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, clang::serialization::ModuleFile*, llvm::SmallVectorImpl<clang::ASTReader::ImportedModule>&, long, long, clang::ASTFileSignature, unsigned int) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:5182:7
#23 0x000064d40bbb3678 clang::ASTReader::ReadAST(llvm::StringRef, clang::serialization::ModuleKind, clang::SourceLocation, unsigned int, clang::serialization::ModuleFile**) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Serialization/ASTReader.cpp:4828:11
#24 0x000064d40b69c575 clang::CompilerInstance::findOrCompileModuleAndReadAST(llvm::StringRef, clang::SourceLocation, clang::SourceLocation, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1805:27
#25 0x000064d40b69fcf0 clang::CompilerInstance::loadModule(clang::SourceLocation, llvm::ArrayRef<clang::IdentifierLoc>, clang::Module::NameVisibilityKind, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1956:31
#26 0x000064d4129e38fd clang::Preprocessor::HandleHeaderIncludeOrImport(clang::SourceLocation, clang::Token&, clang::Token&, clang::SourceLocation, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:2426:5
#27 0x000064d4129d7003 clang::Preprocessor::HandleIncludeDirective(clang::SourceLocation, clang::Token&, clang::detail::SearchDirIteratorImpl<true>, clang::FileEntry const*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:2101:17
#28 0x000064d4129d8147 clang::Preprocessor::HandleDirective(clang::Token&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/PPDirectives.cpp:0:14
#29 0x000064d41293d29d clang::Lexer::LexTokenInternal(clang::Token&, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/Lexer.cpp:4514:7
#30 0x000064d412933fec clang::Lexer::Lex(clang::Token&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/Lexer.cpp:3731:3
#31 0x000064d412a69ddb clang::Preprocessor::Lex(clang::Token&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Lex/Preprocessor.cpp:896:3
#32 0x000064d40f16f731 clang::ParseAST(clang::Sema&, bool, bool) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseAST.cpp:164:5
#33 0x000064d40b7c8e73 clang::FrontendAction::Execute() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1315:10
#34 0x000064d40b6913be getPtr /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Error.h:278:42
#35 0x000064d40b6913be operator bool /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/Support/Error.h:241:16
#36 0x000064d40b6913be clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1008:23
#37 0x000064d40bacf613 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:310:25
#38 0x000064d4021572d5 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/cc1_main.cpp:300:15
#39 0x000064d40214c620 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:227:12
#40 0x000064d402154760 release /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:232:9
#41 0x000064d402154760 ~IntrusiveRefCntPtr /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:196:27
#42 0x000064d402154760 operator() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:369:5
#43 0x000064d402154760 int llvm::function_ref<int (llvm::SmallVectorImpl<char const*>&)>::callback_fn<clang_main(int, char**, llvm::ToolContext const&)::$_0>(long, llvm::SmallVectorImpl<char const*>&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
#44 0x000064d40b3b7df5 operator() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Driver/Job.cpp:436:30
#45 0x000064d40b3b7df5 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::__1::optional<llvm::StringRef>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, bool*) const::$_0>(long) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
#46 0x000064d4095de4d6 operator() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:0:12
#47 0x000064d4095de4d6 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:426:3
#48 0x000064d40b3b4dfd clang::driver::CC1Command::Execute(llvm::ArrayRef<std::__1::optional<llvm::StringRef>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*, bool*) const /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Driver/Job.cpp:436:7
#49 0x000064d40b30eaab clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Driver/Compilation.cpp:196:15
#50 0x000064d40b30f0f7 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*>>&, bool) const /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Driver/Compilation.cpp:246:13
#51 0x000064d40b34613f empty /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:82:46
#52 0x000064d40b34613f clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::__1::pair<int, clang::driver::Command const*>>&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Driver/Driver.cpp:2244:23
#53 0x000064d40214b141 clang_main(int, char**, llvm::ToolContext const&) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:407:21
#54 0x000064d402177126 main /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/tools/clang/tools/driver/clang-driver.cpp:17:10
#55 0x00007c3738e2a578 (/lib/x86_64-linux-gnu/libc.so.6+0x2a578)
#56 0x00007c3738e2a63b __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a63b)
#57 0x000064d40205f0e5 _start (/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/clang+0x11f000e5)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)
clang version 22.0.0git
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin
Build config: +assertions, +asan, +ubsan
clang: note: diagnostic msg: 
********************

) Reverts #162099 Reason: this commit depends on #162098, which I am reverting due to build breakage (see #162098 (comment)).

)" This reverts commit 0cee4db.

…ions" (#162412) Reverts llvm/llvm-project#162099 Reason: this commit depends on #162098, which I am reverting due to build breakage (see llvm/llvm-project#162098 (comment)).

…2413) Reverts #162098 Reason: buildbot breakage (see #162098 (comment))

thurstond · 2025-10-08T02:58:25Z

Due to the time zone difference, I've gone ahead and reverted this patch and its dependent patch (#162099)

…rKind" (#162413) Reverts llvm/llvm-project#162098 Reason: buildbot breakage (see llvm/llvm-project#162098 (comment))

melver · 2025-10-08T13:29:46Z

Due to the time zone difference, I've gone ahead and reverted this patch and its dependent patch (#162099)

Sigh, this is a brittle test, or rather an unfortunate side-effect of incrementally building & testing on a CI where the test outputs are not cleared. I was able to reproduce this when I checked out 93f2e0a, then checked out this change, and retested with the zorg scripts. Then, if I run:

rm -rf zorg-test/llvm_build_asan_ubsan/tools/clang/test/Preprocessor/Output/print-header-json.c.tmp/

And rerun the tests, the tests pass:

../zorg-test/llvm_build_asan_ubsan/bin/llvm-lit -v clang/test/Preprocessor/print-header-json.c
-- Testing: 1 tests, 1 workers --
PASS: Clang :: Preprocessor/print-header-json.c (1 of 1)

Testing Time: 3.57s

Total Discovered Tests: 1
  Passed: 1 (100.00%)

We could try to fix the test to clear the cache dir or fix the test scripts. I suspect fixing the test is the better option, because everyone who does incremental build + test will have this problem.

Summary of the problem is this: After a patch (such as one adding new sanitizer kind) that changes the binary format of PCMs (because they track codegen options), reusing a stale cached PCM is no longer binary-compatible. Here, adding a new sanitizer option altered the implicit binary layout of the serialized LangOptions. The build & test system is oblivious to this. When the new compiler attempted to read the old module file, it misinterpreted the data due to the layout mismatch, resulting in a heap-buffer-overflow.

TLDR; Clang's PCM binary format doesn't encode a version and attempting to load version-incompatible PCMs from previous test invocations after an implicit change results in a heap buffer overflow and assorted failures.

[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] For new expressions, the allocated type is syntactically known and we can trivially emit the !alloc_token metadata. A subsequent change will wire up the AllocToken pass and introduce appropriate tests. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

…162098) [ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

…62099) [ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ] For new expressions, the allocated type is syntactically known and we can trivially emit the !alloc_token metadata. A subsequent change will wire up the AllocToken pass and introduce appropriate tests. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

thurstond · 2025-10-08T16:30:03Z

I see, thank you @melver!

Wire up the `-fsanitize=alloc-token` command-line option, hooking up the `AllocToken` pass -- it provides allocation tokens to compatible runtime allocators, enabling different heap organization strategies, e.g. hardening schemes based on heap partitioning. The instrumentation rewrites standard allocation calls into variants that accept an additional `size_t token_id` argument. For example, calls to `malloc(size)` become `__alloc_token_malloc(size, token_id)`, and a C++ `new MyType` expression will call `__alloc_token__Znwm(size, token_id)`. Currently untyped allocation calls do not yet have `!alloc_token` metadata, and therefore receive the fallback token only. This will be fixed in subsequent changes through best-effort type-inference. One benefit of the instrumentation approach is that it can be applied transparently to large codebases, and scales in deployment as other sanitizers. Similarly to other sanitizers, instrumentation can selectively be controlled using `__attribute__((no_sanitize("alloc-token")))`. Support for sanitizer ignorelists to disable instrumentation for specific functions or source files is implemented. See clang/docs/AllocToken.rst for more usage instructions. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

Wire up the `-fsanitize=alloc-token` command-line option, hooking up the `AllocToken` pass -- it provides allocation tokens to compatible runtime allocators, enabling different heap organization strategies, e.g. hardening schemes based on heap partitioning. The instrumentation rewrites standard allocation calls into variants that accept an additional `size_t token_id` argument. For example, calls to `malloc(size)` become `__alloc_token_malloc(size, token_id)`, and a C++ `new MyType` expression will call `__alloc_token__Znwm(size, token_id)`. Currently untyped allocation calls do not yet have `!alloc_token` metadata, and therefore receive the fallback token only. This will be fixed in subsequent changes through best-effort type-inference. One benefit of the instrumentation approach is that it can be applied transparently to large codebases, and scales in deployment as other sanitizers. Similarly to other sanitizers, instrumentation can selectively be controlled using `__attribute__((no_sanitize("alloc-token")))`. Support for sanitizer ignorelists to disable instrumentation for specific functions or source files is implemented. See clang/docs/AllocToken.rst for more usage instructions. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

Implement the TypeHashPointerSplit mode: This mode assigns a token ID based on the hash of the allocated type's name, where the top half ID-space is reserved for types that contain pointers and the bottom half for types that do not contain pointers. This mode with max tokens of 2 (`-falloc-token-max=2`) may also be valuable for heap hardening strategies that simply separate pointer types from non-pointer types. Make it the new default mode. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

…156840) Implement the TypeHashPointerSplit mode: This mode assigns a token ID based on the hash of the allocated type's name, where the top half ID-space is reserved for types that contain pointers and the bottom half for types that do not contain pointers. This mode with max tokens of 2 (`-falloc-token-max=2`) may also be valuable for heap hardening strategies that simply separate pointer types from non-pointer types. Make it the new default mode. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

#156841) For the AllocToken pass to accurately calculate token ID hints, we need to attach `!alloc_token` metadata for allocation calls. Unlike new expressions, untyped allocation calls (like `malloc`, `calloc`, `::operator new(..)`, `__builtin_operator_new`, etc.) have no syntactic type associated with them. For -fsanitize=alloc-token, type hints are sufficient, and we can attempt to infer the type based on common idioms. When encountering allocation calls (with `__attribute__((malloc))` or `__attribute__((alloc_size(..))`), attach `!alloc_token` by inferring the allocated type from (a) sizeof argument expressions such as `malloc(sizeof(MyType))`, and (b) casts such as `(MyType*)malloc(4096)`. Note that non-standard allocation functions with these attributes are not instrumented by default. Use `-fsanitize-alloc-token-extended` to instrument them as well. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842

…ns and casts (#156841) For the AllocToken pass to accurately calculate token ID hints, we need to attach `!alloc_token` metadata for allocation calls. Unlike new expressions, untyped allocation calls (like `malloc`, `calloc`, `::operator new(..)`, `__builtin_operator_new`, etc.) have no syntactic type associated with them. For -fsanitize=alloc-token, type hints are sufficient, and we can attempt to infer the type based on common idioms. When encountering allocation calls (with `__attribute__((malloc))` or `__attribute__((alloc_size(..))`), attach `!alloc_token` by inferring the allocated type from (a) sizeof argument expressions such as `malloc(sizeof(MyType))`, and (b) casts such as `(MyType*)malloc(4096)`. Note that non-standard allocation functions with these attributes are not instrumented by default. Use `-fsanitize-alloc-token-extended` to instrument them as well. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842

melver added 2 commits October 6, 2025 16:55

[𝘀𝗽𝗿] changes to main this commit is based on

a77cb18

Created using spr 1.3.8-beta.1 [skip ci]

[𝘀𝗽𝗿] initial version

2fe2c42

Created using spr 1.3.8-beta.1

melver marked this pull request as ready for review October 6, 2025 15:05

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Oct 6, 2025

melver requested review from fmayer, zmodem and vitalybuka October 6, 2025 15:05

fmayer approved these changes Oct 6, 2025

View reviewed changes

melver added 4 commits October 7, 2025 11:53

[𝘀𝗽𝗿] changes introduced through rebase

d329d5e

Created using spr 1.3.8-beta.1 [skip ci]

rebase

04a0b5c

Created using spr 1.3.8-beta.1

[𝘀𝗽𝗿] changes introduced through rebase

f6bd3a6

Created using spr 1.3.8-beta.1 [skip ci]

rebase

329bc37

Created using spr 1.3.8-beta.1

melver added 2 commits October 7, 2025 12:56

[𝘀𝗽𝗿] changes introduced through rebase

0256929

Created using spr 1.3.8-beta.1 [skip ci]

rebase

12cef77

Created using spr 1.3.8-beta.1

melver changed the base branch from users/melver/spr/main.clangcodegen-introduce-the-alloctoken-sanitizerkind to main October 7, 2025 11:30

rebase

2ecce80

Created using spr 1.3.8-beta.1

melver merged commit 0cee4db into main Oct 7, 2025
9 checks passed

melver deleted the users/melver/spr/clangcodegen-introduce-the-alloctoken-sanitizerkind branch October 7, 2025 18:22

thurstond mentioned this pull request Oct 8, 2025

Revert "[Clang][CodeGen] Emit !alloc_token for new expressions" #162412

Merged

thurstond added a commit that referenced this pull request Oct 8, 2025

Revert "[Clang][CodeGen] Emit !alloc_token for new expressions" (#162412

34fda63

) Reverts #162099 Reason: this commit depends on #162098, which I am reverting due to build breakage (see #162098 (comment)).

thurstond added a commit that referenced this pull request Oct 8, 2025

Revert "[Clang][CodeGen] Introduce the AllocToken SanitizerKind (#162098

df81b0b

)" This reverts commit 0cee4db.

thurstond mentioned this pull request Oct 8, 2025

Revert "[Clang][CodeGen] Introduce the AllocToken SanitizerKind" #162413

Merged

thurstond added a commit that referenced this pull request Oct 8, 2025

Revert "[Clang][CodeGen] Introduce the AllocToken SanitizerKind" (#16…

c74fa20

…2413) Reverts #162098 Reason: buildbot breakage (see #162098 (comment))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098

[Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098

melver commented Oct 6, 2025 •

edited

Loading

Uh oh!

llvmbot commented Oct 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

thurstond commented Oct 7, 2025

Uh oh!

thurstond commented Oct 8, 2025

Uh oh!

melver commented Oct 8, 2025

Uh oh!

thurstond commented Oct 8, 2025

Uh oh!

Uh oh!

[Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098

[Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098

Conversation

melver commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

thurstond commented Oct 7, 2025

Uh oh!

thurstond commented Oct 8, 2025

Uh oh!

melver commented Oct 8, 2025

Uh oh!

thurstond commented Oct 8, 2025

Uh oh!

Uh oh!

melver commented Oct 6, 2025 •

edited

Loading

llvmbot commented Oct 6, 2025 •

edited

Loading