Skip to content

Conversation

melver
Copy link
Contributor

@melver melver commented Sep 4, 2025

For the AllocToken pass to accurately calculate token ID hints, we
should attach !alloc_token metadata for allocation calls to avoid
reverting to LLVM IR-type based hints (which depends on later "uses" and
is rather imprecise).

Unlike new expressions, untyped allocation calls (like malloc,
calloc, ::operator new(..), __builtin_operator_new, etc.) have no
syntactic type associated with them. For -fsanitize=alloc-token, type
hints are sufficient, and we can attempt to infer the type based on
common idioms.

When encountering allocation calls (with __attribute__((malloc)) or
__attribute__((alloc_size(..))), attach !alloc_token by inferring
the allocated type from (a) sizeof argument expressions such as
malloc(sizeof(MyType)), and (b) casts such as (MyType*)malloc(4096).

Note that non-standard allocation functions with these attributes are
not instrumented by default. Use -fsanitize-alloc-token-extended to
instrument them as well.

Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434


This change is part of the following series:

  1. [AllocToken] Introduce sanitize_alloc_token attribute and alloc_token metadata #160131
  2. [AllocToken] Introduce AllocToken instrumentation pass #156838
  3. [Clang][CodeGen] Introduce the AllocToken SanitizerKind #162098
  4. [Clang][CodeGen] Emit !alloc_token for new expressions #162099
  5. [Clang] Wire up -fsanitize=alloc-token #156839
  6. [AllocToken, Clang] Implement TypeHashPointerSplit mode #156840
  7. [AllocToken, Clang] Infer type hints from sizeof expressions and casts #156841
  8. [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id #156842

Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
@melver melver marked this pull request as ready for review September 9, 2025 13:02
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Sep 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 9, 2025

@llvm/pr-subscribers-clang

Author: Marco Elver (melver)

Changes

For the AllocToken pass to accurately calculate token ID hints, we
should attach !alloc_token_hint metadata for allocation calls to avoid
reverting to LLVM IR-type based hints (which depends on later "uses" and
is rather imprecise).

Unlike new expressions, untyped allocation calls (like malloc,
calloc, ::operator new(..), __builtin_operator_new, etc.) have no
syntactic type associated with them. For -fsanitize=alloc-token, type
hints are sufficient, and we can attempt to infer the type based on
common idioms.

When encountering allocation calls (with __attribute__((malloc)) or
__attribute__((alloc_size(..))), attach !alloc_token_hint by
inferring the allocated type from (a) sizeof argument expressions such
as malloc(sizeof(MyType)), and (b) casts such as (MyType*)malloc(4096).

Note that non-standard allocation functions with these attributes are
not instrumented by default. Use -fsanitize-alloc-token-extended to
instrument them as well.

Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434


This change is part of the following series:

  1. [AllocToken] Introduce AllocToken instrumentation pass #156838
  2. [Clang] Wire up -fsanitize=alloc-token #156839
  3. [AllocToken, Clang] Implement TypeHashPointerSplit mode #156840
  4. [AllocToken, Clang] Infer type hints from sizeof expressions and casts #156841
  5. [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id #156842

Patch is 25.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156841.diff

9 Files Affected:

  • (modified) clang/docs/AllocToken.rst (+29)
  • (modified) clang/lib/CodeGen/CGExpr.cpp (+108-4)
  • (modified) clang/lib/CodeGen/CGExprCXX.cpp (+10-2)
  • (modified) clang/lib/CodeGen/CGExprScalar.cpp (+5)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+7)
  • (added) clang/test/CodeGen/alloc-token-nonlibcalls.c (+18)
  • (modified) clang/test/CodeGen/alloc-token.c (+22-22)
  • (modified) clang/test/CodeGenCXX/alloc-token-pointer.cpp (+36-21)
  • (modified) clang/test/CodeGenCXX/alloc-token.cpp (+49-27)
diff --git a/clang/docs/AllocToken.rst b/clang/docs/AllocToken.rst
index fb354d6738ea3..7ad5e5f03d8a0 100644
--- a/clang/docs/AllocToken.rst
+++ b/clang/docs/AllocToken.rst
@@ -122,6 +122,35 @@ which encodes the token ID hint in the allocation function name.
 This ABI provides a more efficient alternative where
 ``-falloc-token-max`` is small.
 
+Instrumenting Non-Standard Allocation Functions
+-----------------------------------------------
+
+By default, AllocToken only instruments standard library allocation functions.
+This simplifies adoption, as a compatible allocator only needs to provide
+token-enabled variants for a well-defined set of standard functions.
+
+To extend instrumentation to custom allocation functions, enable broader
+coverage with ``-fsanitize-alloc-token-extended``. Such functions require being
+marked with the `malloc
+<https://clang.llvm.org/docs/AttributeReference.html#malloc>`_ or `alloc_size
+<https://clang.llvm.org/docs/AttributeReference.html#alloc-size>`_ attributes
+(or a combination).
+
+For example:
+
+.. code-block:: c
+
+    void *custom_malloc(size_t size) __attribute__((malloc));
+    void *my_malloc(size_t size) __attribute__((alloc_size(1)));
+
+    // Original:
+    ptr1 = custom_malloc(size);
+    ptr2 = my_malloc(size);
+
+    // Instrumented:
+    ptr1 = __alloc_token_custom_malloc(size, token_id);
+    ptr2 = __alloc_token_my_malloc(size, token_id);
+
 Disabling Instrumentation
 -------------------------
 
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index e7a0e7696e204..dc428f04e873a 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -30,6 +30,7 @@
 #include "clang/AST/Attr.h"
 #include "clang/AST/DeclObjC.h"
 #include "clang/AST/NSAPI.h"
+#include "clang/AST/ParentMapContext.h"
 #include "clang/AST/StmtVisitor.h"
 #include "clang/Basic/Builtins.h"
 #include "clang/Basic/CodeGenOptions.h"
@@ -1349,6 +1350,98 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
   CB->setMetadata(llvm::LLVMContext::MD_alloc_token_hint, MDN);
 }
 
+/// Infer type from a simple sizeof expression.
+static QualType inferTypeFromSizeofExpr(const Expr *E) {
+  const Expr *Arg = E->IgnoreParenImpCasts();
+  if (const auto *UET = dyn_cast<UnaryExprOrTypeTraitExpr>(Arg)) {
+    if (UET->getKind() == UETT_SizeOf) {
+      if (UET->isArgumentType()) {
+        return UET->getArgumentTypeInfo()->getType();
+      } else {
+        return UET->getArgumentExpr()->getType();
+      }
+    }
+  }
+  return QualType();
+}
+
+/// Infer type from an arithmetic expression involving a sizeof.
+static QualType inferTypeFromArithSizeofExpr(const Expr *E) {
+  const Expr *Arg = E->IgnoreParenImpCasts();
+  // The argument is a lone sizeof expression.
+  QualType QT = inferTypeFromSizeofExpr(Arg);
+  if (!QT.isNull())
+    return QT;
+  if (const auto *BO = dyn_cast<BinaryOperator>(Arg)) {
+    // Argument is an arithmetic expression. Cover common arithmetic patterns
+    // involving sizeof.
+    switch (BO->getOpcode()) {
+    case BO_Add:
+    case BO_Div:
+    case BO_Mul:
+    case BO_Shl:
+    case BO_Shr:
+    case BO_Sub:
+      QT = inferTypeFromArithSizeofExpr(BO->getLHS());
+      if (!QT.isNull())
+        return QT;
+      QT = inferTypeFromArithSizeofExpr(BO->getRHS());
+      if (!QT.isNull())
+        return QT;
+      break;
+    default:
+      break;
+    }
+  }
+  return QualType();
+}
+
+/// If the expression E is a reference to a variable, infer the type from a
+/// variable's initializer if it contains a sizeof. Beware, this is a heuristic
+/// and ignores if a variable is later reassigned.
+static QualType inferTypeFromVarInitSizeofExpr(const Expr *E) {
+  const Expr *Arg = E->IgnoreParenImpCasts();
+  if (const auto *DRE = dyn_cast<DeclRefExpr>(Arg)) {
+    if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
+      if (const Expr *Init = VD->getInit()) {
+        return inferTypeFromArithSizeofExpr(Init);
+      }
+    }
+  }
+  return QualType();
+}
+
+/// Deduces the allocated type by checking if the allocation call's result
+/// is immediately used in a cast expression.
+static QualType inferTypeFromCastExpr(const CallExpr *CallE,
+                                      const CastExpr *CastE) {
+  if (!CastE)
+    return QualType();
+  QualType PtrType = CastE->getType();
+  if (PtrType->isPointerType())
+    return PtrType->getPointeeType();
+  return QualType();
+}
+
+void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
+                                         const CallExpr *E) {
+  QualType AllocType;
+  // First check arguments.
+  for (const Expr *Arg : E->arguments()) {
+    AllocType = inferTypeFromArithSizeofExpr(Arg);
+    if (AllocType.isNull())
+      AllocType = inferTypeFromVarInitSizeofExpr(Arg);
+    if (!AllocType.isNull())
+      break;
+  }
+  // Then check later casts.
+  if (AllocType.isNull())
+    AllocType = inferTypeFromCastExpr(E, CurCast);
+  // Emit if we were able to infer the type.
+  if (!AllocType.isNull())
+    EmitAllocTokenHint(CB, AllocType);
+}
+
 CodeGenFunction::ComplexPairTy CodeGenFunction::
 EmitComplexPrePostIncDec(const UnaryOperator *E, LValue LV,
                          bool isInc, bool isPre) {
@@ -5720,6 +5813,9 @@ LValue CodeGenFunction::EmitConditionalOperatorLValue(
 /// are permitted with aggregate result, including noop aggregate casts, and
 /// cast from scalar to union.
 LValue CodeGenFunction::EmitCastLValue(const CastExpr *E) {
+  auto RestoreCurCast =
+      llvm::make_scope_exit([this, Prev = CurCast] { CurCast = Prev; });
+  CurCast = E;
   switch (E->getCastKind()) {
   case CK_ToVoid:
   case CK_BitCast:
@@ -6668,16 +6764,24 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType,
   RValue Call = EmitCall(FnInfo, Callee, ReturnValue, Args, &LocalCallOrInvoke,
                          E == MustTailCall, E->getExprLoc());
 
-  // Generate function declaration DISuprogram in order to be used
-  // in debug info about call sites.
-  if (CGDebugInfo *DI = getDebugInfo()) {
-    if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
+  if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
+    // Generate function declaration DISuprogram in order to be used
+    // in debug info about call sites.
+    if (CGDebugInfo *DI = getDebugInfo()) {
       FunctionArgList Args;
       QualType ResTy = BuildFunctionArgList(CalleeDecl, Args);
       DI->EmitFuncDeclForCallSite(LocalCallOrInvoke,
                                   DI->getFunctionType(CalleeDecl, ResTy, Args),
                                   CalleeDecl);
     }
+    if (CalleeDecl->hasAttr<RestrictAttr>() ||
+        CalleeDecl->hasAttr<AllocSizeAttr>()) {
+      // Function has malloc or alloc_size attribute.
+      if (SanOpts.has(SanitizerKind::AllocToken)) {
+        // Set !alloc_token_hint metadata.
+        EmitAllocTokenHint(LocalCallOrInvoke, E);
+      }
+    }
   }
   if (CallOrInvoke)
     *CallOrInvoke = LocalCallOrInvoke;
diff --git a/clang/lib/CodeGen/CGExprCXX.cpp b/clang/lib/CodeGen/CGExprCXX.cpp
index 6bf3332b425fa..85ced63d5036b 100644
--- a/clang/lib/CodeGen/CGExprCXX.cpp
+++ b/clang/lib/CodeGen/CGExprCXX.cpp
@@ -1371,8 +1371,16 @@ RValue CodeGenFunction::EmitBuiltinNewDeleteCall(const FunctionProtoType *Type,
 
   for (auto *Decl : Ctx.getTranslationUnitDecl()->lookup(Name))
     if (auto *FD = dyn_cast<FunctionDecl>(Decl))
-      if (Ctx.hasSameType(FD->getType(), QualType(Type, 0)))
-        return EmitNewDeleteCall(*this, FD, Type, Args);
+      if (Ctx.hasSameType(FD->getType(), QualType(Type, 0))) {
+        RValue RV = EmitNewDeleteCall(*this, FD, Type, Args);
+        if (auto *CB = dyn_cast_if_present<llvm::CallBase>(RV.getScalarVal())) {
+          if (SanOpts.has(SanitizerKind::AllocToken)) {
+            // Set !alloc_token_hint metadata.
+            EmitAllocTokenHint(CB, TheCall);
+          }
+        }
+        return RV;
+      }
   llvm_unreachable("predeclared global operator new/delete is missing");
 }
 
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index 2eff3a387593c..3318c8a52597e 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -33,6 +33,7 @@
 #include "clang/Basic/DiagnosticTrap.h"
 #include "clang/Basic/TargetInfo.h"
 #include "llvm/ADT/APFixedPoint.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/IR/Argument.h"
 #include "llvm/IR/CFG.h"
 #include "llvm/IR/Constants.h"
@@ -2431,6 +2432,10 @@ static Value *EmitHLSLElementwiseCast(CodeGenFunction &CGF, Address RHSVal,
 // have to handle a more broad range of conversions than explicit casts, as they
 // handle things like function to ptr-to-function decay etc.
 Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
+  auto RestoreCurCast =
+      llvm::make_scope_exit([this, Prev = CGF.CurCast] { CGF.CurCast = Prev; });
+  CGF.CurCast = CE;
+
   Expr *E = CE->getSubExpr();
   QualType DestTy = CE->getType();
   CastKind Kind = CE->getCastKind();
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index fd7ec36183c2d..8e89838531d35 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -346,6 +346,10 @@ class CodeGenFunction : public CodeGenTypeCache {
   QualType FnRetTy;
   llvm::Function *CurFn = nullptr;
 
+  /// If a cast expression is being visited, this holds the current cast's
+  /// expression.
+  const CastExpr *CurCast = nullptr;
+
   /// Save Parameter Decl for coroutine.
   llvm::SmallVector<const ParmVarDecl *, 4> FnArgs;
 
@@ -3350,6 +3354,9 @@ class CodeGenFunction : public CodeGenTypeCache {
 
   /// Emit additional metadata used by the AllocToken instrumentation.
   void EmitAllocTokenHint(llvm::CallBase *CB, QualType AllocType);
+  /// Emit additional metadata used by the AllocToken instrumentation,
+  /// inferring the type from an allocation call expression.
+  void EmitAllocTokenHint(llvm::CallBase *CB, const CallExpr *E);
 
   llvm::Value *GetCountedByFieldExprGEP(const Expr *Base, const FieldDecl *FD,
                                         const FieldDecl *CountDecl);
diff --git a/clang/test/CodeGen/alloc-token-nonlibcalls.c b/clang/test/CodeGen/alloc-token-nonlibcalls.c
new file mode 100644
index 0000000000000..53c85a2174c5e
--- /dev/null
+++ b/clang/test/CodeGen/alloc-token-nonlibcalls.c
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1    -fsanitize=alloc-token -fsanitize-alloc-token-extended -falloc-token-max=2147483647 -triple x86_64-linux-gnu -x c -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -O -fsanitize=alloc-token -fsanitize-alloc-token-extended -falloc-token-max=2147483647 -triple x86_64-linux-gnu -x c -emit-llvm %s -o - | FileCheck %s
+
+typedef __typeof(sizeof(int)) size_t;
+typedef size_t gfp_t;
+
+void *custom_malloc(size_t size) __attribute__((malloc));
+void *__kmalloc(size_t size, gfp_t flags) __attribute__((alloc_size(1)));
+
+void *sink;
+
+// CHECK-LABEL: @test_nonlibcall_alloc(
+void test_nonlibcall_alloc() {
+  // CHECK: call{{.*}} ptr @__alloc_token_custom_malloc(i64 noundef 4, i64 {{[1-9][0-9]*}})
+  sink = custom_malloc(sizeof(int));
+  // CHECK: call{{.*}} ptr @__alloc_token_kmalloc(i64 noundef 4, i64 noundef 0, i64 {{[1-9][0-9]*}})
+  sink = __kmalloc(sizeof(int), 0);
+}
diff --git a/clang/test/CodeGen/alloc-token.c b/clang/test/CodeGen/alloc-token.c
index de9b3f48c995f..b75adc1d2e766 100644
--- a/clang/test/CodeGen/alloc-token.c
+++ b/clang/test/CodeGen/alloc-token.c
@@ -3,43 +3,43 @@
 
 typedef __typeof(sizeof(int)) size_t;
 
-void *aligned_alloc(size_t alignment, size_t size);
-void *malloc(size_t size);
-void *calloc(size_t num, size_t size);
-void *realloc(void *ptr, size_t size);
-void *reallocarray(void *ptr, size_t nmemb, size_t size);
-void *memalign(size_t alignment, size_t size);
-void *valloc(size_t size);
-void *pvalloc(size_t size);
+void *aligned_alloc(size_t alignment, size_t size) __attribute__((malloc));
+void *malloc(size_t size) __attribute__((malloc));
+void *calloc(size_t num, size_t size) __attribute__((malloc));
+void *realloc(void *ptr, size_t size) __attribute__((malloc));
+void *reallocarray(void *ptr, size_t nmemb, size_t size) __attribute__((malloc));
+void *memalign(size_t alignment, size_t size) __attribute__((malloc));
+void *valloc(size_t size) __attribute__((malloc));
+void *pvalloc(size_t size) __attribute__((malloc));
 int posix_memalign(void **memptr, size_t alignment, size_t size);
 
 void *sink;
 
 // CHECK-LABEL: @test_malloc_like(
 void test_malloc_like() {
-  // FIXME: Should not be token ID 0! Currently fail to infer the type.
-  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 0)
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
   sink = malloc(sizeof(int));
-  // CHECK: call{{.*}} ptr @__alloc_token_calloc(i64 noundef 3, i64 noundef 4, i64 0)
+  // CHECK: call{{.*}} ptr @__alloc_token_calloc(i64 noundef 3, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
   sink = calloc(3, sizeof(int));
-  // CHECK: call{{.*}} ptr @__alloc_token_realloc(ptr noundef {{[^,]*}}, i64 noundef 8, i64 0)
+  // CHECK: call{{.*}} ptr @__alloc_token_realloc(ptr noundef {{[^,]*}}, i64 noundef 8, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
   sink = realloc(sink, sizeof(long));
-  // CHECK: call{{.*}} ptr @__alloc_token_reallocarray(ptr noundef {{[^,]*}}, i64 noundef 5, i64 noundef 8, i64 0)
+  // CHECK: call{{.*}} ptr @__alloc_token_reallocarray(ptr noundef {{[^,]*}}, i64 noundef 5, i64 noundef 8, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
   sink = reallocarray(sink, 5, sizeof(long));
+  // CHECK: call{{.*}} align 128{{.*}} ptr @__alloc_token_aligned_alloc(i64 noundef 128, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
+  sink = aligned_alloc(128, sizeof(int));
+  // CHECK: call{{.*}} align 16{{.*}} ptr @__alloc_token_memalign(i64 noundef 16, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
+  sink = memalign(16, sizeof(int));
+  // CHECK: call{{.*}} ptr @__alloc_token_valloc(i64 noundef 4, i64 {{[1-9][0-9]*}}), !alloc_token_hint
+  sink = valloc(sizeof(int));
+  // CHECK: call{{.*}} ptr @__alloc_token_pvalloc(i64 noundef 4, i64 {{[1-9][0-9]*}}), !alloc_token_hint
+  sink = pvalloc(sizeof(int));
+  // FIXME: Should not be token ID 0!
   // CHECK: call{{.*}} i32 @__alloc_token_posix_memalign(ptr noundef {{[^,]*}}, i64 noundef 64, i64 noundef 4, i64 0)
   posix_memalign(&sink, 64, sizeof(int));
-  // CHECK: call align 128{{.*}} ptr @__alloc_token_aligned_alloc(i64 noundef 128, i64 noundef 1024, i64 0)
-  sink = aligned_alloc(128, 1024);
-  // CHECK: call align 16{{.*}} ptr @__alloc_token_memalign(i64 noundef 16, i64 noundef 256, i64 0)
-  sink = memalign(16, 256);
-  // CHECK: call{{.*}} ptr @__alloc_token_valloc(i64 noundef 4096, i64 0)
-  sink = valloc(4096);
-  // CHECK: call{{.*}} ptr @__alloc_token_pvalloc(i64 noundef 8192, i64 0)
-  sink = pvalloc(8192);
 }
 
 // CHECK-LABEL: @no_sanitize_malloc(
 void *no_sanitize_malloc(size_t size) __attribute__((no_sanitize("alloc-token"))) {
-  // CHECK: call ptr @malloc(
+  // CHECK: call{{.*}} ptr @malloc(
   return malloc(size);
 }
diff --git a/clang/test/CodeGenCXX/alloc-token-pointer.cpp b/clang/test/CodeGenCXX/alloc-token-pointer.cpp
index 7adb75c7afebb..cceb305fc09e9 100644
--- a/clang/test/CodeGenCXX/alloc-token-pointer.cpp
+++ b/clang/test/CodeGenCXX/alloc-token-pointer.cpp
@@ -9,9 +9,11 @@
 typedef __UINTPTR_TYPE__ uintptr_t;
 
 extern "C" {
-void *malloc(size_t size);
+void *malloc(size_t size) __attribute__((malloc));
 }
 
+void *sink; // prevent optimizations from removing the calls
+
 // CHECK-LABEL: @_Z15test_malloc_intv(
 void *test_malloc_int() {
   // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 0)
@@ -22,8 +24,7 @@ void *test_malloc_int() {
 
 // CHECK-LABEL: @_Z15test_malloc_ptrv(
 int **test_malloc_ptr() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 8, i64 0)
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 8, i64 1)
   int **a = (int **)malloc(sizeof(int*));
   *a = nullptr;
   return a;
@@ -59,50 +60,64 @@ struct ContainsPtr {
 };
 
 // CHECK-LABEL: @_Z27test_malloc_struct_with_ptrv(
-ContainsPtr *test_malloc_struct_with_ptr() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 16, i64 0)
-  ContainsPtr *c = (ContainsPtr *)malloc(sizeof(ContainsPtr));
-  return c;
+void *test_malloc_struct_with_ptr() {
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+  return malloc(sizeof(ContainsPtr));
 }
 
 // CHECK-LABEL: @_Z33test_malloc_struct_array_with_ptrv(
-ContainsPtr *test_malloc_struct_array_with_ptr() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 160, i64 0)
-  ContainsPtr *c = (ContainsPtr *)malloc(10 * sizeof(ContainsPtr));
-  return c;
+void *test_malloc_struct_array_with_ptr() {
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+  return malloc(10 * sizeof(ContainsPtr));
+}
+
+// CHECK-LABEL: @_Z31test_malloc_with_ptr_sizeof_vari(
+void *test_malloc_with_ptr_sizeof_var(int x) {
+  unsigned long size = sizeof(ContainsPtr);
+  size *= x;
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef %{{.*}}, i64 1){{.*}} !alloc_token_hint
+  return malloc(size);
+}
+
+// CHECK-LABEL: @_Z29test_malloc_with_ptr_castonlyv(
+ContainsPtr *test_malloc_with_ptr_castonly() {
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4096, i64 1){{.*}} !alloc_token_hint
+  return (ContainsPtr *)malloc(4096);
 }
 
 // CHECK-LABEL: @_Z32test_operatornew_struct_with_ptrv(
 ContainsPtr *test_operatornew_struct_with_ptr() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 0)
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
   ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(sizeof(ContainsPtr));
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+  sink = ::operator new(sizeof(ContainsPtr));
   return c;
 }
 
 // CHECK-LABEL: @_Z38test_operatornew_struct_array_with_ptrv(
 ContainsPtr *test_operatornew_struct_array_with_ptr() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 0)
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
   ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(10 * sizeof(ContainsPtr));
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+  sink = ::operator new(10 * sizeof(ContainsPtr));
   return c;
 }
 
 // CHECK-LABEL: @_Z33test_operatornew_struct_with_ptr2v(
 ContainsPtr *test_operatornew_struct_with_ptr2() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 0)
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
   ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(sizeof(*c));
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+  sink = ::operator new(sizeof(*c));
   return c;
 }
 
 // CHECK-LABEL: @_Z39test_operatornew_struct_array_with_ptr2v(
 ContainsPtr *test_operatornew_struct_array_with_ptr2() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 0)
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
   ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(10 * sizeof(*c));
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+  sink = ::operator new(10 * sizeof(*c));
   return c;
 }
 
diff --git a/clang/test/CodeGenCXX/alloc-token.cpp b/clang/test/CodeGenCXX/alloc-token.cpp
index 180c771e43ae9..66d9352510783 100644
--- a/clang/test/CodeGenCXX/alloc-token.cpp
+++ b/clang/test/CodeGenCXX/alloc-token.cpp
@@ -3,14 +3,14 @@
 
 #include "../Analysis/Inputs/system-header-simulator-cxx.h"
 extern "C" {
-void *aligned_alloc(size_t alignment, size_t size);
-void *malloc(size_t size);
-void *calloc(size_...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Sep 9, 2025

@llvm/pr-subscribers-clang-codegen

Author: Marco Elver (melver)

Changes

For the AllocToken pass to accurately calculate token ID hints, we
should attach !alloc_token_hint metadata for allocation calls to avoid
reverting to LLVM IR-type based hints (which depends on later "uses" and
is rather imprecise).

Unlike new expressions, untyped allocation calls (like malloc,
calloc, ::operator new(..), __builtin_operator_new, etc.) have no
syntactic type associated with them. For -fsanitize=alloc-token, type
hints are sufficient, and we can attempt to infer the type based on
common idioms.

When encountering allocation calls (with __attribute__((malloc)) or
__attribute__((alloc_size(..))), attach !alloc_token_hint by
inferring the allocated type from (a) sizeof argument expressions such
as malloc(sizeof(MyType)), and (b) casts such as (MyType*)malloc(4096).

Note that non-standard allocation functions with these attributes are
not instrumented by default. Use -fsanitize-alloc-token-extended to
instrument them as well.

Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434


This change is part of the following series:

  1. [AllocToken] Introduce AllocToken instrumentation pass #156838
  2. [Clang] Wire up -fsanitize=alloc-token #156839
  3. [AllocToken, Clang] Implement TypeHashPointerSplit mode #156840
  4. [AllocToken, Clang] Infer type hints from sizeof expressions and casts #156841
  5. [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id #156842

Patch is 25.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156841.diff

9 Files Affected:

  • (modified) clang/docs/AllocToken.rst (+29)
  • (modified) clang/lib/CodeGen/CGExpr.cpp (+108-4)
  • (modified) clang/lib/CodeGen/CGExprCXX.cpp (+10-2)
  • (modified) clang/lib/CodeGen/CGExprScalar.cpp (+5)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+7)
  • (added) clang/test/CodeGen/alloc-token-nonlibcalls.c (+18)
  • (modified) clang/test/CodeGen/alloc-token.c (+22-22)
  • (modified) clang/test/CodeGenCXX/alloc-token-pointer.cpp (+36-21)
  • (modified) clang/test/CodeGenCXX/alloc-token.cpp (+49-27)
diff --git a/clang/docs/AllocToken.rst b/clang/docs/AllocToken.rst
index fb354d6738ea3..7ad5e5f03d8a0 100644
--- a/clang/docs/AllocToken.rst
+++ b/clang/docs/AllocToken.rst
@@ -122,6 +122,35 @@ which encodes the token ID hint in the allocation function name.
 This ABI provides a more efficient alternative where
 ``-falloc-token-max`` is small.
 
+Instrumenting Non-Standard Allocation Functions
+-----------------------------------------------
+
+By default, AllocToken only instruments standard library allocation functions.
+This simplifies adoption, as a compatible allocator only needs to provide
+token-enabled variants for a well-defined set of standard functions.
+
+To extend instrumentation to custom allocation functions, enable broader
+coverage with ``-fsanitize-alloc-token-extended``. Such functions require being
+marked with the `malloc
+<https://clang.llvm.org/docs/AttributeReference.html#malloc>`_ or `alloc_size
+<https://clang.llvm.org/docs/AttributeReference.html#alloc-size>`_ attributes
+(or a combination).
+
+For example:
+
+.. code-block:: c
+
+    void *custom_malloc(size_t size) __attribute__((malloc));
+    void *my_malloc(size_t size) __attribute__((alloc_size(1)));
+
+    // Original:
+    ptr1 = custom_malloc(size);
+    ptr2 = my_malloc(size);
+
+    // Instrumented:
+    ptr1 = __alloc_token_custom_malloc(size, token_id);
+    ptr2 = __alloc_token_my_malloc(size, token_id);
+
 Disabling Instrumentation
 -------------------------
 
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index e7a0e7696e204..dc428f04e873a 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -30,6 +30,7 @@
 #include "clang/AST/Attr.h"
 #include "clang/AST/DeclObjC.h"
 #include "clang/AST/NSAPI.h"
+#include "clang/AST/ParentMapContext.h"
 #include "clang/AST/StmtVisitor.h"
 #include "clang/Basic/Builtins.h"
 #include "clang/Basic/CodeGenOptions.h"
@@ -1349,6 +1350,98 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
   CB->setMetadata(llvm::LLVMContext::MD_alloc_token_hint, MDN);
 }
 
+/// Infer type from a simple sizeof expression.
+static QualType inferTypeFromSizeofExpr(const Expr *E) {
+  const Expr *Arg = E->IgnoreParenImpCasts();
+  if (const auto *UET = dyn_cast<UnaryExprOrTypeTraitExpr>(Arg)) {
+    if (UET->getKind() == UETT_SizeOf) {
+      if (UET->isArgumentType()) {
+        return UET->getArgumentTypeInfo()->getType();
+      } else {
+        return UET->getArgumentExpr()->getType();
+      }
+    }
+  }
+  return QualType();
+}
+
+/// Infer type from an arithmetic expression involving a sizeof.
+static QualType inferTypeFromArithSizeofExpr(const Expr *E) {
+  const Expr *Arg = E->IgnoreParenImpCasts();
+  // The argument is a lone sizeof expression.
+  QualType QT = inferTypeFromSizeofExpr(Arg);
+  if (!QT.isNull())
+    return QT;
+  if (const auto *BO = dyn_cast<BinaryOperator>(Arg)) {
+    // Argument is an arithmetic expression. Cover common arithmetic patterns
+    // involving sizeof.
+    switch (BO->getOpcode()) {
+    case BO_Add:
+    case BO_Div:
+    case BO_Mul:
+    case BO_Shl:
+    case BO_Shr:
+    case BO_Sub:
+      QT = inferTypeFromArithSizeofExpr(BO->getLHS());
+      if (!QT.isNull())
+        return QT;
+      QT = inferTypeFromArithSizeofExpr(BO->getRHS());
+      if (!QT.isNull())
+        return QT;
+      break;
+    default:
+      break;
+    }
+  }
+  return QualType();
+}
+
+/// If the expression E is a reference to a variable, infer the type from a
+/// variable's initializer if it contains a sizeof. Beware, this is a heuristic
+/// and ignores if a variable is later reassigned.
+static QualType inferTypeFromVarInitSizeofExpr(const Expr *E) {
+  const Expr *Arg = E->IgnoreParenImpCasts();
+  if (const auto *DRE = dyn_cast<DeclRefExpr>(Arg)) {
+    if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
+      if (const Expr *Init = VD->getInit()) {
+        return inferTypeFromArithSizeofExpr(Init);
+      }
+    }
+  }
+  return QualType();
+}
+
+/// Deduces the allocated type by checking if the allocation call's result
+/// is immediately used in a cast expression.
+static QualType inferTypeFromCastExpr(const CallExpr *CallE,
+                                      const CastExpr *CastE) {
+  if (!CastE)
+    return QualType();
+  QualType PtrType = CastE->getType();
+  if (PtrType->isPointerType())
+    return PtrType->getPointeeType();
+  return QualType();
+}
+
+void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
+                                         const CallExpr *E) {
+  QualType AllocType;
+  // First check arguments.
+  for (const Expr *Arg : E->arguments()) {
+    AllocType = inferTypeFromArithSizeofExpr(Arg);
+    if (AllocType.isNull())
+      AllocType = inferTypeFromVarInitSizeofExpr(Arg);
+    if (!AllocType.isNull())
+      break;
+  }
+  // Then check later casts.
+  if (AllocType.isNull())
+    AllocType = inferTypeFromCastExpr(E, CurCast);
+  // Emit if we were able to infer the type.
+  if (!AllocType.isNull())
+    EmitAllocTokenHint(CB, AllocType);
+}
+
 CodeGenFunction::ComplexPairTy CodeGenFunction::
 EmitComplexPrePostIncDec(const UnaryOperator *E, LValue LV,
                          bool isInc, bool isPre) {
@@ -5720,6 +5813,9 @@ LValue CodeGenFunction::EmitConditionalOperatorLValue(
 /// are permitted with aggregate result, including noop aggregate casts, and
 /// cast from scalar to union.
 LValue CodeGenFunction::EmitCastLValue(const CastExpr *E) {
+  auto RestoreCurCast =
+      llvm::make_scope_exit([this, Prev = CurCast] { CurCast = Prev; });
+  CurCast = E;
   switch (E->getCastKind()) {
   case CK_ToVoid:
   case CK_BitCast:
@@ -6668,16 +6764,24 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType,
   RValue Call = EmitCall(FnInfo, Callee, ReturnValue, Args, &LocalCallOrInvoke,
                          E == MustTailCall, E->getExprLoc());
 
-  // Generate function declaration DISuprogram in order to be used
-  // in debug info about call sites.
-  if (CGDebugInfo *DI = getDebugInfo()) {
-    if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
+  if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
+    // Generate function declaration DISuprogram in order to be used
+    // in debug info about call sites.
+    if (CGDebugInfo *DI = getDebugInfo()) {
       FunctionArgList Args;
       QualType ResTy = BuildFunctionArgList(CalleeDecl, Args);
       DI->EmitFuncDeclForCallSite(LocalCallOrInvoke,
                                   DI->getFunctionType(CalleeDecl, ResTy, Args),
                                   CalleeDecl);
     }
+    if (CalleeDecl->hasAttr<RestrictAttr>() ||
+        CalleeDecl->hasAttr<AllocSizeAttr>()) {
+      // Function has malloc or alloc_size attribute.
+      if (SanOpts.has(SanitizerKind::AllocToken)) {
+        // Set !alloc_token_hint metadata.
+        EmitAllocTokenHint(LocalCallOrInvoke, E);
+      }
+    }
   }
   if (CallOrInvoke)
     *CallOrInvoke = LocalCallOrInvoke;
diff --git a/clang/lib/CodeGen/CGExprCXX.cpp b/clang/lib/CodeGen/CGExprCXX.cpp
index 6bf3332b425fa..85ced63d5036b 100644
--- a/clang/lib/CodeGen/CGExprCXX.cpp
+++ b/clang/lib/CodeGen/CGExprCXX.cpp
@@ -1371,8 +1371,16 @@ RValue CodeGenFunction::EmitBuiltinNewDeleteCall(const FunctionProtoType *Type,
 
   for (auto *Decl : Ctx.getTranslationUnitDecl()->lookup(Name))
     if (auto *FD = dyn_cast<FunctionDecl>(Decl))
-      if (Ctx.hasSameType(FD->getType(), QualType(Type, 0)))
-        return EmitNewDeleteCall(*this, FD, Type, Args);
+      if (Ctx.hasSameType(FD->getType(), QualType(Type, 0))) {
+        RValue RV = EmitNewDeleteCall(*this, FD, Type, Args);
+        if (auto *CB = dyn_cast_if_present<llvm::CallBase>(RV.getScalarVal())) {
+          if (SanOpts.has(SanitizerKind::AllocToken)) {
+            // Set !alloc_token_hint metadata.
+            EmitAllocTokenHint(CB, TheCall);
+          }
+        }
+        return RV;
+      }
   llvm_unreachable("predeclared global operator new/delete is missing");
 }
 
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index 2eff3a387593c..3318c8a52597e 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -33,6 +33,7 @@
 #include "clang/Basic/DiagnosticTrap.h"
 #include "clang/Basic/TargetInfo.h"
 #include "llvm/ADT/APFixedPoint.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/IR/Argument.h"
 #include "llvm/IR/CFG.h"
 #include "llvm/IR/Constants.h"
@@ -2431,6 +2432,10 @@ static Value *EmitHLSLElementwiseCast(CodeGenFunction &CGF, Address RHSVal,
 // have to handle a more broad range of conversions than explicit casts, as they
 // handle things like function to ptr-to-function decay etc.
 Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
+  auto RestoreCurCast =
+      llvm::make_scope_exit([this, Prev = CGF.CurCast] { CGF.CurCast = Prev; });
+  CGF.CurCast = CE;
+
   Expr *E = CE->getSubExpr();
   QualType DestTy = CE->getType();
   CastKind Kind = CE->getCastKind();
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index fd7ec36183c2d..8e89838531d35 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -346,6 +346,10 @@ class CodeGenFunction : public CodeGenTypeCache {
   QualType FnRetTy;
   llvm::Function *CurFn = nullptr;
 
+  /// If a cast expression is being visited, this holds the current cast's
+  /// expression.
+  const CastExpr *CurCast = nullptr;
+
   /// Save Parameter Decl for coroutine.
   llvm::SmallVector<const ParmVarDecl *, 4> FnArgs;
 
@@ -3350,6 +3354,9 @@ class CodeGenFunction : public CodeGenTypeCache {
 
   /// Emit additional metadata used by the AllocToken instrumentation.
   void EmitAllocTokenHint(llvm::CallBase *CB, QualType AllocType);
+  /// Emit additional metadata used by the AllocToken instrumentation,
+  /// inferring the type from an allocation call expression.
+  void EmitAllocTokenHint(llvm::CallBase *CB, const CallExpr *E);
 
   llvm::Value *GetCountedByFieldExprGEP(const Expr *Base, const FieldDecl *FD,
                                         const FieldDecl *CountDecl);
diff --git a/clang/test/CodeGen/alloc-token-nonlibcalls.c b/clang/test/CodeGen/alloc-token-nonlibcalls.c
new file mode 100644
index 0000000000000..53c85a2174c5e
--- /dev/null
+++ b/clang/test/CodeGen/alloc-token-nonlibcalls.c
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1    -fsanitize=alloc-token -fsanitize-alloc-token-extended -falloc-token-max=2147483647 -triple x86_64-linux-gnu -x c -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -O -fsanitize=alloc-token -fsanitize-alloc-token-extended -falloc-token-max=2147483647 -triple x86_64-linux-gnu -x c -emit-llvm %s -o - | FileCheck %s
+
+typedef __typeof(sizeof(int)) size_t;
+typedef size_t gfp_t;
+
+void *custom_malloc(size_t size) __attribute__((malloc));
+void *__kmalloc(size_t size, gfp_t flags) __attribute__((alloc_size(1)));
+
+void *sink;
+
+// CHECK-LABEL: @test_nonlibcall_alloc(
+void test_nonlibcall_alloc() {
+  // CHECK: call{{.*}} ptr @__alloc_token_custom_malloc(i64 noundef 4, i64 {{[1-9][0-9]*}})
+  sink = custom_malloc(sizeof(int));
+  // CHECK: call{{.*}} ptr @__alloc_token_kmalloc(i64 noundef 4, i64 noundef 0, i64 {{[1-9][0-9]*}})
+  sink = __kmalloc(sizeof(int), 0);
+}
diff --git a/clang/test/CodeGen/alloc-token.c b/clang/test/CodeGen/alloc-token.c
index de9b3f48c995f..b75adc1d2e766 100644
--- a/clang/test/CodeGen/alloc-token.c
+++ b/clang/test/CodeGen/alloc-token.c
@@ -3,43 +3,43 @@
 
 typedef __typeof(sizeof(int)) size_t;
 
-void *aligned_alloc(size_t alignment, size_t size);
-void *malloc(size_t size);
-void *calloc(size_t num, size_t size);
-void *realloc(void *ptr, size_t size);
-void *reallocarray(void *ptr, size_t nmemb, size_t size);
-void *memalign(size_t alignment, size_t size);
-void *valloc(size_t size);
-void *pvalloc(size_t size);
+void *aligned_alloc(size_t alignment, size_t size) __attribute__((malloc));
+void *malloc(size_t size) __attribute__((malloc));
+void *calloc(size_t num, size_t size) __attribute__((malloc));
+void *realloc(void *ptr, size_t size) __attribute__((malloc));
+void *reallocarray(void *ptr, size_t nmemb, size_t size) __attribute__((malloc));
+void *memalign(size_t alignment, size_t size) __attribute__((malloc));
+void *valloc(size_t size) __attribute__((malloc));
+void *pvalloc(size_t size) __attribute__((malloc));
 int posix_memalign(void **memptr, size_t alignment, size_t size);
 
 void *sink;
 
 // CHECK-LABEL: @test_malloc_like(
 void test_malloc_like() {
-  // FIXME: Should not be token ID 0! Currently fail to infer the type.
-  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 0)
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
   sink = malloc(sizeof(int));
-  // CHECK: call{{.*}} ptr @__alloc_token_calloc(i64 noundef 3, i64 noundef 4, i64 0)
+  // CHECK: call{{.*}} ptr @__alloc_token_calloc(i64 noundef 3, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
   sink = calloc(3, sizeof(int));
-  // CHECK: call{{.*}} ptr @__alloc_token_realloc(ptr noundef {{[^,]*}}, i64 noundef 8, i64 0)
+  // CHECK: call{{.*}} ptr @__alloc_token_realloc(ptr noundef {{[^,]*}}, i64 noundef 8, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
   sink = realloc(sink, sizeof(long));
-  // CHECK: call{{.*}} ptr @__alloc_token_reallocarray(ptr noundef {{[^,]*}}, i64 noundef 5, i64 noundef 8, i64 0)
+  // CHECK: call{{.*}} ptr @__alloc_token_reallocarray(ptr noundef {{[^,]*}}, i64 noundef 5, i64 noundef 8, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
   sink = reallocarray(sink, 5, sizeof(long));
+  // CHECK: call{{.*}} align 128{{.*}} ptr @__alloc_token_aligned_alloc(i64 noundef 128, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
+  sink = aligned_alloc(128, sizeof(int));
+  // CHECK: call{{.*}} align 16{{.*}} ptr @__alloc_token_memalign(i64 noundef 16, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
+  sink = memalign(16, sizeof(int));
+  // CHECK: call{{.*}} ptr @__alloc_token_valloc(i64 noundef 4, i64 {{[1-9][0-9]*}}), !alloc_token_hint
+  sink = valloc(sizeof(int));
+  // CHECK: call{{.*}} ptr @__alloc_token_pvalloc(i64 noundef 4, i64 {{[1-9][0-9]*}}), !alloc_token_hint
+  sink = pvalloc(sizeof(int));
+  // FIXME: Should not be token ID 0!
   // CHECK: call{{.*}} i32 @__alloc_token_posix_memalign(ptr noundef {{[^,]*}}, i64 noundef 64, i64 noundef 4, i64 0)
   posix_memalign(&sink, 64, sizeof(int));
-  // CHECK: call align 128{{.*}} ptr @__alloc_token_aligned_alloc(i64 noundef 128, i64 noundef 1024, i64 0)
-  sink = aligned_alloc(128, 1024);
-  // CHECK: call align 16{{.*}} ptr @__alloc_token_memalign(i64 noundef 16, i64 noundef 256, i64 0)
-  sink = memalign(16, 256);
-  // CHECK: call{{.*}} ptr @__alloc_token_valloc(i64 noundef 4096, i64 0)
-  sink = valloc(4096);
-  // CHECK: call{{.*}} ptr @__alloc_token_pvalloc(i64 noundef 8192, i64 0)
-  sink = pvalloc(8192);
 }
 
 // CHECK-LABEL: @no_sanitize_malloc(
 void *no_sanitize_malloc(size_t size) __attribute__((no_sanitize("alloc-token"))) {
-  // CHECK: call ptr @malloc(
+  // CHECK: call{{.*}} ptr @malloc(
   return malloc(size);
 }
diff --git a/clang/test/CodeGenCXX/alloc-token-pointer.cpp b/clang/test/CodeGenCXX/alloc-token-pointer.cpp
index 7adb75c7afebb..cceb305fc09e9 100644
--- a/clang/test/CodeGenCXX/alloc-token-pointer.cpp
+++ b/clang/test/CodeGenCXX/alloc-token-pointer.cpp
@@ -9,9 +9,11 @@
 typedef __UINTPTR_TYPE__ uintptr_t;
 
 extern "C" {
-void *malloc(size_t size);
+void *malloc(size_t size) __attribute__((malloc));
 }
 
+void *sink; // prevent optimizations from removing the calls
+
 // CHECK-LABEL: @_Z15test_malloc_intv(
 void *test_malloc_int() {
   // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 0)
@@ -22,8 +24,7 @@ void *test_malloc_int() {
 
 // CHECK-LABEL: @_Z15test_malloc_ptrv(
 int **test_malloc_ptr() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 8, i64 0)
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 8, i64 1)
   int **a = (int **)malloc(sizeof(int*));
   *a = nullptr;
   return a;
@@ -59,50 +60,64 @@ struct ContainsPtr {
 };
 
 // CHECK-LABEL: @_Z27test_malloc_struct_with_ptrv(
-ContainsPtr *test_malloc_struct_with_ptr() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 16, i64 0)
-  ContainsPtr *c = (ContainsPtr *)malloc(sizeof(ContainsPtr));
-  return c;
+void *test_malloc_struct_with_ptr() {
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+  return malloc(sizeof(ContainsPtr));
 }
 
 // CHECK-LABEL: @_Z33test_malloc_struct_array_with_ptrv(
-ContainsPtr *test_malloc_struct_array_with_ptr() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 160, i64 0)
-  ContainsPtr *c = (ContainsPtr *)malloc(10 * sizeof(ContainsPtr));
-  return c;
+void *test_malloc_struct_array_with_ptr() {
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+  return malloc(10 * sizeof(ContainsPtr));
+}
+
+// CHECK-LABEL: @_Z31test_malloc_with_ptr_sizeof_vari(
+void *test_malloc_with_ptr_sizeof_var(int x) {
+  unsigned long size = sizeof(ContainsPtr);
+  size *= x;
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef %{{.*}}, i64 1){{.*}} !alloc_token_hint
+  return malloc(size);
+}
+
+// CHECK-LABEL: @_Z29test_malloc_with_ptr_castonlyv(
+ContainsPtr *test_malloc_with_ptr_castonly() {
+  // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4096, i64 1){{.*}} !alloc_token_hint
+  return (ContainsPtr *)malloc(4096);
 }
 
 // CHECK-LABEL: @_Z32test_operatornew_struct_with_ptrv(
 ContainsPtr *test_operatornew_struct_with_ptr() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 0)
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
   ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(sizeof(ContainsPtr));
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+  sink = ::operator new(sizeof(ContainsPtr));
   return c;
 }
 
 // CHECK-LABEL: @_Z38test_operatornew_struct_array_with_ptrv(
 ContainsPtr *test_operatornew_struct_array_with_ptr() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 0)
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
   ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(10 * sizeof(ContainsPtr));
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+  sink = ::operator new(10 * sizeof(ContainsPtr));
   return c;
 }
 
 // CHECK-LABEL: @_Z33test_operatornew_struct_with_ptr2v(
 ContainsPtr *test_operatornew_struct_with_ptr2() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 0)
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
   ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(sizeof(*c));
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+  sink = ::operator new(sizeof(*c));
   return c;
 }
 
 // CHECK-LABEL: @_Z39test_operatornew_struct_array_with_ptr2v(
 ContainsPtr *test_operatornew_struct_array_with_ptr2() {
-  // FIXME: This should not be token ID 0!
-  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 0)
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
   ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(10 * sizeof(*c));
+  // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+  sink = ::operator new(10 * sizeof(*c));
   return c;
 }
 
diff --git a/clang/test/CodeGenCXX/alloc-token.cpp b/clang/test/CodeGenCXX/alloc-token.cpp
index 180c771e43ae9..66d9352510783 100644
--- a/clang/test/CodeGenCXX/alloc-token.cpp
+++ b/clang/test/CodeGenCXX/alloc-token.cpp
@@ -3,14 +3,14 @@
 
 #include "../Analysis/Inputs/system-header-simulator-cxx.h"
 extern "C" {
-void *aligned_alloc(size_t alignment, size_t size);
-void *malloc(size_t size);
-void *calloc(size_...
[truncated]

melver added 2 commits October 7, 2025 12:56
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 7, 2025
…alloc_token metadata (#160131)

In preparation of adding the "AllocToken" pass, add the pre-requisite
`sanitize_alloc_token` function attribute and `alloc_token` metadata.

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
melver added a commit that referenced this pull request Oct 7, 2025
Introduce `AllocToken`, an instrumentation pass designed to provide
tokens to memory allocators enabling various heap organization
strategies, such as heap partitioning.

Initially, the pass instruments functions marked with a new attribute
`sanitize_alloc_token` by rewriting allocation calls to include a token
ID, appended as a function argument with the default ABI.

The design aims to provide a flexible framework for implementing
different token generation schemes. It currently supports the following
token modes:

- TypeHash (default): token IDs based on a hash of the allocated type
- Random: statically-assigned pseudo-random token IDs
- Increment: incrementing token IDs per TU

For the `TypeHash` mode introduce support for `!alloc_token` metadata:
the metadata can be attached to allocation calls to provide richer
semantic
information to be consumed by the AllocToken pass. Optimization remarks
can be enabled to show where no metadata was available.

An alternative "fast ABI" is provided, where instead of passing the
token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the
token ID is directly encoded into the name of the called function (e.g.,
`__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this
offers more efficient instrumentation by avoiding the overhead of
passing an additional argument at each allocation site.

Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1]

---

This change is part of the following series:
  1. #160131
  2. #156838
  3. #162098
  4. #162099
  5. #156839
  6. #156840
  7. #156841
  8. #156842
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 7, 2025
…56838)

Introduce `AllocToken`, an instrumentation pass designed to provide
tokens to memory allocators enabling various heap organization
strategies, such as heap partitioning.

Initially, the pass instruments functions marked with a new attribute
`sanitize_alloc_token` by rewriting allocation calls to include a token
ID, appended as a function argument with the default ABI.

The design aims to provide a flexible framework for implementing
different token generation schemes. It currently supports the following
token modes:

- TypeHash (default): token IDs based on a hash of the allocated type
- Random: statically-assigned pseudo-random token IDs
- Increment: incrementing token IDs per TU

For the `TypeHash` mode introduce support for `!alloc_token` metadata:
the metadata can be attached to allocation calls to provide richer
semantic
information to be consumed by the AllocToken pass. Optimization remarks
can be enabled to show where no metadata was available.

An alternative "fast ABI" is provided, where instead of passing the
token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the
token ID is directly encoded into the name of the called function (e.g.,
`__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this
offers more efficient instrumentation by avoiding the overhead of
passing an additional argument at each allocation site.

Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1]

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
melver added 2 commits October 7, 2025 14:56
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
melver added a commit that referenced this pull request Oct 7, 2025
Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target
'x86_64-unknown-linux-gnu'

In this step we can already wire up the `sanitize_alloc_token` IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

---

This change is part of the following series:
  1. #160131
  2. #156838
  3. #162098
  4. #162099
  5. #156839
  6. #156840
  7. #156841
  8. #156842
melver added 2 commits October 7, 2025 20:25
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 7, 2025
…162098)

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target
'x86_64-unknown-linux-gnu'

In this step we can already wire up the `sanitize_alloc_token` IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
melver added a commit that referenced this pull request Oct 7, 2025
For new expressions, the allocated type is syntactically known and we
can trivially emit the !alloc_token metadata. A subsequent change will
wire up the AllocToken pass and introduce appropriate tests.

---

This change is part of the following series:
  1. #160131
  2. #156838
  3. #162098
  4. #162099
  5. #156839
  6. #156840
  7. #156841
  8. #156842
melver added 2 commits October 7, 2025 20:58
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 7, 2025
…62099)

For new expressions, the allocated type is syntactically known and we
can trivially emit the !alloc_token metadata. A subsequent change will
wire up the AllocToken pass and introduce appropriate tests.

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
melver added 2 commits October 7, 2025 21:19
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
melver added a commit that referenced this pull request Oct 8, 2025
[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ]

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target
'x86_64-unknown-linux-gnu'

In this step we can already wire up the `sanitize_alloc_token` IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

---

This change is part of the following series:
  1. #160131
  2. #156838
  3. #162098
  4. #162099
  5. #156839
  6. #156840
  7. #156841
  8. #156842
melver added a commit that referenced this pull request Oct 8, 2025
[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ]

For new expressions, the allocated type is syntactically known and we
can trivially emit the !alloc_token metadata. A subsequent change will
wire up the AllocToken pass and introduce appropriate tests.

---

This change is part of the following series:
  1. #160131
  2. #156838
  3. #162098
  4. #162099
  5. #156839
  6. #156840
  7. #156841
  8. #156842
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 8, 2025
…162098)

[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ]

Introduce the "alloc-token" sanitizer kind, in preparation of wiring it
up. Currently this is a no-op, and any attempt to enable it will result
in failure:

clang: error: unsupported option '-fsanitize=alloc-token' for target
'x86_64-unknown-linux-gnu'

In this step we can already wire up the `sanitize_alloc_token` IR
attribute where the instrumentation is enabled. Subsequent changes will
complete wiring up the AllocToken pass.

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 8, 2025
…62099)

[ Reland after 7815df1 ("[Clang] Fix brittle print-header-json.c test") ]

For new expressions, the allocated type is syntactically known and we
can trivially emit the !alloc_token metadata. A subsequent change will
wire up the AllocToken pass and introduce appropriate tests.

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
melver added 2 commits October 8, 2025 19:17
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
melver added a commit that referenced this pull request Oct 8, 2025
Wire up the `-fsanitize=alloc-token` command-line option, hooking up
the `AllocToken` pass -- it provides allocation tokens to compatible
runtime allocators, enabling different heap organization strategies,
e.g. hardening schemes based on heap partitioning.

The instrumentation rewrites standard allocation calls into variants
that accept an additional `size_t token_id` argument. For example,
calls to `malloc(size)` become `__alloc_token_malloc(size, token_id)`,
and a C++ `new MyType` expression will call
`__alloc_token__Znwm(size, token_id)`.

Currently untyped allocation calls do not yet have `!alloc_token`
metadata, and therefore receive the fallback token only. This will be
fixed in subsequent changes through best-effort type-inference.

One benefit of the instrumentation approach is that it can be applied
transparently to large codebases, and scales in deployment as other
sanitizers.

Similarly to other sanitizers, instrumentation can selectively be
controlled using `__attribute__((no_sanitize("alloc-token")))`. Support
for sanitizer ignorelists to disable instrumentation for specific
functions or source files is implemented.

See clang/docs/AllocToken.rst for more usage instructions.

Link:
https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434

---

This change is part of the following series:
  1. #160131
  2. #156838
  3. #162098
  4. #162099
  5. #156839
  6. #156840
  7. #156841
  8. #156842
melver added 2 commits October 8, 2025 21:06
Created using spr 1.3.8-beta.1

[skip ci]
Created using spr 1.3.8-beta.1
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 8, 2025
Wire up the `-fsanitize=alloc-token` command-line option, hooking up
the `AllocToken` pass -- it provides allocation tokens to compatible
runtime allocators, enabling different heap organization strategies,
e.g. hardening schemes based on heap partitioning.

The instrumentation rewrites standard allocation calls into variants
that accept an additional `size_t token_id` argument. For example,
calls to `malloc(size)` become `__alloc_token_malloc(size, token_id)`,
and a C++ `new MyType` expression will call
`__alloc_token__Znwm(size, token_id)`.

Currently untyped allocation calls do not yet have `!alloc_token`
metadata, and therefore receive the fallback token only. This will be
fixed in subsequent changes through best-effort type-inference.

One benefit of the instrumentation approach is that it can be applied
transparently to large codebases, and scales in deployment as other
sanitizers.

Similarly to other sanitizers, instrumentation can selectively be
controlled using `__attribute__((no_sanitize("alloc-token")))`. Support
for sanitizer ignorelists to disable instrumentation for specific
functions or source files is implemented.

See clang/docs/AllocToken.rst for more usage instructions.

Link:
https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
melver added a commit that referenced this pull request Oct 8, 2025
Implement the TypeHashPointerSplit mode: This mode assigns a token ID
based on the hash of the allocated type's name, where the top half
ID-space is reserved for types that contain pointers and the bottom half
for types that do not contain pointers.

This mode with max tokens of 2 (`-falloc-token-max=2`) may also
be valuable for heap hardening strategies that simply separate pointer
types from non-pointer types.

Make it the new default mode.

Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434

---

This change is part of the following series:
  1. #160131
  2. #156838
  3. #162098
  4. #162099
  5. #156839
  6. #156840
  7. #156841
  8. #156842
@melver melver changed the base branch from users/melver/spr/main.alloctoken-clang-infer-type-hints-from-sizeof-expressions-and-casts to main October 8, 2025 19:59
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 8, 2025
…156840)

Implement the TypeHashPointerSplit mode: This mode assigns a token ID
based on the hash of the allocated type's name, where the top half
ID-space is reserved for types that contain pointers and the bottom half
for types that do not contain pointers.

This mode with max tokens of 2 (`-falloc-token-max=2`) may also
be valuable for heap hardening strategies that simply separate pointer
types from non-pointer types.

Make it the new default mode.

Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434

---

This change is part of the following series:
  1. llvm/llvm-project#160131
  2. llvm/llvm-project#156838
  3. llvm/llvm-project#162098
  4. llvm/llvm-project#162099
  5. llvm/llvm-project#156839
  6. llvm/llvm-project#156840
  7. llvm/llvm-project#156841
  8. llvm/llvm-project#156842
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants