-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AllocToken, Clang] Infer type hints from sizeof expressions and casts #156841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/melver/spr/main.alloctoken-clang-infer-type-hints-from-sizeof-expressions-and-casts
Are you sure you want to change the base?
Conversation
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1
@llvm/pr-subscribers-clang Author: Marco Elver (melver) ChangesFor the AllocToken pass to accurately calculate token ID hints, we Unlike new expressions, untyped allocation calls (like When encountering allocation calls (with Note that non-standard allocation functions with these attributes are Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 This change is part of the following series:
Patch is 25.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156841.diff 9 Files Affected:
diff --git a/clang/docs/AllocToken.rst b/clang/docs/AllocToken.rst
index fb354d6738ea3..7ad5e5f03d8a0 100644
--- a/clang/docs/AllocToken.rst
+++ b/clang/docs/AllocToken.rst
@@ -122,6 +122,35 @@ which encodes the token ID hint in the allocation function name.
This ABI provides a more efficient alternative where
``-falloc-token-max`` is small.
+Instrumenting Non-Standard Allocation Functions
+-----------------------------------------------
+
+By default, AllocToken only instruments standard library allocation functions.
+This simplifies adoption, as a compatible allocator only needs to provide
+token-enabled variants for a well-defined set of standard functions.
+
+To extend instrumentation to custom allocation functions, enable broader
+coverage with ``-fsanitize-alloc-token-extended``. Such functions require being
+marked with the `malloc
+<https://clang.llvm.org/docs/AttributeReference.html#malloc>`_ or `alloc_size
+<https://clang.llvm.org/docs/AttributeReference.html#alloc-size>`_ attributes
+(or a combination).
+
+For example:
+
+.. code-block:: c
+
+ void *custom_malloc(size_t size) __attribute__((malloc));
+ void *my_malloc(size_t size) __attribute__((alloc_size(1)));
+
+ // Original:
+ ptr1 = custom_malloc(size);
+ ptr2 = my_malloc(size);
+
+ // Instrumented:
+ ptr1 = __alloc_token_custom_malloc(size, token_id);
+ ptr2 = __alloc_token_my_malloc(size, token_id);
+
Disabling Instrumentation
-------------------------
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index e7a0e7696e204..dc428f04e873a 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -30,6 +30,7 @@
#include "clang/AST/Attr.h"
#include "clang/AST/DeclObjC.h"
#include "clang/AST/NSAPI.h"
+#include "clang/AST/ParentMapContext.h"
#include "clang/AST/StmtVisitor.h"
#include "clang/Basic/Builtins.h"
#include "clang/Basic/CodeGenOptions.h"
@@ -1349,6 +1350,98 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
CB->setMetadata(llvm::LLVMContext::MD_alloc_token_hint, MDN);
}
+/// Infer type from a simple sizeof expression.
+static QualType inferTypeFromSizeofExpr(const Expr *E) {
+ const Expr *Arg = E->IgnoreParenImpCasts();
+ if (const auto *UET = dyn_cast<UnaryExprOrTypeTraitExpr>(Arg)) {
+ if (UET->getKind() == UETT_SizeOf) {
+ if (UET->isArgumentType()) {
+ return UET->getArgumentTypeInfo()->getType();
+ } else {
+ return UET->getArgumentExpr()->getType();
+ }
+ }
+ }
+ return QualType();
+}
+
+/// Infer type from an arithmetic expression involving a sizeof.
+static QualType inferTypeFromArithSizeofExpr(const Expr *E) {
+ const Expr *Arg = E->IgnoreParenImpCasts();
+ // The argument is a lone sizeof expression.
+ QualType QT = inferTypeFromSizeofExpr(Arg);
+ if (!QT.isNull())
+ return QT;
+ if (const auto *BO = dyn_cast<BinaryOperator>(Arg)) {
+ // Argument is an arithmetic expression. Cover common arithmetic patterns
+ // involving sizeof.
+ switch (BO->getOpcode()) {
+ case BO_Add:
+ case BO_Div:
+ case BO_Mul:
+ case BO_Shl:
+ case BO_Shr:
+ case BO_Sub:
+ QT = inferTypeFromArithSizeofExpr(BO->getLHS());
+ if (!QT.isNull())
+ return QT;
+ QT = inferTypeFromArithSizeofExpr(BO->getRHS());
+ if (!QT.isNull())
+ return QT;
+ break;
+ default:
+ break;
+ }
+ }
+ return QualType();
+}
+
+/// If the expression E is a reference to a variable, infer the type from a
+/// variable's initializer if it contains a sizeof. Beware, this is a heuristic
+/// and ignores if a variable is later reassigned.
+static QualType inferTypeFromVarInitSizeofExpr(const Expr *E) {
+ const Expr *Arg = E->IgnoreParenImpCasts();
+ if (const auto *DRE = dyn_cast<DeclRefExpr>(Arg)) {
+ if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
+ if (const Expr *Init = VD->getInit()) {
+ return inferTypeFromArithSizeofExpr(Init);
+ }
+ }
+ }
+ return QualType();
+}
+
+/// Deduces the allocated type by checking if the allocation call's result
+/// is immediately used in a cast expression.
+static QualType inferTypeFromCastExpr(const CallExpr *CallE,
+ const CastExpr *CastE) {
+ if (!CastE)
+ return QualType();
+ QualType PtrType = CastE->getType();
+ if (PtrType->isPointerType())
+ return PtrType->getPointeeType();
+ return QualType();
+}
+
+void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
+ const CallExpr *E) {
+ QualType AllocType;
+ // First check arguments.
+ for (const Expr *Arg : E->arguments()) {
+ AllocType = inferTypeFromArithSizeofExpr(Arg);
+ if (AllocType.isNull())
+ AllocType = inferTypeFromVarInitSizeofExpr(Arg);
+ if (!AllocType.isNull())
+ break;
+ }
+ // Then check later casts.
+ if (AllocType.isNull())
+ AllocType = inferTypeFromCastExpr(E, CurCast);
+ // Emit if we were able to infer the type.
+ if (!AllocType.isNull())
+ EmitAllocTokenHint(CB, AllocType);
+}
+
CodeGenFunction::ComplexPairTy CodeGenFunction::
EmitComplexPrePostIncDec(const UnaryOperator *E, LValue LV,
bool isInc, bool isPre) {
@@ -5720,6 +5813,9 @@ LValue CodeGenFunction::EmitConditionalOperatorLValue(
/// are permitted with aggregate result, including noop aggregate casts, and
/// cast from scalar to union.
LValue CodeGenFunction::EmitCastLValue(const CastExpr *E) {
+ auto RestoreCurCast =
+ llvm::make_scope_exit([this, Prev = CurCast] { CurCast = Prev; });
+ CurCast = E;
switch (E->getCastKind()) {
case CK_ToVoid:
case CK_BitCast:
@@ -6668,16 +6764,24 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType,
RValue Call = EmitCall(FnInfo, Callee, ReturnValue, Args, &LocalCallOrInvoke,
E == MustTailCall, E->getExprLoc());
- // Generate function declaration DISuprogram in order to be used
- // in debug info about call sites.
- if (CGDebugInfo *DI = getDebugInfo()) {
- if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
+ if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
+ // Generate function declaration DISuprogram in order to be used
+ // in debug info about call sites.
+ if (CGDebugInfo *DI = getDebugInfo()) {
FunctionArgList Args;
QualType ResTy = BuildFunctionArgList(CalleeDecl, Args);
DI->EmitFuncDeclForCallSite(LocalCallOrInvoke,
DI->getFunctionType(CalleeDecl, ResTy, Args),
CalleeDecl);
}
+ if (CalleeDecl->hasAttr<RestrictAttr>() ||
+ CalleeDecl->hasAttr<AllocSizeAttr>()) {
+ // Function has malloc or alloc_size attribute.
+ if (SanOpts.has(SanitizerKind::AllocToken)) {
+ // Set !alloc_token_hint metadata.
+ EmitAllocTokenHint(LocalCallOrInvoke, E);
+ }
+ }
}
if (CallOrInvoke)
*CallOrInvoke = LocalCallOrInvoke;
diff --git a/clang/lib/CodeGen/CGExprCXX.cpp b/clang/lib/CodeGen/CGExprCXX.cpp
index 6bf3332b425fa..85ced63d5036b 100644
--- a/clang/lib/CodeGen/CGExprCXX.cpp
+++ b/clang/lib/CodeGen/CGExprCXX.cpp
@@ -1371,8 +1371,16 @@ RValue CodeGenFunction::EmitBuiltinNewDeleteCall(const FunctionProtoType *Type,
for (auto *Decl : Ctx.getTranslationUnitDecl()->lookup(Name))
if (auto *FD = dyn_cast<FunctionDecl>(Decl))
- if (Ctx.hasSameType(FD->getType(), QualType(Type, 0)))
- return EmitNewDeleteCall(*this, FD, Type, Args);
+ if (Ctx.hasSameType(FD->getType(), QualType(Type, 0))) {
+ RValue RV = EmitNewDeleteCall(*this, FD, Type, Args);
+ if (auto *CB = dyn_cast_if_present<llvm::CallBase>(RV.getScalarVal())) {
+ if (SanOpts.has(SanitizerKind::AllocToken)) {
+ // Set !alloc_token_hint metadata.
+ EmitAllocTokenHint(CB, TheCall);
+ }
+ }
+ return RV;
+ }
llvm_unreachable("predeclared global operator new/delete is missing");
}
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index 2eff3a387593c..3318c8a52597e 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -33,6 +33,7 @@
#include "clang/Basic/DiagnosticTrap.h"
#include "clang/Basic/TargetInfo.h"
#include "llvm/ADT/APFixedPoint.h"
+#include "llvm/ADT/ScopeExit.h"
#include "llvm/IR/Argument.h"
#include "llvm/IR/CFG.h"
#include "llvm/IR/Constants.h"
@@ -2431,6 +2432,10 @@ static Value *EmitHLSLElementwiseCast(CodeGenFunction &CGF, Address RHSVal,
// have to handle a more broad range of conversions than explicit casts, as they
// handle things like function to ptr-to-function decay etc.
Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
+ auto RestoreCurCast =
+ llvm::make_scope_exit([this, Prev = CGF.CurCast] { CGF.CurCast = Prev; });
+ CGF.CurCast = CE;
+
Expr *E = CE->getSubExpr();
QualType DestTy = CE->getType();
CastKind Kind = CE->getCastKind();
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index fd7ec36183c2d..8e89838531d35 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -346,6 +346,10 @@ class CodeGenFunction : public CodeGenTypeCache {
QualType FnRetTy;
llvm::Function *CurFn = nullptr;
+ /// If a cast expression is being visited, this holds the current cast's
+ /// expression.
+ const CastExpr *CurCast = nullptr;
+
/// Save Parameter Decl for coroutine.
llvm::SmallVector<const ParmVarDecl *, 4> FnArgs;
@@ -3350,6 +3354,9 @@ class CodeGenFunction : public CodeGenTypeCache {
/// Emit additional metadata used by the AllocToken instrumentation.
void EmitAllocTokenHint(llvm::CallBase *CB, QualType AllocType);
+ /// Emit additional metadata used by the AllocToken instrumentation,
+ /// inferring the type from an allocation call expression.
+ void EmitAllocTokenHint(llvm::CallBase *CB, const CallExpr *E);
llvm::Value *GetCountedByFieldExprGEP(const Expr *Base, const FieldDecl *FD,
const FieldDecl *CountDecl);
diff --git a/clang/test/CodeGen/alloc-token-nonlibcalls.c b/clang/test/CodeGen/alloc-token-nonlibcalls.c
new file mode 100644
index 0000000000000..53c85a2174c5e
--- /dev/null
+++ b/clang/test/CodeGen/alloc-token-nonlibcalls.c
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -fsanitize=alloc-token -fsanitize-alloc-token-extended -falloc-token-max=2147483647 -triple x86_64-linux-gnu -x c -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -O -fsanitize=alloc-token -fsanitize-alloc-token-extended -falloc-token-max=2147483647 -triple x86_64-linux-gnu -x c -emit-llvm %s -o - | FileCheck %s
+
+typedef __typeof(sizeof(int)) size_t;
+typedef size_t gfp_t;
+
+void *custom_malloc(size_t size) __attribute__((malloc));
+void *__kmalloc(size_t size, gfp_t flags) __attribute__((alloc_size(1)));
+
+void *sink;
+
+// CHECK-LABEL: @test_nonlibcall_alloc(
+void test_nonlibcall_alloc() {
+ // CHECK: call{{.*}} ptr @__alloc_token_custom_malloc(i64 noundef 4, i64 {{[1-9][0-9]*}})
+ sink = custom_malloc(sizeof(int));
+ // CHECK: call{{.*}} ptr @__alloc_token_kmalloc(i64 noundef 4, i64 noundef 0, i64 {{[1-9][0-9]*}})
+ sink = __kmalloc(sizeof(int), 0);
+}
diff --git a/clang/test/CodeGen/alloc-token.c b/clang/test/CodeGen/alloc-token.c
index de9b3f48c995f..b75adc1d2e766 100644
--- a/clang/test/CodeGen/alloc-token.c
+++ b/clang/test/CodeGen/alloc-token.c
@@ -3,43 +3,43 @@
typedef __typeof(sizeof(int)) size_t;
-void *aligned_alloc(size_t alignment, size_t size);
-void *malloc(size_t size);
-void *calloc(size_t num, size_t size);
-void *realloc(void *ptr, size_t size);
-void *reallocarray(void *ptr, size_t nmemb, size_t size);
-void *memalign(size_t alignment, size_t size);
-void *valloc(size_t size);
-void *pvalloc(size_t size);
+void *aligned_alloc(size_t alignment, size_t size) __attribute__((malloc));
+void *malloc(size_t size) __attribute__((malloc));
+void *calloc(size_t num, size_t size) __attribute__((malloc));
+void *realloc(void *ptr, size_t size) __attribute__((malloc));
+void *reallocarray(void *ptr, size_t nmemb, size_t size) __attribute__((malloc));
+void *memalign(size_t alignment, size_t size) __attribute__((malloc));
+void *valloc(size_t size) __attribute__((malloc));
+void *pvalloc(size_t size) __attribute__((malloc));
int posix_memalign(void **memptr, size_t alignment, size_t size);
void *sink;
// CHECK-LABEL: @test_malloc_like(
void test_malloc_like() {
- // FIXME: Should not be token ID 0! Currently fail to infer the type.
- // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 0)
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
sink = malloc(sizeof(int));
- // CHECK: call{{.*}} ptr @__alloc_token_calloc(i64 noundef 3, i64 noundef 4, i64 0)
+ // CHECK: call{{.*}} ptr @__alloc_token_calloc(i64 noundef 3, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
sink = calloc(3, sizeof(int));
- // CHECK: call{{.*}} ptr @__alloc_token_realloc(ptr noundef {{[^,]*}}, i64 noundef 8, i64 0)
+ // CHECK: call{{.*}} ptr @__alloc_token_realloc(ptr noundef {{[^,]*}}, i64 noundef 8, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
sink = realloc(sink, sizeof(long));
- // CHECK: call{{.*}} ptr @__alloc_token_reallocarray(ptr noundef {{[^,]*}}, i64 noundef 5, i64 noundef 8, i64 0)
+ // CHECK: call{{.*}} ptr @__alloc_token_reallocarray(ptr noundef {{[^,]*}}, i64 noundef 5, i64 noundef 8, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
sink = reallocarray(sink, 5, sizeof(long));
+ // CHECK: call{{.*}} align 128{{.*}} ptr @__alloc_token_aligned_alloc(i64 noundef 128, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
+ sink = aligned_alloc(128, sizeof(int));
+ // CHECK: call{{.*}} align 16{{.*}} ptr @__alloc_token_memalign(i64 noundef 16, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
+ sink = memalign(16, sizeof(int));
+ // CHECK: call{{.*}} ptr @__alloc_token_valloc(i64 noundef 4, i64 {{[1-9][0-9]*}}), !alloc_token_hint
+ sink = valloc(sizeof(int));
+ // CHECK: call{{.*}} ptr @__alloc_token_pvalloc(i64 noundef 4, i64 {{[1-9][0-9]*}}), !alloc_token_hint
+ sink = pvalloc(sizeof(int));
+ // FIXME: Should not be token ID 0!
// CHECK: call{{.*}} i32 @__alloc_token_posix_memalign(ptr noundef {{[^,]*}}, i64 noundef 64, i64 noundef 4, i64 0)
posix_memalign(&sink, 64, sizeof(int));
- // CHECK: call align 128{{.*}} ptr @__alloc_token_aligned_alloc(i64 noundef 128, i64 noundef 1024, i64 0)
- sink = aligned_alloc(128, 1024);
- // CHECK: call align 16{{.*}} ptr @__alloc_token_memalign(i64 noundef 16, i64 noundef 256, i64 0)
- sink = memalign(16, 256);
- // CHECK: call{{.*}} ptr @__alloc_token_valloc(i64 noundef 4096, i64 0)
- sink = valloc(4096);
- // CHECK: call{{.*}} ptr @__alloc_token_pvalloc(i64 noundef 8192, i64 0)
- sink = pvalloc(8192);
}
// CHECK-LABEL: @no_sanitize_malloc(
void *no_sanitize_malloc(size_t size) __attribute__((no_sanitize("alloc-token"))) {
- // CHECK: call ptr @malloc(
+ // CHECK: call{{.*}} ptr @malloc(
return malloc(size);
}
diff --git a/clang/test/CodeGenCXX/alloc-token-pointer.cpp b/clang/test/CodeGenCXX/alloc-token-pointer.cpp
index 7adb75c7afebb..cceb305fc09e9 100644
--- a/clang/test/CodeGenCXX/alloc-token-pointer.cpp
+++ b/clang/test/CodeGenCXX/alloc-token-pointer.cpp
@@ -9,9 +9,11 @@
typedef __UINTPTR_TYPE__ uintptr_t;
extern "C" {
-void *malloc(size_t size);
+void *malloc(size_t size) __attribute__((malloc));
}
+void *sink; // prevent optimizations from removing the calls
+
// CHECK-LABEL: @_Z15test_malloc_intv(
void *test_malloc_int() {
// CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 0)
@@ -22,8 +24,7 @@ void *test_malloc_int() {
// CHECK-LABEL: @_Z15test_malloc_ptrv(
int **test_malloc_ptr() {
- // FIXME: This should not be token ID 0!
- // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 8, i64 0)
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 8, i64 1)
int **a = (int **)malloc(sizeof(int*));
*a = nullptr;
return a;
@@ -59,50 +60,64 @@ struct ContainsPtr {
};
// CHECK-LABEL: @_Z27test_malloc_struct_with_ptrv(
-ContainsPtr *test_malloc_struct_with_ptr() {
- // FIXME: This should not be token ID 0!
- // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 16, i64 0)
- ContainsPtr *c = (ContainsPtr *)malloc(sizeof(ContainsPtr));
- return c;
+void *test_malloc_struct_with_ptr() {
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+ return malloc(sizeof(ContainsPtr));
}
// CHECK-LABEL: @_Z33test_malloc_struct_array_with_ptrv(
-ContainsPtr *test_malloc_struct_array_with_ptr() {
- // FIXME: This should not be token ID 0!
- // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 160, i64 0)
- ContainsPtr *c = (ContainsPtr *)malloc(10 * sizeof(ContainsPtr));
- return c;
+void *test_malloc_struct_array_with_ptr() {
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+ return malloc(10 * sizeof(ContainsPtr));
+}
+
+// CHECK-LABEL: @_Z31test_malloc_with_ptr_sizeof_vari(
+void *test_malloc_with_ptr_sizeof_var(int x) {
+ unsigned long size = sizeof(ContainsPtr);
+ size *= x;
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef %{{.*}}, i64 1){{.*}} !alloc_token_hint
+ return malloc(size);
+}
+
+// CHECK-LABEL: @_Z29test_malloc_with_ptr_castonlyv(
+ContainsPtr *test_malloc_with_ptr_castonly() {
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4096, i64 1){{.*}} !alloc_token_hint
+ return (ContainsPtr *)malloc(4096);
}
// CHECK-LABEL: @_Z32test_operatornew_struct_with_ptrv(
ContainsPtr *test_operatornew_struct_with_ptr() {
- // FIXME: This should not be token ID 0!
- // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 0)
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(sizeof(ContainsPtr));
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+ sink = ::operator new(sizeof(ContainsPtr));
return c;
}
// CHECK-LABEL: @_Z38test_operatornew_struct_array_with_ptrv(
ContainsPtr *test_operatornew_struct_array_with_ptr() {
- // FIXME: This should not be token ID 0!
- // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 0)
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(10 * sizeof(ContainsPtr));
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+ sink = ::operator new(10 * sizeof(ContainsPtr));
return c;
}
// CHECK-LABEL: @_Z33test_operatornew_struct_with_ptr2v(
ContainsPtr *test_operatornew_struct_with_ptr2() {
- // FIXME: This should not be token ID 0!
- // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 0)
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(sizeof(*c));
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+ sink = ::operator new(sizeof(*c));
return c;
}
// CHECK-LABEL: @_Z39test_operatornew_struct_array_with_ptr2v(
ContainsPtr *test_operatornew_struct_array_with_ptr2() {
- // FIXME: This should not be token ID 0!
- // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 0)
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(10 * sizeof(*c));
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+ sink = ::operator new(10 * sizeof(*c));
return c;
}
diff --git a/clang/test/CodeGenCXX/alloc-token.cpp b/clang/test/CodeGenCXX/alloc-token.cpp
index 180c771e43ae9..66d9352510783 100644
--- a/clang/test/CodeGenCXX/alloc-token.cpp
+++ b/clang/test/CodeGenCXX/alloc-token.cpp
@@ -3,14 +3,14 @@
#include "../Analysis/Inputs/system-header-simulator-cxx.h"
extern "C" {
-void *aligned_alloc(size_t alignment, size_t size);
-void *malloc(size_t size);
-void *calloc(size_...
[truncated]
|
@llvm/pr-subscribers-clang-codegen Author: Marco Elver (melver) ChangesFor the AllocToken pass to accurately calculate token ID hints, we Unlike new expressions, untyped allocation calls (like When encountering allocation calls (with Note that non-standard allocation functions with these attributes are Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 This change is part of the following series:
Patch is 25.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156841.diff 9 Files Affected:
diff --git a/clang/docs/AllocToken.rst b/clang/docs/AllocToken.rst
index fb354d6738ea3..7ad5e5f03d8a0 100644
--- a/clang/docs/AllocToken.rst
+++ b/clang/docs/AllocToken.rst
@@ -122,6 +122,35 @@ which encodes the token ID hint in the allocation function name.
This ABI provides a more efficient alternative where
``-falloc-token-max`` is small.
+Instrumenting Non-Standard Allocation Functions
+-----------------------------------------------
+
+By default, AllocToken only instruments standard library allocation functions.
+This simplifies adoption, as a compatible allocator only needs to provide
+token-enabled variants for a well-defined set of standard functions.
+
+To extend instrumentation to custom allocation functions, enable broader
+coverage with ``-fsanitize-alloc-token-extended``. Such functions require being
+marked with the `malloc
+<https://clang.llvm.org/docs/AttributeReference.html#malloc>`_ or `alloc_size
+<https://clang.llvm.org/docs/AttributeReference.html#alloc-size>`_ attributes
+(or a combination).
+
+For example:
+
+.. code-block:: c
+
+ void *custom_malloc(size_t size) __attribute__((malloc));
+ void *my_malloc(size_t size) __attribute__((alloc_size(1)));
+
+ // Original:
+ ptr1 = custom_malloc(size);
+ ptr2 = my_malloc(size);
+
+ // Instrumented:
+ ptr1 = __alloc_token_custom_malloc(size, token_id);
+ ptr2 = __alloc_token_my_malloc(size, token_id);
+
Disabling Instrumentation
-------------------------
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index e7a0e7696e204..dc428f04e873a 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -30,6 +30,7 @@
#include "clang/AST/Attr.h"
#include "clang/AST/DeclObjC.h"
#include "clang/AST/NSAPI.h"
+#include "clang/AST/ParentMapContext.h"
#include "clang/AST/StmtVisitor.h"
#include "clang/Basic/Builtins.h"
#include "clang/Basic/CodeGenOptions.h"
@@ -1349,6 +1350,98 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
CB->setMetadata(llvm::LLVMContext::MD_alloc_token_hint, MDN);
}
+/// Infer type from a simple sizeof expression.
+static QualType inferTypeFromSizeofExpr(const Expr *E) {
+ const Expr *Arg = E->IgnoreParenImpCasts();
+ if (const auto *UET = dyn_cast<UnaryExprOrTypeTraitExpr>(Arg)) {
+ if (UET->getKind() == UETT_SizeOf) {
+ if (UET->isArgumentType()) {
+ return UET->getArgumentTypeInfo()->getType();
+ } else {
+ return UET->getArgumentExpr()->getType();
+ }
+ }
+ }
+ return QualType();
+}
+
+/// Infer type from an arithmetic expression involving a sizeof.
+static QualType inferTypeFromArithSizeofExpr(const Expr *E) {
+ const Expr *Arg = E->IgnoreParenImpCasts();
+ // The argument is a lone sizeof expression.
+ QualType QT = inferTypeFromSizeofExpr(Arg);
+ if (!QT.isNull())
+ return QT;
+ if (const auto *BO = dyn_cast<BinaryOperator>(Arg)) {
+ // Argument is an arithmetic expression. Cover common arithmetic patterns
+ // involving sizeof.
+ switch (BO->getOpcode()) {
+ case BO_Add:
+ case BO_Div:
+ case BO_Mul:
+ case BO_Shl:
+ case BO_Shr:
+ case BO_Sub:
+ QT = inferTypeFromArithSizeofExpr(BO->getLHS());
+ if (!QT.isNull())
+ return QT;
+ QT = inferTypeFromArithSizeofExpr(BO->getRHS());
+ if (!QT.isNull())
+ return QT;
+ break;
+ default:
+ break;
+ }
+ }
+ return QualType();
+}
+
+/// If the expression E is a reference to a variable, infer the type from a
+/// variable's initializer if it contains a sizeof. Beware, this is a heuristic
+/// and ignores if a variable is later reassigned.
+static QualType inferTypeFromVarInitSizeofExpr(const Expr *E) {
+ const Expr *Arg = E->IgnoreParenImpCasts();
+ if (const auto *DRE = dyn_cast<DeclRefExpr>(Arg)) {
+ if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
+ if (const Expr *Init = VD->getInit()) {
+ return inferTypeFromArithSizeofExpr(Init);
+ }
+ }
+ }
+ return QualType();
+}
+
+/// Deduces the allocated type by checking if the allocation call's result
+/// is immediately used in a cast expression.
+static QualType inferTypeFromCastExpr(const CallExpr *CallE,
+ const CastExpr *CastE) {
+ if (!CastE)
+ return QualType();
+ QualType PtrType = CastE->getType();
+ if (PtrType->isPointerType())
+ return PtrType->getPointeeType();
+ return QualType();
+}
+
+void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase *CB,
+ const CallExpr *E) {
+ QualType AllocType;
+ // First check arguments.
+ for (const Expr *Arg : E->arguments()) {
+ AllocType = inferTypeFromArithSizeofExpr(Arg);
+ if (AllocType.isNull())
+ AllocType = inferTypeFromVarInitSizeofExpr(Arg);
+ if (!AllocType.isNull())
+ break;
+ }
+ // Then check later casts.
+ if (AllocType.isNull())
+ AllocType = inferTypeFromCastExpr(E, CurCast);
+ // Emit if we were able to infer the type.
+ if (!AllocType.isNull())
+ EmitAllocTokenHint(CB, AllocType);
+}
+
CodeGenFunction::ComplexPairTy CodeGenFunction::
EmitComplexPrePostIncDec(const UnaryOperator *E, LValue LV,
bool isInc, bool isPre) {
@@ -5720,6 +5813,9 @@ LValue CodeGenFunction::EmitConditionalOperatorLValue(
/// are permitted with aggregate result, including noop aggregate casts, and
/// cast from scalar to union.
LValue CodeGenFunction::EmitCastLValue(const CastExpr *E) {
+ auto RestoreCurCast =
+ llvm::make_scope_exit([this, Prev = CurCast] { CurCast = Prev; });
+ CurCast = E;
switch (E->getCastKind()) {
case CK_ToVoid:
case CK_BitCast:
@@ -6668,16 +6764,24 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType,
RValue Call = EmitCall(FnInfo, Callee, ReturnValue, Args, &LocalCallOrInvoke,
E == MustTailCall, E->getExprLoc());
- // Generate function declaration DISuprogram in order to be used
- // in debug info about call sites.
- if (CGDebugInfo *DI = getDebugInfo()) {
- if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
+ if (auto *CalleeDecl = dyn_cast_or_null<FunctionDecl>(TargetDecl)) {
+ // Generate function declaration DISuprogram in order to be used
+ // in debug info about call sites.
+ if (CGDebugInfo *DI = getDebugInfo()) {
FunctionArgList Args;
QualType ResTy = BuildFunctionArgList(CalleeDecl, Args);
DI->EmitFuncDeclForCallSite(LocalCallOrInvoke,
DI->getFunctionType(CalleeDecl, ResTy, Args),
CalleeDecl);
}
+ if (CalleeDecl->hasAttr<RestrictAttr>() ||
+ CalleeDecl->hasAttr<AllocSizeAttr>()) {
+ // Function has malloc or alloc_size attribute.
+ if (SanOpts.has(SanitizerKind::AllocToken)) {
+ // Set !alloc_token_hint metadata.
+ EmitAllocTokenHint(LocalCallOrInvoke, E);
+ }
+ }
}
if (CallOrInvoke)
*CallOrInvoke = LocalCallOrInvoke;
diff --git a/clang/lib/CodeGen/CGExprCXX.cpp b/clang/lib/CodeGen/CGExprCXX.cpp
index 6bf3332b425fa..85ced63d5036b 100644
--- a/clang/lib/CodeGen/CGExprCXX.cpp
+++ b/clang/lib/CodeGen/CGExprCXX.cpp
@@ -1371,8 +1371,16 @@ RValue CodeGenFunction::EmitBuiltinNewDeleteCall(const FunctionProtoType *Type,
for (auto *Decl : Ctx.getTranslationUnitDecl()->lookup(Name))
if (auto *FD = dyn_cast<FunctionDecl>(Decl))
- if (Ctx.hasSameType(FD->getType(), QualType(Type, 0)))
- return EmitNewDeleteCall(*this, FD, Type, Args);
+ if (Ctx.hasSameType(FD->getType(), QualType(Type, 0))) {
+ RValue RV = EmitNewDeleteCall(*this, FD, Type, Args);
+ if (auto *CB = dyn_cast_if_present<llvm::CallBase>(RV.getScalarVal())) {
+ if (SanOpts.has(SanitizerKind::AllocToken)) {
+ // Set !alloc_token_hint metadata.
+ EmitAllocTokenHint(CB, TheCall);
+ }
+ }
+ return RV;
+ }
llvm_unreachable("predeclared global operator new/delete is missing");
}
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index 2eff3a387593c..3318c8a52597e 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -33,6 +33,7 @@
#include "clang/Basic/DiagnosticTrap.h"
#include "clang/Basic/TargetInfo.h"
#include "llvm/ADT/APFixedPoint.h"
+#include "llvm/ADT/ScopeExit.h"
#include "llvm/IR/Argument.h"
#include "llvm/IR/CFG.h"
#include "llvm/IR/Constants.h"
@@ -2431,6 +2432,10 @@ static Value *EmitHLSLElementwiseCast(CodeGenFunction &CGF, Address RHSVal,
// have to handle a more broad range of conversions than explicit casts, as they
// handle things like function to ptr-to-function decay etc.
Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
+ auto RestoreCurCast =
+ llvm::make_scope_exit([this, Prev = CGF.CurCast] { CGF.CurCast = Prev; });
+ CGF.CurCast = CE;
+
Expr *E = CE->getSubExpr();
QualType DestTy = CE->getType();
CastKind Kind = CE->getCastKind();
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index fd7ec36183c2d..8e89838531d35 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -346,6 +346,10 @@ class CodeGenFunction : public CodeGenTypeCache {
QualType FnRetTy;
llvm::Function *CurFn = nullptr;
+ /// If a cast expression is being visited, this holds the current cast's
+ /// expression.
+ const CastExpr *CurCast = nullptr;
+
/// Save Parameter Decl for coroutine.
llvm::SmallVector<const ParmVarDecl *, 4> FnArgs;
@@ -3350,6 +3354,9 @@ class CodeGenFunction : public CodeGenTypeCache {
/// Emit additional metadata used by the AllocToken instrumentation.
void EmitAllocTokenHint(llvm::CallBase *CB, QualType AllocType);
+ /// Emit additional metadata used by the AllocToken instrumentation,
+ /// inferring the type from an allocation call expression.
+ void EmitAllocTokenHint(llvm::CallBase *CB, const CallExpr *E);
llvm::Value *GetCountedByFieldExprGEP(const Expr *Base, const FieldDecl *FD,
const FieldDecl *CountDecl);
diff --git a/clang/test/CodeGen/alloc-token-nonlibcalls.c b/clang/test/CodeGen/alloc-token-nonlibcalls.c
new file mode 100644
index 0000000000000..53c85a2174c5e
--- /dev/null
+++ b/clang/test/CodeGen/alloc-token-nonlibcalls.c
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -fsanitize=alloc-token -fsanitize-alloc-token-extended -falloc-token-max=2147483647 -triple x86_64-linux-gnu -x c -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -O -fsanitize=alloc-token -fsanitize-alloc-token-extended -falloc-token-max=2147483647 -triple x86_64-linux-gnu -x c -emit-llvm %s -o - | FileCheck %s
+
+typedef __typeof(sizeof(int)) size_t;
+typedef size_t gfp_t;
+
+void *custom_malloc(size_t size) __attribute__((malloc));
+void *__kmalloc(size_t size, gfp_t flags) __attribute__((alloc_size(1)));
+
+void *sink;
+
+// CHECK-LABEL: @test_nonlibcall_alloc(
+void test_nonlibcall_alloc() {
+ // CHECK: call{{.*}} ptr @__alloc_token_custom_malloc(i64 noundef 4, i64 {{[1-9][0-9]*}})
+ sink = custom_malloc(sizeof(int));
+ // CHECK: call{{.*}} ptr @__alloc_token_kmalloc(i64 noundef 4, i64 noundef 0, i64 {{[1-9][0-9]*}})
+ sink = __kmalloc(sizeof(int), 0);
+}
diff --git a/clang/test/CodeGen/alloc-token.c b/clang/test/CodeGen/alloc-token.c
index de9b3f48c995f..b75adc1d2e766 100644
--- a/clang/test/CodeGen/alloc-token.c
+++ b/clang/test/CodeGen/alloc-token.c
@@ -3,43 +3,43 @@
typedef __typeof(sizeof(int)) size_t;
-void *aligned_alloc(size_t alignment, size_t size);
-void *malloc(size_t size);
-void *calloc(size_t num, size_t size);
-void *realloc(void *ptr, size_t size);
-void *reallocarray(void *ptr, size_t nmemb, size_t size);
-void *memalign(size_t alignment, size_t size);
-void *valloc(size_t size);
-void *pvalloc(size_t size);
+void *aligned_alloc(size_t alignment, size_t size) __attribute__((malloc));
+void *malloc(size_t size) __attribute__((malloc));
+void *calloc(size_t num, size_t size) __attribute__((malloc));
+void *realloc(void *ptr, size_t size) __attribute__((malloc));
+void *reallocarray(void *ptr, size_t nmemb, size_t size) __attribute__((malloc));
+void *memalign(size_t alignment, size_t size) __attribute__((malloc));
+void *valloc(size_t size) __attribute__((malloc));
+void *pvalloc(size_t size) __attribute__((malloc));
int posix_memalign(void **memptr, size_t alignment, size_t size);
void *sink;
// CHECK-LABEL: @test_malloc_like(
void test_malloc_like() {
- // FIXME: Should not be token ID 0! Currently fail to infer the type.
- // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 0)
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
sink = malloc(sizeof(int));
- // CHECK: call{{.*}} ptr @__alloc_token_calloc(i64 noundef 3, i64 noundef 4, i64 0)
+ // CHECK: call{{.*}} ptr @__alloc_token_calloc(i64 noundef 3, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
sink = calloc(3, sizeof(int));
- // CHECK: call{{.*}} ptr @__alloc_token_realloc(ptr noundef {{[^,]*}}, i64 noundef 8, i64 0)
+ // CHECK: call{{.*}} ptr @__alloc_token_realloc(ptr noundef {{[^,]*}}, i64 noundef 8, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
sink = realloc(sink, sizeof(long));
- // CHECK: call{{.*}} ptr @__alloc_token_reallocarray(ptr noundef {{[^,]*}}, i64 noundef 5, i64 noundef 8, i64 0)
+ // CHECK: call{{.*}} ptr @__alloc_token_reallocarray(ptr noundef {{[^,]*}}, i64 noundef 5, i64 noundef 8, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
sink = reallocarray(sink, 5, sizeof(long));
+ // CHECK: call{{.*}} align 128{{.*}} ptr @__alloc_token_aligned_alloc(i64 noundef 128, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
+ sink = aligned_alloc(128, sizeof(int));
+ // CHECK: call{{.*}} align 16{{.*}} ptr @__alloc_token_memalign(i64 noundef 16, i64 noundef 4, i64 {{[1-9][0-9]*}}){{.*}} !alloc_token_hint
+ sink = memalign(16, sizeof(int));
+ // CHECK: call{{.*}} ptr @__alloc_token_valloc(i64 noundef 4, i64 {{[1-9][0-9]*}}), !alloc_token_hint
+ sink = valloc(sizeof(int));
+ // CHECK: call{{.*}} ptr @__alloc_token_pvalloc(i64 noundef 4, i64 {{[1-9][0-9]*}}), !alloc_token_hint
+ sink = pvalloc(sizeof(int));
+ // FIXME: Should not be token ID 0!
// CHECK: call{{.*}} i32 @__alloc_token_posix_memalign(ptr noundef {{[^,]*}}, i64 noundef 64, i64 noundef 4, i64 0)
posix_memalign(&sink, 64, sizeof(int));
- // CHECK: call align 128{{.*}} ptr @__alloc_token_aligned_alloc(i64 noundef 128, i64 noundef 1024, i64 0)
- sink = aligned_alloc(128, 1024);
- // CHECK: call align 16{{.*}} ptr @__alloc_token_memalign(i64 noundef 16, i64 noundef 256, i64 0)
- sink = memalign(16, 256);
- // CHECK: call{{.*}} ptr @__alloc_token_valloc(i64 noundef 4096, i64 0)
- sink = valloc(4096);
- // CHECK: call{{.*}} ptr @__alloc_token_pvalloc(i64 noundef 8192, i64 0)
- sink = pvalloc(8192);
}
// CHECK-LABEL: @no_sanitize_malloc(
void *no_sanitize_malloc(size_t size) __attribute__((no_sanitize("alloc-token"))) {
- // CHECK: call ptr @malloc(
+ // CHECK: call{{.*}} ptr @malloc(
return malloc(size);
}
diff --git a/clang/test/CodeGenCXX/alloc-token-pointer.cpp b/clang/test/CodeGenCXX/alloc-token-pointer.cpp
index 7adb75c7afebb..cceb305fc09e9 100644
--- a/clang/test/CodeGenCXX/alloc-token-pointer.cpp
+++ b/clang/test/CodeGenCXX/alloc-token-pointer.cpp
@@ -9,9 +9,11 @@
typedef __UINTPTR_TYPE__ uintptr_t;
extern "C" {
-void *malloc(size_t size);
+void *malloc(size_t size) __attribute__((malloc));
}
+void *sink; // prevent optimizations from removing the calls
+
// CHECK-LABEL: @_Z15test_malloc_intv(
void *test_malloc_int() {
// CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 0)
@@ -22,8 +24,7 @@ void *test_malloc_int() {
// CHECK-LABEL: @_Z15test_malloc_ptrv(
int **test_malloc_ptr() {
- // FIXME: This should not be token ID 0!
- // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 8, i64 0)
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 8, i64 1)
int **a = (int **)malloc(sizeof(int*));
*a = nullptr;
return a;
@@ -59,50 +60,64 @@ struct ContainsPtr {
};
// CHECK-LABEL: @_Z27test_malloc_struct_with_ptrv(
-ContainsPtr *test_malloc_struct_with_ptr() {
- // FIXME: This should not be token ID 0!
- // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 16, i64 0)
- ContainsPtr *c = (ContainsPtr *)malloc(sizeof(ContainsPtr));
- return c;
+void *test_malloc_struct_with_ptr() {
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+ return malloc(sizeof(ContainsPtr));
}
// CHECK-LABEL: @_Z33test_malloc_struct_array_with_ptrv(
-ContainsPtr *test_malloc_struct_array_with_ptr() {
- // FIXME: This should not be token ID 0!
- // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 160, i64 0)
- ContainsPtr *c = (ContainsPtr *)malloc(10 * sizeof(ContainsPtr));
- return c;
+void *test_malloc_struct_array_with_ptr() {
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+ return malloc(10 * sizeof(ContainsPtr));
+}
+
+// CHECK-LABEL: @_Z31test_malloc_with_ptr_sizeof_vari(
+void *test_malloc_with_ptr_sizeof_var(int x) {
+ unsigned long size = sizeof(ContainsPtr);
+ size *= x;
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef %{{.*}}, i64 1){{.*}} !alloc_token_hint
+ return malloc(size);
+}
+
+// CHECK-LABEL: @_Z29test_malloc_with_ptr_castonlyv(
+ContainsPtr *test_malloc_with_ptr_castonly() {
+ // CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4096, i64 1){{.*}} !alloc_token_hint
+ return (ContainsPtr *)malloc(4096);
}
// CHECK-LABEL: @_Z32test_operatornew_struct_with_ptrv(
ContainsPtr *test_operatornew_struct_with_ptr() {
- // FIXME: This should not be token ID 0!
- // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 0)
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(sizeof(ContainsPtr));
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+ sink = ::operator new(sizeof(ContainsPtr));
return c;
}
// CHECK-LABEL: @_Z38test_operatornew_struct_array_with_ptrv(
ContainsPtr *test_operatornew_struct_array_with_ptr() {
- // FIXME: This should not be token ID 0!
- // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 0)
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(10 * sizeof(ContainsPtr));
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+ sink = ::operator new(10 * sizeof(ContainsPtr));
return c;
}
// CHECK-LABEL: @_Z33test_operatornew_struct_with_ptr2v(
ContainsPtr *test_operatornew_struct_with_ptr2() {
- // FIXME: This should not be token ID 0!
- // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 0)
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(sizeof(*c));
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 16, i64 1){{.*}} !alloc_token_hint
+ sink = ::operator new(sizeof(*c));
return c;
}
// CHECK-LABEL: @_Z39test_operatornew_struct_array_with_ptr2v(
ContainsPtr *test_operatornew_struct_array_with_ptr2() {
- // FIXME: This should not be token ID 0!
- // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 0)
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
ContainsPtr *c = (ContainsPtr *)__builtin_operator_new(10 * sizeof(*c));
+ // CHECK: call {{.*}} ptr @__alloc_token_Znwm(i64 noundef 160, i64 1){{.*}} !alloc_token_hint
+ sink = ::operator new(10 * sizeof(*c));
return c;
}
diff --git a/clang/test/CodeGenCXX/alloc-token.cpp b/clang/test/CodeGenCXX/alloc-token.cpp
index 180c771e43ae9..66d9352510783 100644
--- a/clang/test/CodeGenCXX/alloc-token.cpp
+++ b/clang/test/CodeGenCXX/alloc-token.cpp
@@ -3,14 +3,14 @@
#include "../Analysis/Inputs/system-header-simulator-cxx.h"
extern "C" {
-void *aligned_alloc(size_t alignment, size_t size);
-void *malloc(size_t size);
-void *calloc(size_...
[truncated]
|
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1
Created using spr 1.3.8-beta.1
For the AllocToken pass to accurately calculate token ID hints, we should attach `!alloc_token` metadata for allocation calls to avoid reverting to LLVM IR-type based hints (which depends on later "uses" and is rather imprecise). Unlike new expressions, untyped allocation calls (like `malloc`, `calloc`, `::operator new(..)`, `__builtin_operator_new`, etc.) have no syntactic type associated with them. For -fsanitize=alloc-token, type hints are sufficient, and we can attempt to infer the type based on common idioms. When encountering allocation calls (with `__attribute__((malloc))` or `__attribute__((alloc_size(..))`), attach `!alloc_token` by inferring the allocated type from (a) sizeof argument expressions such as `malloc(sizeof(MyType))`, and (b) casts such as `(MyType*)malloc(4096)`. Note that non-standard allocation functions with these attributes are not instrumented by default. Use `-fsanitize-alloc-token-extended` to instrument them as well. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 Pull Request: llvm#156841
Created using spr 1.3.8-beta.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
||
// CHECK-LABEL: @test_malloc( | ||
// CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 0) | ||
// CHECK: call{{.*}} ptr @__alloc_token_malloc(i64 noundef 4, i64 2689373973731826898){{.*}} !alloc_token [[META_INT:![0-9]+]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Use a regex instead of hard-coding the token here and in alloc-token-nonlibcalls.c?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using a stable hash, so this should always be the same.
Or do you mean to be able to refer to the same token via e.g. [[TOKEN_INT]]
?
Though we still want to check the exact value, given it's meant to be stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just figured that at this level (Clang test), we don't care so much about the exact number, just that there is a number. LLVM tests can have expectations about the exact hash.
But I don't feel particularly strongly about this.
For the AllocToken pass to accurately calculate token ID hints, we should attach `!alloc_token` metadata for allocation calls to avoid reverting to LLVM IR-type based hints (which depends on later "uses" and is rather imprecise). Unlike new expressions, untyped allocation calls (like `malloc`, `calloc`, `::operator new(..)`, `__builtin_operator_new`, etc.) have no syntactic type associated with them. For -fsanitize=alloc-token, type hints are sufficient, and we can attempt to infer the type based on common idioms. When encountering allocation calls (with `__attribute__((malloc))` or `__attribute__((alloc_size(..))`), attach `!alloc_token` by inferring the allocated type from (a) sizeof argument expressions such as `malloc(sizeof(MyType))`, and (b) casts such as `(MyType*)malloc(4096)`. Note that non-standard allocation functions with these attributes are not instrumented by default. Use `-fsanitize-alloc-token-extended` to instrument them as well. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 Pull Request: llvm#156841
For the AllocToken pass to accurately calculate token ID hints, we should attach `!alloc_token` metadata for allocation calls to avoid reverting to LLVM IR-type based hints (which depends on later "uses" and is rather imprecise). Unlike new expressions, untyped allocation calls (like `malloc`, `calloc`, `::operator new(..)`, `__builtin_operator_new`, etc.) have no syntactic type associated with them. For -fsanitize=alloc-token, type hints are sufficient, and we can attempt to infer the type based on common idioms. When encountering allocation calls (with `__attribute__((malloc))` or `__attribute__((alloc_size(..))`), attach `!alloc_token` by inferring the allocated type from (a) sizeof argument expressions such as `malloc(sizeof(MyType))`, and (b) casts such as `(MyType*)malloc(4096)`. Note that non-standard allocation functions with these attributes are not instrumented by default. Use `-fsanitize-alloc-token-extended` to instrument them as well. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 Pull Request: llvm#156841
…alloc_token metadata (#160131) In preparation of adding the "AllocToken" pass, add the pre-requisite `sanitize_alloc_token` function attribute and `alloc_token` metadata. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842
Introduce `AllocToken`, an instrumentation pass designed to provide tokens to memory allocators enabling various heap organization strategies, such as heap partitioning. Initially, the pass instruments functions marked with a new attribute `sanitize_alloc_token` by rewriting allocation calls to include a token ID, appended as a function argument with the default ABI. The design aims to provide a flexible framework for implementing different token generation schemes. It currently supports the following token modes: - TypeHash (default): token IDs based on a hash of the allocated type - Random: statically-assigned pseudo-random token IDs - Increment: incrementing token IDs per TU For the `TypeHash` mode introduce support for `!alloc_token` metadata: the metadata can be attached to allocation calls to provide richer semantic information to be consumed by the AllocToken pass. Optimization remarks can be enabled to show where no metadata was available. An alternative "fast ABI" is provided, where instead of passing the token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the token ID is directly encoded into the name of the called function (e.g., `__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this offers more efficient instrumentation by avoiding the overhead of passing an additional argument at each allocation site. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1] --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842
…56838) Introduce `AllocToken`, an instrumentation pass designed to provide tokens to memory allocators enabling various heap organization strategies, such as heap partitioning. Initially, the pass instruments functions marked with a new attribute `sanitize_alloc_token` by rewriting allocation calls to include a token ID, appended as a function argument with the default ABI. The design aims to provide a flexible framework for implementing different token generation schemes. It currently supports the following token modes: - TypeHash (default): token IDs based on a hash of the allocated type - Random: statically-assigned pseudo-random token IDs - Increment: incrementing token IDs per TU For the `TypeHash` mode introduce support for `!alloc_token` metadata: the metadata can be attached to allocation calls to provide richer semantic information to be consumed by the AllocToken pass. Optimization remarks can be enabled to show where no metadata was available. An alternative "fast ABI" is provided, where instead of passing the token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the token ID is directly encoded into the name of the called function (e.g., `__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this offers more efficient instrumentation by avoiding the overhead of passing an additional argument at each allocation site. Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1] --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842
Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842
…162098) Introduce the "alloc-token" sanitizer kind, in preparation of wiring it up. Currently this is a no-op, and any attempt to enable it will result in failure: clang: error: unsupported option '-fsanitize=alloc-token' for target 'x86_64-unknown-linux-gnu' In this step we can already wire up the `sanitize_alloc_token` IR attribute where the instrumentation is enabled. Subsequent changes will complete wiring up the AllocToken pass. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842
For new expressions, the allocated type is syntactically known and we can trivially emit the !alloc_token metadata. A subsequent change will wire up the AllocToken pass and introduce appropriate tests. --- This change is part of the following series: 1. #160131 2. #156838 3. #162098 4. #162099 5. #156839 6. #156840 7. #156841 8. #156842
…62099) For new expressions, the allocated type is syntactically known and we can trivially emit the !alloc_token metadata. A subsequent change will wire up the AllocToken pass and introduce appropriate tests. --- This change is part of the following series: 1. llvm/llvm-project#160131 2. llvm/llvm-project#156838 3. llvm/llvm-project#162098 4. llvm/llvm-project#162099 5. llvm/llvm-project#156839 6. llvm/llvm-project#156840 7. llvm/llvm-project#156841 8. llvm/llvm-project#156842
For the AllocToken pass to accurately calculate token ID hints, we
should attach
!alloc_token
metadata for allocation calls to avoidreverting to LLVM IR-type based hints (which depends on later "uses" and
is rather imprecise).
Unlike new expressions, untyped allocation calls (like
malloc
,calloc
,::operator new(..)
,__builtin_operator_new
, etc.) have nosyntactic type associated with them. For -fsanitize=alloc-token, type
hints are sufficient, and we can attempt to infer the type based on
common idioms.
When encountering allocation calls (with
__attribute__((malloc))
or__attribute__((alloc_size(..))
), attach!alloc_token
by inferringthe allocated type from (a) sizeof argument expressions such as
malloc(sizeof(MyType))
, and (b) casts such as(MyType*)malloc(4096)
.Note that non-standard allocation functions with these attributes are
not instrumented by default. Use
-fsanitize-alloc-token-extended
toinstrument them as well.
Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434
This change is part of the following series: