[Clang] Correct __builtin_dynamic_object_size for subobject types #83204

bwendling · 2024-02-27T23:05:17Z

The second argument of __builtin_dynamic_object_size controls whether it
returns the size of the whole object or the closest surrounding object.
For this struct:

  struct s { 
    int  foo;
    char bar[2][40];
    int  baz;
    int  qux;
  };  

  int main(int argc, char **argv) {
    struct s f;

  #define report(x) printf(#x ": %zu\n", x)

    argc = 1;
    report(__builtin_dynamic_object_size(f.bar[argc], 0));
    report(__builtin_dynamic_object_size(f.bar[argc], 1));
    return 0;
  }

should return:

  __builtin_dynamic_object_size(f.bar[argc], 0): 48
  __builtin_dynamic_object_size(f.bar[argc], 1): 40

determined by the least significant bit of the TYPE.

The LLVM IR isn't sufficient to determine what could be considered a
"sub-object". Instead determine the size / offset info in the front-end
and pass that information along with the intrinsic.

This expands the llvm.objectsize intrinsic to add these three new
fields:

The fifth argument controls which object:
- If false, return the size of the closest surrounding object.
- If true, return the size of the whole object from the pointer.
If non-zero and the fifth argument is 'false', the size of the
sub-object.
If non-zero and the fifth argument is 'false', the offset of the
sub-object.

The second argument of __builtin_dynamic_object_size controls whether it returns the size of the whole object or the closest surrounding object. For this struct: struct s { int foo; char bar[2][40]; int baz; int qux; }; int main(int argc, char **argv) { struct s f; #define report(x) printf(#x ": %zu\n", x) argc = 1; report(__builtin_dynamic_object_size(f.bar[argc], 0)); report(__builtin_dynamic_object_size(f.bar[argc], 1)); return 0; } should return: __builtin_dynamic_object_size(f.bar[argc], 0): 48 __builtin_dynamic_object_size(f.bar[argc], 1): 40 determined by the least significant bit of the TYPE. The LLVM IR isn't sufficient to determine what could be considered a "sub-object". Instead determine the size / offset info in the front-end and pass that information along with the intrinsic. This expands the llvm.objectsize intrinsic to add these three new fields: - The fifth argument controls which object: - If false, return the size of the closest surrounding object. - If true, return the size of the whole object from the pointer. - If non-zero and the fifth argument is 'false', the size of the sub-object. - If non-zero and the fifth argument is 'false', the offset of the sub-object.

…on if the offset is zero.

llvmbot · 2024-02-27T23:05:50Z

@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-clang

Author: Bill Wendling (bwendling)

Changes

The second argument of __builtin_dynamic_object_size controls whether it
returns the size of the whole object or the closest surrounding object.
For this struct:

struct s {
int foo;
char bar[2][40];
int baz;
int qux;
};

int main(int argc, char **argv) {
struct s f;

#define report(x) printf(#x ": %zu\n", x)

argc = 1;
report(__builtin_dynamic_object_size(f.bar[argc], 0));
report(__builtin_dynamic_object_size(f.bar[argc], 1));
return 0;

}

should return:

__builtin_dynamic_object_size(f.bar[argc], 0): 48
__builtin_dynamic_object_size(f.bar[argc], 1): 40

determined by the least significant bit of the TYPE.

The LLVM IR isn't sufficient to determine what could be considered a
"sub-object". Instead determine the size / offset info in the front-end
and pass that information along with the intrinsic.

This expands the llvm.objectsize intrinsic to add these three new
fields:

The fifth argument controls which object:
- If false, return the size of the closest surrounding object.
- If true, return the size of the whole object from the pointer.
If non-zero and the fifth argument is 'false', the size of the
sub-object.
If non-zero and the fifth argument is 'false', the offset of the
sub-object.

Patch is 172.77 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/83204.diff

45 Files Affected:

(modified) clang/lib/CodeGen/CGBuiltin.cpp (+163-10)
(modified) clang/lib/CodeGen/CGExpr.cpp (+8-3)
(modified) clang/test/CodeGen/catch-undef-behavior.c (+1-1)
(added) clang/test/CodeGen/object-size-sub-object.c (+311)
(modified) clang/test/CodeGen/object-size.c (+549-221)
(modified) clang/test/CodeGen/object-size.cpp (+7-7)
(modified) clang/test/CodeGen/pass-object-size.c (+3-3)
(modified) clang/test/CodeGen/sanitize-recover.c (+1-1)
(modified) llvm/docs/LangRef.rst (+34-14)
(modified) llvm/include/llvm/Analysis/MemoryBuiltins.h (+9)
(modified) llvm/include/llvm/IR/Intrinsics.td (+4-3)
(modified) llvm/lib/Analysis/MemoryBuiltins.cpp (+20-4)
(modified) llvm/lib/IR/AutoUpgrade.cpp (+63-6)
(modified) llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp (+2-1)
(modified) llvm/test/Analysis/CostModel/X86/free-intrinsics.ll (+4-4)
(modified) llvm/test/Analysis/CostModel/free-intrinsics-datalayout.ll (+4-4)
(modified) llvm/test/Analysis/CostModel/free-intrinsics-no_info.ll (+4-4)
(modified) llvm/test/Assembler/auto_upgrade_intrinsics.ll (+4-4)
(modified) llvm/test/Bitcode/objectsize-upgrade-7.0.ll (+2-2)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/memcpy_chk_no_tail.ll (+2-2)
(modified) llvm/test/CodeGen/AArch64/memsize-remarks.ll (+16-16)
(modified) llvm/test/CodeGen/AMDGPU/promote-alloca-mem-intrinsics.ll (+4-4)
(modified) llvm/test/Other/cgscc-libcall-update.ll (+2-2)
(modified) llvm/test/Transforms/InferAddressSpaces/AMDGPU/debug-info.ll (+4-4)
(modified) llvm/test/Transforms/InferAddressSpaces/AMDGPU/intrinsics.ll (+6-6)
(modified) llvm/test/Transforms/InferAlignment/propagate-assume.ll (+2-2)
(modified) llvm/test/Transforms/Inline/call-intrinsic-objectsize.ll (+3-3)
(modified) llvm/test/Transforms/InstCombine/allocsize.ll (+5-5)
(modified) llvm/test/Transforms/InstCombine/assume_inevitable.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/builtin-dynamic-object-size.ll (+13-13)
(modified) llvm/test/Transforms/InstCombine/builtin-object-size-custom-dl.ll (+3-3)
(modified) llvm/test/Transforms/InstCombine/builtin-object-size-strdup-family.ll (+14-10)
(modified) llvm/test/Transforms/InstCombine/invoke.ll (+1-1)
(modified) llvm/test/Transforms/InstCombine/memset_chk-1.ll (+5-5)
(modified) llvm/test/Transforms/InstCombine/objsize.ll (+38-38)
(modified) llvm/test/Transforms/InstCombine/stpcpy_chk-1.ll (+4-4)
(modified) llvm/test/Transforms/InstCombine/strcpy_chk-1.ll (+5-5)
(modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-load.ll (+2-2)
(modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll (+5-5)
(modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-posix-memalign.ll (+8-8)
(modified) llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll (+2-2)
(modified) llvm/test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll (+2-2)
(modified) llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll (+18-18)
(modified) llvm/test/Transforms/LowerConstantIntrinsics/stale-worklist-phi.ll (+2-2)
(modified) llvm/test/Transforms/SCCP/issue59602-assume-like-call-users.ll (+5-5)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 2d16e7cdc06053..9d8e5671d9d12d 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -26,6 +26,7 @@
 #include "clang/AST/Decl.h"
 #include "clang/AST/OSLog.h"
 #include "clang/AST/OperationKinds.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/Basic/TargetBuiltins.h"
 #include "clang/Basic/TargetInfo.h"
 #include "clang/Basic/TargetOptions.h"
@@ -1052,11 +1053,145 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
   return Builder.CreateSelect(Cmp, Res, ConstantInt::get(ResType, 0, IsSigned));
 }
 
+namespace {
+
+/// SubobjectFinder - A simple visitor to find the "sub-object" pointed to by a
+/// __builtin_dynamic_object_size call. Information gathered from the
+/// sub-object is used by the back-end to determine the correct size when the
+/// 'TYPE' of the __bdos call has the least significant bit set (i.e. asking
+/// for the sub-object size).
+///
+/// The expectation is that we'll eventually hit one of three expression types:
+///
+///   1. DeclRefExpr - This is the expression for the base of the structure.
+///      It's exactly what we want to build an access to the \p counted_by
+///      field.
+///   2. MemberExpr - This is the field in the structure.
+///   3. CompoundLiteralExpr - This is for people who create something
+///      heretical like (struct foo has a flexible array member):
+///
+///        (struct foo){ 1, 2 }.blah[idx];
+///
+/// All other expressions can be correctly handled with the current code.
+struct SubobjectFinder
+    : public ConstStmtVisitor<SubobjectFinder, const Expr *> {
+  SubobjectFinder() = default;
+
+  //===--------------------------------------------------------------------===//
+  //                            Visitor Methods
+  //===--------------------------------------------------------------------===//
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) { return E; }
+  const Expr *VisitMemberExpr(const MemberExpr *E) { return E; }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+    return E;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+    return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+    return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+    return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const clang::UnaryOperator *E) {
+    return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const clang::UnaryOperator *E) {
+    return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+/// getFieldInfo - Gather the size and offset of the field \p VD in \p RD.
+static std::pair<uint64_t, uint64_t> getFieldInfo(CodeGenFunction &CGF,
+                                                  const RecordDecl *RD,
+                                                  const ValueDecl *VD,
+                                                  uint64_t Offset = 0) {
+  if (!RD)
+    return std::make_pair(0, 0);
+
+  ASTContext &Ctx = CGF.getContext();
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+    if (const auto *Record = dyn_cast<RecordDecl>(D)) {
+      std::pair<uint64_t, uint64_t> Res =
+          getFieldInfo(CGF, Record->getDefinition(), VD,
+                       Offset + Layout.getFieldOffset(FieldNo));
+      if (Res.first != 0)
+        return Res;
+      continue;
+    }
+
+    if (const auto *FD = dyn_cast<FieldDecl>(D); FD && FD == VD) {
+      Offset += Layout.getFieldOffset(FieldNo);
+      return std::make_pair(Ctx.getTypeSizeInChars(FD->getType()).getQuantity(),
+                            Ctx.toCharUnitsFromBits(Offset).getQuantity());
+    }
+
+    if (isa<FieldDecl>(D))
+      ++FieldNo;
+  }
+
+  return std::make_pair(0, 0);
+}
+
+/// getSubobjectInfo - Find the sub-object that \p E points to. If it lives
+/// inside a struct, return the "size" and "offset" of that sub-object.
+static std::pair<uint64_t, uint64_t> getSubobjectInfo(CodeGenFunction &CGF,
+                                                      const Expr *E) {
+  const Expr *Subobject = SubobjectFinder().Visit(E);
+  if (!Subobject)
+    return std::make_pair(0, 0);
+
+  const RecordDecl *OuterRD = nullptr;
+  const ValueDecl *VD = nullptr;
+
+  if (const auto *DRE = dyn_cast<DeclRefExpr>(Subobject)) {
+    // We're pointing to the beginning of the struct.
+    VD = DRE->getDecl();
+    QualType Ty = VD->getType();
+    if (Ty->isPointerType())
+      Ty = Ty->getPointeeType();
+    OuterRD = Ty->getAsRecordDecl();
+  } else if (const auto *ME = dyn_cast<MemberExpr>(Subobject)) {
+    VD = ME->getMemberDecl();
+    OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else {
+    if (!isa<CompoundLiteralExpr>(Subobject))
+      llvm_unreachable("unexpected expression");
+
+    // We encounter a CompoundLiteralExpr if we have something like this:
+    //
+    //     __builtin_dynamic_object_size(&(struct x){ 1, 2, 3 }, 1)
+    //
+    // In that case, we want the size of the whole struct. So we don't have to
+    // worry about finding a suboject.
+    return std::make_pair(0, 0);
+  }
+
+  if (!VD || !OuterRD)
+    // The expression is referencing an object that's not in a struct.
+    return std::make_pair(0, 0);
+
+  return getFieldInfo(CGF, OuterRD->getDefinition(), VD);
+}
+
 /// Returns a Value corresponding to the size of the given expression.
 /// This Value may be either of the following:
-///   - A llvm::Argument (if E is a param with the pass_object_size attribute on
-///     it)
-///   - A call to the @llvm.objectsize intrinsic
+///
+///   - An Argument if E is a param with the pass_object_size attribute on
+///     it,
+///   - An Instruction representing the calculation of the value, when a
+///     flexible array member is involved, or
+///   - A call to the @llvm.objectsize intrinsic.
 ///
 /// EmittedE is the result of emitting `E` as a scalar expr. If it's non-null
 /// and we wouldn't otherwise try to reference a pass_object_size parameter,
@@ -1084,18 +1219,31 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, unsigned Type,
     }
   }
 
+  // LLVM can't handle Type=3 appropriately, and __builtin_object_size shouldn't
+  // evaluate E for side-effects. In either case, we shouldn't lower to
+  // @llvm.objectsize.
+  if (Type == 3 || (!EmittedE && E->HasSideEffects(getContext())))
+    return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+  std::pair<Value *, Value *> SubobjectInfo =
+      std::make_pair(Builder.getInt64(0), Builder.getInt64(0));
+
   if (IsDynamic) {
     // Emit special code for a flexible array member with the "counted_by"
     // attribute.
     if (Value *V = emitFlexibleArrayMemberSize(E, Type, ResType))
       return V;
-  }
 
-  // LLVM can't handle Type=3 appropriately, and __builtin_object_size shouldn't
-  // evaluate E for side-effects. In either case, we shouldn't lower to
-  // @llvm.objectsize.
-  if (Type == 3 || (!EmittedE && E->HasSideEffects(getContext())))
-    return getDefaultBuiltinObjectSizeResult(Type, ResType);
+    if ((Type & 1) != 0) {
+      // The object size is constrained to the sub-object containing the
+      // element. If it's in a structure, get the size and offset information
+      // for back-end processing.
+      std::pair<uint64_t, uint64_t> Info = getSubobjectInfo(*this, E);
+      if (Info.first != 0)
+        SubobjectInfo = std::make_pair(Builder.getInt64(Info.first),
+                                       Builder.getInt64(Info.second));
+    }
+  }
 
   Value *Ptr = EmittedE ? EmittedE : EmitScalarExpr(E);
   assert(Ptr->getType()->isPointerTy() &&
@@ -1109,7 +1257,12 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, unsigned Type,
   // For GCC compatibility, __builtin_object_size treat NULL as unknown size.
   Value *NullIsUnknown = Builder.getTrue();
   Value *Dynamic = Builder.getInt1(IsDynamic);
-  return Builder.CreateCall(F, {Ptr, Min, NullIsUnknown, Dynamic});
+  // If the least significant bit is clear, objects are whole variables. If
+  // it's set, a closest surrounding subobject is considered the object a
+  // pointer points to.
+  Value *WholeObj = Builder.getInt1((Type & 1) == 0);
+  return Builder.CreateCall(F, {Ptr, Min, NullIsUnknown, Dynamic, WholeObj,
+                                SubobjectInfo.first, SubobjectInfo.second});
 }
 
 namespace {
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 59a7fe8925001c..122de18640b456 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -742,13 +742,18 @@ void CodeGenFunction::EmitTypeCheck(TypeCheckKind TCK, SourceLocation Loc,
       // FIXME: If Address Sanitizer is enabled, insert dynamic instrumentation
       //        to check this.
       // FIXME: Get object address space
-      llvm::Type *Tys[2] = { IntPtrTy, Int8PtrTy };
-      llvm::Function *F = CGM.getIntrinsic(llvm::Intrinsic::objectsize, Tys);
+      llvm::Function *F =
+          CGM.getIntrinsic(llvm::Intrinsic::objectsize, {IntPtrTy, Int8PtrTy});
       llvm::Value *Min = Builder.getFalse();
       llvm::Value *NullIsUnknown = Builder.getFalse();
       llvm::Value *Dynamic = Builder.getFalse();
+      llvm::Value *WholeObj = Builder.getTrue();
+      llvm::Value *SubobjectSize = Builder.getInt64(0);
+      llvm::Value *SubobjectOffset = Builder.getInt64(0);
       llvm::Value *LargeEnough = Builder.CreateICmpUGE(
-          Builder.CreateCall(F, {Ptr, Min, NullIsUnknown, Dynamic}), Size);
+          Builder.CreateCall(F, {Ptr, Min, NullIsUnknown, Dynamic, WholeObj,
+                                 SubobjectSize, SubobjectOffset}),
+          Size);
       Checks.push_back(std::make_pair(LargeEnough, SanitizerKind::ObjectSize));
     }
   }
diff --git a/clang/test/CodeGen/catch-undef-behavior.c b/clang/test/CodeGen/catch-undef-behavior.c
index af37ef9e8565b1..1305bb2953b954 100644
--- a/clang/test/CodeGen/catch-undef-behavior.c
+++ b/clang/test/CodeGen/catch-undef-behavior.c
@@ -35,7 +35,7 @@
 void foo(void) {
   union { int i; } u;
 
-  // CHECK-COMMON: %[[SIZE:.*]] = call i64 @llvm.objectsize.i64.p0(ptr %[[PTR:.*]], i1 false, i1 false, i1 false)
+  // CHECK-COMMON: %[[SIZE:.*]] = call i64 @llvm.objectsize.i64.p0(ptr %[[PTR:.*]], i1 false, i1 false, i1 false, i1 true, i64 0, i64 0)
   // CHECK-COMMON-NEXT: %[[OK:.*]] = icmp uge i64 %[[SIZE]], 4
 
   // CHECK-UBSAN: br i1 %[[OK]], {{.*}} !prof ![[WEIGHT_MD:.*]], !nosanitize
diff --git a/clang/test/CodeGen/object-size-sub-object.c b/clang/test/CodeGen/object-size-sub-object.c
new file mode 100644
index 00000000000000..9d3f515e5d9bd8
--- /dev/null
+++ b/clang/test/CodeGen/object-size-sub-object.c
@@ -0,0 +1,311 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm %s -o - | FileCheck %s
+
+typedef __SIZE_TYPE__ size_t;
+
+#define __bdos(a) __builtin_dynamic_object_size(a, 1)
+
+struct U {
+  double d;
+  int i;
+};
+
+struct test_struct {
+  struct test_struct *vptr;
+  char buf1[5];
+  struct i {
+    char a;
+    int b[2][13];
+    int c, d;
+  } z;
+  struct U *u_ptr;
+  unsigned _a : 1;
+  unsigned _b : 2;
+  struct {
+    struct {
+      char x_1;
+      char x_2[37];
+    };
+  };
+  union {
+    struct { char _z[20]; } m;
+    struct { char _y[13]; } n;
+  } u;
+  char buf2[7];
+};
+
+size_t ret;
+
+// CHECK-LABEL: define dso_local i64 @test1(
+// CHECK-SAME: ptr noundef [[P:%.*]], i32 noundef [[IDX:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[IDX_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    store i32 [[IDX]], ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    [[TMP1:%.*]] = call i64 @llvm.objectsize.i64.p0(ptr [[TMP0]], i1 false, i1 true, i1 true, i1 false, i64 0, i64 0)
+// CHECK-NEXT:    ret i64 [[TMP1]]
+//
+size_t test1(struct test_struct *p, int idx) {
+  return __bdos(p);
+}
+
+// CHECK-LABEL: define dso_local i64 @test2(
+// CHECK-SAME: ptr noundef [[P:%.*]], i32 noundef [[IDX:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[IDX_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    store i32 [[IDX]], ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    [[BUF1:%.*]] = getelementptr inbounds [[STRUCT_TEST_STRUCT:%.*]], ptr [[TMP0]], i32 0, i32 1
+// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds [5 x i8], ptr [[BUF1]], i64 0, i64 [[IDXPROM]]
+// CHECK-NEXT:    [[TMP2:%.*]] = call i64 @llvm.objectsize.i64.p0(ptr [[ARRAYIDX]], i1 false, i1 true, i1 true, i1 false, i64 5, i64 8)
+// CHECK-NEXT:    ret i64 [[TMP2]]
+//
+size_t test2(struct test_struct *p, int idx) {
+  return __bdos(&p->buf1[idx]);
+}
+
+// CHECK-LABEL: define dso_local i64 @test3(
+// CHECK-SAME: ptr noundef [[P:%.*]], i32 noundef [[IDX:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[IDX_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    store i32 [[IDX]], ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds [[STRUCT_TEST_STRUCT:%.*]], ptr [[TMP0]], i32 0, i32 2
+// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i8, ptr [[Z]], i64 [[IDXPROM]]
+// CHECK-NEXT:    [[TMP2:%.*]] = call i64 @llvm.objectsize.i64.p0(ptr [[ARRAYIDX]], i1 false, i1 true, i1 true, i1 false, i64 116, i64 16)
+// CHECK-NEXT:    ret i64 [[TMP2]]
+//
+size_t test3(struct test_struct *p, int idx) {
+  return __bdos(&((char *)&p->z)[idx]);
+}
+
+// CHECK-LABEL: define dso_local i64 @test4(
+// CHECK-SAME: ptr noundef [[P:%.*]], i32 noundef [[IDX:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[IDX_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    store i32 [[IDX]], ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds [[STRUCT_TEST_STRUCT:%.*]], ptr [[TMP0]], i32 0, i32 2
+// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i8, ptr [[Z]], i64 [[IDXPROM]]
+// CHECK-NEXT:    [[TMP2:%.*]] = call i64 @llvm.objectsize.i64.p0(ptr [[ARRAYIDX]], i1 false, i1 true, i1 true, i1 false, i64 116, i64 16)
+// CHECK-NEXT:    ret i64 [[TMP2]]
+//
+size_t test4(struct test_struct *p, int idx) {
+  return __bdos(&((char *)&p->z)[idx]);
+}
+
+// CHECK-LABEL: define dso_local i64 @test5(
+// CHECK-SAME: ptr noundef [[P:%.*]], i32 noundef [[IDX:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[IDX_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    store i32 [[IDX]], ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds [[STRUCT_TEST_STRUCT:%.*]], ptr [[TMP0]], i32 0, i32 2
+// CHECK-NEXT:    [[A:%.*]] = getelementptr inbounds [[STRUCT_I:%.*]], ptr [[Z]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i8, ptr [[A]], i64 [[IDXPROM]]
+// CHECK-NEXT:    [[TMP2:%.*]] = call i64 @llvm.objectsize.i64.p0(ptr [[ARRAYIDX]], i1 false, i1 true, i1 true, i1 false, i64 1, i64 16)
+// CHECK-NEXT:    ret i64 [[TMP2]]
+//
+size_t test5(struct test_struct *p, int idx) {
+  return __bdos(&((char *)&p->z.a)[idx]);
+}
+
+// CHECK-LABEL: define dso_local i64 @test6(
+// CHECK-SAME: ptr noundef [[P:%.*]], i32 noundef [[IDX:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[IDX_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    store i32 [[IDX]], ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds [[STRUCT_TEST_STRUCT:%.*]], ptr [[TMP0]], i32 0, i32 2
+// CHECK-NEXT:    [[B:%.*]] = getelementptr inbounds [[STRUCT_I:%.*]], ptr [[Z]], i32 0, i32 1
+// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[IDXPROM]]
+// CHECK-NEXT:    [[TMP2:%.*]] = call i64 @llvm.objectsize.i64.p0(ptr [[ARRAYIDX]], i1 false, i1 true, i1 true, i1 false, i64 104, i64 20)
+// CHECK-NEXT:    ret i64 [[TMP2]]
+//
+size_t test6(struct test_struct *p, int idx) {
+  return __bdos(&((char *)&p->z.b)[idx]);
+}
+
+// CHECK-LABEL: define dso_local i64 @test7(
+// CHECK-SAME: ptr noundef [[P:%.*]], i32 noundef [[IDX:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[IDX_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    store i32 [[IDX]], ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds [[STRUCT_TEST_STRUCT:%.*]], ptr [[TMP0]], i32 0, i32 2
+// CHECK-NEXT:    [[C:%.*]] = getelementptr inbounds [[STRUCT_I:%.*]], ptr [[Z]], i32 0, i32 2
+// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i8, ptr [[C]], i64 [[IDXPROM]]
+// CHECK-NEXT:    [[TMP2:%.*]] = call i64 @llvm.objectsize.i64.p0(ptr [[ARRAYIDX]], i1 false, i1 true, i1 true, i1 false, i64 4, i64 124)
+// CHECK-NEXT:    ret i64 [[TMP2]]
+//
+size_t test7(struct test_struct *p, int idx) {
+  return __bdos(&((char *)&p->z.c)[idx]);
+}
+
+// CHECK-LABEL: define dso_local i64 @test8(
+// CHECK-SAME: ptr noundef [[P:%.*]], i32 noundef [[IDX:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[IDX_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    store i32 [[IDX]], ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// CHECK-NEXT:    [[Z:%.*]] = getelementptr inbounds [[STRUCT_TEST_STRUCT:%.*]], ptr [[TMP0]], i32 0, i32 2
+// CHECK-NEXT:    [[D:%.*]] = getelementptr inbounds [[STRUCT_I:%.*]], ptr [[Z]], i32 0, i32 3
+// CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[IDX_ADDR]], align 4
+// CHECK-NEXT:    [[IDXPROM:%.*]] = sext i32 [[TMP1]] to i64
+// CHECK-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i8, ptr [[D]], i64 [[IDXPROM]]
+// CHECK-NEXT:    [[TMP2:%.*]] = call i64 @llvm.objectsize.i64.p0(ptr [[ARRAYIDX]], i1 false, i1 true, i1 true, i1 false, i64 4, i64 128)
+// CHECK-NEXT:    ret i64 [[TMP2]]
+//
+size_t test8(struct test_struct *p, int idx) {
+  return __bdos(&((char *)&p->z.d)[idx]);
+}
+
+// CHECK-LABEL: define dso_local i64 @test9(
+// CHECK-SAME: ptr noundef [[P:%.*]], i32 noundef [[IDX:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    [[IDX_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    ...
[truncated]

bwendling · 2024-02-27T23:09:37Z

The first PR attempt was here: #78526

It was NACK'ed because it used the LLVM IR representation of the structure, which wasn't appropriate. To solve that issue, I chose to expand the llvm.objectsize() builtin to contain the size and offset of the sub-object, which is determined in the front-end.

Note that there are many other things wrong with our __builtin_{dynamic_}object_size implementations. For one, they perform loads of the pointer which isn't necessary and contrary to the idea that the builtins don't allow for side-effects. I'll be addressing those issues in future patches.

bwendling · 2024-03-05T21:06:11Z

Friendly ping.

bwendling · 2024-03-11T19:27:43Z

Another ping.

zygoloid · 2024-03-11T20:21:37Z

llvm/docs/LangRef.rst

+- If non-zero, the sixth and seventh arguments encode the size and offset
+  information, respectively, of the original subobject's layout and is used
+  when the fifth argument is ``false``.
+- The seventh argument encodes the offset information of the original
+  subobject's layout and is used when the fifth argument is ``false``.


I think the information you're passing in here isn't quite what we'd want. If I'm reading the code correctly, the offset you're passing in is the field offset relative to the immediately-enclosing record type, which doesn't give us any information about either where the pointer is within the subobject, or where the subobject is within the complete object, so this doesn't seem like it can be enough information to produce a correct result.

Rather than passing in the offset of the subobject (relative to an unknown anchor point), I think it would be more useful to pass in a pointer to the start of the subobject. Or to pass in the offset from the start of the subobject to the pointer argument, but that would likely be harder for the frontend to calculate (eg, you'd need to work out the offset produced by array indexing).

I think the information you're passing in here isn't quite what we'd want. If I'm reading the code correctly, the offset you're passing in is the field offset relative to the immediately-enclosing record type, which doesn't give us any information about either where the pointer is within the subobject, or where the subobject is within the complete object, so this doesn't seem like it can be enough information to produce a correct result.

That's the information which leads to the correct calculation. If you have a pointer like this:

struct S { int a; char c[234]; int b; }; void foo(struct S *ptr) { size_t x = __builtin_dynamic_object_size(ptr->a[22], 1); /* ... */ }

the value of x should be 0. See https://godbolt.org/z/4xaY4191o for a list of examples that show this behavior (at least in GCC). Notice that this applies for the sub-object type only. If the __bdos value is 0, then your behavior is the correct behavior.

A lot of your review seems to be based on this differing of opinion of what to do when indexing outside of the object currently being pointed to. Let's get this sorted out before I make changes...

As far as I can see, the "offset" value you're passing in is a difference between two pointers both of which are unknown to the middle-end -- it's the difference from the start of the two-levels-out enclosing class subobject to the one-level-out enclosing field subobject, I think. I could be misunderstanding something, but I don't see how you can use that information to get correct results from the builtin. Can you explain how the LLVM intrinsic uses this information?

I'm not passing a difference between two pointers; it's the offset from the start of the outermost RecordDecl to the object. The LLVM intrinsic recursively walks back through the instructions (that define the pointer) to try to calculate the value. It's rather convoluted and hard to read, because it involves two visitor classes called one from another. Eventually, it returns a Size / Offset pair (Offset from the start of the structure). At that point, I use the extra information I added to determine if those values are within the range of the sub-object. If it's outside of that range, I return 0. Otherwise, I calculate the remaining size after adjusting the offset (i.e. the offset is adjusted to be from the beginning of the sub-object rather than the start of the structure).

You're passing the difference between the address of some arbitrary class object and a field. The address of that class object is not the base address that the LLVM intrinsic will calculate. So the offset you're passing in is not meaningful. And there's no way it even can be meaningful. Consider something like this:

struct A { int n; }; struct B { int data[100]; A a; }; int f(A *p) { return __builtin_dynamic_object_size(&p->n, 1); } int g() { A a; return f(&a); } int h() { B b; return f(&b); }

After inlining, the BDOS intrinsic in g will see a base pointer of &a, and the intrinsic in h will see a base pointer of &b. The offset from the base pointer to the n subobject is different in each case. So there's no way you can compute a constant offset in Clang's lowering and pass it to the intrinsic.

What you can do is pass a pointer to p->n to the intrinsic as the start of the subobject. That'll do the right thing regardless of what LLVM computes to be the complete enclosing object.

You really need to make up your mind. First you tell me that I can't use LLVM's IR to determine the subobject, even though I did and it worked just fine, and now you're saying that I can't use the front-end's knowledge about the structure. In your example, you've explicitly lied to the compiler about the types being passed in. Unlike in GCC, we don't do any inlining in the front-end, so we can't properly handle this. I have no clue what you mean by pass a "pointer to p->n to the intrinsic" as that's already the first argument in the intrinsic.

First you tell me that I can't use LLVM's IR to determine the subobject, even though I did and it worked just fine

It definitely doesn't work fine, but sure, in some (possibly many or most) test cases it does happen to work currently.

now you're saying that I can't use the front-end's knowledge about the structure.

I'm not. I'm saying that you can't assume that the LLVM IR idea of the complete object lines up with some enclosing subobject you've found in the frontend. You're still trying to use the IR notion of complete object and subobjects, and that still doesn't work for the same reasons as before.

You can absolutely compute where the subobject is in the frontend, and pass that information onto LLVM. But you'll need to pass it on in a way that is meaningful to LLVM. For example, if you pass a pointer and size to the intrinsic describing the complete range of addresses covering the subobject, that should be fine. But an offset is not useful if the frontend and middle-end can't agree on what it's an offset relative to.

In your example, you've explicitly lied to the compiler about the types being passed in.

Sorry, that was a typo in my example: the call in h() should be f(&b.a).

I have no clue what you mean by pass a "pointer to p->n to the intrinsic" as that's already the first argument in the intrinsic.

I mean, pass a pointer to the start of the subobject. For example, for p->arr[i], you'd pass in &p->arr.

in some simple test cases it does appear to work.

Please don't be insulting. I didn't just run a couple of "simple" tests and called it a day. I had many rather convoluted examples and they all worked.

Sorry, that was a typo in my example: the call in h() should be f(&b.a).

Okay. This patch generates the correct answer (modified so it's calculated by the middle end):

$ clang -O2 ~/llvm/a.c && ./a.out &((char *)&p->n)[idx]: __builtin_dynamic_object_size(&((char *)&p->n)[idx], 1) = 3 &((char *)&p->n)[idx]: __builtin_dynamic_object_size(&((char *)&p->n)[idx], 1) = 3

I mean, pass a pointer to the start of the subobject. For example, for p->arr[i], you'd pass in &p->arr.

I don't think that's necessary. If we're given the original size and the middle end is calculating the final size, it may be sufficient to ignore the offset all together for sub-objects. I'll see if it works.

in some simple test cases it does appear to work.

Please don't be insulting. I didn't just run a couple of "simple" tests and called it a day. I had many rather convoluted examples and they all worked.

I apologize, I didn't mean to imply you hadn't been diligent here.

zygoloid · 2024-03-11T20:27:18Z