Skip to content

Conversation

ahatanak
Copy link
Collaborator

This addresses an issue introduced by 0a9c08c, which implemented P2280R4.

This fixes the issue reported in
#95474 (comment).

rdar://149897839

in C++23

This addresses an issue introduced by 0a9c08c,
which implemented P2280R4.

This fixes the issue reported in
llvm#95474 (comment).

rdar://149897839
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Sep 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 10, 2025

@llvm/pr-subscribers-clang

Author: Akira Hatanaka (ahatanak)

Changes

This addresses an issue introduced by 0a9c08c, which implemented P2280R4.

This fixes the issue reported in
#95474 (comment).

rdar://149897839


Full diff: https://github.com/llvm/llvm-project/pull/157778.diff

2 Files Affected:

  • (modified) clang/lib/AST/ExprConstant.cpp (+15-1)
  • (added) clang/test/Sema/builtin-object-size.cpp (+48)
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 2376e482a19f5..229211c4f2cb7 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -13279,6 +13279,9 @@ static bool refersToCompleteObject(const LValue &LVal) {
   if (LVal.Designator.Invalid)
     return false;
 
+  if (LVal.AllowConstexprUnknown)
+    return false;
+
   if (!LVal.Designator.Entries.empty())
     return LVal.Designator.isMostDerivedAnUnsizedArray();
 
@@ -13328,7 +13331,7 @@ static bool isUserWritingOffTheEnd(const ASTContext &Ctx, const LValue &LVal) {
     return false;
   };
 
-  return LVal.InvalidBase &&
+  return (LVal.InvalidBase || LVal.AllowConstexprUnknown) &&
          Designator.Entries.size() == Designator.MostDerivedPathLength &&
          Designator.MostDerivedIsArrayElement && isFlexibleArrayMember() &&
          isDesignatorAtObjectEnd(Ctx, LVal);
@@ -13396,6 +13399,17 @@ static bool determineEndOffset(EvalInfo &Info, SourceLocation ExprLoc,
     if (LVal.InvalidBase)
       return false;
 
+    // We cannot deterimine the end offset of the enitre object if this is an
+    // unknown reference.
+    if (Type == 0 && LVal.AllowConstexprUnknown)
+      return false;
+
+    // We cannot deterimine the end offset of the subobject if this is an
+    // unknown reference and the subobject designator is invalid (e.g., unsized
+    // array designator).
+    if (Type == 1 && LVal.Designator.Invalid && LVal.AllowConstexprUnknown)
+      return false;
+
     QualType BaseTy = getObjectType(LVal.getLValueBase());
     const bool Ret = CheckedHandleSizeof(BaseTy, EndOffset);
     addFlexibleArrayMemberInitSize(Info, BaseTy, LVal, EndOffset);
diff --git a/clang/test/Sema/builtin-object-size.cpp b/clang/test/Sema/builtin-object-size.cpp
new file mode 100644
index 0000000000000..3995e1880ec81
--- /dev/null
+++ b/clang/test/Sema/builtin-object-size.cpp
@@ -0,0 +1,48 @@
+// RUN: %clang_cc1 -triple x86_64-apple-darwin -fsyntax-only -fstrict-flex-arrays=0 -DSTRICT0 -std=c++23 -verify %s
+// RUN: %clang_cc1 -triple x86_64-apple-darwin -fsyntax-only -fstrict-flex-arrays=1 -DSTRICT1 -std=c++23 -verify %s
+// RUN: %clang_cc1 -triple x86_64-apple-darwin -fsyntax-only -fstrict-flex-arrays=2 -DSTRICT2 -std=c++23 -verify %s
+// RUN: %clang_cc1 -triple x86_64-apple-darwin -fsyntax-only -fstrict-flex-arrays=3 -DSTRICT3 -std=c++23 -verify %s
+
+struct EmptyS {
+  int i;
+  char a[];
+};
+
+template <unsigned N>
+struct S {
+  int i;
+  char a[N];
+};
+
+extern S<2> &s2;
+static_assert(__builtin_object_size(s2.a, 0)); // expected-error {{static assertion expression is not an integral constant expression}}
+static_assert(__builtin_object_size(s2.a, 1) == 2);
+#if defined(STRICT0)
+// expected-error@-2 {{static assertion expression is not an integral constant expression}}
+#endif
+static_assert(__builtin_object_size(s2.a, 2) == 4);
+static_assert(__builtin_object_size(s2.a, 3) == 2);
+
+extern S<1> &s1;
+static_assert(__builtin_object_size(s1.a, 0)); // expected-error {{static assertion expression is not an integral constant expression}}
+static_assert(__builtin_object_size(s1.a, 1) == 1);
+#if defined(STRICT0) || defined(STRICT1)
+// expected-error@-2 {{static assertion expression is not an integral constant expression}}
+#endif
+static_assert(__builtin_object_size(s1.a, 2) == 4);
+static_assert(__builtin_object_size(s1.a, 3) == 1);
+
+extern S<0> &s0;
+static_assert(__builtin_object_size(s0.a, 0)); // expected-error {{static assertion expression is not an integral constant expression}}
+static_assert(__builtin_object_size(s0.a, 1) == 0);
+#if defined(STRICT0) || defined(STRICT1) || defined(STRICT2)
+// expected-error@-2 {{static assertion expression is not an integral constant expression}}
+#endif
+static_assert(__builtin_object_size(s0.a, 2) == 0);
+static_assert(__builtin_object_size(s0.a, 3) == 0);
+
+extern EmptyS &empty;
+static_assert(__builtin_object_size(empty.a, 0)); // expected-error {{static assertion expression is not an integral constant expression}}
+static_assert(__builtin_object_size(empty.a, 1)); // expected-error {{static assertion expression is not an integral constant expression}}
+static_assert(__builtin_object_size(empty.a, 2) == 0);
+static_assert(__builtin_object_size(empty.a, 3)); // expected-error {{static assertion expression is not an integral constant expression}}

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we actually need to bail out in more cases: if the underlying object could become known later, we need to refuse to evaluate __builtin_object_size. If you have extern S<2> &s2;, the definition could be completed later, and we need to ensure that we don't produce different values based on that.

I'd also like to see some tests for potential constant expression checking (see #151053).

@ahatanak
Copy link
Collaborator Author

I'd also like to see some tests for potential constant expression checking (see #151053).

Something like this?

constexpr const int* add(const int &a) { return &a+3; }
constexpr int arr[4]{0, 1, 2, 3};
static_assert(__builtin_object_size(add(arr[0]), 0) == 4);

Could you elaborate on what kind of tests you have in mind?

@efriedma-quic
Copy link
Collaborator

Try the following with -std=c++23 -Winvalid-constexpr:

constexpr int f(int &a) {
  return 1 / (__builtin_object_size(&a, 0) - 4);
}
int a[2];
static_assert(f(a[0]) == 0);

@ahatanak
Copy link
Collaborator Author

I suspect we actually need to bail out in more cases: if the underlying object could become known later, we need to refuse to evaluate __builtin_object_size.

Does this mean that determineEndOffset should never use a conservative upper or lower bound? If that's the case, I think the following piece of code in the function is wrong.

// If we cannot determine the size of the initial allocation, then we can't
// given an accurate upper-bound. However, we are still able to give
// conservative lower-bounds for Type=3.
if (Type == 1)
  return false;

Shouldn't it return false when Type is 3 too?

@efriedma-quic
Copy link
Collaborator

In any context where the standard requires constant evaluation, we have to return false, probably. (This corresponds, roughly, to Info.InConstantContext. I don't think we have a bit that precisely corresponds, though.)

We probably do want to fold in tryEvaluateObjectSize, though.

@ahatanak
Copy link
Collaborator Author

Shouldn't it return false even in contexts that don't require constant evaluation?

The following function (adapted from test/AST/ByteCode/builtin-object-size-codegen.cpp) returns 16 because determineEndOffset uses a conservative lower bound. gcc returns 32.

#include <cstdlib>

int foo() {
  struct A { char buf[16]; };
  struct B : A {};
  struct C { int i; B bs[1]; } *c = (C*)malloc(sizeof(C) + sizeof(B));

  int gi;
  gi = __builtin_object_size(&c->bs[0], 3);
  return gi;
}

This is exactly the case where the underlying object could become known later.

@efriedma-quic
Copy link
Collaborator

We can't represent Type==3 in LLVM IR; CodeGenFunction::emitBuiltinObjectSize unconditionally bails. So we either evaluate in the frontend, or not at all.

@ahatanak
Copy link
Collaborator Author

We can't represent Type==3 in LLVM IR; CodeGenFunction::emitBuiltinObjectSize unconditionally bails. So we either evaluate in the frontend, or not at all.

I see, thank you for clarifying. In that case, we shouldn't return false unless constant evaluation is required.

- Bail out in more cases.
- Test potential constant expression.
@ahatanak
Copy link
Collaborator Author

In any context where the standard requires constant evaluation, we have to return false, probably. (This corresponds, roughly, to Info.InConstantContext. I don't think we have a bit that precisely corresponds, though.)

For Type=3, conditionally returning false when InConstantContext is set triggered undesirable warnings (e.g., clang emitted a division-by-zero warning for function f3 in constant-expression-p2280r4.cpp), so I removed that check. As a result, CodeGen now emits a lower bound of 0 in cases where it could otherwise emit a tighter bound using the conservative lower bound (see clang/test/CodeGenCXX/builtin-object-size.cpp).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants