Skip to content

Conversation

@AaronBallman
Copy link
Collaborator

@AaronBallman AaronBallman commented Mar 28, 2025

This feature largely models the same behavior as in C++11. It is technically a breaking change between C99 and C11, so the paper is not being backported to older language modes.

One difference between C++ and C is that things which are rvalues in C are often lvalues in C++ (such as the result of a ternary operator or a comma operator).

Fixes #96486

This feature largely models the same behavior as in C++11. It is
technically a breaking change between C99 and C11, so the paper is not
being backported to older language modes.

One difference between C++ and C is that things which are rvalues in C
are often lvalues in C++ (such as the result of a ternary operator or a
comma operator).
@AaronBallman AaronBallman added clang Clang issues not falling into any other category c11 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Mar 28, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 28, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Aaron Ballman (AaronBallman)

Changes

This feature largely models the same behavior as in C++11. It is technically a breaking change between C99 and C11, so the paper is not being backported to older language modes.

One difference between C++ and C is that things which are rvalues in C are often lvalues in C++ (such as the result of a ternary operator or a comma operator).


Patch is 21.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133472.diff

8 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+9)
  • (modified) clang/lib/CodeGen/CGExpr.cpp (+2-2)
  • (modified) clang/lib/Sema/Sema.cpp (+3-2)
  • (modified) clang/lib/Sema/SemaInit.cpp (+1-1)
  • (modified) clang/test/C/C11/n1285.c (+70-13)
  • (added) clang/test/C/C11/n1285_1.c (+212)
  • (modified) clang/test/CodeGenObjC/property-array-type.m (+3-2)
  • (modified) clang/www/c_status.html (+1-1)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b96780cec75d9..c51b3c44f3b3b 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -160,6 +160,15 @@ C23 Feature Support
   treated as if the compound literal were within the body rather than at file
   scope.
 
+C11 Feature Support
+^^^^^^^^^^^^^^^^^^^
+- Implemented `WG14 N1285 <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1285.htm>`_
+  which introduces the notion of objects with a temporary lifetime. When an
+  expression resulting in an rvalue with structure or union type and that type
+  contains a member of array type, the expression result is an automatic storage
+  duration object with temporary lifetime which begins when the expression is
+  evaluated and ends at the evaluation of the containing full expression.
+
 Non-comprehensive list of changes in this release
 -------------------------------------------------
 
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 5943ff9294e1a..2328ad0def736 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -387,8 +387,8 @@ pushTemporaryCleanup(CodeGenFunction &CGF, const MaterializeTemporaryExpr *M,
   if (const RecordType *RT =
           E->getType()->getBaseElementTypeUnsafe()->getAs<RecordType>()) {
     // Get the destructor for the reference temporary.
-    auto *ClassDecl = cast<CXXRecordDecl>(RT->getDecl());
-    if (!ClassDecl->hasTrivialDestructor())
+    if (auto *ClassDecl = dyn_cast<CXXRecordDecl>(RT->getDecl());
+        ClassDecl && !ClassDecl->hasTrivialDestructor())
       ReferenceTemporaryDtor = ClassDecl->getDestructor();
   }
 
diff --git a/clang/lib/Sema/Sema.cpp b/clang/lib/Sema/Sema.cpp
index 93a2d797679d4..4b573abb8efba 100644
--- a/clang/lib/Sema/Sema.cpp
+++ b/clang/lib/Sema/Sema.cpp
@@ -741,10 +741,11 @@ ExprResult Sema::ImpCastExprToType(Expr *E, QualType Ty,
   if (Kind == CK_ArrayToPointerDecay) {
     // C++1z [conv.array]: The temporary materialization conversion is applied.
     // We also use this to fuel C++ DR1213, which applies to C++11 onwards.
-    if (getLangOpts().CPlusPlus && E->isPRValue()) {
+    if ((getLangOpts().C11 || getLangOpts().CPlusPlus) && E->isPRValue()) {
       // The temporary is an lvalue in C++98 and an xvalue otherwise.
       ExprResult Materialized = CreateMaterializeTemporaryExpr(
-          E->getType(), E, !getLangOpts().CPlusPlus11);
+          E->getType(), E,
+          getLangOpts().CPlusPlus && !getLangOpts().CPlusPlus11);
       if (Materialized.isInvalid())
         return ExprError();
       E = Materialized.get();
diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp
index 9814c3f456f0d..c47ea71cbafb5 100644
--- a/clang/lib/Sema/SemaInit.cpp
+++ b/clang/lib/Sema/SemaInit.cpp
@@ -7654,7 +7654,7 @@ ExprResult Sema::TemporaryMaterializationConversion(Expr *E) {
   // FIXME: This means that AST consumers need to deal with "prvalues" that
   // denote materialized temporaries. Maybe we should add another ValueKind
   // for "xvalue pretending to be a prvalue" for C++98 support.
-  if (!E->isPRValue() || !getLangOpts().CPlusPlus11)
+  if (!E->isPRValue() || (!getLangOpts().CPlusPlus11 && !getLangOpts().C11))
     return E;
 
   // C++1z [conv.rval]/1: T shall be a complete type.
diff --git a/clang/test/C/C11/n1285.c b/clang/test/C/C11/n1285.c
index e7a9463e68103..c4fb199e33ae5 100644
--- a/clang/test/C/C11/n1285.c
+++ b/clang/test/C/C11/n1285.c
@@ -1,24 +1,81 @@
-// RUN: %clang_cc1 -verify=wrong -std=c99 %s
-// RUN: %clang_cc1 -verify=wrong -std=c11 %s
-// RUN: %clang_cc1 -verify=cpp -std=c++11 -x c++ %s
+// RUN: %clang_cc1 -fsyntax-only -verify=good -std=c99 %s
+// RUN: %clang_cc1 -fsyntax-only -verify=expected,c -std=c11 %s
+// RUN: %clang_cc1 -fsyntax-only -verify=expected,cpp -std=c++11 -x c++ %s
 
-/* WG14 N1285: No
+/* WG14 N1285: Clang 21
  * Extending the lifetime of temporary objects (factored approach)
  *
- * NB: we do not properly materialize temporary expressions in situations where
- * it would be expected; that is why the "no-diagnostics" marking is named
- * "wrong". We do issue the expected diagnostic in C++ mode.
+ * This paper introduced the notion of an object with a temporary lifetime. Any
+ * operation resulting in an rvalue of structure or union type which contains
+ * an array results in an object with temporary lifetime.
  */
 
-// wrong-no-diagnostics
+// good-no-diagnostics
+
+// C11 6.2.4p8: A non-lvalue expression with structure or union type, where the
+// structure or union contains a member with array type (including,
+// recursively, members of all contained structures and unions) refers to an
+// object with automatic storage duration and temporary lifetime. Its lifetime
+// begins when the expression is evaluated and its initial value is the value
+// of the expression. Its lifetime ends when the evaluation of the containing
+// full expression or full declarator ends. Any attempt to modify an object
+// with temporary lifetime results in undefined behavior.
 
 struct X { int a[5]; };
 struct X f(void);
 
-int foo(void) {
-  // FIXME: This diagnostic should be issued in C11 as well (though not in C99,
-  // as this paper was a breaking change between C99 and C11).
-  int *p = f().a; // cpp-warning {{temporary whose address is used as value of local variable 'p' will be destroyed at the end of the full-expression}}
-  return *p;
+union U { int a[10]; double d; };
+union U g(void);
+
+void sink(int *);
+
+int func_return(void) {
+  int *p = f().a; // expected-warning {{temporary whose address is used as value of local variable 'p' will be destroyed at the end of the full-expression}}
+  p = f().a;      // expected-warning {{object backing the pointer p will be destroyed at the end of the full-expression}}
+  p = g().a;      // expected-warning {{object backing the pointer p will be destroyed at the end of the full-expression}}
+  sink(f().a);    // Ok
+  return *f().a;  // Ok
+}
+
+int ternary(void) {
+  int *p = (1 ? (struct X){ 0 } : f()).a; // expected-warning {{temporary whose address is used as value of local variable 'p' will be destroyed at the end of the full-expression}}
+  int *r = (1 ? (union U){ 0 } : g()).a;  // expected-warning {{temporary whose address is used as value of local variable 'r' will be destroyed at the end of the full-expression}}
+  p = (1 ? (struct X){ 0 } : f()).a;      // expected-warning {{object backing the pointer p will be destroyed at the end of the full-expression}}
+  sink((1 ? (struct X){ 0 } : f()).a);    // Ok
+
+  // This intentionally gets one diagnostic in C and two in C++. In C, the
+  // compound literal results in an lvalue, not an rvalue as it does in C++. So
+  // only one branch results in a temporary in C but both branches do in C++.
+  int *q = 1 ? (struct X){ 0 }.a : f().a; // expected-warning {{temporary whose address is used as value of local variable 'q' will be destroyed at the end of the full-expression}} \
+                                             cpp-warning {{temporary whose address is used as value of local variable 'q' will be destroyed at the end of the full-expression}}
+  q = 1 ? (struct X){ 0 }.a : f().a; // expected-warning {{object backing the pointer q will be destroyed at the end of the full-expression}} \
+                                        cpp-warning {{object backing the pointer q will be destroyed at the end of the full-expression}}
+  q = 1 ? (struct X){ 0 }.a : g().a; // expected-warning {{object backing the pointer q will be destroyed at the end of the full-expression}} \
+                                        cpp-warning {{object backing the pointer q will be destroyed at the end of the full-expression}}
+  sink(1 ? (struct X){ 0 }.a : f().a); // Ok
+  return *(1 ? (struct X){ 0 }.a : f().a); // Ok
 }
 
+int comma(void) {
+  struct X x;
+  int *p = ((void)0, x).a; // c-warning {{temporary whose address is used as value of local variable 'p' will be destroyed at the end of the full-expression}}
+  p = ((void)0, x).a;      // c-warning {{object backing the pointer p will be destroyed at the end of the full-expression}}
+  sink(((void)0, x).a);    // Ok
+  return *(((void)0, x).a);// Ok
+}
+
+int cast(void) {
+  struct X x;
+  int *p = ((struct X)x).a;  // expected-warning {{temporary whose address is used as value of local variable 'p' will be destroyed at the end of the full-expression}}
+  p = ((struct X)x).a;       // expected-warning {{object backing the pointer p will be destroyed at the end of the full-expression}}
+  sink(((struct X)x).a);     // Ok
+  return *(((struct X)x).a); // Ok
+}
+
+int assign(void) {
+  struct X x, s;
+  int *p = (x = s).a;  // c-warning {{temporary whose address is used as value of local variable 'p' will be destroyed at the end of the full-expression}}
+  p = (x = s).a;       // c-warning {{object backing the pointer p will be destroyed at the end of the full-expression}}
+  sink((x = s).a);     // Ok
+  return *((x = s).a); // Ok
+}
diff --git a/clang/test/C/C11/n1285_1.c b/clang/test/C/C11/n1285_1.c
new file mode 100644
index 0000000000000..6a3df58e5a6a0
--- /dev/null
+++ b/clang/test/C/C11/n1285_1.c
@@ -0,0 +1,212 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -std=c99 -Wno-dangling -emit-llvm -o - %s | FileCheck %s --check-prefix=C99
+// RUN: %clang_cc1 -std=c11 -Wno-dangling -emit-llvm -o - %s | FileCheck %s --check-prefix=C11
+
+// Ensure that codegen for temporary lifetimes makes sense.
+
+struct X { int a[5]; };
+struct X f(void);
+
+// C99-LABEL: define dso_local i32 @func_return(
+// C99-SAME: ) #[[ATTR0:[0-9]+]] {
+// C99-NEXT:  [[ENTRY:.*:]]
+// C99-NEXT:    [[P:%.*]] = alloca ptr, align 8
+// C99-NEXT:    [[TMP:%.*]] = alloca [[STRUCT_X:%.*]], align 4
+// C99-NEXT:    call void @f(ptr dead_on_unwind writable sret([[STRUCT_X]]) align 4 [[TMP]])
+// C99-NEXT:    [[A:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[TMP]], i32 0, i32 0
+// C99-NEXT:    [[ARRAYDECAY:%.*]] = getelementptr inbounds [5 x i32], ptr [[A]], i64 0, i64 0
+// C99-NEXT:    store ptr [[ARRAYDECAY]], ptr [[P]], align 8
+// C99-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P]], align 8
+// C99-NEXT:    [[TMP1:%.*]] = load i32, ptr [[TMP0]], align 4
+// C99-NEXT:    ret i32 [[TMP1]]
+//
+// C11-LABEL: define dso_local i32 @func_return(
+// C11-SAME: ) #[[ATTR0:[0-9]+]] {
+// C11-NEXT:  [[ENTRY:.*:]]
+// C11-NEXT:    [[P:%.*]] = alloca ptr, align 8
+// C11-NEXT:    [[REF_TMP:%.*]] = alloca [[STRUCT_X:%.*]], align 4
+// C11-NEXT:    call void @f(ptr dead_on_unwind writable sret([[STRUCT_X]]) align 4 [[REF_TMP]])
+// C11-NEXT:    [[A:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[REF_TMP]], i32 0, i32 0
+// C11-NEXT:    [[ARRAYDECAY:%.*]] = getelementptr inbounds [5 x i32], ptr [[A]], i64 0, i64 0
+// C11-NEXT:    store ptr [[ARRAYDECAY]], ptr [[P]], align 8
+// C11-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P]], align 8
+// C11-NEXT:    [[TMP1:%.*]] = load i32, ptr [[TMP0]], align 4
+// C11-NEXT:    ret i32 [[TMP1]]
+//
+int func_return(void) {
+  int *p = f().a;
+  return *p;
+}
+
+// C99-LABEL: define dso_local i32 @ternary(
+// C99-SAME: ) #[[ATTR0]] {
+// C99-NEXT:  [[ENTRY:.*:]]
+// C99-NEXT:    [[P:%.*]] = alloca ptr, align 8
+// C99-NEXT:    [[TMP:%.*]] = alloca [[STRUCT_X:%.*]], align 4
+// C99-NEXT:    [[Q:%.*]] = alloca ptr, align 8
+// C99-NEXT:    [[DOTCOMPOUNDLITERAL:%.*]] = alloca [[STRUCT_X]], align 4
+// C99-NEXT:    br i1 true, label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]]
+// C99:       [[COND_TRUE]]:
+// C99-NEXT:    call void @llvm.memset.p0.i64(ptr align 4 [[TMP]], i8 0, i64 20, i1 false)
+// C99-NEXT:    [[A:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[TMP]], i32 0, i32 0
+// C99-NEXT:    br label %[[COND_END:.*]]
+// C99:       [[COND_FALSE]]:
+// C99-NEXT:    call void @f(ptr dead_on_unwind writable sret([[STRUCT_X]]) align 4 [[TMP]])
+// C99-NEXT:    br label %[[COND_END]]
+// C99:       [[COND_END]]:
+// C99-NEXT:    [[A1:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[TMP]], i32 0, i32 0
+// C99-NEXT:    [[ARRAYDECAY:%.*]] = getelementptr inbounds [5 x i32], ptr [[A1]], i64 0, i64 0
+// C99-NEXT:    store ptr [[ARRAYDECAY]], ptr [[P]], align 8
+// C99-NEXT:    call void @llvm.memset.p0.i64(ptr align 4 [[DOTCOMPOUNDLITERAL]], i8 0, i64 20, i1 false)
+// C99-NEXT:    [[A2:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[DOTCOMPOUNDLITERAL]], i32 0, i32 0
+// C99-NEXT:    [[A3:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[DOTCOMPOUNDLITERAL]], i32 0, i32 0
+// C99-NEXT:    [[ARRAYDECAY4:%.*]] = getelementptr inbounds [5 x i32], ptr [[A3]], i64 0, i64 0
+// C99-NEXT:    store ptr [[ARRAYDECAY4]], ptr [[Q]], align 8
+// C99-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P]], align 8
+// C99-NEXT:    [[TMP1:%.*]] = load i32, ptr [[TMP0]], align 4
+// C99-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[Q]], align 8
+// C99-NEXT:    [[TMP3:%.*]] = load i32, ptr [[TMP2]], align 4
+// C99-NEXT:    [[ADD:%.*]] = add nsw i32 [[TMP1]], [[TMP3]]
+// C99-NEXT:    ret i32 [[ADD]]
+//
+// C11-LABEL: define dso_local i32 @ternary(
+// C11-SAME: ) #[[ATTR0]] {
+// C11-NEXT:  [[ENTRY:.*:]]
+// C11-NEXT:    [[P:%.*]] = alloca ptr, align 8
+// C11-NEXT:    [[REF_TMP:%.*]] = alloca [[STRUCT_X:%.*]], align 4
+// C11-NEXT:    [[Q:%.*]] = alloca ptr, align 8
+// C11-NEXT:    [[DOTCOMPOUNDLITERAL:%.*]] = alloca [[STRUCT_X]], align 4
+// C11-NEXT:    br i1 true, label %[[COND_TRUE:.*]], label %[[COND_FALSE:.*]]
+// C11:       [[COND_TRUE]]:
+// C11-NEXT:    call void @llvm.memset.p0.i64(ptr align 4 [[REF_TMP]], i8 0, i64 20, i1 false)
+// C11-NEXT:    [[A:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[REF_TMP]], i32 0, i32 0
+// C11-NEXT:    br label %[[COND_END:.*]]
+// C11:       [[COND_FALSE]]:
+// C11-NEXT:    call void @f(ptr dead_on_unwind writable sret([[STRUCT_X]]) align 4 [[REF_TMP]])
+// C11-NEXT:    br label %[[COND_END]]
+// C11:       [[COND_END]]:
+// C11-NEXT:    [[A1:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[REF_TMP]], i32 0, i32 0
+// C11-NEXT:    [[ARRAYDECAY:%.*]] = getelementptr inbounds [5 x i32], ptr [[A1]], i64 0, i64 0
+// C11-NEXT:    store ptr [[ARRAYDECAY]], ptr [[P]], align 8
+// C11-NEXT:    call void @llvm.memset.p0.i64(ptr align 4 [[DOTCOMPOUNDLITERAL]], i8 0, i64 20, i1 false)
+// C11-NEXT:    [[A2:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[DOTCOMPOUNDLITERAL]], i32 0, i32 0
+// C11-NEXT:    [[A3:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[DOTCOMPOUNDLITERAL]], i32 0, i32 0
+// C11-NEXT:    [[ARRAYDECAY4:%.*]] = getelementptr inbounds [5 x i32], ptr [[A3]], i64 0, i64 0
+// C11-NEXT:    store ptr [[ARRAYDECAY4]], ptr [[Q]], align 8
+// C11-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P]], align 8
+// C11-NEXT:    [[TMP1:%.*]] = load i32, ptr [[TMP0]], align 4
+// C11-NEXT:    [[TMP2:%.*]] = load ptr, ptr [[Q]], align 8
+// C11-NEXT:    [[TMP3:%.*]] = load i32, ptr [[TMP2]], align 4
+// C11-NEXT:    [[ADD:%.*]] = add nsw i32 [[TMP1]], [[TMP3]]
+// C11-NEXT:    ret i32 [[ADD]]
+//
+int ternary(void) {
+  int *p;
+  p = (1 ? (struct X){ 0 } : f()).a;
+  int *q = 1 ? (struct X){ 0 }.a : f().a;
+
+  return *p + *q;
+}
+
+// C99-LABEL: define dso_local i32 @comma(
+// C99-SAME: ) #[[ATTR0]] {
+// C99-NEXT:  [[ENTRY:.*:]]
+// C99-NEXT:    [[X:%.*]] = alloca [[STRUCT_X:%.*]], align 4
+// C99-NEXT:    [[P:%.*]] = alloca ptr, align 8
+// C99-NEXT:    [[A:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[X]], i32 0, i32 0
+// C99-NEXT:    [[ARRAYDECAY:%.*]] = getelementptr inbounds [5 x i32], ptr [[A]], i64 0, i64 0
+// C99-NEXT:    store ptr [[ARRAYDECAY]], ptr [[P]], align 8
+// C99-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P]], align 8
+// C99-NEXT:    [[TMP1:%.*]] = load i32, ptr [[TMP0]], align 4
+// C99-NEXT:    ret i32 [[TMP1]]
+//
+// C11-LABEL: define dso_local i32 @comma(
+// C11-SAME: ) #[[ATTR0]] {
+// C11-NEXT:  [[ENTRY:.*:]]
+// C11-NEXT:    [[X:%.*]] = alloca [[STRUCT_X:%.*]], align 4
+// C11-NEXT:    [[P:%.*]] = alloca ptr, align 8
+// C11-NEXT:    [[REF_TMP:%.*]] = alloca [[STRUCT_X]], align 4
+// C11-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[REF_TMP]], ptr align 4 [[X]], i64 20, i1 false)
+// C11-NEXT:    [[A:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[REF_TMP]], i32 0, i32 0
+// C11-NEXT:    [[ARRAYDECAY:%.*]] = getelementptr inbounds [5 x i32], ptr [[A]], i64 0, i64 0
+// C11-NEXT:    store ptr [[ARRAYDECAY]], ptr [[P]], align 8
+// C11-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P]], align 8
+// C11-NEXT:    [[TMP1:%.*]] = load i32, ptr [[TMP0]], align 4
+// C11-NEXT:    ret i32 [[TMP1]]
+//
+int comma(void) {
+  struct X x;
+  int *p = ((void)0, x).a;
+  return *p;
+}
+
+// C99-LABEL: define dso_local i32 @cast(
+// C99-SAME: ) #[[ATTR0]] {
+// C99-NEXT:  [[ENTRY:.*:]]
+// C99-NEXT:    [[X:%.*]] = alloca [[STRUCT_X:%.*]], align 4
+// C99-NEXT:    [[P:%.*]] = alloca ptr, align 8
+// C99-NEXT:    [[A:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[X]], i32 0, i32 0
+// C99-NEXT:    [[ARRAYDECAY:%.*]] = getelementptr inbounds [5 x i32], ptr [[A]], i64 0, i64 0
+// C99-NEXT:    store ptr [[ARRAYDECAY]], ptr [[P]], align 8
+// C99-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P]], align 8
+// C99-NEXT:    [[TMP1:%.*]] = load i32, ptr [[TMP0]], align 4
+// C99-NEXT:    ret i32 [[TMP1]]
+//
+// C11-LABEL: define dso_local i32 @cast(
+// C11-SAME: ) #[[ATTR0]] {
+// C11-NEXT:  [[ENTRY:.*:]]
+// C11-NEXT:    [[X:%.*]] = alloca [[STRUCT_X:%.*]], align 4
+// C11-NEXT:    [[P:%.*]] = alloca ptr, align 8
+// C11-NEXT:    [[REF_TMP:%.*]] = alloca [[STRUCT_X]], align 4
+// C11-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[REF_TMP]], ptr align 4 [[X]], i64 20, i1 false)
+// C11-NEXT:    [[A:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[REF_TMP]], i32 0, i32 0
+// C11-NEXT:    [[ARRAYDECAY:%.*]] = getelementptr inbounds [5 x i32], ptr [[A]], i64 0, i64 0
+// C11-NEXT:    store ptr [[ARRAYDECAY]], ptr [[P]], align 8
+// C11-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P]], align 8
+// C11-NEXT:    [[TMP1:%.*]] = load i32, ptr [[TMP0]], align 4
+// C11-NEXT:    ret i32 [[TMP1]]
+//
+int cast(void) {
+  struct X x;
+  int *p;
+  p = ((struct X)x).a;
+  return *p;
+}
+
+// C99-LABEL: define dso_local i32 @assign(
+// C99-SAME: ) #[[ATTR0]] {
+// C99-NEXT:  [[ENTRY:.*:]]
+// C99-NEXT:    [[X:%.*]] = alloca [[STRUCT_X:%.*]], align 4
+// C99-NEXT:    [[S:%.*]] = alloca [[STRUCT_X]], align 4
+// C99-NEXT:    [[P:%.*]] = alloca ptr, align 8
+// C99-NEXT:    [[TMP:%.*]] = alloca [[STRUCT_X]], align 4
+// C99-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[X]], ptr align 4 [[S]], i64 20, i1 false)
+// C99-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[TMP]], ptr align 4 [[X]], i64 20, i1 false)
+// C99-NEXT:    [[A:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[TMP]], i32 0, i32 0
+// C99-NEXT:    [[ARRAYDECAY:%.*]] = getelementptr inbounds [5 x i32], ptr [[A]], i64 0, i64 0
+// C99-NEXT:    store ptr [[ARRAYDECAY]], ptr [[P]], align 8
+// C99-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P]], align 8
+// C99-NEXT:    [[TMP1:%.*]] = load i32, ptr [[TMP0]], align 4
+// C99-NEXT:    ret i32 [[TMP1]]
+//
+// C11-LABEL: define dso_local i32 @assign(
+// C11-SAME: ) #[[ATTR0]] {
+// C11-NEXT:  [[ENTRY:.*:]]
+// C11-NEXT:    [[X:%.*]] = alloca [[STRUCT_X:%.*]], align 4
+// C11-NEXT:    [[S:%.*]] = alloca [[STRUCT_X]], align 4
+// C11-NEXT:    [[P:%.*]] = alloca ptr, align 8
+// C11-NEXT:    [[REF_TMP:%.*]] = alloca [[STRUCT_X]], align 4
+// C11-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[X]], ptr align 4 [[S]], i64 20, i1 false)
+// C11-NEXT:    call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[REF_TMP]], ptr align 4 [[X]], i64 20, i1 false)
+// C11-NEXT:    [[A:%.*]] = getelementptr inbounds nuw [[STRUCT_X]], ptr [[REF_TMP]], i32 0, i32 0
+// C11-NEXT:    [[ARRAYDECAY:%.*]] = getelementptr inbounds [5 x i32], ptr [[A]], i64 0, i64 0
+// C11-NEXT:    store ptr [[ARRAYDECAY]], ptr [[P]], align 8
+// C11-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P]], align 8
+// C11-NEXT:    [[TMP1:%.*]] = load i32, ptr [[TMP0]], align 4
+// C11-NEXT:    ret i32 [[TMP1]]
+//
+int assign(void) {
+  struct X x, s;
+  int *p = (x = s).a;
+  return *p;
+}
diff --git a/clang/test/CodeGenObjC/property-array-type.m b/clang/test/CodeGenObjC/property-array-type.m
index a0ad8db...
[truncated]

@AaronBallman
Copy link
Collaborator Author

Note, this will fail in precommit CI due to test/CodeGenObjC/strong-in-c-struct.c. I am posting this PR to see if there are other failures on Linux platforms that I'm missing locally, and to get an idea whether folks agree that this is heading in the right direction. I'm leaning very heavily on the C++ implementation.

@github-actions
Copy link

github-actions bot commented Mar 31, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jyknight
Copy link
Member

  • In C89: it's impossible to get the address of a temporary struct, so the lifetime is irrelevant.
  • In C99: you now can get the address (from an array subobject), but the lifetime was defined as "until the next sequence point". This made it technically illegal to access even non-array subobjects from a temporary struct in many cases where it seems like it should be fine and was in C89.
  • In C11, the lifetime was extended to be the containing full expression.

(LMK if I've got that wrong!) Given that, how is the C11 change a breaking change? It seems like the C99 rule was just bogus, and we should forget it ever existed?

@frederick-vs-ja
Copy link
Contributor

(LMK if I've got that wrong!) Given that, how is the C11 change a breaking change? It seems like the C99 rule was just bogus, and we should forget it ever existed?

IIUC, the "breaking change" is that in C99 a pointer value to a temporary array element never becomes indeterminate (despite access through it raising UB), but in C11 it does.

E.g.

struct S { int a[1]; };
struct S fun(void) {
  return (struct S){0};
}

int main(void) {
  int* p = fun().a;
  int m = *p; // UB in C99 and C11
  *p += 42;   // UB in C99 and C11

  *p;         // UB in C11, but OK in C99?
  int* q = p; // q is more indeterminate in C11?
}

@AaronBallman
Copy link
Collaborator Author

(LMK if I've got that wrong!) Given that, how is the C11 change a breaking change? It seems like the C99 rule was just bogus, and we should forget it ever existed?

IIUC, the "breaking change" is that in C99 a pointer value to a temporary array element never becomes indeterminate (despite access through it raising UB), but in C11 it does.

E.g.

struct S { int a[1]; };
struct S fun(void) {
  return (struct S){0};
}

int main(void) {
  int* p = fun().a;
  int m = *p; // UB in C99 and C11
  *p += 42;   // UB in C99 and C11

  *p;         // UB in C11, but OK in C99?
  int* q = p; // q is more indeterminate in C11?
}

From the original paper:

Please note that this approach additionally declares an example like this, which was conforming under C99, to be non-conforming:

    struct X { int a[5]; } f();
    int *p = f().a;
    printf("%p\n", p);

Because it uses the (indeterminate) value of a pointer to a temporary object. Personally, I think this is a good thing, but I wanted to be very clear and up front about this change.

@efriedma-quic
Copy link
Collaborator

In practice, when you're using LLVM, pointers never become indeterminate. So I don't think that "breakage" can actually break anything.

@AaronBallman
Copy link
Collaborator Author

So is the request to remove the checks for C11 and just treat C as being this way always? (If so, does anyone want to see extension/compatibility warnings in C99 mode?) Or are we fine as-is without backporting it?

@rjmccall
Copy link
Contributor

rjmccall commented Apr 1, 2025

I mean, let's be clear here: the old C rule is broken. The standard failed to understand that the memory could be addressed, so it failed to give rules about the lifetime of addresses within it; as a result, it is formally well-defined to have pointers perpetually referring to it. That is unimplementable without dynamically allocating and leaking memory, which we certainly should not do and which nobody would thank us for doing. Given that, I think we have to understand the wording change here as fully retroactive.

@efriedma-quic
Copy link
Collaborator

I think I'd prefer to make -std=c11 and -std=c99 behave the same way where possible, I think. Every difference we have between language modes is a maintenance burden. I don't really care if we produce a "will be destroyed at the end of the full-expression" in C99 even if that isn't strictly correct... it's better than the current behavior of not generating address-of-temporary diagnostics at all.

@AaronBallman
Copy link
Collaborator Author

Okay, I'll change this so it's not specific to C11. Is anyone going to be upset if I don't implement "is a C11 extension" diagnostics, or should I take care of that too?

@jyknight
Copy link
Member

jyknight commented Apr 1, 2025

I was trying to figure out what Clang previously, in practice, thought the lifetime of this C99 temporary-which-isn't-a-temporary was.

AFAICT, it's treating it a bit inconsistently as either the full expression (where is where we emit e.g. objc-arc cleanups for the reference-counted values), or else the entire-function scope -- since we didn't emit any llvm.lifetime instrinsics for the object, the alloca lives until the end of the function.

So, in practice this PR reduces reduce the observed lifetime. E.g.

void test2(int a, int b);
void test() {
  struct X { int a[5]; } f(void);
  int *g = f().a;
  int *g2 = f().a;
  test2(g[0], g2[0]);
}

would've previously worked in practice (though UB per the standard), but now is broken in practice. It also observerably-works in GCC.

I don't know if any code actually does this, but it's maybe something to consider for the decision on whether to backport to C99.

@frederick-vs-ja
Copy link
Contributor

That is unimplementable without dynamically allocating and leaking memory, which we certainly should not do and which nobody would thank us for doing.

Ah, I think DR452 (adding "Such an object need not have a unique address.") also addressed this.

And then the patched C99 rule would be - the storage of the temporary object can be deallocated or reused later, but a pointer to it never becomes indeterminate. This is implementable but extremely weird, since it would indicate that some aspect of temporary objects is more persistent than that of other objects.

@AaronBallman
Copy link
Collaborator Author

Ping

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AaronBallman AaronBallman merged commit 5c8ba28 into llvm:main Apr 10, 2025
6 of 10 checks passed
@AaronBallman AaronBallman deleted the aballman-wg14-n1285 branch April 10, 2025 12:12
@nico
Copy link
Contributor

nico commented Apr 10, 2025

Looks like this breaks tests on macOS: http://45.33.8.238/macm1/104220/step_6.txt

Please take a look and revert for now if it takes a while to fix.

@AaronBallman
Copy link
Collaborator Author

Looks like this breaks tests on macOS: http://45.33.8.238/macm1/104220/step_6.txt

Please take a look and revert for now if it takes a while to fix.

This should already be addressed by 5a1b4ec, I'm watching the bots to make sure they go back to green.

@nico
Copy link
Contributor

nico commented Apr 10, 2025

Confirmed, bots are happy again. Thanks for the super fast fix, and sorry for the noise :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c11 clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C11] Properly handle materializing temporaries

7 participants