Skip to content

Conversation

@to268
Copy link
Contributor

@to268 to268 commented Jan 29, 2024

The N3006 Underspecified object declarations paper describes that underspecified declarations should be diagnosed as an error.
Example snippet:

struct S1 { int x, y; };
union U1 { int a; double b; };
enum E1 { FOO, BAR };

auto valid_struct = (struct S1){ 1, 2 };
auto underspecified_struct = (struct S2 { int x, y; }){ 1, 2 }; // Error

auto valid_union = (union U1){ .a = 12 };
auto underspecified_union = (union U2 { int a; double b; }){ .a = 34 }; // Error

auto valid_enum = (enum E1){ FOO };
auto underspecified_enum = (enum E2 { BAZ, QUX }){ BAZ }; // Error

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jan 29, 2024
@llvmbot
Copy link
Member

llvmbot commented Jan 29, 2024

@llvm/pr-subscribers-clang

Author: Guillot Tony (to268)

Changes

The N3006 Underspecified object declarations paper describes that underspecified declarations should be diagnosed as an error.
Example snippet:

struct S1 { int x, y; };
union U1 { int a; double b; };
enum E1 { FOO, BAR };

auto valid_struct = (struct S1){ 1, 2 };
auto underspecified_struct = (struct S2 { int x, y; }){ 1, 2 }; // Error

auto valid_union = (union U1){ .a = 12 };
auto underspecified_union = (union U2 { int a; double b; }){ .a = 34 }; // Error

auto valid_enum = (enum E1){ FOO };
auto underspecified_enum = (enum E2 { BAZ, QUX }){ BAZ }; // Error

Full diff: https://github.com/llvm/llvm-project/pull/79845.diff

6 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+3)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+2)
  • (modified) clang/lib/Sema/SemaExpr.cpp (+27)
  • (added) clang/test/C/C2x/n3006.c (+27)
  • (added) clang/test/Parser/c2x-underspecified-decls.c (+12)
  • (modified) clang/www/c_status.html (+1-1)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 254e0a9cb72979..65885968204a64 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -83,6 +83,9 @@ C Language Changes
 C23 Feature Support
 ^^^^^^^^^^^^^^^^^^^
 
+- Clang now diagnoses `N3006 Underspecified object declarations
+  <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3006.htm>`_.
+
 Non-comprehensive list of changes in this release
 -------------------------------------------------
 
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 24d32cb87c89e2..8508208a3acedd 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -7639,6 +7639,8 @@ def err_attribute_arm_mve_polymorphism : Error<
   "'__clang_arm_mve_strict_polymorphism' attribute can only be applied to an MVE/NEON vector type">;
 def err_attribute_webassembly_funcref : Error<
   "'__funcref' attribute can only be applied to a function pointer type">;
+def err_c23_underspecified_object_declaration: Error<
+  "'%select{struct|<ERROR>|union|<ERROR>|enum}0 %1' is defined as an underspecified object initializer">;
 
 def warn_setter_getter_impl_required : Warning<
   "property %0 requires method %1 to be defined - "
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 2f1ddfb215116d..8166748c38e6e9 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -7805,6 +7805,33 @@ Sema::BuildCompoundLiteralExpr(SourceLocation LParenLoc, TypeSourceInfo *TInfo,
                                            diag::err_variable_object_no_init))
         return ExprError();
     }
+  } else if (LangOpts.C23 &&
+             (literalType->isRecordType() || literalType->isEnumeralType())) {
+    // C23 6.2.1p7: Structure, union, and enumeration tags have scope that
+    // begins just after the appearance of the tag in a type specifier that
+    // declares the tag.
+    // [...]
+    // An ordinary identifier that has an underspecified definition has scope
+    // that starts when the definition is completed; if the same ordinary
+    // identifier declares another entity with a scope that encloses the current
+    // block, that declaration is hidden as soon as the inner declarator is
+    // completed*.)
+    // [...]
+    // *) That means, that the outer declaration is not visible for the
+    // initializer.
+    auto Range = SourceRange(LParenLoc, RParenLoc);
+    const auto *Tag = literalType->castAs<TagType>();
+    const auto &TagRange = Tag->getDecl()->getSourceRange();
+
+    // We should diagnose underspecified declaration, unless the identifier has
+    // been diagnosed as being a redefinition, since the tag is made anonymous.
+    if (Range.fullyContains(TagRange) && Tag->getDecl()->getIdentifier()) {
+      Diag(TagRange.getBegin(),
+           diag::err_c23_underspecified_object_declaration)
+          << (unsigned)Tag->getDecl()->getTagKind()
+          << Tag->getDecl()->getName() << TagRange;
+      return ExprError();
+    }
   } else if (!literalType->isDependentType() &&
              RequireCompleteType(LParenLoc, literalType,
                diag::err_typecheck_decl_incomplete_type,
diff --git a/clang/test/C/C2x/n3006.c b/clang/test/C/C2x/n3006.c
new file mode 100644
index 00000000000000..15efc0ccd6d323
--- /dev/null
+++ b/clang/test/C/C2x/n3006.c
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 -std=c2x -verify %s
+
+/* WG14 N3006: Full
+ * Underspecified object declarations
+ */
+
+struct S1 { int x, y; };        // expected-note {{previous definition is here}}
+union U1 { int a; double b; };  // expected-note {{previous definition is here}}
+enum E1 { FOO, BAR };           // expected-note {{previous definition is here}}
+
+auto normal_struct = (struct S1){ 1, 2 };
+auto underspecified_struct = (struct S2 { int x, y; }){ 1, 2 };           // expected-error {{'struct S2' is defined as an underspecified object initializer}}
+auto underspecified_struct_redef = (struct S1 { char x, y; }){ 'A', 'B'}; // expected-error {{redefinition of 'S1'}}
+auto underspecified_empty_struct = (struct S3 { }){ };                    // expected-error {{'struct S3' is defined as an underspecified object initializer}}
+
+auto normal_union_int = (union U1){ .a = 12 };
+auto normal_union_double = (union U1){ .b = 2.4 };
+auto underspecified_union = (union U2 { int a; double b; }){ .a = 34 };         // expected-error {{'union U2' is defined as an underspecified object initializer}}
+auto underspecified_union_redef = (union U1 { char a; double b; }){ .a = 'A' }; // expected-error {{redefinition of 'U1'}}
+auto underspecified_empty_union = (union U3 {  }){  };                          // expected-error {{'union U3' is defined as an underspecified object initializer}}
+
+auto normal_enum_foo = (enum E1){ FOO };
+auto normal_enum_bar = (enum E1){ BAR };
+auto underspecified_enum = (enum E2 { BAZ, QUX }){ BAZ };       // expected-error {{'enum E2' is defined as an underspecified object initializer}}
+auto underspecified_enum_redef = (enum E1 { ONE, TWO }){ ONE }; // expected-error {{redefinition of 'E1'}}
+auto underspecified_empty_enum = (enum E3 {  }){ };             // expected-error {{'enum E3' is defined as an underspecified object initializer}} \
+                                                                   expected-error {{use of empty enum}}
diff --git a/clang/test/Parser/c2x-underspecified-decls.c b/clang/test/Parser/c2x-underspecified-decls.c
new file mode 100644
index 00000000000000..5a7e935cda5444
--- /dev/null
+++ b/clang/test/Parser/c2x-underspecified-decls.c
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=expected,c23 -std=c23 %s
+// RUN: %clang_cc1 -fsyntax-only -verify=expected,c17 -std=c17 %s
+
+auto underspecified_struct = (struct S1 { int x, y; }){ 1, 2 };         // c23-error {{'struct S1' is defined as an underspecified object initializer}} \
+                                                                           c17-error {{type specifier missing, defaults to 'int'; ISO C99 and later do not support implicit int}} \
+                                                                           c17-error {{illegal storage class on file-scoped variable}}
+auto underspecified_union = (union U1 { int a; double b; }){ .a = 34 }; // c23-error {{'union U1' is defined as an underspecified object initializer}} \
+                                                                           c17-error {{type specifier missing, defaults to 'int'; ISO C99 and later do not support implicit int}} \
+                                                                           c17-error {{illegal storage class on file-scoped variable}}
+auto underspecified_enum = (enum E1 { FOO, BAR }){ BAR };               // c23-error {{'enum E1' is defined as an underspecified object initializer}} \
+                                                                           c17-error {{type specifier missing, defaults to 'int'; ISO C99 and later do not support implicit int}} \
+                                                                           c17-error {{illegal storage class on file-scoped variable}}
diff --git a/clang/www/c_status.html b/clang/www/c_status.html
index 3955a1d7963976..fc7d3f5f8ad79c 100644
--- a/clang/www/c_status.html
+++ b/clang/www/c_status.html
@@ -1191,7 +1191,7 @@ <h2 id="c2x">C23 implementation status</h2>
     <tr>
       <td>Underspecified object definitions</td>
       <td><a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3006.htm">N3006</a></td>
-      <td class="none" align="center">No</td>
+      <td class="unreleased" align="center">Clang 19</td>
     </tr>
     <tr>
       <td>Type inference for object declarations</td>

def err_attribute_webassembly_funcref : Error<
"'__funcref' attribute can only be applied to a function pointer type">;
def err_c23_underspecified_object_declaration: Error<
"'%select{struct|<ERROR>|union|<ERROR>|enum}0 %1' is defined as an underspecified object initializer">;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the diagnostic message need to be corrected.

Copy link
Contributor

@cor3ntin cor3ntin Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think underspecified object is useful for users
Maybe error: the deduced type of foo refers to a struct|union|enum bar which is defined in its initializer.

Copy link
Contributor

@Fznamznon Fznamznon Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK "underspecified" is how the standard calls it. And constexpr declarations in C are also "underspecified".
gcc agrees https://godbolt.org/z/1KcW9zTcd .
I agree that underspecified sounds weird for ordinary people though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also didn't wanted to just use the same wording as GCC.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the terminology used in the standard is really hard for mortals to understand, but I also think the definition of "underspecified" makes it hard to capture this in a single diagnostic and still be precise. But I think existing diagnostics cover the other cases, mostly.

C23 6.7.1p12: A declaration such that the declaration specifiers contain no type specifier or that is declared with constexpr is said to be underspecified. If such a declaration is not a definition, if it declares no or more than one ordinary identifier, if the declared identifier already has a declaration in the same scope, if the declared entity is not an object, or if anywhere within the sequence of tokens making up the declaration identifiers that are not ordinary are declared, the behavior is implementation-defined.

So I think the best I can come up with is:

%select{declaration using a deduced type|constexpr declaration}0 %1 also declares %2 which is not an ordinary identifier

I think "if such a declaration is not a definition" is already covered by constexpr variable 'i' must be initialized by a constant expression: https://godbolt.org/z/nvKEzv975
I think "if it declares no or more than one ordinary identifier" is already covered by expected identifier or '(' or is supported by Clang: https://godbolt.org/z/Yas4od5Ev (Note, we should add documentation for the implementation-defined extension accepting multiple declarations in a group.)
I think "if the declared identifier already has a declaration in the same scope" is not correctly supported: https://godbolt.org/z/Kab7GvW85 but this should be handled by an existing redeclaration diagnostic when we fix it.
I think "if the declared entity is not an object" is already covered by "'auto' not allowed in function return type": https://godbolt.org/z/KzeTeczY7
I think "if anywhere within the sequence of tokens making up the declaration identifiers that are not ordinary are declared" is the only thing we're missing a diagnostic for, and the above suggested wording would handle it.

So I think we can get away without inflicting "underspecified" on users.

The other half of this paper has to do with scope of identifiers, we may end up needing to get creative with diagnostic wording when lookup fails to find an identifier from an underspecified declaration.

@github-actions
Copy link

github-actions bot commented Jan 29, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@to268
Copy link
Contributor Author

to268 commented Jan 29, 2024

CC: @AaronBallman

Comment on lines +7835 to +7839
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

White space changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the formatting CI was yelling at me for this.
I'll probably revert this since it's not part of the feature.

def err_attribute_webassembly_funcref : Error<
"'__funcref' attribute can only be applied to a function pointer type">;
def err_c23_underspecified_object_declaration: Error<
"'%select{struct|<ERROR>|union|<ERROR>|enum}0 %1' is defined as an underspecified object initializer">;
Copy link
Contributor

@cor3ntin cor3ntin Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think underspecified object is useful for users
Maybe error: the deduced type of foo refers to a struct|union|enum bar which is defined in its initializer.

@to268
Copy link
Contributor Author

to268 commented Feb 5, 2024

Ping: @AaronBallman

@to268
Copy link
Contributor Author

to268 commented Mar 18, 2024

I have rebased my PR to include N3018 constexpr feature and reverted a formatted snippet of code that is not part of the PR.
I will add constexpr tests but I need to investigate why the constexpr keyword does not trigger the diagnostic (not sure if it's a constexpr implementation bug or something I need to handle in my PR (it does not really matter TBO)).

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! I think there's something here worth considering: how an implementation handles use of underspecified declarations is left implementation-defined. I think we want to diagnose many of the situations it covers, but I think we already have most of those diagnostics covered.

The one thing that's not covered is declaring or defining new identifiers as part of an underspecified declaration. However, we might want to accept that as an extension rather than follow GCC's footsteps. C users are used to the fact that you can define a type basically anywhere you can spell a type name. Is it confusing? Absolutely. But I think it's just as confusing to allow users to define a type in some places but not others, along fairly arbitrary lines. In other words, it's weird to me that we would allow:

void func() {
  int x = (struct Foo { int x; }){ 0 }.x;
}

but would reject:

void func() {
  constexpr int x = (struct Foo { int x; }){ 0 }.x;
}

So support for this paper might entail adding documentation for what our implementation-defined behavior is when we accept code + test coverage, rather than rejecting code because GCC rejects it.

CC @zygoloid @jyknight @jrtc27 @nickdesaulniers for some more opinions on this.

def err_attribute_webassembly_funcref : Error<
"'__funcref' attribute can only be applied to a function pointer type">;
def err_c23_underspecified_object_declaration: Error<
"'%select{struct|<ERROR>|union|<ERROR>|enum}0 %1' is defined as an underspecified object initializer">;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the terminology used in the standard is really hard for mortals to understand, but I also think the definition of "underspecified" makes it hard to capture this in a single diagnostic and still be precise. But I think existing diagnostics cover the other cases, mostly.

C23 6.7.1p12: A declaration such that the declaration specifiers contain no type specifier or that is declared with constexpr is said to be underspecified. If such a declaration is not a definition, if it declares no or more than one ordinary identifier, if the declared identifier already has a declaration in the same scope, if the declared entity is not an object, or if anywhere within the sequence of tokens making up the declaration identifiers that are not ordinary are declared, the behavior is implementation-defined.

So I think the best I can come up with is:

%select{declaration using a deduced type|constexpr declaration}0 %1 also declares %2 which is not an ordinary identifier

I think "if such a declaration is not a definition" is already covered by constexpr variable 'i' must be initialized by a constant expression: https://godbolt.org/z/nvKEzv975
I think "if it declares no or more than one ordinary identifier" is already covered by expected identifier or '(' or is supported by Clang: https://godbolt.org/z/Yas4od5Ev (Note, we should add documentation for the implementation-defined extension accepting multiple declarations in a group.)
I think "if the declared identifier already has a declaration in the same scope" is not correctly supported: https://godbolt.org/z/Kab7GvW85 but this should be handled by an existing redeclaration diagnostic when we fix it.
I think "if the declared entity is not an object" is already covered by "'auto' not allowed in function return type": https://godbolt.org/z/KzeTeczY7
I think "if anywhere within the sequence of tokens making up the declaration identifiers that are not ordinary are declared" is the only thing we're missing a diagnostic for, and the above suggested wording would handle it.

So I think we can get away without inflicting "underspecified" on users.

The other half of this paper has to do with scope of identifiers, we may end up needing to get creative with diagnostic wording when lookup fails to find an identifier from an underspecified declaration.

diag::err_variable_object_no_init))
return ExprError();
}
} else if (LangOpts.C23 &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these changes are quite sufficient because I think we need to handle more cases than just compound literals. Consider:

constexpr typeof(struct s *) x = 0; // declares `s` which is not an ordinary identifier
constexpr struct S { int a, b; } y = { 0 }; // declares `S`, `a`, and `b`, none of which are ordinary identifiers
constexpr int a = 0, b = 0; // declares `a` and `b` as ordinary identifiers
auto c = (struct T { int x, y; }){0, 0}; // declares `T`, `x`, and `y`, none of which are ordinary identifiers
constexpr int (*fp)(struct X { int x; } val) = 0; // declares `X` and `x` which are not ordinary identifiers

(https://godbolt.org/z/Pq87aMnch)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 out of these 5 cases are not conforming curently:

constexpr typeof(struct s *) x = 0;               // Wrong, no underspecifed diagnostics
constexpr struct S { int a, b; } y = { 0 };       // Wrong, no underspecifed diagnostics
constexpr int a = 0, b = 0;                       // Ok, no underspecifed diagnostics
auto c = (struct T { int x, y; }){0, 0};          // Ok, `struct T` diagnosed as underspecifed
constexpr int (*fp)(struct X { int x; } val) = 0; // Wrong, no underspecifed diagnostics but dignoses warning: `declaration of 'struct X' will not be visible outside of this function`

I need to do more work outside of compound literals.

auto underspecified_enum = (enum E2 { BAZ, QUX }){ BAZ }; // expected-error {{'enum E2' is defined as an underspecified object initializer}}
auto underspecified_enum_redef = (enum E1 { ONE, TWO }){ ONE }; // expected-error {{redefinition of 'E1'}}
auto underspecified_empty_enum = (enum E3 { }){ }; // expected-error {{'enum E3' is defined as an underspecified object initializer}} \
expected-error {{use of empty enum}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need additional test coverage for the changes to scope that came from the paper, but we should already have diagnostics implemented for them. I read that as changing the behavior of this code:

constexpr int i = i;
auto j = j;

The only bit I'm not certain of is "... if the same ordinary identifier declares another entity with a scope that encloses the current block, that declaration is hidden as soon as the inner declarator is completed".

int func() {
  struct S { int x, y; };
  constexpr int i = (struct T { int a, b; }){0, 1}.a;

  struct T t = { 1, 2 };
}

so that the use of struct T outside of the compound literal is no longer valid. I'm not certain we want to conform to that restriction if we're deciding we want to accept the use in the compound literal. Again, it's mysterious behavior for the constexpr specifier to change behavior of type definitions.

Copy link
Contributor Author

@to268 to268 Jul 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if declaring a non ordinary identifier t of type struct T should be diagnosed as an invalid declaration, since the definition of the struct T type is actually a underspecified declaration.
Interestingly, if you remove the constexpr keyword, there is currently no change in behavior in this case.
Moreover, the diagnostic about constexpr variable 'i' must be initialized by a constant expression is not emitted in this case, I should probably look into it.

@to268
Copy link
Contributor Author

to268 commented Jun 29, 2024

I've rebased the branch from main

@AaronBallman
Copy link
Collaborator

@to268 and I spoke about this offline and we decided to close this PR because it's heading in the wrong direction. We discussed a different direction that focuses solely on implementing the logic to handle %select{declaration using a deduced type|constexpr declaration}0 %1 also declares %2 which is not an ordinary identifier as the initial steps towards full support.

@to268 to268 deleted the feature-N3006 branch February 7, 2025 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants