Skip to content

Conversation

@higher-performance
Copy link
Contributor

Fixes #141103

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels May 22, 2025
@llvmbot
Copy link
Member

llvmbot commented May 22, 2025

@llvm/pr-subscribers-clang

Author: None (higher-performance)

Changes

Fixes #141103


Full diff: https://github.com/llvm/llvm-project/pull/141133.diff

1 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+2-1)
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 78b36ceb88125..8a3aafeabdad3 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -2423,7 +2423,8 @@ def note_uninit_reference_member : Note<
   "uninitialized reference member is here">;
 def warn_field_requires_explicit_init : Warning<
   "field %select{%1|in %1}0 requires explicit initialization but is not "
-   "explicitly initialized">, InGroup<UninitializedExplicitInit>;
+   "explicitly initialized">, InGroup<UninitializedExplicitInit>,
+   ShowInSystemHeader;
 def warn_field_is_uninit : Warning<"field %0 is uninitialized when used here">,
   InGroup<Uninitialized>;
 def warn_base_class_is_uninit : Warning<

@efriedma-quic
Copy link
Collaborator

Testcase?

This seems like the sort of thing where we might want to cooperate with the C++ library somehow to get better diagnostics, but getting some diagnostic seems like an improvement.

@higher-performance
Copy link
Contributor Author

@efriedma-quic: I'm not sure what the appropriate way to test this is. The file clang/test/SemaCXX/uninitialized.cpp doesn't include any system headers. Should I just add a system header there and include it?

Re: libc++ cooperation: I thought about something along those lines too, but I couldn't think of a satisfactory solution. Especially because people can have their own templated wrappers too, and also because "system headers" could be anything third-party -- not just the standard library. I think the current approach might actually be best, as inelegant as it feels.

@cor3ntin
Copy link
Contributor

I've been considering similar issues recently.
e.g https://compiler-explorer.com/z/5xcKf8Kj1

A more general solution would be to emit diagnostics in system headers if one of the involved types/expressions names an entity declared outside of a system header.
I haven't had a brilliant idea on how to do that yet. Note that looking at the instantiation stack doesn't help because warnings unrelated to user code can be produced in system headers, caused by an instantiation in user code.

So we need to look at entities.... somehow.

@higher-performance
Copy link
Contributor Author

@efriedma-quic Ah I figured out how to test this - just added a test.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, but I'd like a second approval; I haven't really looked at this template instantiation stuff recently.

@cor3ntin
Copy link
Contributor

cor3ntin commented May 24, 2025

Let me chat with @AaronBallman and @erichkeane on Monday about this.

This feels like a very narrow fix for a widespread issue.

  • Presumably, we don't want to warn if library implementers (mis)use std::construct_at internally
  • There are tons of other warnings that are silenced.

So I think that either

  • We need a better heuristic to hide warnings than "comes from a system header".
  • We should add the ShowInSystemHeader flags to a lot of warnings, not just this one.

@higher-performance
Copy link
Contributor Author

@cor3ntin feel free to chat, but in this case we do actually want to warn (and error, under -Werror) if anybody (mis)uses std::construct_at (i.e. neglects to explicitly initialize a field that declares itself as such). It's the exact analog of using std::invoke to call a function that has a mandatory positional parameter -- it doesn't make sense to allow anyone to bypass it. The entire goal is to make sure that field/parameter has an explicitly provided value in all cases. So I think this really is the right fix for this particular warning, regardless of the more general problem.

@higher-performance
Copy link
Contributor Author

What do folks think?

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a good idea, warning on EVERY use of this is incorrect in system headers. IT is going to result in a ton of we-want-to-be-suppressed-positives

I also don't think we can have a heuristic that wouldn't have an absurd amount of false positives.

IMO, the only thing we COULD do here is to have some sort of #pragma clang that wraps the template and says "show this even in system headers", which is an opt-in at the library level.

@higher-performance
Copy link
Contributor Author

I don't think this is a good idea, warning on EVERY use of this is incorrect in system headers. IT is going to result in a ton of we-want-to-be-suppressed-positives

Thanks @erichkeane. I think I fundamentally disagree with this. The entire intention of this attribute is to flag such a thing -- it's no different from calling std::invoke(foo, x) and expecting to get a diagnostic on every invocation when the function is void foo(int x, int y), especially when you consider that these are basically simulating named-parameters. Do you have a plausible use case in mind that makes you believe people would actually want to suppress the warning in system headers?

@erichkeane
Copy link
Collaborator

I don't think this is a good idea, warning on EVERY use of this is incorrect in system headers. IT is going to result in a ton of we-want-to-be-suppressed-positives

Thanks @erichkeane. I think I fundamentally disagree with this. The entire intention of this attribute is to flag such a thing -- it's no different from calling std::invoke(foo, x) and expecting to get a diagnostic on every invocation when the function is void foo(int x, int y), especially when you consider that these are basically simulating named-parameters. Do you have a plausible use case in mind that makes you believe people would actually want to suppress the warning in system headers?

Hmm... I just spent a while looking at the attribute more and changed how I was thinking about this problem, and I'm less confident in my above statement. SINCE this is competely opt-in we could presumably document this better to mean "we mean this even if it is in the nitty-gritty of the standard library". Typically these sorts of attributes don't really mean that/are intended for use in violations by the 'users' code. Of course, construct-at and invoke/etc all make this somewhat more complicated of a differentiation.

My biggest 'plausible' concern is cases where the system header needs to make irrelevant/unused temporaries for the purposes of various operations along the way, which still DO the right thing but are wrong along the way. BUT I also wonder how much sympathy I have for those cases when the author of the type has opted into this.

@higher-performance
Copy link
Contributor Author

Yeah. I'm not even sure I can think of a case with the irrelevant/unused temporaries you mention, to be honest. The way I would think about this is: this attribute is intended to be equivalent to adding an in-class member initializer that is impossible for anyone to actually invoke. The only reason we don't do that directly is that the error messages are terrible, but otherwise they are solving the same problem. And, similarly, there's no reason to allow that to be bypassed either. (But, just as you said, I don't see reasons to have sympathy for it even if that case was possible to bypass.)

@zygoloid
Copy link
Collaborator

zygoloid commented May 27, 2025

There are techniques for determining the number of fields in an aggregate type T by attempting to initialize with T{} then T{arg} then T{arg, arg}, and so on (with an arg that converts to anything), until the initialization fails. Such techniques might in principle show up in a system header (eg, boost probably has this somewhere).

However, any such code should be in an unevaluated operand, so as long as this warning is suppressed in unevaluated operands I think it's fine for it to be enabled in system headers. (For extra assurance, we could make this a DiagRuntimeBehavior to also suppress it in unreachable code, but maybe that's going too far.)

@higher-performance
Copy link
Contributor Author

Interesting point! A few thoughts:

  • Are system headers really any different from non-system headers in that case? It feels like it might be an orthogonal issue we should address in a separate PR.
  • Do you mean SFINAE contexts rather than merely unevaluated contexts (?)
  • How would I detect if a given Expr (say, an InitListExpr) is unevaluated? Is that easy or would it require a huge rework of the implementation?

@zygoloid
Copy link
Collaborator

I don't think this concern is specific to system headers.

I don't think we have precedent for disabling a warning in a SFINAE context. We do suppress warnings produced while substituting into a function template specialization during deduction if we end up not selecting that overload, but I don't think we have something analogous for concepts, and it's not clear that that would help here, given that the warning would be produced in the case where the concept check succeeds.

You can check if you're in an unevaluated context with Sema::isUnevaluatedContext.

@higher-performance
Copy link
Contributor Author

Got it, thanks. I updated this to ignore unevaluated contexts.

@erichkeane
Copy link
Collaborator

I think the unevaluated contexts DOES help a bunch, though I'm still concerned about the ability of the library to create 'temproary' objects/etc via some sort of default construction/aggregate construction, to then fill in the rest.

@higher-performance
Copy link
Contributor Author

@erichkeane the question of whether S s{}; s.f = ...; should be deemed as a relaxation of S s{.f = ...}; came up during the initial RFC. From memory, the conclusion was that it's something we could relax in a v2 of the RFC, but it is a significant amount of work, and the initial design shouldn't be blocked on it. Furthermore, we haven't really seen significant demand for that on our side. By contrast, the current bug is actually a loophole in the original design that is causing bugs to slip through for current users who only needed the guarantees from the original attribute, so blocking this on that doesn't really feel justified at this stage.

@higher-performance
Copy link
Contributor Author

Could I move forward with merging this? Or does anyone feel there are blockers here?

@erichkeane
Copy link
Collaborator

@erichkeane the question of whether S s{}; s.f = ...; should be deemed as a relaxation of S s{.f = ...}; came up during the initial RFC. From memory, the conclusion was that it's something we could relax in a v2 of the RFC, but it is a significant amount of work, and the initial design shouldn't be blocked on it. Furthermore, we haven't really seen significant demand for that on our side. By contrast, the current bug is actually a loophole in the original design that is causing bugs to slip through for current users who only needed the guarantees from the original attribute, so blocking this on that doesn't really feel justified at this stage.

Its more that we assume/like to assume that the Standard Library always 'does the right thing', even if it does so in ways that look 'wrong'. So we suppress warnings in the standard library headers, since they are assumed to all be false-positives.

This of course is an interesting 'that assumption is not perfectly true' anymore type of situation.

Could I move forward with merging this? Or does anyone feel there are blockers here?

@AaronBallman , @cor3ntin , and need to discuss this still I think and make sure we are sure. Unfortunately Aaron's next 'office hours' (when we'd do this typically) aren't for a few weeks, but I'll see if we can discuss this offline in the near future.

@higher-performance
Copy link
Contributor Author

I see, thanks for explaining. That would be great if you can discuss it offline in the near future.

In the meantime, let me just emphasize a some points from my side before I forget:

  • The question of delayed initialization is (so far as I can tell) entirely orthogonal to the question of what the standard library should be doing here. If we keep the current spec, then obviously the existing behavior is buggy, and this PR is required for the fix. If we do relax the attribute to allow delayed initialization, then it would require dataflow analysis out of the standard library (since it would now be okay to do std::construct_at(...) followed by filling in the fields). Either way, I struggle to see how treating the standard library differently would solve anything.
  • Insofar as the current spec goes -- just as std::construct_at cannot possibly detect (let alone try to avoid) an in-class member initializer that goes awry when invoked, I think it also cannot (and should not) try to avoid the same issue w.r.t. this attribute. It simply does not have enough information, and this is true for every function out there that constructs a user-specified type (emplace_back, etc.), not just those in the standard library.

@zygoloid
Copy link
Collaborator

Its more that we assume/like to assume that the Standard Library always 'does the right thing', even if it does so in ways that look 'wrong'. So we suppress warnings in the standard library headers, since they are assumed to all be false-positives.

This of course is an interesting 'that assumption is not perfectly true' anymore type of situation.

For me the concern here is minimal: the standard library should never be "inventing" its own way of initializing a user-defined type. If the standard library is performing an initialization, it should either have a reason to believe the initialization works (either because it owns the type being initialized -- never a problem here unless the standard library itself starts using this attribute -- or because the user of the standard library used it in a way that implies that the initialization works -- which is exactly the case in which we want a warning), or it should be testing whether the initialization works (eg, in a SFINAE context).

I wonder if a different approach to this problem would be to make the diagnostic produced by this attribute be an error rather than a warning. That'd make it rather more clear that it should be produced even in system headers :) (And we then shouldn't disable it in unevaluated operands -- instead SFINAE could handle the relevant cases there.)

@higher-performance
Copy link
Contributor Author

Hi all~ where do folks stand on the PR currently, given the last discussion?

@higher-performance
Copy link
Contributor Author

Are folks still on the fence after the latest discussion? If anyone still has objections I can try to suggest some path forward, but I'm not sure what everyone is thinking right now (or what this is pending on in general), and it would be lovely to move forward somehow.

@AaronBallman
Copy link
Collaborator

Are folks still on the fence after the latest discussion? If anyone still has objections I can try to suggest some path forward, but I'm not sure what everyone is thinking right now (or what this is pending on in general), and it would be lovely to move forward somehow.

Apologies for the delayed response, I've been trying to think about this topic a bit over the past few weeks to see if your explanation on the issue changed my concerns. I'm still pretty worried about false positives showing up in system headers, I don't think we're in agreement about how problematic that is in practice. But I'm not certain whether others share that concern; there are WG21 meetings happening this week and so folks like @erichkeane and @cor3ntin don't have the bandwidth to weigh in just yet unfortunately.

@higher-performance
Copy link
Contributor Author

Got it, thanks for the update! Okay, let's see what they think when they have a chance to reply. If there's still disagreement, I'll try to suggest some other ways to move forward.

@erichkeane
Copy link
Collaborator

Here is where my mind is at:

1- We need, as a project to re-think/come up with a better way to figure out how to suppress warnings in header files. I don't think there is a silver-bullet here, and no one has come up with a reasonable way forward on this.

2- I think the idea that 'warnings in the STL aren't actionable by users' is correct in SOME cases, but not all. It is frustrating we don't have a way to figure out when an STL function is a 'proxy' for user action (like construct_at/invoke/emplace_back/ etc), where we should no longer suppress warnings in that instantiation-context. PERHAPS we could find a way to mark it that way?

3- This one somewhat scares me for things where the STL will need to 'invent' an object (eg, sees it is 'default constructible'/'trivially copyable', so does a default-construct + memcpy instead of a construct-in-place because reasons), which will result in false-positives, because as I said, the STL is 'assumed right'.

4- It is PROBABLY not a huge deal here, and is something we could perhaps use to fix 'later'. This is an opt-in and perhaps a bit of a 'you get what you asked for' (even if you didn't know what you were asking for) via the attribute. IMO, I wonder if we should INSTEAD start marking this diagnostic as an error? It plays awkwardly with SFINAE IIRC, but we can mark it as SFINAE error anyway.

5- Even if we choose not to make it an error, I think I'm about 55/45 that we can accept this and hold our nose/push off the issues with figuring out STL proxy-function vs normal-function.

@higher-performance
Copy link
Contributor Author

Thanks!

One thing I want to also highlight here is that the general problem of figuring out "whom to blame" (i.e. where the diagnostic should be attributed to, and thus when to suppress the diagnostic) for an incorrect template instantiation seems difficult at best, and fundamentally impossible (without additional out-of-band information) at worst.

For example, consider this:

template<class T>
struct S;
template<class T>
T foo() { return typename S<typename T::value_type>::type(); }

If you see a call to foo<std::string>(), and it turns out that instantiation is wrong, whom exactly should take the blame/responsibility here? Any of these are possible:

  • The definition of foo would be to blame if its author has control over all the specializations of S.
  • The definition of S would be to blame if its author is guaranteeing that ::type is instantiable.
  • The foo<std::string>() expression would be to blame if S and foo are entirely orthogonal, and thus the author of foo<std::string>() is the one imposing instantiability as a requirement.

There is simply no way to know which is the correct choice -- and thus no way to know where the suppressions should be processed -- without knowing the implicit assumptions in the code. Any choice made is going to be wrong some of the time.

This is why I'm so weary of trying to solve this larger problem for the sake of this PR here -- it's not even clear to me that it's solvable in the first place, never mind the implementation challenges.


Now, responding to @erichkeane's thoughts above:

1- -W[no-]system-headers=... seems like a reasonable way to address this -- do you see issues with it?

2- (see highlight above)

3- Default-construct + memcpy still requires the default-construction to be valid. It doesn't really matter whether it's occurring in the STL or not, right? (Just like how std::make_unique<T>() requires default-construction of T to be valid, even if it's in the STL.) If the STL is instantiating T then it has to know that T is valid to instantiate that way. For a user type, it wouldn't possibly know that, so it wouldn't do that except on behalf of the user. For its own types, then of course it has to respect its own attributes, but then there's no problem. If you have an example that neither of these addresses, please share them. The only one I can think of is std::bit_cast, but that seems fine to ignore.

4- From my biased standpoint, I'm not inherently against error-by-default (we are currently using it under -Werror anyway), but I think it is (a) strictly worse than allowing it to be either a warning or an error? (Note that I'm assuming -Wno-system-headers=... as a further extension here, if we end up needing it.)

5- I very much agree. To me the worst case (if they come up at all) is that some people would stop using the attribute, which is... where we were before the attribute was introduced, and where other compilers are now. At least in that case they can resort to alternate (more inconvenient) methods for discovering all construction sites. Contrast that to the current situation, where unaware users mistakenly think the attribute is working when it is actually suppressed in some cases -- and thus may assume they have updated all call sites, when they actually haven't, which is worse than not having it at all.
(Important side note: we aren't running into this problem, because we have implemented a safety mechanism: we're hiding the attribute behind a macro that also adds an in-class initializer, so we at least get a linker error if the attribute diagnosis fails for some reason. The error message is terrible, but at least we get some error, and thus the attribute is still very valuable to us, even with its current flaw. But a user who is unaware of this failure mode wouldn't do that, and our solution isn't possible in C either.)

@erichkeane
Copy link
Collaborator

There is simply no way to know which is the correct choice -- and thus no way to know where the suppressions should be processed -- without knowing the implicit assumptions in the code. Any choice made is going to be wrong some of the time.

This is why I'm so weary of trying to solve this larger problem for the sake of this PR here -- it's not even clear to me that it's solvable in the first place, never mind the implementation challenges.

That is where I'm falling after the conversation too, we should just punt on the general solution. The only solution I can come up with at all that has any amount of 'legs' is to figure out how to identify STL entry points that are 'please diagnose warnings inside of me' (or worse, via a white-list!), and diagnose warnings that are a result of that instantiation.

However, of course, that has ADDITIONAL problems with instantiations (that is, the same warning would have been encountered via two different instantiations, one marked WITH and one WITHOUT said 'proxy' marking, so the order of calling the two changes the warning behavior), so it too seems imperfect.

SO all that to say, the way forward on THAT is cloudy. I think your response to point-5 above is what makes this change cromulent to me.

@higher-performance
Copy link
Contributor Author

Hi all! So given the discussion above and that @erichkeane also okayed this, can we finally proceed with this PR? I'd like to note that after discussing it for the past month and a half, it still seems like we have no idea what the ideal solution to the more-general problem could even look like in principle (if it's even possible), never mind how to get there in practice. Whereas this particular solution would address the issue we're facing, without closing the door to any improvements we may think of later.

@AaronBallman
Copy link
Collaborator

Hi all! So given the discussion above and that @erichkeane also okayed this, can we finally proceed with this PR? I'd like to note that after discussing it for the past month and a half, it still seems like we have no idea what the ideal solution to the more-general problem could even look like in principle (if it's even possible), never mind how to get there in practice. Whereas this particular solution would address the issue we're facing, without closing the door to any improvements we may think of later.

Yeah, I think we should move forward. This may not be ideal, but we also don't know what ideal will be and this is solving a problem.

Thank you for the really good discussion on this! I know it's frustrating to have long-running discussions, but we try to have a high bar for quality and that sometimes leads to this kind of thing, so we appreciate all the consideration given here,

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@higher-performance
Copy link
Contributor Author

Okay great, thanks so much! I'll go ahead and merge this then.

@higher-performance higher-performance merged commit 56679a8 into llvm:main Jul 9, 2025
10 checks passed
@higher-performance higher-performance deleted the required-fields branch July 9, 2025 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

std::construct_at() seems to bypass [[clang::require_explicit_initialization]]

7 participants