Skip to content

Conversation

@philnik777
Copy link
Contributor

@philnik777 philnik777 commented Oct 26, 2024

This is required to avoid macro clashing when using attributes like [[msvc::no_unique_address]].

This patch also refactor the logic for attribute scope __uglification__ into a single place to make it easier to add additional cases in case another one is required at some point again.

@github-actions
Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff d906ac52ab8ee46090a6696f4ffb34c40ee6abb7 68f3617027fa09b531b5e3eaea6a025c6783a258 --extensions cpp,h -- clang/include/clang/Basic/Attributes.h clang/lib/Basic/Attributes.cpp clang/lib/Sema/SemaCodeComplete.cpp clang/test/SemaCXX/cxx2a-ms-no-unique-address.cpp
View the diff from clang-format here.
diff --git a/clang/include/clang/Basic/Attributes.h b/clang/include/clang/Basic/Attributes.h
index 9fa7e5ce55..498a4a2e41 100644
--- a/clang/include/clang/Basic/Attributes.h
+++ b/clang/include/clang/Basic/Attributes.h
@@ -23,7 +23,7 @@ int hasAttribute(AttributeCommonInfo::Syntax Syntax,
                  const IdentifierInfo *Scope, const IdentifierInfo *Attr,
                  const TargetInfo &Target, const LangOptions &LangOpts);
 
-inline const char* deuglifyAttrScope(StringRef Scope) {
+inline const char *deuglifyAttrScope(StringRef Scope) {
   if (Scope == "_Clang")
     return "clang";
   if (Scope == "__gnu__")
@@ -33,7 +33,7 @@ inline const char* deuglifyAttrScope(StringRef Scope) {
   return nullptr;
 }
 
-inline const char* uglifyAttrScope(StringRef Scope) {
+inline const char *uglifyAttrScope(StringRef Scope) {
   if (Scope == "clang")
     return "_Clang";
   if (Scope == "gnu")

@philnik777 philnik777 marked this pull request as ready for review October 27, 2024 09:07
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Oct 27, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 27, 2024

@llvm/pr-subscribers-clang

Author: Nikolas Klauser (philnik777)

Changes

This is required to avoid macro clashing when using attributes like [[msvc::no_unique_address]].

This patch also refactor the logic for attribute scope uglification into a single place to make it easier to add additional cases in case another one is required at some point again.


Full diff: https://github.com/llvm/llvm-project/pull/113765.diff

4 Files Affected:

  • (modified) clang/include/clang/Basic/Attributes.h (+24)
  • (modified) clang/lib/Basic/Attributes.cpp (+9-14)
  • (modified) clang/lib/Sema/SemaCodeComplete.cpp (+4-19)
  • (modified) clang/test/SemaCXX/cxx2a-ms-no-unique-address.cpp (+3)
diff --git a/clang/include/clang/Basic/Attributes.h b/clang/include/clang/Basic/Attributes.h
index 61666a6f4d9ac4..9fa7e5ce55d491 100644
--- a/clang/include/clang/Basic/Attributes.h
+++ b/clang/include/clang/Basic/Attributes.h
@@ -23,6 +23,30 @@ int hasAttribute(AttributeCommonInfo::Syntax Syntax,
                  const IdentifierInfo *Scope, const IdentifierInfo *Attr,
                  const TargetInfo &Target, const LangOptions &LangOpts);
 
+inline const char* deuglifyAttrScope(StringRef Scope) {
+  if (Scope == "_Clang")
+    return "clang";
+  if (Scope == "__gnu__")
+    return "gnu";
+  if (Scope == "__msvc__")
+    return "msvc";
+  return nullptr;
+}
+
+inline const char* uglifyAttrScope(StringRef Scope) {
+  if (Scope == "clang")
+    return "_Clang";
+  if (Scope == "gnu")
+    return "__gnu__";
+  if (Scope == "msvc")
+    return "__msvc__";
+  return nullptr;
+}
+
+inline bool isPotentiallyUglyScope(StringRef Scope) {
+  return Scope == "gnu" || Scope == "clang" || Scope == "msvc";
+}
+
 } // end namespace clang
 
 #endif // LLVM_CLANG_BASIC_ATTRIBUTES_H
diff --git a/clang/lib/Basic/Attributes.cpp b/clang/lib/Basic/Attributes.cpp
index 867d241a2cf847..4afa129e3b2222 100644
--- a/clang/lib/Basic/Attributes.cpp
+++ b/clang/lib/Basic/Attributes.cpp
@@ -38,10 +38,8 @@ int clang::hasAttribute(AttributeCommonInfo::Syntax Syntax,
 
   // Normalize the scope name, but only for gnu and clang attributes.
   StringRef ScopeName = Scope ? Scope->getName() : "";
-  if (ScopeName == "__gnu__")
-    ScopeName = "gnu";
-  else if (ScopeName == "_Clang")
-    ScopeName = "clang";
+  if (const char *prettyName = deuglifyAttrScope(ScopeName))
+    ScopeName = prettyName;
 
   // As a special case, look for the omp::sequence and omp::directive
   // attributes. We support those, but not through the typical attribute
@@ -87,10 +85,8 @@ normalizeAttrScopeName(const IdentifierInfo *Scope,
   StringRef ScopeName = Scope->getName();
   if (SyntaxUsed == AttributeCommonInfo::AS_CXX11 ||
       SyntaxUsed == AttributeCommonInfo::AS_C23) {
-    if (ScopeName == "__gnu__")
-      ScopeName = "gnu";
-    else if (ScopeName == "_Clang")
-      ScopeName = "clang";
+    if (const char *prettySpelling = deuglifyAttrScope(ScopeName))
+      return prettySpelling;
   }
   return ScopeName;
 }
@@ -100,12 +96,11 @@ static StringRef normalizeAttrName(const IdentifierInfo *Name,
                                    AttributeCommonInfo::Syntax SyntaxUsed) {
   // Normalize the attribute name, __foo__ becomes foo. This is only allowable
   // for GNU attributes, and attributes using the double square bracket syntax.
-  bool ShouldNormalize =
-      SyntaxUsed == AttributeCommonInfo::AS_GNU ||
-      ((SyntaxUsed == AttributeCommonInfo::AS_CXX11 ||
-        SyntaxUsed == AttributeCommonInfo::AS_C23) &&
-       (NormalizedScopeName.empty() || NormalizedScopeName == "gnu" ||
-        NormalizedScopeName == "clang"));
+  bool ShouldNormalize = SyntaxUsed == AttributeCommonInfo::AS_GNU ||
+                         ((SyntaxUsed == AttributeCommonInfo::AS_CXX11 ||
+                           SyntaxUsed == AttributeCommonInfo::AS_C23) &&
+                          (NormalizedScopeName.empty() ||
+                           isPotentiallyUglyScope(NormalizedScopeName)));
   StringRef AttrName = Name->getName();
   if (ShouldNormalize && AttrName.size() >= 4 && AttrName.starts_with("__") &&
       AttrName.ends_with("__"))
diff --git a/clang/lib/Sema/SemaCodeComplete.cpp b/clang/lib/Sema/SemaCodeComplete.cpp
index 3e31f3d82657a3..732da9ceb0628f 100644
--- a/clang/lib/Sema/SemaCodeComplete.cpp
+++ b/clang/lib/Sema/SemaCodeComplete.cpp
@@ -25,6 +25,7 @@
 #include "clang/AST/RecursiveASTVisitor.h"
 #include "clang/AST/Type.h"
 #include "clang/Basic/AttributeCommonInfo.h"
+#include "clang/Basic/Attributes.h"
 #include "clang/Basic/CharInfo.h"
 #include "clang/Basic/OperatorKinds.h"
 #include "clang/Basic/Specifiers.h"
@@ -4579,22 +4580,6 @@ void SemaCodeCompletion::CodeCompleteDeclSpec(Scope *S, DeclSpec &DS,
                             Results.size());
 }
 
-static const char *underscoreAttrScope(llvm::StringRef Scope) {
-  if (Scope == "clang")
-    return "_Clang";
-  if (Scope == "gnu")
-    return "__gnu__";
-  return nullptr;
-}
-
-static const char *noUnderscoreAttrScope(llvm::StringRef Scope) {
-  if (Scope == "_Clang")
-    return "clang";
-  if (Scope == "__gnu__")
-    return "gnu";
-  return nullptr;
-}
-
 void SemaCodeCompletion::CodeCompleteAttribute(
     AttributeCommonInfo::Syntax Syntax, AttributeCompletion Completion,
     const IdentifierInfo *InScope) {
@@ -4618,7 +4603,7 @@ void SemaCodeCompletion::CodeCompleteAttribute(
   bool InScopeUnderscore = false;
   if (InScope) {
     InScopeName = InScope->getName();
-    if (const char *NoUnderscore = noUnderscoreAttrScope(InScopeName)) {
+    if (const char *NoUnderscore = deuglifyAttrScope(InScopeName)) {
       InScopeName = NoUnderscore;
       InScopeUnderscore = true;
     }
@@ -4653,7 +4638,7 @@ void SemaCodeCompletion::CodeCompleteAttribute(
           Results.AddResult(
               CodeCompletionResult(Results.getAllocator().CopyString(Scope)));
           // Include alternate form (__gnu__ instead of gnu).
-          if (const char *Scope2 = underscoreAttrScope(Scope))
+          if (const char *Scope2 = uglifyAttrScope(Scope))
             Results.AddResult(CodeCompletionResult(Scope2));
         }
         continue;
@@ -4712,7 +4697,7 @@ void SemaCodeCompletion::CodeCompleteAttribute(
         if (Scope.empty()) {
           Add(Scope, Name, /*Underscores=*/true);
         } else {
-          const char *GuardedScope = underscoreAttrScope(Scope);
+          const char *GuardedScope = uglifyAttrScope(Scope);
           if (!GuardedScope)
             continue;
           Add(GuardedScope, Name, /*Underscores=*/true);
diff --git a/clang/test/SemaCXX/cxx2a-ms-no-unique-address.cpp b/clang/test/SemaCXX/cxx2a-ms-no-unique-address.cpp
index 822ed752fa9c75..c9cf86f5270034 100644
--- a/clang/test/SemaCXX/cxx2a-ms-no-unique-address.cpp
+++ b/clang/test/SemaCXX/cxx2a-ms-no-unique-address.cpp
@@ -16,6 +16,9 @@ struct [[msvc::no_unique_address]] S { // expected-error {{only applies to non-b
   [[msvc::no_unique_address()]] int arglist; // expected-error {{cannot have an argument list}} unsupported-warning {{unknown}}
 
   int [[msvc::no_unique_address]] c; // expected-error {{cannot be applied to types}} unsupported-error {{cannot be applied to types}}
+  [[__msvc__::__no_unique_address__]] int d; // unsupported-warning {{unknown}}
+  [[__msvc__::no_unique_address]] int e; // unsupported-warning {{unknown}}
+  [[msvc::__no_unique_address__]] int g; // unsupported-warning {{unknown}}
 };
 
 struct CStructNoUniqueAddress {

@cor3ntin
Copy link
Contributor

How would that be used? MSVC seems to not like the ugly spelling https://compiler-explorer.com/z/Kb85YY18P

@philnik777
Copy link
Contributor Author

How would that be used? MSVC seems to not like the ugly spelling https://compiler-explorer.com/z/Kb85YY18P

We need [[msvc::no_unique_address]] in libc++ for Windows but don't support MSVC, only clang-cl, so we can make use of this spelling to avoid grabbing the tokens. FWIW the MSVC STL has a similar problem and I hope MSVC will follow suit. Maybe @StephanTLavavej has some thoughts as well.

@StephanTLavavej
Copy link
Member

Yeah, it would be great if C1XX (MSVC's frontend) supported __meow__ as a synonym for meow.

I've occasionally gone into the FE and just implemented features; this one might be within my reach.

@AaronBallman
Copy link
Collaborator

In principle, I'm not opposed, but if cl.exe doesn't support the syntax for their vendor string, I don't know that we should support it ourselves. The signal of support from @StephanTLavavej is definitely a point in favor of going this route, but I'd feel most comfortable if we were following cl's lead rather than going off on our own.

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only hold up is that we don't really own the design space. If @StephanTLavavej /Microsoft can assert us that this is the spelling they are going with, I'm all for it.

const IdentifierInfo *Scope, const IdentifierInfo *Attr,
const TargetInfo &Target, const LangOptions &LangOpts);

inline const char* deuglifyAttrScope(StringRef Scope) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a huge fan of ugly here as it isn't particularly descriptive, but mostly just on the name of it.

return nullptr;
}

inline bool isPotentiallyUglyScope(StringRef Scope) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this the inverse? These are the not-ugly ones, right?

@StephanTLavavej
Copy link
Member

StephanTLavavej commented Oct 29, 2024

The MSVC FE team hasn't expressed enthusiasm for adding ugly spellings. If I learn more I'll relay that info.

@AaronBallman
Copy link
Collaborator

The MSVC FE team hasn't expressed enthusiasm for adding ugly spellings. If I learn more I'll relay that info.

Thank you for checking! Unfortunately, I think that's a reason for Clang to not support it for the msvc vendor prefix either.

@philnik777
Copy link
Contributor Author

The MSVC FE team hasn't expressed enthusiasm for adding ugly spellings. If I learn more I'll relay that info.

Thank you for checking! Unfortunately, I think that's a reason for Clang to not support it for the msvc vendor prefix either.

What is the alternative?

@AaronBallman
Copy link
Collaborator

The MSVC FE team hasn't expressed enthusiasm for adding ugly spellings. If I learn more I'll relay that info.

Thank you for checking! Unfortunately, I think that's a reason for Clang to not support it for the msvc vendor prefix either.

What is the alternative?

Can you get away with not using [[msvc::no_unique_address]]? If so, I'd go that route. If not, I'd say use the attribute with the non-ugly spelling and woe unto anyone defining msvc as a macro despite that being a perfectly valid thing for them to do. You could be "kind" and do

#ifdef msvc
#error "Microsoft's vendor specific attribute prefix steals this identifier from the user's namespace, please file an issue with Microsoft if you see this diagnostic"
#endif

or something along those lines.

@philnik777
Copy link
Contributor Author

The MSVC FE team hasn't expressed enthusiasm for adding ugly spellings. If I learn more I'll relay that info.

Thank you for checking! Unfortunately, I think that's a reason for Clang to not support it for the msvc vendor prefix either.

What is the alternative?

Can you get away with not using [[msvc::no_unique_address]]? If so, I'd go that route.
No. We need it to have a reasonable ABI.
If not, I'd say use the attribute with the non-ugly spelling and woe unto anyone defining msvc as a macro despite that being a perfectly valid thing for them to do. You could be "kind" and do

#ifdef msvc
#error "Microsoft's vendor specific attribute prefix steals this identifier from the user's namespace, please file an issue with Microsoft if you see this diagnostic"
#endif

or something along those lines.

This steals both msvc and no_unique_address and does that on all platforms, not just MSVC, so that's not exactly a thrilling solution. Would defining a namespace like __clang_msvc__ be an option? libc++ only cares about Clang on MSVC targets, so it's pretty much irrelevant what MSVC does for us. Adding an alias like [[_Clang::__no_unique_address__]] would also work if that's more palatable. That'd have to be added to any msvc attributes libc++ cares about of course (though currently that's only [[msvc::no_unique_address]]).

@erichkeane
Copy link
Collaborator

The MSVC FE team hasn't expressed enthusiasm for adding ugly spellings. If I learn more I'll relay that info.

Thank you for checking! Unfortunately, I think that's a reason for Clang to not support it for the msvc vendor prefix either.

What is the alternative?

Can you get away with not using [[msvc::no_unique_address]]? If so, I'd go that route.
No. We need it to have a reasonable ABI.
If not, I'd say use the attribute with the non-ugly spelling and woe unto anyone defining msvc as a macro despite that being a perfectly valid thing for them to do. You could be "kind" and do

#ifdef msvc
#error "Microsoft's vendor specific attribute prefix steals this identifier from the user's namespace, please file an issue with Microsoft if you see this diagnostic"
#endif

or something along those lines.

This steals both msvc and no_unique_address and does that on all platforms, not just MSVC, so that's not exactly a thrilling solution. Would defining a namespace like __clang_msvc__ be an option? libc++ only cares about Clang on MSVC targets, so it's pretty much irrelevant what MSVC does for us. Adding an alias like [[_Clang::__no_unique_address__]] would also work if that's more palatable. That'd have to be added to any msvc attributes libc++ cares about of course (though currently that's only [[msvc::no_unique_address]]).

I'm personally sympathetic, and would be ok with something that we know MSVC won't step on (__clang_msvc__ would be fine to me). As far as the no_unique_address, I THINK we already get the double-underscore spelling of that, don't we? I don't think we exclude the MSVC spellings from that.

@philnik777
Copy link
Contributor Author

I'm personally sympathetic, and would be ok with something that we know MSVC won't step on (__clang_msvc__ would be fine to me). As far as the no_unique_address, I THINK we already get the double-underscore spelling of that, don't we? I don't think we exclude the MSVC spellings from that.

AFAICT you're only allowed to __uglify__ attributes with no namespace, clang, or gnu. At least clang 18 rejects msvc::__no_unique_address__ with a warning: https://godbolt.org/z/zq89ssnh1

@erichkeane
Copy link
Collaborator

I'm personally sympathetic, and would be ok with something that we know MSVC won't step on (__clang_msvc__ would be fine to me). As far as the no_unique_address, I THINK we already get the double-underscore spelling of that, don't we? I don't think we exclude the MSVC spellings from that.

AFAICT you're only allowed to uglify attributes with no namespace, clang, or gnu. At least clang 18 rejects msvc::__no_unique_address__ with a warning: https://godbolt.org/z/zq89ssnh1

I see that now :/ I managed to mis-remember that, and did the research AFTER posting!

@AaronBallman
Copy link
Collaborator

The MSVC FE team hasn't expressed enthusiasm for adding ugly spellings. If I learn more I'll relay that info.

Thank you for checking! Unfortunately, I think that's a reason for Clang to not support it for the msvc vendor prefix either.

What is the alternative?

Can you get away with not using [[msvc::no_unique_address]]? If so, I'd go that route.
No. We need it to have a reasonable ABI.
If not, I'd say use the attribute with the non-ugly spelling and woe unto anyone defining msvc as a macro despite that being a perfectly valid thing for them to do. You could be "kind" and do

#ifdef msvc
#error "Microsoft's vendor specific attribute prefix steals this identifier from the user's namespace, please file an issue with Microsoft if you see this diagnostic"
#endif

or something along those lines.

This steals both msvc and no_unique_address and does that on all platforms, not just MSVC, so that's not exactly a thrilling solution.

You can put guards in to only steal msvc on Microsoft's platform; there's no need to steal it everywhere. And no_unique_address... I was thinking that we already supported that with the underscores but it seems we don't: https://godbolt.org/z/fn5v19hx7 so that's unfortunate.

Would defining a namespace like __clang_msvc__ be an option?

It's an option, but not one I'm thrilled with. The point to vendor namespaces is that the vendor picks them and other vendors can be compatible. Do we want EDG to have to support __clang_msvc__ as a prefix for Clang's compatibility with Microsoft? That seems... a bit silly when Microsoft could elect to provide a spelling that doesn't steal from the user's namespace.

libc++ only cares about Clang on MSVC targets, so it's pretty much irrelevant what MSVC does for us. Adding an alias like [[_Clang::__no_unique_address__]] would also work if that's more palatable. That'd have to be added to any msvc attributes libc++ cares about of course (though currently that's only [[msvc::no_unique_address]]).

That's somewhat more palatable to me, but I'm still wondering whether we're making a mountain out of a mole hill with the vendor prefix:
https://sourcegraph.com/search?q=context:global+%5E%23define%5Cs%2Bmsvc%5Cb&patternType=regexp&case=yes&sm=0

That said, the attribute name itself does seem to be an issue:
https://sourcegraph.com/search?q=context:global+%5E%23define%5Cs%2Bno_unique_address%5Cb&patternType=regexp&case=yes&sm=0

So yeah, I think I'd be happier exposing [[_Clang::__no_unique_address__]] as an alias to the standard attribute, but I am wondering if @StephanTLavavej can maybe push back on the MSVC team a little bit regarding support for attribute usage in system headers. Both Clang and GCC already provide such a mechanism for exactly that reason, and I have to imagine Microsoft would want something similar so they can protect their own system headers.

@philnik777
Copy link
Contributor Author

@AaronBallman I'm not super worried about these specific cases, but I'd like to know how to proceed with msvc attributes we'd like to use in libc++ generally. e.g. msvc::intrinsic would be quite nice and doesn't actually need to be bound to the MSVC ABI. Also, as more and more attributes can't be _Uglified due to Microsoft not feeling like it there will inevitably be a clash somewhere. I'd also be fine with not providing anything for MSVC-only attributes but providing an alternate spelling for attributes that are useful outside the MSVC ABI. It'd suck having to work around that my not using the obviously correct tool __has_attribute(msvc::no_unique_address) though.

@AaronBallman
Copy link
Collaborator

@AaronBallman I'm not super worried about these specific cases, but I'd like to know how to proceed with msvc attributes we'd like to use in libc++ generally.

I think there's three options.

  1. Don't use msvc attributes in libc++ as they're not fit for use in system headers.
  2. Use the msvc attributes in libc++ and do not support any uses where msvc attributes (or the vendor prefix) are in conflict with user macros.
  3. Expose the msvc attributes under a new vendor prefix (in addition to the msvc namespace) and use those from libc++. This includes spelling it __msvc__ or _Msvc (lol).

As a non-libc++ person, it's easy for me to support (1). :-D I really think Microsoft needs to address this issue on their end. Also as a non-libc++ person, I could live with (2). I'm not keen on (3) because this would mean that Clang has multiple vendor prefixes that mean "Clang" and makes it a bit confusing as to whether the attributes under the new vendor prefix are ours or Microsoft's when conflicts arise. e.g., we start having the risk of Clang-specific behaviors under one vendor spelling and Microsoft-specific behaviors under the other.

@philnik777
Copy link
Contributor Author

Closing for now. I hope MSVC will add it, but I don't really expect it unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants