-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[Clang] Reland '__has_builtin should return false for aux triple builtins' #126324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
@llvm/pr-subscribers-clang Author: Nick Sarnie (sarnex) ChangesAs a follow-up to #121839, where we wanted to make Instead, introduce a new macro Full diff: https://github.com/llvm/llvm-project/pull/126324.diff 4 Files Affected:
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index 973cf8f9d091c30..057ad564f970bb4 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -67,6 +67,10 @@ It can be used like this:
``__has_builtin`` should not be used to detect support for a builtin macro;
use ``#ifdef`` instead.
+ When using device offloading, a builtin is considered available if it is
+ available on either the host or the device targets.
+ Use ``__has_target_builtin`` to consider only the current target.
+
``__has_constexpr_builtin``
---------------------------
@@ -96,6 +100,35 @@ the ``<cmath>`` header file to conditionally make a function constexpr whenever
the constant evaluation of the corresponding builtin (for example,
``std::fmax`` calls ``__builtin_fmax``) is supported in Clang.
+``__has_target_builtin``
+------------------------
+
+This function-like macro takes a single identifier argument that is the name of
+a builtin function, a builtin pseudo-function (taking one or more type
+arguments), or a builtin template.
+It evaluates to 1 if the builtin is supported on the current target or 0 if not.
+The behavior is different than ``__has_builtin`` when there is an auxiliary target,
+such when offloading to a target device.
+It can be used like this:
+
+.. code-block:: c++
+
+ #ifndef __has_target_builtin // Optional of course.
+ #define __has_target_builtin(x) 0 // Compatibility with non-clang compilers.
+ #endif
+
+ ...
+ #if __has_target_builtin(__builtin_trap)
+ __builtin_trap();
+ #else
+ abort();
+ #endif
+ ...
+
+.. note::
+ ``__has_target_builtin`` should not be used to detect support for a builtin macro;
+ use ``#ifdef`` instead.
+
.. _langext-__has_feature-__has_extension:
``__has_feature`` and ``__has_extension``
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index 2bf4d1a16699430..240fe28aba93e33 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -174,6 +174,7 @@ class Preprocessor {
IdentifierInfo *Ident__has_extension; // __has_extension
IdentifierInfo *Ident__has_builtin; // __has_builtin
IdentifierInfo *Ident__has_constexpr_builtin; // __has_constexpr_builtin
+ IdentifierInfo *Ident__has_target_builtin; // __has_target_builtin
IdentifierInfo *Ident__has_attribute; // __has_attribute
IdentifierInfo *Ident__has_embed; // __has_embed
IdentifierInfo *Ident__has_include; // __has_include
diff --git a/clang/lib/Lex/PPMacroExpansion.cpp b/clang/lib/Lex/PPMacroExpansion.cpp
index 347c13da0ad215a..23a693b105fca3a 100644
--- a/clang/lib/Lex/PPMacroExpansion.cpp
+++ b/clang/lib/Lex/PPMacroExpansion.cpp
@@ -357,6 +357,7 @@ void Preprocessor::RegisterBuiltinMacros() {
Ident__has_builtin = RegisterBuiltinMacro("__has_builtin");
Ident__has_constexpr_builtin =
RegisterBuiltinMacro("__has_constexpr_builtin");
+ Ident__has_target_builtin = RegisterBuiltinMacro("__has_target_builtin");
Ident__has_attribute = RegisterBuiltinMacro("__has_attribute");
if (!getLangOpts().CPlusPlus)
Ident__has_c_attribute = RegisterBuiltinMacro("__has_c_attribute");
@@ -1797,55 +1798,62 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
diag::err_feature_check_malformed);
return II && HasExtension(*this, II->getName());
});
- } else if (II == Ident__has_builtin) {
- EvaluateFeatureLikeBuiltinMacro(OS, Tok, II, *this, false,
- [this](Token &Tok, bool &HasLexedNextToken) -> int {
- IdentifierInfo *II = ExpectFeatureIdentifierInfo(Tok, *this,
- diag::err_feature_check_malformed);
- if (!II)
- return false;
- else if (II->getBuiltinID() != 0) {
- switch (II->getBuiltinID()) {
- case Builtin::BI__builtin_cpu_is:
- return getTargetInfo().supportsCpuIs();
- case Builtin::BI__builtin_cpu_init:
- return getTargetInfo().supportsCpuInit();
- case Builtin::BI__builtin_cpu_supports:
- return getTargetInfo().supportsCpuSupports();
- case Builtin::BI__builtin_operator_new:
- case Builtin::BI__builtin_operator_delete:
- // denotes date of behavior change to support calling arbitrary
- // usual allocation and deallocation functions. Required by libc++
- return 201802;
- default:
- return Builtin::evaluateRequiredTargetFeatures(
- getBuiltinInfo().getRequiredFeatures(II->getBuiltinID()),
- getTargetInfo().getTargetOpts().FeatureMap);
+ } else if (II == Ident__has_builtin || II == Ident__has_target_builtin) {
+ bool IsHasTargetBuiltin = II == Ident__has_target_builtin;
+ EvaluateFeatureLikeBuiltinMacro(
+ OS, Tok, II, *this, false,
+ [this, IsHasTargetBuiltin](Token &Tok, bool &HasLexedNextToken) -> int {
+ IdentifierInfo *II = ExpectFeatureIdentifierInfo(
+ Tok, *this, diag::err_feature_check_malformed);
+ if (!II)
+ return false;
+ auto BuiltinID = II->getBuiltinID();
+ if (BuiltinID != 0) {
+ switch (BuiltinID) {
+ case Builtin::BI__builtin_cpu_is:
+ return getTargetInfo().supportsCpuIs();
+ case Builtin::BI__builtin_cpu_init:
+ return getTargetInfo().supportsCpuInit();
+ case Builtin::BI__builtin_cpu_supports:
+ return getTargetInfo().supportsCpuSupports();
+ case Builtin::BI__builtin_operator_new:
+ case Builtin::BI__builtin_operator_delete:
+ // denotes date of behavior change to support calling arbitrary
+ // usual allocation and deallocation functions. Required by libc++
+ return 201802;
+ default:
+ // __has_target_builtin should return false for aux builtins.
+ if (IsHasTargetBuiltin &&
+ getBuiltinInfo().isAuxBuiltinID(BuiltinID))
+ return false;
+ return Builtin::evaluateRequiredTargetFeatures(
+ getBuiltinInfo().getRequiredFeatures(BuiltinID),
+ getTargetInfo().getTargetOpts().FeatureMap);
+ }
+ return true;
+ } else if (IsBuiltinTrait(Tok)) {
+ return true;
+ } else if (II->getTokenID() != tok::identifier &&
+ II->getName().starts_with("__builtin_")) {
+ return true;
+ } else {
+ return llvm::StringSwitch<bool>(II->getName())
+ // Report builtin templates as being builtins.
+ .Case("__make_integer_seq", getLangOpts().CPlusPlus)
+ .Case("__type_pack_element", getLangOpts().CPlusPlus)
+ .Case("__builtin_common_type", getLangOpts().CPlusPlus)
+ // Likewise for some builtin preprocessor macros.
+ // FIXME: This is inconsistent; we usually suggest detecting
+ // builtin macros via #ifdef. Don't add more cases here.
+ .Case("__is_target_arch", true)
+ .Case("__is_target_vendor", true)
+ .Case("__is_target_os", true)
+ .Case("__is_target_environment", true)
+ .Case("__is_target_variant_os", true)
+ .Case("__is_target_variant_environment", true)
+ .Default(false);
}
- return true;
- } else if (IsBuiltinTrait(Tok)) {
- return true;
- } else if (II->getTokenID() != tok::identifier &&
- II->getName().starts_with("__builtin_")) {
- return true;
- } else {
- return llvm::StringSwitch<bool>(II->getName())
- // Report builtin templates as being builtins.
- .Case("__make_integer_seq", getLangOpts().CPlusPlus)
- .Case("__type_pack_element", getLangOpts().CPlusPlus)
- .Case("__builtin_common_type", getLangOpts().CPlusPlus)
- // Likewise for some builtin preprocessor macros.
- // FIXME: This is inconsistent; we usually suggest detecting
- // builtin macros via #ifdef. Don't add more cases here.
- .Case("__is_target_arch", true)
- .Case("__is_target_vendor", true)
- .Case("__is_target_os", true)
- .Case("__is_target_environment", true)
- .Case("__is_target_variant_os", true)
- .Case("__is_target_variant_environment", true)
- .Default(false);
- }
- });
+ });
} else if (II == Ident__has_constexpr_builtin) {
EvaluateFeatureLikeBuiltinMacro(
OS, Tok, II, *this, false,
@@ -1886,8 +1894,7 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
return false;
});
- } else if (II == Ident__has_cpp_attribute ||
- II == Ident__has_c_attribute) {
+ } else if (II == Ident__has_cpp_attribute || II == Ident__has_c_attribute) {
bool IsCXX = II == Ident__has_cpp_attribute;
EvaluateFeatureLikeBuiltinMacro(OS, Tok, II, *this, true,
[&](Token &Tok, bool &HasLexedNextToken) -> int {
@@ -1917,8 +1924,7 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
getLangOpts())
: 0;
});
- } else if (II == Ident__has_include ||
- II == Ident__has_include_next) {
+ } else if (II == Ident__has_include || II == Ident__has_include_next) {
// The argument to these two builtins should be a parenthesized
// file name string literal using angle brackets (<>) or
// double-quotes ("").
diff --git a/clang/test/Preprocessor/has_target_builtin.cpp b/clang/test/Preprocessor/has_target_builtin.cpp
new file mode 100644
index 000000000000000..64b2d7e1b35d9ef
--- /dev/null
+++ b/clang/test/Preprocessor/has_target_builtin.cpp
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -fopenmp -triple=spirv64 -fopenmp-is-target-device \
+// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck -implicit-check-not=BAD %s
+
+// RUN: %clang_cc1 -fopenmp -triple=nvptx64 -fopenmp-is-target-device \
+// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck -implicit-check-not=BAD %s
+
+// RUN: %clang_cc1 -fopenmp -triple=amdgcn-amd-amdhsa -fopenmp-is-target-device \
+// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck -implicit-check-not=BAD %s
+
+// RUN: %clang_cc1 -fopenmp -triple=aarch64 -fopenmp-is-target-device \
+// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck -implicit-check-not=BAD %s
+
+// CHECK: GOOD
+#if __has_target_builtin(__builtin_ia32_pause)
+ BAD
+#else
+ GOOD
+#endif
|
clang/lib/Lex/PPMacroExpansion.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may want to define this macro for offloading languages only. The reason is that non-offloading languages do not need this macro but if they start to use this macro then it will break again in offloading languages like __has_builtin
did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the usage would commonly be
#if defined(__has_target_builtin) && __has_target_builtin(foo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fear is that some C++ library headers start to use this macro __has_target_builtin
in place of __has_builtin
, and we cannot modify such headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will do this. I can't find a good way to detect offloading languages in general here, so I'm just going to check for CUDA/HIP/SYCLDevice/OpenMPDevice, let me know if there's some common logic I can rely on that I missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in latest commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fear is that some C++ library headers start to use this macro
__has_target_builtin
in place of__has_builtin
, and we cannot modify such headers.
IMO, now that we do document semantics of __has_target_builtin()
, its misuse on the library side will be their problem to fix. The problem with __has_builtin()
was that it was never intended to handle heterogeneous compilation, and that's what created the issue when CUDA/HIP made builtins from both host and device visible to the compiler, but not all of them codegen-able. __has_target_builtin()
clearly states what to expect. Sure, it's possible to misuse it, but having it available unconditionally will make it much less cumbersome to use in the headers shared between CUDA and C++, and that's a fairly common use case.
I'd prefer to have __has_target_builtin()
generally available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, __has_target_builtin()
is probably identical to __has_builtin
on non-offloading related things. It's up to them if they keep it portable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made it unconditional again in the latest commit
i kept the code example in langref because probably we still want to recommend only using this for offloading targets, even though it will work on non-offloading targets. let met know if you disagree.
need release note |
Feedback from #126324 Signed-off-by: Sarnie, Nick <[email protected]>
…26571) Feedback from llvm/llvm-project#126324 Signed-off-by: Sarnie, Nick <[email protected]>
clang/docs/LanguageExtensions.rst
Outdated
``__has_target_builtin`` should not be used to detect support for a builtin macro; | ||
use ``#ifdef`` instead. | ||
|
||
``__has_target_built`` is only defined for offloading targets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``__has_target_built`` is only defined for offloading targets. | |
``__has_target_builtin`` is only defined for offloading targets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow thats embarrassing, thanks
clang/docs/LanguageExtensions.rst
Outdated
arguments), or a builtin template. | ||
It evaluates to 1 if the builtin is supported on the current target or 0 if not. | ||
The behavior is different than ``__has_builtin`` when there is an auxiliary target, | ||
such when offloading to a target device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
such when offloading to a target device. | |
such as when offloading to a target device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cant english today, thanks
clang/docs/LanguageExtensions.rst
Outdated
#ifndef __has_target_builtin // Optional of course. | ||
#define __has_target_builtin(x) 0 // Compatibility with non-clang compilers. | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be more helpful to do something like ifdef CUDA ... else __has_builtin
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopefully the latest commit has the use case youre looking for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we make it available to C++, we'd better document the following invalid usage which originally leads to this extension:
#if !__has_target_builtin(__wfi)
static __inline__ void __attribute__((__always_inline__, __nodebug__)) __wfi(void) {
__builtin_arm_wfi();
}
#endif
we should emphasize that a C++ header may be used by offloading languages, and in offloading language, the same source is compiled for host and device target separately. A builtin not available for the current target does not justify defining the builtin for both host and device targets. In this case, better to use __has_builtin(__wfi)
since it makes sure the condition is true for both hosts and device targets so that the code won't break when used in offloading languages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have somewhat conflicting requirements:
- On C++ side, writers do not care about offloading (and we can't force them to). They only have
__has_builtin()
and it does what they need -- if the given builtin exists it will be compileable. - On offloading side, we want C++ headers to work out of the box for the host side. Ideally with the host and device compilations seeing the same code after preprocessing, and that's where we get into this problem. We can't tell whether the original C++ code needs
__has_builtin()
(works well enough for most uses inside of host function bodies) or if it needs__has_target_builtin()
(e.g. when it's used inside a lambda or constexpr function which is implicitly HD, and we do need to generate code for it).
I'm not sure we can find a universal solution. That said, __has_target_builtin()
gives us some flexibility on the offloading side. C++ side should stick with __has_builtin()
. __has_target_builtin()
should only be used when offloading comes into the picture, but it includes the possibility that it will be used in the headers shared with C++ and therefore the builtin itself should be available there.
clang/docs/LanguageExtensions.rst
Outdated
This function-like macro takes a single identifier argument that is the name of | ||
a builtin function, a builtin pseudo-function (taking one or more type | ||
arguments), or a builtin template. | ||
It evaluates to 1 if the builtin is supported on the current target or 0 if not. | ||
The behavior is different than ``__has_builtin`` when there is an auxiliary target, | ||
such when offloading to a target device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rephrase it to be more specific in terms of what the difference is rather than when it occurs.
__has_builtin()
and __has_target_builtin()
behave identically for normal C++ compilations.
For heterogeneous compilations that see source code intended for more than one target
__has_builtin()
returns true if the builtin is known to the compiler (i.e. it's available via one of the targets), but makes no promises whether it's available on the current target. We can parse it, but not necessarily codegen it.__has_target_builtin()
returns true if the builtin can actually be codegen'ed for the current target.
__has_target_builtin()
is, effectively, functional superset of CUDA's __CUDA_ARCH__
-- it allows distinguishing both host and target architectures. It has to be treated with similar caution so it does not break consistency of the TU source code seen by the compiler across sub-compilations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, i like the way you worded it so i'll use most of this verbatim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only some tiny nits from me
clang/lib/Lex/PPMacroExpansion.cpp
Outdated
return false; | ||
else if (II->getBuiltinID() != 0) { | ||
switch (II->getBuiltinID()) { | ||
auto BuiltinID = II->getBuiltinID(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please spell out the type explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in latest commit, thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have been undone.
// CHECK-NOTOFFLOAD: DOESNT | ||
#ifdef __has_target_builtin | ||
HAS | ||
#if __has_target_builtin(__builtin_ia32_pause) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add test coverage for when the target does have the builtin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in latest commit, thx
|
||
// CHECK-NOTOFFLOAD: DOESNT | ||
#ifdef __has_target_builtin | ||
HAS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably check for HAS
explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the macro is unconditionally available in the latest commit so i removed the checking for the macro being defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT with a nit.
clang/docs/LanguageExtensions.rst
Outdated
#else // !CUDA | ||
#if __has_builtin(__builtin_trap) | ||
__builtin_trap(); | ||
#else | ||
abort(); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#else // !CUDA | |
#if __has_builtin(__builtin_trap) | |
__builtin_trap(); | |
#else | |
abort(); | |
#endif | |
#else // !CUDA | |
#if __has_builtin(__builtin_trap) | |
__builtin_trap(); | |
#else | |
abort(); | |
#endif |
Is this still necessary now that we allow it on all targets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, its not necessary, but i was thinking we leave it to suggest users to only use this for offloading, even if it does work for normal compilation modes. if you think the explanation of the differences between has_builtin
and has_target_builtin
is enough for users, ill remove the offload/nooffload check from the example
clang/docs/LanguageExtensions.rst
Outdated
arguments), or a builtin template. | ||
It evaluates to 1 if the builtin is supported on the current target or 0 if not. | ||
|
||
``__has_builtin`` and ``__has_target_builtin`` behave identically for normal C++ compilations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this means users should use __has_target_builtin
in place of __has_builtin
so that their code will correctly check for the builtin on host or device?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably better to be portable with GCC for average users, but it's the recommended solution for code that's intended to be run on the GPU I'd say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the user's goal is to check that the builtin can be codegen'd on the current target being compiled, then yes they should use __has_target_builtin
. if they want to confirm it can be parsed but not necessarily codegen'd, then they should use __has_builtin
. if that's unclear let me know and i can try to improve the doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the user's goal is to check that the builtin can be codegen'd on the current target being compiled, then yes they should use __has_target_builtin. if they want to confirm it can be parsed but not necessarily codegen'd, then they should use __has_builtin. if that's unclear let me know and i can try to improve the doc
Users don't typically think in terms of "parsed" and "codegenned", but more "works" and "doesn't work", which I think means "codegenned" in general. The line we're drawing here is pretty subtle, so perhaps more real world examples would help; it's hard to understand why a builtin can be parsed but cannot be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that's a fair point, so it sounds like you'd prefer an example of when codegen doesn't work even though the user is checking with has_builtin
, and for that i could show the motiving example for this change which is something like
void foo() {
#if __has_builtin(__builtin_ia32_pause)
__builtin_ia32_pause();
#else
abort();
#endif
}
and if the current target is an offloading target (amdgpu/nvptx/spirv) and the aux target is x86, we will get a error saying it can't be codegen'd
would that kind of example address your concern? thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it would, along with the prose explaining why that's not a bug but is actually by design for __has_builtin
. I think users would reasonably look at that and say it should not be an error, so it'd be nice to make sure they understand the intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to including a concrete example with an explanation. The fact that a builtin may be both visible but not usable will be a surprise for a C++ user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, ill add this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added in the latest commit, please take a look and let me know if it's clear
Feedback from llvm#126324 Signed-off-by: Sarnie, Nick <[email protected]>
clang/docs/LanguageExtensions.rst
Outdated
Compilation of this code results in a compiler error because ``__builtin_ia32_pause`` is known to the compiler because | ||
it is a builtin supported by the host x86-64 compilation so ``__has_builtin`` returns true. However, code cannot | ||
be generated for ``__builtin_ia32_pause`` during the offload AMDGPU compilation as it is not supported on that target. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this is helping somewhat. But I think a user is still going to ask themselves "why does __has_builtin
return true
in this case?". The host and the offload are separate compilations, so logically, it stands to reason that __has_builtin
would return true
for the host and false
for the offload. And because of:
__has_builtin
and__has_target_builtin
behave identically for normal C++ compilations.
They're going to wonder why they wouldn't just replace all uses of __has_builtin
with __has_target_builtin
, which begs the question of why __has_builtin
isn't just being "fixed".
(I'll be honest, I still don't fully understand the answer to that myself.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking as requesting changes so we don't accidentally land, because I've got questions about whether we want to go down this route.
- GCC has
__has_builtin
, so how do they handle offloading targets? Do they have the same odd behavior where__has_builtin
returns true for builtins it cannot actually emit code for? - Given that
__has_target_builtin
seems to have the semantics everyone would expect from__has_builtin
, do we want to consider deprecating__has_builtin
so that downstreams have time to adjust but we eventually end up with a less confusing builtin?
It really seems to me that __has_builtin
has a broken design because of compilations where we try to hide a two-pass compilation as though it were one-pass and it seems like we're going to confuse folks with that behavior. If I could wave a magic wand, I would say __has_builtin
should behave exactly how __has_target_builtin
is behaving in this PR and that a reliance on __has_builtin
behaving how it does today is relying on a bug.
Signed-off-by: Sarnie, Nick <[email protected]>
Yeah just noticed the same thing, looks like that feedback was only in this PR and not the original merged one, my bad. Fixed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document this and add it to the clang release notes.
ah right, sorry. been away from this pr for too long :P |
Signed-off-by: Sarnie, Nick <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, thanks!
I'll merge once @AaronBallman gets a chance to take a final look |
He showed up in the original patch where I revived this and seemed to agree with the consensus we reached. |
Sounds like you need this patch quickly, so I'll merge in about an hour unless I get new feedback to address. |
It's not an urgent need, I was just experimenting with my RPC interface through HIP and was getting annoyed with keeping a workaround in tree. Thanks for picking this up again. |
Sure, thanks for the reminder/reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM aside from a nit with the release notes.
clang/docs/ReleaseNotes.rst
Outdated
This feature is enabled by default but can be disabled by compiling with | ||
``-fno-sanitize-annotate-debug-info-traps``. | ||
|
||
- The ``__has_builtin`` function now only considers the currently active target when being used with target offloading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved to the potentially breaking changes section, because the behavioral change could catch folks off-guard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do then merge, thanks!
Signed-off-by: Sarnie, Nick <[email protected]>
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/19671 Here is the relevant piece of the build log for the reference
|
@boomanaiden154 Just FYI that I relanded this patch which disables a test, you agreed to take a look at it in the previous PR here, thank you! |
…ple builtins' (llvm#126324)" This reverts commit 0efcb83.
It does not look like __cpuidex builtins are being incorrectly included in compilations for offload targets anymore, so change up the test to assume that we are defining __cpuidex as static in cpuid.h now that \llvm#126324 updates the behavior of __has_builtin on offload compilations. This ensures we are still testing that we are avoiding the conflicts around offloading that were first pointed out in https://reviews.llvm.org/D150646.
It does not look like __cpuidex builtins are being incorrectly included in compilations for offload targets anymore, so change up the test to assume that we are defining __cpuidex as static in cpuid.h now that \#126324 updates the behavior of __has_builtin on offload compilations. This ensures we are still testing that we are avoiding the conflicts around offloading that were first pointed out in https://reviews.llvm.org/D150646.
Some code from NCCL (https://github.com/NVIDIA/nccl/blob/master/src/graph/xml.cc#L16) started failing to compile after this patch. A standalone test (https://gcc.godbolt.org/z/n1dbEGr1M):
This started producing the following error:
|
That looks like exactly the issue @boomanaiden154 had the test for in #151220. @boomanaiden154 can you take a look? If it's really an issue with this patch and not something exposed by it let me know. Thanks! |
This is the cc1 invocation that is causing things to fail (on the example provided in the comment above):
It looks like something in the CUDA flags enables the |
The landing of llvm#126324 made it so that __has_builtin returns false for aux triple builtins. CUDA offloading can sometimes compile where the host is in the aux triple (ie x86_64). This patch explicitly carves out NVPTX so that we do not run into redefinition errors.
The landing of #126324 made it so that __has_builtin returns false for aux triple builtins. CUDA offloading can sometimes compile where the host is in the aux triple (ie x86_64). This patch explicitly carves out NVPTX so that we do not run into redefinition errors.
@alexfh Aiden landed a patch that should fix this, if you're still seeing it please let us know! |
Reland #121839 based on the results of the Discourse discussion here.