-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[Clang][HIP] Suppress availability diagnostics for mismatched host/device overloads #93546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…vice overloads
Outside of function bodies, the resolution of host/device overloads for
functions in HIP/CUDA operates as if in a host-device context. This means that
the device overload is used in the device compilation phase and the host
overload is used in the host compilation phase.
Therefore, the following code would cause a deprecation warning during host
compilation, even though val is only used as part of a device function:
__attribute__((host, deprecated)) constexpr int val(void) {return 1;}
__attribute__((device)) constexpr int val(void) {return 1;}
__attribute__((device)) std::enable_if<(val() > 0), int>::type fun(void) {
return 42;
}
As only the available device overload is used during device compilation, where
code for fun is actually generated, this diagnostic is spurious.
This patch suppresses availability diagnostics in such situations: When an
unavailable host function is used in a device context during host compilation
or when an unavailable device function is used in a host context during device
compilation.
|
@llvm/pr-subscribers-clang Author: Fabian Ritter (ritter-x2a) ChangesOutside of function bodies, the resolution of host/device overloads for functions in HIP/CUDA operates as if in a host-device context. This means that the device overload is used in the device compilation phase and the host overload is used in the host compilation phase. Therefore, the following code would cause a deprecation warning during host compilation, even though val is only used as part of a device function: __attribute__((host, deprecated)) constexpr int val(void) {return 1;}
__attribute__((device)) constexpr int val(void) {return 1;}
__attribute__((device)) std::enable_if<(val() > 0), int>::type fun(void) {
return 42;
}As only the available device overload is used during device compilation, where code for fun is actually generated, this diagnostic is spurious. This patch suppresses availability diagnostics in such situations: When an unavailable host function is used in a device context during host compilation or when an unavailable device function is used in a host context during device compilation. This change is necessary to avoid spurious warnings with #91478, e.g., in the rocPRIM library. Full diff: https://github.com/llvm/llvm-project/pull/93546.diff 2 Files Affected:
diff --git a/clang/lib/Sema/SemaAvailability.cpp b/clang/lib/Sema/SemaAvailability.cpp
index 22f5a2f663477..984789489098a 100644
--- a/clang/lib/Sema/SemaAvailability.cpp
+++ b/clang/lib/Sema/SemaAvailability.cpp
@@ -20,6 +20,7 @@
#include "clang/Sema/DelayedDiagnostic.h"
#include "clang/Sema/ScopeInfo.h"
#include "clang/Sema/Sema.h"
+#include "clang/Sema/SemaCUDA.h"
#include "clang/Sema/SemaObjC.h"
#include "llvm/ADT/StringRef.h"
#include <optional>
@@ -156,6 +157,58 @@ static bool ShouldDiagnoseAvailabilityInContext(
}
}
+ if (S.getLangOpts().CUDA || S.getLangOpts().HIP) {
+ // In CUDA/HIP, do not diagnose uses of unavailable host or device function
+ // overloads when they occur in the context of a Decl with an explicitly
+ // given opposite target.
+ // We encounter this if the OffendingDecl is used outside of a function
+ // body, e.g., in template arguments for a function's return or parameter
+ // types. In this case, overloads of the called function are resolved as if
+ // in a host-device context, i.e., the device overload is chosen in the
+ // device compilation phase and the host overload in the host compilation
+ // phase. As code is only generated for the variant with matching targets,
+ // an availabiliy diagnostic for the variant with non-matching targets would
+ // be spurious.
+
+ if (auto *OffendingFunDecl = llvm::dyn_cast<FunctionDecl>(OffendingDecl)) {
+ Decl *ActualCtx = Ctx;
+ if (auto *FTD = llvm::dyn_cast<FunctionTemplateDecl>(Ctx)) {
+ // Attributes of template Decls are only on the templated Decl
+ ActualCtx = FTD->getTemplatedDecl();
+ }
+ if (auto *CtxFun = llvm::dyn_cast<FunctionDecl>(ActualCtx)) {
+ auto TargetIs = [&S](const FunctionDecl *FD, CUDAFunctionTarget FT) {
+ return S.CUDA().IdentifyTarget(FD, /* IgnoreImplicitHDAttr */ true) ==
+ FT;
+ };
+
+ bool CtxIsHost = TargetIs(CtxFun, CUDAFunctionTarget::Host);
+ bool CtxIsDevice = TargetIs(CtxFun, CUDAFunctionTarget::Device);
+
+ bool OffendingDeclIsHost =
+ TargetIs(OffendingFunDecl, CUDAFunctionTarget::Host);
+ bool OffendingDeclIsDevice =
+ TargetIs(OffendingFunDecl, CUDAFunctionTarget::Device);
+
+ // There is a way to call a device function from host code (and vice
+ // versa, analogously) that passes semantic analysis: As constexprs,
+ // when there is no host overload. In this case, a diagnostic is
+ // necessary. Characteristic for this situation is that the device
+ // function will also be used in a host context during host compilation.
+ // Therefore, only suppress diagnostics if a host function is used in a
+ // device context during host compilation or a device function is used
+ // in a host context during device compilation.
+ bool CompilingForDevice = S.getLangOpts().CUDAIsDevice;
+ bool CompilingForHost = !CompilingForDevice;
+
+ if ((OffendingDeclIsHost && CtxIsDevice && CompilingForHost) ||
+ (OffendingDeclIsDevice && CtxIsHost && CompilingForDevice)) {
+ return false;
+ }
+ }
+ }
+ }
+
// Checks if we should emit the availability diagnostic in the context of C.
auto CheckContext = [&](const Decl *C) {
if (K == AR_NotYetIntroduced) {
diff --git a/clang/test/SemaCUDA/suppress-availability-warnings-mismatched-attributes.cu b/clang/test/SemaCUDA/suppress-availability-warnings-mismatched-attributes.cu
new file mode 100644
index 0000000000000..c3023d16565cf
--- /dev/null
+++ b/clang/test/SemaCUDA/suppress-availability-warnings-mismatched-attributes.cu
@@ -0,0 +1,149 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only -verify=expected,onhost %s
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsyntax-only -fcuda-is-device -verify=expected,ondevice %s
+
+template <bool C, class T = void> struct my_enable_if {};
+
+template <class T> struct my_enable_if<true, T> {
+ typedef T type;
+};
+
+__attribute__((host, device)) void use(int x);
+
+__attribute__((device)) constexpr int OverloadFunHostDepr(void) { return 1; }
+__attribute__((host, deprecated("Host variant"))) constexpr int OverloadFunHostDepr(void) { return 1; } // expected-note 0+ {{has been explicitly marked deprecated here}}
+
+
+__attribute__((device, deprecated("Device variant"))) constexpr int OverloadFunDeviceDepr(void) { return 1; } // expected-note 0+ {{has been explicitly marked deprecated here}}
+__attribute__((host)) constexpr int OverloadFunDeviceDepr(void) { return 1; }
+
+
+template<typename T>
+__attribute__((device)) constexpr T TemplateOverloadFun(void) { return 1; }
+
+template<typename T>
+__attribute__((host, deprecated("Host variant"))) constexpr T TemplateOverloadFun(void) { return 1; } // expected-note 0+ {{has been explicitly marked deprecated here}}
+
+
+__attribute__((device, deprecated)) constexpr int // expected-note 0+ {{has been explicitly marked deprecated here}}
+DeviceOnlyFunDeprecated(void) { return 1; }
+
+__attribute__((host, deprecated)) constexpr int // expected-note 0+ {{has been explicitly marked deprecated here}}
+HostOnlyFunDeprecated(void) { return 1; }
+
+class FunSelector {
+public:
+ template<int X> __attribute__((device))
+ auto devicefun(void) -> typename my_enable_if<(X == OverloadFunHostDepr()), int>::type {
+ return 1;
+ }
+
+ template<int X> __attribute__((device))
+ auto devicefun(void) -> typename my_enable_if<(X != OverloadFunHostDepr()), int>::type {
+ return 0;
+ }
+
+ template<int X> __attribute__((device))
+ auto devicefun_wrong(void) -> typename my_enable_if<(X == OverloadFunDeviceDepr()), int>::type { // ondevice-warning {{'OverloadFunDeviceDepr' is deprecated: Device variant}}
+ return 1;
+ }
+
+ template<int X> __attribute__((device))
+ auto devicefun_wrong(void) -> typename my_enable_if<(X != OverloadFunDeviceDepr()), int>::type { // ondevice-warning {{'OverloadFunDeviceDepr' is deprecated: Device variant}}
+ return 0;
+ }
+
+ template<int X> __attribute__((host))
+ auto hostfun(void) -> typename my_enable_if<(X == OverloadFunDeviceDepr()), int>::type {
+ return 1;
+ }
+
+ template<int X> __attribute__((host))
+ auto hostfun(void) -> typename my_enable_if<(X != OverloadFunDeviceDepr()), int>::type {
+ return 0;
+ }
+
+ template<int X> __attribute__((host))
+ auto hostfun_wrong(void) -> typename my_enable_if<(X == OverloadFunHostDepr()), int>::type { // onhost-warning {{'OverloadFunHostDepr' is deprecated: Host variant}}
+ return 1;
+ }
+
+ template<int X> __attribute__((host))
+ auto hostfun_wrong(void) -> typename my_enable_if<(X != OverloadFunHostDepr()), int>::type { // onhost-warning {{'OverloadFunHostDepr' is deprecated: Host variant}}
+ return 0;
+ }
+};
+
+
+// These should not be diagnosed since the device overload of
+// OverloadFunHostDepr is not deprecated:
+__attribute__((device)) my_enable_if<(OverloadFunHostDepr() > 0), int>::type
+DeviceUserOverloadFunHostDepr1(void) { return 2; }
+
+my_enable_if<(OverloadFunHostDepr() > 0), int>::type __attribute__((device))
+DeviceUserOverloadFunHostDepr2(void) { return 2; }
+
+__attribute__((device))
+my_enable_if<(OverloadFunHostDepr() > 0), int>::type constexpr
+DeviceUserOverloadFunHostDeprConstexpr(void) { return 2; }
+
+
+// Analogously for OverloadFunDeviceDepr:
+__attribute__((host)) my_enable_if<(OverloadFunDeviceDepr() > 0), int>::type
+DeviceUserOverloadFunDeviceDepr1(void) { return 2; }
+
+my_enable_if<(OverloadFunDeviceDepr() > 0), int>::type __attribute__((host))
+DeviceUserOverloadFunDeviceDepr2(void) { return 2; }
+
+__attribute__((host))
+my_enable_if<(OverloadFunDeviceDepr() > 0), int>::type constexpr
+DeviceUserOverloadFunDeviceDeprConstexpr(void) { return 2; }
+
+
+// Actual uses of the deprecated overloads should be diagnosed:
+__attribute__((host, device)) my_enable_if<(OverloadFunHostDepr() > 0), int>::type // onhost-warning {{'OverloadFunHostDepr' is deprecated: Host variant}}
+HostDeviceUserOverloadFunHostDepr(void) { return 3; }
+
+__attribute__((host)) my_enable_if<(OverloadFunHostDepr() > 0), int>::type constexpr // onhost-warning {{'OverloadFunHostDepr' is deprecated: Host variant}}
+HostUserOverloadFunHostDeprConstexpr(void) { return 3; }
+
+__attribute__((device)) my_enable_if<(OverloadFunDeviceDepr() > 0), int>::type constexpr // ondevice-warning {{'OverloadFunDeviceDepr' is deprecated: Device variant}}
+HostUserOverloadFunDeviceDeprConstexpr(void) { return 3; }
+
+
+// Making the offending decl a template shouldn't change anything:
+__attribute__((host)) my_enable_if<(TemplateOverloadFun<int>() > 0), int>::type // onhost-warning {{'TemplateOverloadFun<int>' is deprecated: Host variant}}
+HostUserTemplateOverloadFun(void) { return 3; }
+
+__attribute__((device)) my_enable_if<(TemplateOverloadFun<int>() > 0), int>::type
+DeviceUserTemplateOverloadFun(void) { return 3; }
+
+
+// If the constexpr function is actually called from the mismatched context, diagnostics should be issued:
+__attribute__((host))
+my_enable_if<(DeviceOnlyFunDeprecated() > 0), int>::type constexpr // onhost-warning {{'DeviceOnlyFunDeprecated' is deprecated}}
+HostUserDeviceOnlyFunDeprecated(void) { return 3; }
+
+__attribute__((device))
+my_enable_if<(HostOnlyFunDeprecated() > 0), int>::type constexpr // ondevice-warning {{'HostOnlyFunDeprecated' is deprecated}}
+DeviceUserHostOnlyFunDeprecated(void) { return 3; }
+
+// Diagnostics for uses in function bodies should work as expected:
+__attribute__((device, deprecated)) constexpr int DeviceVarConstDepr = 1; // expected-note 0+ {{has been explicitly marked deprecated here}}
+
+__attribute__((host)) void HostUser(void) {
+ use(DeviceVarConstDepr); // expected-warning {{'DeviceVarConstDepr' is deprecated}}
+ use(HostOnlyFunDeprecated()); // expected-warning {{'HostOnlyFunDeprecated' is deprecated}}
+ use(OverloadFunHostDepr()); // expected-warning {{'OverloadFunHostDepr' is deprecated: Host variant}}
+ use(TemplateOverloadFun<int>()); // expected-warning {{'TemplateOverloadFun<int>' is deprecated: Host variant}}
+
+ use(OverloadFunDeviceDepr());
+}
+
+__attribute__((device)) void DeviceUser(void) {
+ use(DeviceVarConstDepr); // expected-warning {{'DeviceVarConstDepr' is deprecated}}
+ use(DeviceOnlyFunDeprecated()); // expected-warning {{'DeviceOnlyFunDeprecated' is deprecated}}
+ use(OverloadFunDeviceDepr()); // expected-warning {{'OverloadFunDeviceDepr' is deprecated: Device variant}}
+
+ use(OverloadFunHostDepr());
+ use(TemplateOverloadFun<int>());
+}
|
This is where we may need help from @zygoloid.
Here In this case, function overload fails, and the patch tries to suppress such an error. If we are adding a special case for handling overloads, perhaps a better approach would be to consider inferring the caller context from the enveloping function declaration attributes, and allow overload resolution to pick a device function instead. It would avoid the errors you're trying to suppress, and it will arguably make things more consistent -- the function declaration will have the same signature in both host and device compilations. @zygoloid -- do you think such a change will create other issues? |
One problem I encountered when I looked into such a solution is that the host/device attribute(s) are not necessarily parsed at the time when the overload resolution in template arguments of the return type happens, because the |
|
Ping. |
You situation is similar to overloading resolution of functions called in global variable initializer. You may consider using a similar approach as https://reviews.llvm.org/D158247 |
Thank you for your suggestion. When considering a similar approach as is used for global variable initializers, I found that I would need to set a context with the correct For example: __attribute__((device)) constexpr int OverloadFunHostDepr(void) { return 1; }
__attribute__((host, deprecated("Host variant"))) constexpr int OverloadFunHostDepr(void) { return 1; }
// The device attribute here is parsed after the OverloadFunHostDepr call is resolved:
std::enable_if<(OverloadFunHostDepr() > 0), int>::type __attribute__((device))
foo(void) { return 2; }As far as I can see, we would need to go in one of the following directions:
Please let me know if I missed something, or if you are aware of a situation that is similar to the latter option, on which I could build a solution for this one. |
|
Ping @zygoloid, your input would be highly appreciated. Thanks! |
…specifiers
So far, the resolution of host/device overloads for functions in HIP/CUDA
operates as if in a host-device context for code outside of function bodies,
e.g., in expressions that are part of template arguments in top-level
declarations. This means that, if separate host and device overloads are
declared, the device overload is used in the device compilation phase and the
host overload is used in the host compilation phase.
This patch changes overload resolution in such cases to prefer overloads that
match the target of the declaration in which they occur. For example:
__device__ constexpr int get_n() { return 64; }
__host__ constexpr int get_n() { return -1; }
__device__ std::enable_if<(get_n() > 32)>::type foo() { }
Before, this code would not compile, because get_n resolved to the host
overload during host compilation, causing an error. With this patch, the call
to get_n in the declaration of the device function foo resolves to the device
overload in host and device compilation.
If attributes that affect the declaration's target occur after a call with
target-dependent overload resolution, a warning is issued. This is realized by
registering the Kinds of relevant attributes in the CUDATargetContext when they
are parsed.
This is an alternative to PR llvm#93546, which is required for PR llvm#91478.
…ithout relying on target-dependent overload resolution The __AMDGCN_WAVEFRONT_SIZE and __AMDGCN_WAVEFRONT_SIZE__ macros in HIP can only provide meaningful values during device compilation. They are currently usable in host code, but only contain the default value of 64, independent of the target device(s). This patch checks for numeric literals in clearly identifiable host code if they are the result of expanding the wavefront-size macros and issues a diagnostic if that's the case. A alternative PR, llvm#91478, relied on constexpr functions with host and device overloads (where the host overload is marked as deprecated) to diagnose uses of these macros in host code. A problem with this approach are uses of the macros outside of function bodies, e.g., in template arguments of return types, or default template arguments of functions. In these cases, calls to functions with target overloads are resolved to the host variant during host compilation and to the device variant during device compilation - independently of the target of the function they belong to. Therefore, using the wavefront size macros in such cases leads to diagnostics during host compilation with llvm#91478, even if they are only associated to a device function. PR llvm#93546 is a proposal to suppress these spurious diagnostics. PR llvm#103031 is a proposal to change the behavior of target-dependent overload resolution outside of function bodies to use the target attributes that occur before the overloaded call to select the overload candidate. In contrast to llvm#91478, this PR will not diagnose uses of the wavefront-size macros outside of function bodies or initializers of global host variables. Implements SWDEV-449015.
|
Closing this PR in favor of a more comprehensive treatment of the AMDGCN_WAVEFRONT_SIZE situation. |
Outside of function bodies, the resolution of host/device overloads for functions in HIP/CUDA operates as if in a host-device context. This means that the device overload is used in the device compilation phase and the host overload is used in the host compilation phase.
Therefore, the following code would cause a deprecation warning during host compilation, even though val is only used as part of a device function:
As only the available device overload is used during device compilation, where code for fun is actually generated, this diagnostic is spurious.
This patch suppresses availability diagnostics in such situations: When an unavailable host function is used in a device context during host compilation or when an unavailable device function is used in a host context during device compilation.
This change is necessary to avoid spurious warnings with #91478, e.g., in the rocPRIM library.