[Clang][HIP] Suppress availability diagnostics for mismatched host/device overloads #93546

ritter-x2a · 2024-05-28T13:27:15Z

Outside of function bodies, the resolution of host/device overloads for functions in HIP/CUDA operates as if in a host-device context. This means that the device overload is used in the device compilation phase and the host overload is used in the host compilation phase.

Therefore, the following code would cause a deprecation warning during host compilation, even though val is only used as part of a device function:

__attribute__((host, deprecated)) constexpr int val(void) {return 1;}
__attribute__((device)) constexpr int val(void) {return 1;}
__attribute__((device)) std::enable_if<(val() > 0), int>::type fun(void) {
    return 42;
}

As only the available device overload is used during device compilation, where code for fun is actually generated, this diagnostic is spurious.

This patch suppresses availability diagnostics in such situations: When an unavailable host function is used in a device context during host compilation or when an unavailable device function is used in a host context during device compilation.

This change is necessary to avoid spurious warnings with #91478, e.g., in the rocPRIM library.

…vice overloads Outside of function bodies, the resolution of host/device overloads for functions in HIP/CUDA operates as if in a host-device context. This means that the device overload is used in the device compilation phase and the host overload is used in the host compilation phase. Therefore, the following code would cause a deprecation warning during host compilation, even though val is only used as part of a device function: __attribute__((host, deprecated)) constexpr int val(void) {return 1;} __attribute__((device)) constexpr int val(void) {return 1;} __attribute__((device)) std::enable_if<(val() > 0), int>::type fun(void) { return 42; } As only the available device overload is used during device compilation, where code for fun is actually generated, this diagnostic is spurious. This patch suppresses availability diagnostics in such situations: When an unavailable host function is used in a device context during host compilation or when an unavailable device function is used in a host context during device compilation.

llvmbot · 2024-05-28T13:27:52Z

@llvm/pr-subscribers-clang

Author: Fabian Ritter (ritter-x2a)

Changes

Outside of function bodies, the resolution of host/device overloads for functions in HIP/CUDA operates as if in a host-device context. This means that the device overload is used in the device compilation phase and the host overload is used in the host compilation phase.

Therefore, the following code would cause a deprecation warning during host compilation, even though val is only used as part of a device function:

__attribute__((host, deprecated)) constexpr int val(void) {return 1;}
__attribute__((device)) constexpr int val(void) {return 1;}
__attribute__((device)) std::enable_if&lt;(val() &gt; 0), int&gt;::type fun(void) {
    return 42;
}

As only the available device overload is used during device compilation, where code for fun is actually generated, this diagnostic is spurious.

This patch suppresses availability diagnostics in such situations: When an unavailable host function is used in a device context during host compilation or when an unavailable device function is used in a host context during device compilation.

This change is necessary to avoid spurious warnings with #91478, e.g., in the rocPRIM library.

Full diff: https://github.com/llvm/llvm-project/pull/93546.diff

2 Files Affected:

(modified) clang/lib/Sema/SemaAvailability.cpp (+53)
(added) clang/test/SemaCUDA/suppress-availability-warnings-mismatched-attributes.cu (+149)

diff --git a/clang/lib/Sema/SemaAvailability.cpp b/clang/lib/Sema/SemaAvailability.cpp
index 22f5a2f663477..984789489098a 100644
--- a/clang/lib/Sema/SemaAvailability.cpp
+++ b/clang/lib/Sema/SemaAvailability.cpp
@@ -20,6 +20,7 @@
 #include "clang/Sema/DelayedDiagnostic.h"
 #include "clang/Sema/ScopeInfo.h"
 #include "clang/Sema/Sema.h"
+#include "clang/Sema/SemaCUDA.h"
 #include "clang/Sema/SemaObjC.h"
 #include "llvm/ADT/StringRef.h"
 #include <optional>
@@ -156,6 +157,58 @@ static bool ShouldDiagnoseAvailabilityInContext(
     }
   }
 
+  if (S.getLangOpts().CUDA || S.getLangOpts().HIP) {
+    // In CUDA/HIP, do not diagnose uses of unavailable host or device function
+    // overloads when they occur in the context of a Decl with an explicitly
+    // given opposite target.
+    // We encounter this if the OffendingDecl is used outside of a function
+    // body, e.g., in template arguments for a function's return or parameter
+    // types. In this case, overloads of the called function are resolved as if
+    // in a host-device context, i.e., the device overload is chosen in the
+    // device compilation phase and the host overload in the host compilation
+    // phase. As code is only generated for the variant with matching targets,
+    // an availabiliy diagnostic for the variant with non-matching targets would
+    // be spurious.
+
+    if (auto *OffendingFunDecl = llvm::dyn_cast<FunctionDecl>(OffendingDecl)) {
+      Decl *ActualCtx = Ctx;
+      if (auto *FTD = llvm::dyn_cast<FunctionTemplateDecl>(Ctx)) {
+        // Attributes of template Decls are only on the templated Decl
+        ActualCtx = FTD->getTemplatedDecl();
+      }
+      if (auto *CtxFun = llvm::dyn_cast<FunctionDecl>(ActualCtx)) {
+        auto TargetIs = [&S](const FunctionDecl *FD, CUDAFunctionTarget FT) {
+          return S.CUDA().IdentifyTarget(FD, /* IgnoreImplicitHDAttr */ true) ==
+                 FT;
+        };
+
+        bool CtxIsHost = TargetIs(CtxFun, CUDAFunctionTarget::Host);
+        bool CtxIsDevice = TargetIs(CtxFun, CUDAFunctionTarget::Device);
+
+        bool OffendingDeclIsHost =
+            TargetIs(OffendingFunDecl, CUDAFunctionTarget::Host);
+        bool OffendingDeclIsDevice =
+            TargetIs(OffendingFunDecl, CUDAFunctionTarget::Device);
+
+        // There is a way to call a device function from host code (and vice
+        // versa, analogously) that passes semantic analysis: As constexprs,
+        // when there is no host overload. In this case, a diagnostic is
+        // necessary. Characteristic for this situation is that the device
+        // function will also be used in a host context during host compilation.
+        // Therefore, only suppress diagnostics if a host function is used in a
+        // device context during host compilation or a device function is used
+        // in a host context during device compilation.
+        bool CompilingForDevice = S.getLangOpts().CUDAIsDevice;
+        bool CompilingForHost = !CompilingForDevice;
+
+        if ((OffendingDeclIsHost && CtxIsDevice && CompilingForHost) ||
+            (OffendingDeclIsDevice && CtxIsHost && CompilingForDevice)) {
+          return false;
+        }
+      }
+    }
+  }
+
   // Checks if we should emit the availability diagnostic in the context of C.
   auto CheckContext = [&](const Decl *C) {
     if (K == AR_NotYetIntroduced) {
diff --git a/clang/test/SemaCUDA/suppress-availability-warnings-mismatched-attributes.cu b/clang/test/SemaCUDA/suppress-availability-warnings-mismatched-attributes.cu
new file mode 100644
index 0000000000000..c3023d16565cf
--- /dev/null
+++ b/clang/test/SemaCUDA/suppress-availability-warnings-mismatched-attributes.cu
@@ -0,0 +1,149 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only -verify=expected,onhost %s
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsyntax-only -fcuda-is-device -verify=expected,ondevice %s
+
+template <bool C, class T = void> struct my_enable_if {};
+
+template <class T> struct my_enable_if<true, T> {
+  typedef T type;
+};
+
+__attribute__((host, device)) void use(int x);
+
+__attribute__((device)) constexpr int OverloadFunHostDepr(void) { return 1; }
+__attribute__((host, deprecated("Host variant"))) constexpr int OverloadFunHostDepr(void) { return 1; } // expected-note 0+ {{has been explicitly marked deprecated here}}
+
+
+__attribute__((device, deprecated("Device variant"))) constexpr int OverloadFunDeviceDepr(void) { return 1; } // expected-note 0+ {{has been explicitly marked deprecated here}}
+__attribute__((host)) constexpr int OverloadFunDeviceDepr(void) { return 1; }
+
+
+template<typename T>
+__attribute__((device)) constexpr T TemplateOverloadFun(void) { return 1; }
+
+template<typename T>
+__attribute__((host, deprecated("Host variant"))) constexpr T TemplateOverloadFun(void) { return 1; } // expected-note 0+ {{has been explicitly marked deprecated here}}
+
+
+__attribute__((device, deprecated)) constexpr int // expected-note 0+ {{has been explicitly marked deprecated here}}
+DeviceOnlyFunDeprecated(void) { return 1; }
+
+__attribute__((host, deprecated)) constexpr int // expected-note 0+ {{has been explicitly marked deprecated here}}
+HostOnlyFunDeprecated(void) { return 1; }
+
+class FunSelector {
+public:
+  template<int X> __attribute__((device))
+  auto devicefun(void) -> typename my_enable_if<(X == OverloadFunHostDepr()), int>::type {
+    return 1;
+  }
+
+  template<int X> __attribute__((device))
+  auto devicefun(void) -> typename my_enable_if<(X != OverloadFunHostDepr()), int>::type {
+      return 0;
+  }
+
+  template<int X> __attribute__((device))
+  auto devicefun_wrong(void) -> typename my_enable_if<(X == OverloadFunDeviceDepr()), int>::type { // ondevice-warning {{'OverloadFunDeviceDepr' is deprecated: Device variant}}
+    return 1;
+  }
+
+  template<int X> __attribute__((device))
+  auto devicefun_wrong(void) -> typename my_enable_if<(X != OverloadFunDeviceDepr()), int>::type { // ondevice-warning {{'OverloadFunDeviceDepr' is deprecated: Device variant}}
+      return 0;
+  }
+
+  template<int X> __attribute__((host))
+  auto hostfun(void) -> typename my_enable_if<(X == OverloadFunDeviceDepr()), int>::type {
+    return 1;
+  }
+
+  template<int X> __attribute__((host))
+  auto hostfun(void) -> typename my_enable_if<(X != OverloadFunDeviceDepr()), int>::type {
+      return 0;
+  }
+
+  template<int X> __attribute__((host))
+  auto hostfun_wrong(void) -> typename my_enable_if<(X == OverloadFunHostDepr()), int>::type { // onhost-warning {{'OverloadFunHostDepr' is deprecated: Host variant}}
+    return 1;
+  }
+
+  template<int X> __attribute__((host))
+  auto hostfun_wrong(void) -> typename my_enable_if<(X != OverloadFunHostDepr()), int>::type { // onhost-warning {{'OverloadFunHostDepr' is deprecated: Host variant}}
+      return 0;
+  }
+};
+
+
+// These should not be diagnosed since the device overload of
+// OverloadFunHostDepr is not deprecated:
+__attribute__((device)) my_enable_if<(OverloadFunHostDepr() > 0), int>::type
+DeviceUserOverloadFunHostDepr1(void) { return 2; }
+
+my_enable_if<(OverloadFunHostDepr() > 0), int>::type __attribute__((device))
+DeviceUserOverloadFunHostDepr2(void) { return 2; }
+
+__attribute__((device))
+my_enable_if<(OverloadFunHostDepr() > 0), int>::type constexpr
+DeviceUserOverloadFunHostDeprConstexpr(void) { return 2; }
+
+
+// Analogously for OverloadFunDeviceDepr:
+__attribute__((host)) my_enable_if<(OverloadFunDeviceDepr() > 0), int>::type
+DeviceUserOverloadFunDeviceDepr1(void) { return 2; }
+
+my_enable_if<(OverloadFunDeviceDepr() > 0), int>::type __attribute__((host))
+DeviceUserOverloadFunDeviceDepr2(void) { return 2; }
+
+__attribute__((host))
+my_enable_if<(OverloadFunDeviceDepr() > 0), int>::type constexpr
+DeviceUserOverloadFunDeviceDeprConstexpr(void) { return 2; }
+
+
+// Actual uses of the deprecated overloads should be diagnosed:
+__attribute__((host, device)) my_enable_if<(OverloadFunHostDepr() > 0), int>::type // onhost-warning {{'OverloadFunHostDepr' is deprecated: Host variant}}
+HostDeviceUserOverloadFunHostDepr(void) { return 3; }
+
+__attribute__((host)) my_enable_if<(OverloadFunHostDepr() > 0), int>::type constexpr // onhost-warning {{'OverloadFunHostDepr' is deprecated: Host variant}}
+HostUserOverloadFunHostDeprConstexpr(void) { return 3; }
+
+__attribute__((device)) my_enable_if<(OverloadFunDeviceDepr() > 0), int>::type constexpr // ondevice-warning {{'OverloadFunDeviceDepr' is deprecated: Device variant}}
+HostUserOverloadFunDeviceDeprConstexpr(void) { return 3; }
+
+
+// Making the offending decl a template shouldn't change anything:
+__attribute__((host)) my_enable_if<(TemplateOverloadFun<int>() > 0), int>::type // onhost-warning {{'TemplateOverloadFun<int>' is deprecated: Host variant}}
+HostUserTemplateOverloadFun(void) { return 3; }
+
+__attribute__((device)) my_enable_if<(TemplateOverloadFun<int>() > 0), int>::type
+DeviceUserTemplateOverloadFun(void) { return 3; }
+
+
+// If the constexpr function is actually called from the mismatched context, diagnostics should be issued:
+__attribute__((host))
+my_enable_if<(DeviceOnlyFunDeprecated() > 0), int>::type constexpr // onhost-warning {{'DeviceOnlyFunDeprecated' is deprecated}}
+HostUserDeviceOnlyFunDeprecated(void) { return 3; }
+
+__attribute__((device))
+my_enable_if<(HostOnlyFunDeprecated() > 0), int>::type constexpr // ondevice-warning {{'HostOnlyFunDeprecated' is deprecated}}
+DeviceUserHostOnlyFunDeprecated(void) { return 3; }
+
+// Diagnostics for uses in function bodies should work as expected:
+__attribute__((device, deprecated)) constexpr int DeviceVarConstDepr = 1; // expected-note 0+ {{has been explicitly marked deprecated here}}
+
+__attribute__((host)) void HostUser(void) {
+  use(DeviceVarConstDepr); // expected-warning {{'DeviceVarConstDepr' is deprecated}}
+  use(HostOnlyFunDeprecated()); // expected-warning {{'HostOnlyFunDeprecated' is deprecated}}
+  use(OverloadFunHostDepr()); // expected-warning {{'OverloadFunHostDepr' is deprecated: Host variant}}
+  use(TemplateOverloadFun<int>()); // expected-warning {{'TemplateOverloadFun<int>' is deprecated: Host variant}}
+
+  use(OverloadFunDeviceDepr());
+}
+
+__attribute__((device)) void DeviceUser(void) {
+  use(DeviceVarConstDepr); // expected-warning {{'DeviceVarConstDepr' is deprecated}}
+  use(DeviceOnlyFunDeprecated()); // expected-warning {{'DeviceOnlyFunDeprecated' is deprecated}}
+  use(OverloadFunDeviceDepr()); // expected-warning {{'OverloadFunDeviceDepr' is deprecated: Device variant}}
+
+  use(OverloadFunHostDepr());
+  use(TemplateOverloadFun<int>());
+}

Artem-B · 2024-05-28T18:08:28Z

Therefore, the following code would cause a deprecation warning during host compilation, even though val is only used as part of a device function:

This is where we may need help from @zygoloid.

attribute((device)) std::enable_if<(val() > 0), int>::type fun(void)

Here val() is evaluated in global context as it does not have a caller function. As such, overload resolution picking a host function during host compilation is a WAI, even if it happens to be part of a GPU-side function declaration.

In this case, function overload fails, and the patch tries to suppress such an error.

If we are adding a special case for handling overloads, perhaps a better approach would be to consider inferring the caller context from the enveloping function declaration attributes, and allow overload resolution to pick a device function instead. It would avoid the errors you're trying to suppress, and it will arguably make things more consistent -- the function declaration will have the same signature in both host and device compilations.

@zygoloid -- do you think such a change will create other issues?

ritter-x2a · 2024-05-29T06:35:22Z

If we are adding a special case for handling overloads, perhaps a better approach would be to consider inferring the caller context from the enveloping function declaration attributes, and allow overload resolution to pick a device function instead. It would avoid the errors you're trying to suppress, and it will arguably make things more consistent -- the function declaration will have the same signature in both host and device compilations.

One problem I encountered when I looked into such a solution is that the host/device attribute(s) are not necessarily parsed at the time when the overload resolution in template arguments of the return type happens, because the __attribute__((device)) can come after the return type specifier (cf. the DeviceUserOverloadFunHostDepr2 function in the test case).
So far I haven't found an existing mechanism in clang to solve this without a bigger change that introduces some sort of backtracking. I'd be happy about pointers in that direction.

ritter-x2a · 2024-06-05T08:51:52Z

Ping.

yxsamliu · 2024-06-05T17:01:31Z

Ping.

You situation is similar to overloading resolution of functions called in global variable initializer. You may consider using a similar approach as https://reviews.llvm.org/D158247

ritter-x2a · 2024-06-06T09:42:56Z

You situation is similar to overloading resolution of functions called in global variable initializer. You may consider using a similar approach as https://reviews.llvm.org/D158247

Thank you for your suggestion. When considering a similar approach as is used for global variable initializers, I found that I would need to set a context with the correct CUDAFunctionTarget before parsing the specifier of the function's return type, since that specifier can include template arguments with function calls.
This is problematic because relevant attributes may not have been parsed yet when the function's return type specifier is parsed, so that determining the correct CUDAFunctionTarget is not yet possible.

For example:

__attribute__((device)) constexpr int OverloadFunHostDepr(void) { return 1; }
__attribute__((host, deprecated("Host variant"))) constexpr int OverloadFunHostDepr(void) { return 1; }

// The device attribute here is parsed after the OverloadFunHostDepr call is resolved:
std::enable_if<(OverloadFunHostDepr() > 0), int>::type __attribute__((device))
foo(void) { return 2; }

As far as I can see, we would need to go in one of the following directions:

Disallow device/host/... target attributes after the type specifier, so that the available information at the type specifier is always sufficient. This seems problematic for backwards compatibility.
Resolve overloads with the information that is available, which means generating potentially different types depending on if the device attribute comes before or after the type specifier. This would enable surprising bugs for compiler users. We might be able to reduce such bugs by issuing a warning for encountered attributes that are not used for overload resolution.
Parse all attributes before overloads in the return type specifier are resolved, possibly by delaying overload resolution until the attributes are available. It is not clear to me whether this would be preferable over this PR as it would require a significant change to how clang parses function declarations.

Please let me know if I missed something, or if you are aware of a situation that is similar to the latter option, on which I could build a solution for this one.

ritter-x2a · 2024-06-14T07:18:23Z

Ping @zygoloid, your input would be highly appreciated. Thanks!

…specifiers So far, the resolution of host/device overloads for functions in HIP/CUDA operates as if in a host-device context for code outside of function bodies, e.g., in expressions that are part of template arguments in top-level declarations. This means that, if separate host and device overloads are declared, the device overload is used in the device compilation phase and the host overload is used in the host compilation phase. This patch changes overload resolution in such cases to prefer overloads that match the target of the declaration in which they occur. For example: __device__ constexpr int get_n() { return 64; } __host__ constexpr int get_n() { return -1; } __device__ std::enable_if<(get_n() > 32)>::type foo() { } Before, this code would not compile, because get_n resolved to the host overload during host compilation, causing an error. With this patch, the call to get_n in the declaration of the device function foo resolves to the device overload in host and device compilation. If attributes that affect the declaration's target occur after a call with target-dependent overload resolution, a warning is issued. This is realized by registering the Kinds of relevant attributes in the CUDATargetContext when they are parsed. This is an alternative to PR llvm#93546, which is required for PR llvm#91478.

…ithout relying on target-dependent overload resolution The __AMDGCN_WAVEFRONT_SIZE and __AMDGCN_WAVEFRONT_SIZE__ macros in HIP can only provide meaningful values during device compilation. They are currently usable in host code, but only contain the default value of 64, independent of the target device(s). This patch checks for numeric literals in clearly identifiable host code if they are the result of expanding the wavefront-size macros and issues a diagnostic if that's the case. A alternative PR, llvm#91478, relied on constexpr functions with host and device overloads (where the host overload is marked as deprecated) to diagnose uses of these macros in host code. A problem with this approach are uses of the macros outside of function bodies, e.g., in template arguments of return types, or default template arguments of functions. In these cases, calls to functions with target overloads are resolved to the host variant during host compilation and to the device variant during device compilation - independently of the target of the function they belong to. Therefore, using the wavefront size macros in such cases leads to diagnostics during host compilation with llvm#91478, even if they are only associated to a device function. PR llvm#93546 is a proposal to suppress these spurious diagnostics. PR llvm#103031 is a proposal to change the behavior of target-dependent overload resolution outside of function bodies to use the target attributes that occur before the overloaded call to select the overload candidate. In contrast to llvm#91478, this PR will not diagnose uses of the wavefront-size macros outside of function bodies or initializers of global host variables. Implements SWDEV-449015.

ritter-x2a · 2024-10-17T06:38:11Z

Closing this PR in favor of a more comprehensive treatment of the AMDGCN_WAVEFRONT_SIZE situation.

ritter-x2a requested review from AlexVlx and yxsamliu May 28, 2024 13:27

ritter-x2a self-assigned this May 28, 2024

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels May 28, 2024

yxsamliu requested a review from Artem-B May 28, 2024 13:51

ritter-x2a requested a review from zygoloid June 13, 2024 08:31

ritter-x2a mentioned this pull request Aug 13, 2024

[Clang][HIP] Target-dependent overload resolution in declarators and specifiers #103031

Closed

ritter-x2a mentioned this pull request Sep 23, 2024

[Clang][HIP] Warn when __AMDGCN_WAVEFRONT_SIZE is used in host code without relying on target-dependent overload resolution #109663

Closed

ritter-x2a closed this Oct 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Clang][HIP] Suppress availability diagnostics for mismatched host/device overloads #93546

[Clang][HIP] Suppress availability diagnostics for mismatched host/device overloads #93546

Uh oh!

ritter-x2a commented May 28, 2024

Uh oh!

llvmbot commented May 28, 2024

Uh oh!

Artem-B commented May 28, 2024

Uh oh!

ritter-x2a commented May 29, 2024

Uh oh!

ritter-x2a commented Jun 5, 2024

Uh oh!

yxsamliu commented Jun 5, 2024

Uh oh!

ritter-x2a commented Jun 6, 2024 •

edited

Loading

Uh oh!

ritter-x2a commented Jun 14, 2024

Uh oh!

ritter-x2a commented Oct 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Clang][HIP] Suppress availability diagnostics for mismatched host/device overloads #93546

[Clang][HIP] Suppress availability diagnostics for mismatched host/device overloads #93546

Uh oh!

Conversation

ritter-x2a commented May 28, 2024

Uh oh!

llvmbot commented May 28, 2024

Uh oh!

Artem-B commented May 28, 2024

Uh oh!

ritter-x2a commented May 29, 2024

Uh oh!

ritter-x2a commented Jun 5, 2024

Uh oh!

yxsamliu commented Jun 5, 2024

Uh oh!

ritter-x2a commented Jun 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ritter-x2a commented Jun 14, 2024

Uh oh!

ritter-x2a commented Oct 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ritter-x2a commented Jun 6, 2024 •

edited

Loading