-
Notifications
You must be signed in to change notification settings - Fork 15.4k
Reland [CUDA][HIP] Fix CTAD for host/device constructors (#168711) #170481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,32 @@ | ||
| // RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \ | ||
| // RUN: -fcuda-is-device -verify %s | ||
| // RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \ | ||
| // RUN: -verify %s | ||
|
|
||
| #include "Inputs/cuda.h" | ||
|
|
||
| template <typename T> | ||
| struct S { | ||
| __host__ __device__ S(T); | ||
| }; | ||
|
|
||
| // A host+device deduction guide is allowed and participates in CTAD, but its | ||
| // explicit target attributes are deprecated and will be rejected in a future | ||
| // Clang version. | ||
| template <typename T> | ||
| __host__ __device__ S(T) -> S<T>; // expected-warning {{use of CUDA/HIP target attributes on deduction guides is deprecated; they will be rejected in a future version of Clang}} | ||
|
|
||
| __host__ __device__ void use_hd_guide() { | ||
| S s(42); // uses the explicit __host__ __device__ deduction guide above | ||
| } | ||
|
|
||
| // CUDA/HIP target attributes on deduction guides are rejected when they make | ||
| // the guide host-only, device-only, or a kernel. | ||
| template <typename U> | ||
| __host__ S(U) -> S<U>; // expected-error {{in CUDA/HIP, deduction guides may only be annotated with '__host__ __device__'; '__host__'-only, '__device__'-only, or '__global__' deduction guides are not allowed}} | ||
|
|
||
| template <typename V> | ||
| __device__ S(V) -> S<V>; // expected-error {{in CUDA/HIP, deduction guides may only be annotated with '__host__ __device__'; '__host__'-only, '__device__'-only, or '__global__' deduction guides are not allowed}} | ||
|
|
||
| template <typename W> | ||
| __global__ S(W) -> S<W>; // expected-error {{in CUDA/HIP, deduction guides may only be annotated with '__host__ __device__'; '__host__'-only, '__device__'-only, or '__global__' deduction guides are not allowed}} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| // RUN: %clang_cc1 -std=c++20 -triple nvptx64-nvidia-cuda -fsyntax-only \ | ||
| // RUN: -fcuda-is-device -verify %s | ||
| // RUN: %clang_cc1 -std=c++20 -triple nvptx64-nvidia-cuda -fsyntax-only \ | ||
| // RUN: -verify %s | ||
| // expected-no-diagnostics | ||
|
|
||
| #include "Inputs/cuda.h" | ||
|
|
||
| // This test exercises class template argument deduction (CTAD) when there are | ||
| // multiple constructors that differ only by constraints. In CUDA/HIP mode, the | ||
| // implementation must *not* collapse implicit deduction guides that have the | ||
| // same function type but different constraints; otherwise, CTAD can lose viable | ||
| // candidates. | ||
|
|
||
| template <typename T> | ||
| concept Signed = __is_signed(T); | ||
|
|
||
| template <typename T> | ||
| concept NotSigned = !Signed<T>; | ||
|
|
||
| // 1) Constrained ctors with different constraints: ensure we keep | ||
| // deduction guides that differ only by constraints. | ||
|
|
||
| template <typename T> | ||
| struct OverloadCTAD { | ||
| __host__ __device__ OverloadCTAD(T) requires Signed<T>; | ||
| __host__ __device__ OverloadCTAD(T) requires NotSigned<T>; | ||
| }; | ||
|
|
||
| __host__ __device__ void use_overload_ctad_hd() { | ||
| OverloadCTAD a(1); // T = int, uses Signed-constrained guide | ||
| OverloadCTAD b(1u); // T = unsigned int, uses NotSigned-constrained guide | ||
| } | ||
|
|
||
| __device__ void use_overload_ctad_dev() { | ||
| OverloadCTAD c(1); | ||
| OverloadCTAD d(1u); | ||
| } | ||
|
|
||
| __global__ void use_overload_ctad_global() { | ||
| OverloadCTAD e(1); | ||
| OverloadCTAD f(1u); | ||
| } | ||
|
|
||
| // 2) Add a pair of constructors that have the same signature and the same | ||
| // constraint but differ only by CUDA target attributes. This exercises the | ||
| // case where two implicit deduction guides would be identical except for | ||
| // their originating constructor's CUDA target. | ||
|
|
||
| template <typename T> | ||
| struct OverloadCTADTargets { | ||
| __host__ OverloadCTADTargets(T) requires Signed<T>; | ||
| __device__ OverloadCTADTargets(T) requires Signed<T>; | ||
| }; | ||
|
|
||
| __host__ void use_overload_ctad_targets_host() { | ||
| OverloadCTADTargets g(1); | ||
| } | ||
|
|
||
| __device__ void use_overload_ctad_targets_device() { | ||
| OverloadCTADTargets h(1); | ||
| } | ||
|
|
||
| // 3) Unconstrained host/device duplicates: identical signatures and no | ||
| // constraints, differing only by CUDA target attributes. | ||
|
|
||
| template <typename T> | ||
| struct UnconstrainedHD { | ||
| __host__ UnconstrainedHD(T); | ||
| __device__ UnconstrainedHD(T); | ||
| }; | ||
|
|
||
| __host__ __device__ void use_unconstrained_hd_hd() { | ||
| UnconstrainedHD u1(1); | ||
| } | ||
|
|
||
| __device__ void use_unconstrained_hd_dev() { | ||
| UnconstrainedHD u2(1); | ||
| } | ||
|
|
||
| __global__ void use_unconstrained_hd_global() { | ||
| UnconstrainedHD u3(1); | ||
| } | ||
|
|
||
| // 4) Constrained vs unconstrained ctors with the same signature: guides | ||
| // must not be collapsed away when constraints differ. | ||
|
|
||
| template <typename T> | ||
| concept IsInt = __is_same(T, int); | ||
|
|
||
| template <typename T> | ||
| struct ConstrainedVsUnconstrained { | ||
| __host__ __device__ ConstrainedVsUnconstrained(T); | ||
| __host__ __device__ ConstrainedVsUnconstrained(T) requires IsInt<T>; | ||
| }; | ||
|
|
||
| __host__ __device__ void use_constrained_vs_unconstrained_hd() { | ||
| ConstrainedVsUnconstrained a(1); // T = int, constrained guide viable | ||
| ConstrainedVsUnconstrained b(1u); // T = unsigned, only unconstrained guide | ||
| } | ||
|
|
||
| __device__ void use_constrained_vs_unconstrained_dev() { | ||
| ConstrainedVsUnconstrained c(1); | ||
| ConstrainedVsUnconstrained d(1u); | ||
| } | ||
|
|
||
| __global__ void use_constrained_vs_unconstrained_global() { | ||
| ConstrainedVsUnconstrained e(1); | ||
| ConstrainedVsUnconstrained f(1u); | ||
| } | ||
|
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to stick with "no target attributes on guides" policy? Or do we treat it as "guides are always HD, implicitly or explicitly". Considering that target overloads do play the role in the guide selection, I would be biased towards the latter, as it would be a case of business as usual, but with some restrictions. One could reason about overload resolution behavior of the guide using the same rules as we do for other functions.
If we stick with "no target attributes", we'd still need to mention implicit HD treatment, which makes it equivalent to the case above, but with the oddity that the function behaves as HD, but we are not allowed to state it explicitly, though we are generally allowed to make implicitly-HD functions of other kinds explicitly HD. I do not think this extra quirk is buying us anything useful, other than making it harder to start using target args on the guides. However, with non-HD variants being diagnosed as errors, we already have enough guardrails to protect us from unsupported (for now, at least) use of target attributes.
I'd drop the deprecation for explicit HD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, Deduction guides aren't really anything, they should be the host/device of their target. They aren't even implicitly ANYTHING, they are a 'guide', not really even a function themselves.
I think disallowing target attributes is completely sensible here, and IMO, is the right way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. no target attributes it is, then.