Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions clang/docs/HIPSupport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -412,6 +412,57 @@ Example Usage
__host__ __device__ int Four(void) __attribute__((weak, alias("_Z6__Fourv")));
__host__ __device__ float Four(float f) __attribute__((weak, alias("_Z6__Fourf")));

C++17 Class Template Argument Deduction (CTAD) Support
======================================================

Clang supports C++17 Class Template Argument Deduction (CTAD) in both host and
device code for HIP. This allows you to omit template arguments when creating
class template instances, letting the compiler deduce them from constructor
arguments.

.. code-block:: c++

#include <tuple>

__host__ __device__ void func() {
std::tuple<int, int> t = std::tuple(1, 1);
}

In the above example, ``std::tuple(1, 1)`` automatically deduces the type to be
``std::tuple<int, int>``.

Deduction Guides
----------------

User-defined deduction guides are also supported. Since deduction guides are not
executable code and only participate in type deduction, they semantically behave
as ``__host__ __device__``. This ensures they are available for deduction in both
host and device contexts, and CTAD continues to respect any constraints on the
corresponding constructors in the usual C++ way.

.. code-block:: c++

template <typename T>
struct MyType {
T value;
__device__ MyType(T v) : value(v) {}
};

MyType(float) -> MyType<double>;

__device__ void deviceFunc() {
MyType m(1.0f); // Deduces MyType<double>
}

.. note::

Explicit HIP target attributes such as ``__host__`` or ``__device__``
are currently only permitted on deduction guides when both are present
(``__host__ __device__``). This usage is deprecated and will be rejected
in a future version of Clang; prefer omitting HIP target attributes on
deduction guides entirely. Clang treats all deduction guides as if they
were ``__host__ __device__``, so ``__host__``-only, ``__device__``-only,
or ``__global__`` deduction guides are rejected as ill-formed.

Host and Device Attributes of Default Destructors
===================================================
Expand Down
17 changes: 17 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -659,6 +659,23 @@ RISC-V Support
CUDA/HIP Language Changes
^^^^^^^^^^^^^^^^^^^^^^^^^

- Clang now supports C++17 Class Template Argument Deduction (CTAD) in CUDA/HIP
device code by treating deduction guides as if they were ``__host__ __device__``.

- Clang avoids ambiguous CTAD in CUDA/HIP by not synthesizing duplicate implicit
deduction guides when ``__host__`` and ``__device__`` constructors differ only
in CUDA target attributes (same signature and constraints).

- Clang diagnoses CUDA/HIP deduction guides that are annotated as host-only,
device-only, or ``__global__`` as errors. Explicit ``__host__ __device__``
deduction guides remain accepted for now but are deprecated and will be
rejected in a future version of Clang; deduction guides do not participate
in code generation and are treated as implicitly host+device.

- Clang preserves distinct implicit deduction guides for constructors that differ
by constraints, so constraint-based CTAD works in CUDA/HIP device code as in
standard C++.

CUDA Support
^^^^^^^^^^^^

Expand Down
8 changes: 8 additions & 0 deletions clang/include/clang/Basic/DiagnosticSemaKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -2769,6 +2769,14 @@ def err_deduction_guide_name_not_class_template : Error<
"cannot specify deduction guide for "
"%select{<error>|function template|variable template|alias template|"
"template template parameter|concept|dependent template name}0 %1">;
def err_deduction_guide_target_attr : Error<
"in CUDA/HIP, deduction guides may only be annotated with "
"'__host__ __device__'; '__host__'-only, '__device__'-only, or "
"'__global__' deduction guides are not allowed">;
def warn_deduction_guide_target_attr_deprecated : Warning<
"use of CUDA/HIP target attributes on deduction guides is deprecated; "
"they will be rejected in a future version of Clang">,
InGroup<DeprecatedAttributes>;
def err_deduction_guide_wrong_scope : Error<
"deduction guide must be declared in the same scope as template %q0">;
def err_deduction_guide_defines_function : Error<
Expand Down
12 changes: 12 additions & 0 deletions clang/lib/Sema/SemaCUDA.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,12 @@ CUDAFunctionTarget SemaCUDA::IdentifyTarget(const FunctionDecl *D,
if (D == nullptr)
return CurCUDATargetCtx.Target;

// C++ deduction guides are never codegen'ed and only participate in template
// argument deduction. Treat them as if they were always host+device so that
// CUDA/HIP target checking never rejects their use based solely on target.
if (isa<CXXDeductionGuideDecl>(D))
return CUDAFunctionTarget::HostDevice;

if (D->hasAttr<CUDAInvalidTargetAttr>())
return CUDAFunctionTarget::InvalidTarget;

Expand Down Expand Up @@ -986,6 +992,12 @@ bool SemaCUDA::CheckCall(SourceLocation Loc, FunctionDecl *Callee) {
if (ExprEvalCtx.isUnevaluated() || ExprEvalCtx.isConstantEvaluated())
return true;

// C++ deduction guides participate in overload resolution but are not
// callable functions and are never codegen'ed. Treat them as always
// allowed for CUDA/HIP compatibility checking.
if (isa<CXXDeductionGuideDecl>(Callee))
return true;

// FIXME: Is bailing out early correct here? Should we instead assume that
// the caller is a global initializer?
FunctionDecl *Caller = SemaRef.getCurFunctionDecl(/*AllowLambda=*/true);
Expand Down
24 changes: 24 additions & 0 deletions clang/lib/Sema/SemaDeclAttr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8056,6 +8056,30 @@ void Sema::ProcessDeclAttributeList(
}
}

// CUDA/HIP: restrict explicit CUDA target attributes on deduction guides.
//
// Deduction guides are not callable functions and never participate in
// codegen; they are always treated as host+device for CUDA/HIP semantic
// checks. We therefore allow either no CUDA target attributes or an explicit
// '__host__ __device__' annotation, but reject guides that are host-only,
// device-only, or marked '__global__'. The use of explicit CUDA/HIP target
// attributes on deduction guides is deprecated and will be rejected in a
// future Clang version.
if (getLangOpts().CUDA)
if (auto *Guide = dyn_cast<CXXDeductionGuideDecl>(D)) {
bool HasHost = Guide->hasAttr<CUDAHostAttr>();
bool HasDevice = Guide->hasAttr<CUDADeviceAttr>();
bool HasGlobal = Guide->hasAttr<CUDAGlobalAttr>();

if (HasGlobal || HasHost != HasDevice) {
Diag(Guide->getLocation(), diag::err_deduction_guide_target_attr);
Guide->setInvalidDecl();
} else if (HasHost && HasDevice) {
Diag(Guide->getLocation(),
diag::warn_deduction_guide_target_attr_deprecated);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to stick with "no target attributes on guides" policy? Or do we treat it as "guides are always HD, implicitly or explicitly". Considering that target overloads do play the role in the guide selection, I would be biased towards the latter, as it would be a case of business as usual, but with some restrictions. One could reason about overload resolution behavior of the guide using the same rules as we do for other functions.

If we stick with "no target attributes", we'd still need to mention implicit HD treatment, which makes it equivalent to the case above, but with the oddity that the function behaves as HD, but we are not allowed to state it explicitly, though we are generally allowed to make implicitly-HD functions of other kinds explicitly HD. I do not think this extra quirk is buying us anything useful, other than making it harder to start using target args on the guides. However, with non-HD variants being diagnosed as errors, we already have enough guardrails to protect us from unsupported (for now, at least) use of target attributes.

I'd drop the deprecation for explicit HD.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, Deduction guides aren't really anything, they should be the host/device of their target. They aren't even implicitly ANYTHING, they are a 'guide', not really even a function themselves.

I think disallowing target attributes is completely sensible here, and IMO, is the right way to go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. no target attributes it is, then.

}
}

// Do not permit 'constructor' or 'destructor' attributes on __device__ code.
if (getLangOpts().CUDAIsDevice && D->hasAttr<CUDADeviceAttr>() &&
(D->hasAttr<ConstructorAttr>() || D->hasAttr<DestructorAttr>()) &&
Expand Down
66 changes: 64 additions & 2 deletions clang/lib/Sema/SemaTemplateDeductionGuide.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,26 @@ using namespace clang;
using namespace sema;

namespace {

/// Return true if two associated-constraint sets are semantically equal.
static bool HaveSameAssociatedConstraints(
Sema &SemaRef, const NamedDecl *Old, ArrayRef<AssociatedConstraint> OldACs,
const NamedDecl *New, ArrayRef<AssociatedConstraint> NewACs) {
if (OldACs.size() != NewACs.size())
return false;
if (OldACs.empty())
return true;

// General case: pairwise compare each associated constraint expression.
Sema::TemplateCompareNewDeclInfo NewInfo(New);
for (size_t I = 0, E = OldACs.size(); I != E; ++I)
if (!SemaRef.AreConstraintExpressionsEqual(
Old, OldACs[I].ConstraintExpr, NewInfo, NewACs[I].ConstraintExpr))
return false;

return true;
}

/// Tree transform to "extract" a transformed type from a class template's
/// constructor to a deduction guide.
class ExtractTypeForDeductionGuide
Expand Down Expand Up @@ -218,9 +238,51 @@ buildDeductionGuide(Sema &SemaRef, TemplateDecl *OriginalTemplate,
TInfo->getTypeLoc().castAs<FunctionProtoTypeLoc>().getParams();

// Build the implicit deduction guide template.
QualType GuideType = TInfo->getType();

// In CUDA/HIP mode, avoid duplicate implicit guides that differ only in CUDA
// target attributes (same constructor signature and constraints).
if (IsImplicit && Ctor && SemaRef.getLangOpts().CUDA) {
SmallVector<AssociatedConstraint, 4> NewACs;
Ctor->getAssociatedConstraints(NewACs);

for (NamedDecl *Existing : DC->lookup(DeductionGuideName)) {
auto *ExistingFT = dyn_cast<FunctionTemplateDecl>(Existing);
auto *ExistingGuide =
ExistingFT
? dyn_cast<CXXDeductionGuideDecl>(ExistingFT->getTemplatedDecl())
: dyn_cast<CXXDeductionGuideDecl>(Existing);
if (!ExistingGuide)
continue;

// Only consider guides that were also synthesized from a constructor.
auto *ExistingCtor = ExistingGuide->getCorrespondingConstructor();
if (!ExistingCtor)
continue;

// If the underlying constructors are overloads (different signatures once
// CUDA attributes are ignored), they should each get their own guides.
if (SemaRef.IsOverload(Ctor, ExistingCtor,
/*UseMemberUsingDeclRules=*/false,
/*ConsiderCudaAttrs=*/false))
continue;

// At this point, the constructors have the same signature ignoring CUDA
// attributes. Decide whether their associated constraints are also the
// same; only in that case do we treat one guide as a duplicate of the
// other.
SmallVector<AssociatedConstraint, 4> ExistingACs;
ExistingCtor->getAssociatedConstraints(ExistingACs);

if (HaveSameAssociatedConstraints(SemaRef, ExistingCtor, ExistingACs,
Ctor, NewACs))
return Existing;
}
}

auto *Guide = CXXDeductionGuideDecl::Create(
SemaRef.Context, DC, LocStart, ES, Name, TInfo->getType(), TInfo, LocEnd,
Ctor, DeductionCandidate::Normal, FunctionTrailingRC);
SemaRef.Context, DC, LocStart, ES, Name, GuideType, TInfo, LocEnd, Ctor,
DeductionCandidate::Normal, FunctionTrailingRC);
Guide->setImplicit(IsImplicit);
Guide->setParams(Params);

Expand Down
32 changes: 32 additions & 0 deletions clang/test/SemaCUDA/deduction-guide-attrs.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \
// RUN: -fcuda-is-device -verify %s
// RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \
// RUN: -verify %s

#include "Inputs/cuda.h"

template <typename T>
struct S {
__host__ __device__ S(T);
};

// A host+device deduction guide is allowed and participates in CTAD, but its
// explicit target attributes are deprecated and will be rejected in a future
// Clang version.
template <typename T>
__host__ __device__ S(T) -> S<T>; // expected-warning {{use of CUDA/HIP target attributes on deduction guides is deprecated; they will be rejected in a future version of Clang}}

__host__ __device__ void use_hd_guide() {
S s(42); // uses the explicit __host__ __device__ deduction guide above
}

// CUDA/HIP target attributes on deduction guides are rejected when they make
// the guide host-only, device-only, or a kernel.
template <typename U>
__host__ S(U) -> S<U>; // expected-error {{in CUDA/HIP, deduction guides may only be annotated with '__host__ __device__'; '__host__'-only, '__device__'-only, or '__global__' deduction guides are not allowed}}

template <typename V>
__device__ S(V) -> S<V>; // expected-error {{in CUDA/HIP, deduction guides may only be annotated with '__host__ __device__'; '__host__'-only, '__device__'-only, or '__global__' deduction guides are not allowed}}

template <typename W>
__global__ S(W) -> S<W>; // expected-error {{in CUDA/HIP, deduction guides may only be annotated with '__host__ __device__'; '__host__'-only, '__device__'-only, or '__global__' deduction guides are not allowed}}
111 changes: 111 additions & 0 deletions clang/test/SemaCUDA/deduction-guide-overload.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
// RUN: %clang_cc1 -std=c++20 -triple nvptx64-nvidia-cuda -fsyntax-only \
// RUN: -fcuda-is-device -verify %s
// RUN: %clang_cc1 -std=c++20 -triple nvptx64-nvidia-cuda -fsyntax-only \
// RUN: -verify %s
// expected-no-diagnostics

#include "Inputs/cuda.h"

// This test exercises class template argument deduction (CTAD) when there are
// multiple constructors that differ only by constraints. In CUDA/HIP mode, the
// implementation must *not* collapse implicit deduction guides that have the
// same function type but different constraints; otherwise, CTAD can lose viable
// candidates.

template <typename T>
concept Signed = __is_signed(T);

template <typename T>
concept NotSigned = !Signed<T>;

// 1) Constrained ctors with different constraints: ensure we keep
// deduction guides that differ only by constraints.

template <typename T>
struct OverloadCTAD {
__host__ __device__ OverloadCTAD(T) requires Signed<T>;
__host__ __device__ OverloadCTAD(T) requires NotSigned<T>;
};

__host__ __device__ void use_overload_ctad_hd() {
OverloadCTAD a(1); // T = int, uses Signed-constrained guide
OverloadCTAD b(1u); // T = unsigned int, uses NotSigned-constrained guide
}

__device__ void use_overload_ctad_dev() {
OverloadCTAD c(1);
OverloadCTAD d(1u);
}

__global__ void use_overload_ctad_global() {
OverloadCTAD e(1);
OverloadCTAD f(1u);
}

// 2) Add a pair of constructors that have the same signature and the same
// constraint but differ only by CUDA target attributes. This exercises the
// case where two implicit deduction guides would be identical except for
// their originating constructor's CUDA target.

template <typename T>
struct OverloadCTADTargets {
__host__ OverloadCTADTargets(T) requires Signed<T>;
__device__ OverloadCTADTargets(T) requires Signed<T>;
};

__host__ void use_overload_ctad_targets_host() {
OverloadCTADTargets g(1);
}

__device__ void use_overload_ctad_targets_device() {
OverloadCTADTargets h(1);
}

// 3) Unconstrained host/device duplicates: identical signatures and no
// constraints, differing only by CUDA target attributes.

template <typename T>
struct UnconstrainedHD {
__host__ UnconstrainedHD(T);
__device__ UnconstrainedHD(T);
};

__host__ __device__ void use_unconstrained_hd_hd() {
UnconstrainedHD u1(1);
}

__device__ void use_unconstrained_hd_dev() {
UnconstrainedHD u2(1);
}

__global__ void use_unconstrained_hd_global() {
UnconstrainedHD u3(1);
}

// 4) Constrained vs unconstrained ctors with the same signature: guides
// must not be collapsed away when constraints differ.

template <typename T>
concept IsInt = __is_same(T, int);

template <typename T>
struct ConstrainedVsUnconstrained {
__host__ __device__ ConstrainedVsUnconstrained(T);
__host__ __device__ ConstrainedVsUnconstrained(T) requires IsInt<T>;
};

__host__ __device__ void use_constrained_vs_unconstrained_hd() {
ConstrainedVsUnconstrained a(1); // T = int, constrained guide viable
ConstrainedVsUnconstrained b(1u); // T = unsigned, only unconstrained guide
}

__device__ void use_constrained_vs_unconstrained_dev() {
ConstrainedVsUnconstrained c(1);
ConstrainedVsUnconstrained d(1u);
}

__global__ void use_constrained_vs_unconstrained_global() {
ConstrainedVsUnconstrained e(1);
ConstrainedVsUnconstrained f(1u);
}

Loading
Loading