Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions clang/include/clang/AST/SYCLKernelInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,12 @@ namespace clang {

class SYCLKernelInfo {
public:
SYCLKernelInfo(
CanQualType KernelNameType,
const FunctionDecl *KernelEntryPointDecl,
const std::string &KernelName)
:
KernelNameType(KernelNameType),
KernelEntryPointDecl(KernelEntryPointDecl),
KernelName(KernelName)
{}
SYCLKernelInfo(CanQualType KernelNameType,
const FunctionDecl *KernelEntryPointDecl,
const std::string &KernelName, int ParamCount)
: KernelNameType(KernelNameType),
KernelEntryPointDecl(KernelEntryPointDecl), KernelName(KernelName),
ParamCount(ParamCount) {}

CanQualType GetKernelNameType() const {
return KernelNameType;
Expand All @@ -43,10 +40,13 @@ class SYCLKernelInfo {
return KernelName;
}

const int &GetParamCount() const { return ParamCount; }

private:
CanQualType KernelNameType;
const FunctionDecl *KernelEntryPointDecl;
std::string KernelName;
int ParamCount;
};

} // namespace clang
Expand Down
1 change: 1 addition & 0 deletions clang/include/clang/Basic/Builtins.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ enum LanguageID : uint16_t {
OCL_DSE = 0x400, // builtin requires OpenCL device side enqueue.
ALL_OCL_LANGUAGES = 0x800, // builtin for OCL languages.
HLSL_LANG = 0x1000, // builtin requires HLSL.
SYCL_LANG = 0x2000, // builtin requires SYCL.
ALL_LANGUAGES = C_LANG | CXX_LANG | OBJC_LANG, // builtin for all languages.
ALL_GNU_LANGUAGES = ALL_LANGUAGES | GNU_LANG, // builtin requires GNU mode.
ALL_MS_LANGUAGES = ALL_LANGUAGES | MS_LANG // builtin requires MS mode.
Expand Down
13 changes: 13 additions & 0 deletions clang/include/clang/Basic/Builtins.td
Original file line number Diff line number Diff line change
Expand Up @@ -4598,6 +4598,19 @@ def GetDeviceSideMangledName : LangBuiltin<"CUDA_LANG"> {
let Prototype = "char const*(...)";
}

// SYCL
def SYCLKernelName : LangBuiltin<"SYCL_LANG"> {
let Spellings = ["__builtin_sycl_kernel_name"];
let Attributes = [NoThrow, CustomTypeChecking];
let Prototype = "char const*(...)";
}

def SYCLKernelParamCount : LangBuiltin<"SYCL_LANG"> {
let Spellings = ["__builtin_sycl_kernel_param_count"];
let Attributes = [NoThrow, CustomTypeChecking];
let Prototype = "int(...)";
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add the Const and Constexpr attributes here if doing so doesn't cause significant issues. I can't think of any reason that these builtins shouldn't work during constant evaluation. We'll need to ensure that these builtins are always evaluated at compile-time and that constant arguments are passed for those that accept arguments (it is possible this is already accounted for and I just haven't read the code far enough yet).


// HLSL
def HLSLAll : LangBuiltin<"HLSL_LANG"> {
let Spellings = ["__builtin_hlsl_elementwise_all"];
Expand Down
3 changes: 3 additions & 0 deletions clang/include/clang/Basic/DiagnosticSemaKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -12199,6 +12199,9 @@ def err_sycl_kernel_name_type : Error<
def err_sycl_kernel_name_conflict : Error<
"'sycl_kernel_entry_point' kernel name argument conflicts with a previous"
" declaration">;
def err_builtin_invalid_argument_count : Error<"builtin requires exactly 1 argument">;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that a generic diagnostic for this doesn't already exist, but I went looking and didn't find one.

I suggest we model this after err_attribute_wrong_number_arguments to make it more reusable.

Suggested change
def err_builtin_invalid_argument_count : Error<"builtin requires exactly 1 argument">;
def err_builtin_invalid_argument_count : Error<
"%0 %plural{0:takes no arguments|1:takes one argument|"
":requires exactly %1 arguments}1">;

Also, how about colocating this with the other generic builtin diagnostics? Search for "/// Built-in functions." in the same file.

def err_builtin_invalid_argument : Error<"invalid argument; argument must be a class or struct type"
" with a member type alias named 'type'">;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at some other diagnostics for wording inspiration. These included err_pragma_loop_invalid_argument_type, err_typecheck_unary_expr, err_hip_invalid_args_builtin_mangled_name, err_opencl_builtin_pipe_invalid_arg, err_opencl_builtin_to_addr_invalid_arg, err_attribute_vecreturn_only_vector_member, and err_typename_nested_not_found_enable_if. There isn't strong consistency across these, but the following is a little more consistent.

The name of the diagnostic should be changed; err_builtin_invalid_argument is a pretty generic name for a rather specific diagnostic.

Suggested change
def err_builtin_invalid_argument : Error<"invalid argument; argument must be a class or struct type"
" with a member type alias named 'type'">;
def err_sycl_kernel_name_argument_invalid : Error<
"invalid argument of type %0; expected a class or structure "
"with a member typedef or type alias named 'type'">;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took inspiration from some existing diagnostics with "argument must be a xyz" wording. I personally prefer that to "expected a...". I have no strong feelings though and so I have made this modification.

I did not specify the type via %0 in the diagnostic because we have the ^ pointing to the problematic type when we emit the diagnostic. Way back, someone told me to not include unnecessary information in diagnostics to keep it short, if we are already pointing to the problematic type.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the "argument must be a xyz" wording; that is consistent with other diagnostics too.

My motivation for including the type in the diagnostic is because the carat won't actually point to the type; I expect it will point to the expression that produces an object of the type. Given this:

template<typename KernelName>
struct kernel_id_t {
  using type = KernelName;
};
template<typename KernelName>
void f() {
  __builtin_sycl_kernel_name(kernel_id_t<KernelName>());
}

I would expect a diagnostic to look something like:

invalid argument; argument must be a class or structure with a member typedef or type alias named 'type'
  __builtin_sycl_kernel_name(kernel_id_t<KernelName>());
                             ^^^^^^^^^^^^^^^^^^^^^^^^^

Our expectation is that the expression will readily identify the type in expected usage as in the above case, but that isn't guaranteed. For example, the expression could call some function (I don't know of any good reason to do so, but style preferences could lead someone to do so).


def warn_cuda_maxclusterrank_sm_90 : Warning<
"maxclusterrank requires sm_90 or higher, CUDA arch provided: %0, ignoring "
Expand Down
47 changes: 41 additions & 6 deletions clang/lib/AST/ASTContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11994,8 +11994,15 @@ bool ASTContext::DeclMustBeEmitted(const Decl *D) {

// SYCL kernel entry point functions are used to generate and emit
// the offload kernel.
if (LangOpts.SYCLIsDevice && FD->hasAttr<SYCLKernelEntryPointAttr>())
if (LangOpts.SYCLIsDevice) {
if (D->hasAttr<SYCLKernelEntryPointAttr>())
return true;
// FIXME: Existing tests fail if we limit emission to the kernel caller
// function and functions called from it. Once the sycl_device attribute
// is implemented, modify this check (and tests) to include it and
// return false.
Comment on lines +12000 to +12003
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since sycl_device doesn't exist upstream yet, I think we should refrain from mentioning it here. The following suggested change defers to the SYCL specification for expected future behavior.

I suspect more changes will be needed in this function. For example, should we be returning true for declarations attributed with AliasAttr or UsedAttr? Or for global variables with external linkage?

(I tend to prefix "FIXME" on every line of a FIXME comment so that a grep for FIXME yields the entire comment. It also helps to separate FIXME content from other content. Not everyone does this of course).

Suggested change
// FIXME: Existing tests fail if we limit emission to the kernel caller
// function and functions called from it. Once the sycl_device attribute
// is implemented, modify this check (and tests) to include it and
// return false.
// FIXME: The only functions that must be emitted during device compilation
// FIXME: are the kernel entry points and functions declared with SYCL_EXTERNAL.
// FIXME: However, some existing tests fail if the set of emitted functions is
// FIXME: limited to those functions and the ones they call; further investigation
// FIXME: is needed to determine how to address those test failures.

// return false;
}

// Constructors and destructors are required.
if (FD->hasAttr<ConstructorAttr>() || FD->hasAttr<DestructorAttr>())
Expand Down Expand Up @@ -13781,9 +13788,8 @@ void ASTContext::getFunctionFeatureMap(llvm::StringMap<bool> &FeatureMap,
}
}

static SYCLKernelInfo BuildSYCLKernelInfo(ASTContext &Context,
CanQualType KernelNameType,
const FunctionDecl *FD) {
static std::string GetSYCLKernelCallerName(ASTContext &Context,
CanQualType KernelNameType) {
// FIXME: Host and device compilations must agree on a name for the generated
// FIXME: SYCL kernel caller function. The name is provided to the SYCL
// FIXME: library on the host via __builtin_sycl_kernel_name() and the SYCL
Expand Down Expand Up @@ -13820,9 +13826,38 @@ static SYCLKernelInfo BuildSYCLKernelInfo(ASTContext &Context,
Buffer.reserve(128);
llvm::raw_string_ostream Out(Buffer);
MC->mangleSYCLKernelCallerName(KernelNameType, Out);
std::string KernelName = Out.str();
return Out.str();
}

static int GetSYCLKernelCallerParamCount(const FunctionDecl *FD) {
// The parameters of the compiler generated SYCL kernel caller function
// are generated using the parameters of the SYCL kernel entry point
// function. The first parameter of the SYCL kernel entry point function
// (i.e. FD), is the SYCL kernel object. If the SYCL kernel object does
// not contain SYCL special types, it can be passed as-is to the device.
// In this case, parameters of the SYCL kernel caller function are the
// same as that of the the SYCL kernel entry point function. However, if
// the SYCL kernel object contains a SYCL special type, it is decomposed
// to it's data members and the parameters of the kernel caller function
// are generated using these decomposed fields, instead of the kernel
// object.

// FIXME: SYCL special types are not currently supported and therefore we
// assume there is no decomposition and return the parameter count of SYCL
// kernel entry point function.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the initial PR, I think we should avoid discussing decomposition at all since we don't yet understand the motivation for it well enough to defend it. When/If we add support for decomposition, whether for SYCL special types or otherwise, we'll be in a better position to answer any questions. I suggest removing this comment for now.

Also, we don't actually require the SYCL kernel object to be the first parameter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we be upstreaming builtins in initial PR? I thought this was just for internal testing. If we are upstreaming all this initially I assume I don't even need this function and can just do FD->param_size() in BuildSYCLKernelInfo

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. My memory is fuzzy on that. Let's talk about that during the FE sync meeting today.

return FD->param_size();
}

static SYCLKernelInfo BuildSYCLKernelInfo(ASTContext &Context,
CanQualType KernelNameType,
const FunctionDecl *FD) {
// Get the mangled name.
std::string KernelCallerName =
GetSYCLKernelCallerName(Context, KernelNameType);
// Get number of arguments.
int ParamCount = GetSYCLKernelCallerParamCount(FD);

return { KernelNameType, FD, KernelName };
return {KernelNameType, FD, KernelCallerName, ParamCount};
}

void ASTContext::registerSYCLEntryPointFunction(FunctionDecl *FD) {
Expand Down
3 changes: 3 additions & 0 deletions clang/lib/Basic/Builtins.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,9 @@ static bool builtinIsSupported(const Builtin::Info &BuiltinInfo,
/* CUDA Unsupported */
if (!LangOpts.CUDA && BuiltinInfo.Langs == CUDA_LANG)
return false;
/* SYCL Unsupported */
if (!LangOpts.isSYCL() && BuiltinInfo.Langs == SYCL_LANG)
return false;
/* CPlusPlus Unsupported */
if (!LangOpts.CPlusPlus && BuiltinInfo.Langs == CXX_LANG)
return false;
Expand Down
32 changes: 32 additions & 0 deletions clang/lib/CodeGen/CGBuiltin.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
#include "clang/Basic/TargetBuiltins.h"
#include "clang/Basic/TargetInfo.h"
#include "clang/Basic/TargetOptions.h"
#include "clang/Basic/IdentifierTable.h"
#include "clang/CodeGen/CGFunctionInfo.h"
#include "clang/Frontend/FrontendDiagnostic.h"
#include "llvm/ADT/APFloat.h"
Expand Down Expand Up @@ -2538,6 +2539,20 @@ static RValue EmitHipStdParUnsupportedBuiltin(CodeGenFunction *CGF,
return RValue::get(CGF->Builder.CreateCall(UBF, Args));
}

const SYCLKernelInfo *GetSYCLKernelInfo(ASTContext &Ctx, const CallExpr *E) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const SYCLKernelInfo *GetSYCLKernelInfo(ASTContext &Ctx, const CallExpr *E) {
static const SYCLKernelInfo *GetSYCLKernelInfo(ASTContext &Ctx, const CallExpr *E) {

// Argument to the builtin is a kernel_id_t type trait which is used
// to retrieve the kernel name type.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kernel_id_t isn't a meaningful name here. The closest terminology the C++ standard has is a "transformation trait", but that doesn't include requirements such as default constructibility. I suggest:

Suggested change
// Argument to the builtin is a kernel_id_t type trait which is used
// to retrieve the kernel name type.
// The first argument to the builtin is an object that designates the type
// used as the name of a SYCL kernel. The value of the argument is evaluated
// and discarded; the kernel is identified by the argument's type. The type is
// is required to be a class or structure type that is default constructible, copy
// constructible, and that has a member typedef or type alias named 'type'.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain why we need it to be default constructible and copy constructible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually can you explain this comment? How is the kernel identified by the argument's type? Won't the type of the argument be the class/struct enclosing type alias named type, not the kernel name type itself?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling the builtin requires constructing an object of the type and passing it as an argument to the builtin. However, you are right that we don't need to be overly prescriptive here; from the perspective of the builtin, it doesn't matter how the object is constructed.

Upon re-reading my suggested update, I agree some of it is ambiguous regarding which type it is referring to. How does this sound?

Suggested change
// Argument to the builtin is a kernel_id_t type trait which is used
// to retrieve the kernel name type.
// The first argument to the builtin is an object that designates the SYCL kernel.
// The argument is evaluated and its value is discarded; the SYCL kernel is
// identified based on the argument type. The argument type is required to be
// a class or structure with a member typedef or type alias named 'type'. The
// target type of the 'type' member is the SYCL kernel name type.

RecordDecl *RD = E->getArg(0)->getType()->castAs<RecordType>()->getDecl();
IdentifierTable &IdentTable = Ctx.Idents;
auto Name = DeclarationName(&(IdentTable.get("type")));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IdentifierTable::get() will create and return an IdentifierInfo reference even if no such identifier as been encountered yet. Perhaps it would be better to use find() here and to assert the identifier is found.

Suggested change
auto Name = DeclarationName(&(IdentTable.get("type")));
auto TypeIdentInfo = IdentTable.find("type");
assert(TypeIdentInfo != IdentTable.end();
auto Name = DeclarationName(*TypeIdentInfo);

NamedDecl *ND = (RD->lookup(Name)).front();
TypeAliasDecl *TD = cast<TypeAliasDecl>(ND);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeAliasDecl is specific to C++11 type aliases; I think TypedefNameDecl is needed here to support both typedefs and type aliases.

CanQualType KernelNameType = Ctx.getCanonicalType(TD->getUnderlyingType());

// Retrieve KernelInfo using the kernel name.
return Ctx.findSYCLKernelInfo(KernelNameType);
}

RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
const CallExpr *E,
ReturnValueSlot ReturnValue) {
Expand Down Expand Up @@ -5992,6 +6007,23 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
auto Str = CGM.GetAddrOfConstantCString(Name, "");
return RValue::get(Str.getPointer());
}
case Builtin::BI__builtin_sycl_kernel_name: {
// Retrieve the kernel info corresponding to kernel name type.
const SYCLKernelInfo *KernelInfo = GetSYCLKernelInfo(getContext(), E);
assert(KernelInfo && "Type does not correspond to a SYCL kernel name.");

// Emit the mangled name.
auto Str = CGM.GetAddrOfConstantCString(KernelInfo->GetKernelName(), "");
return RValue::get(Str.getPointer());
}
case Builtin::BI__builtin_sycl_kernel_param_count: {
// Retrieve the kernel info corresponding to kernel name type.
const SYCLKernelInfo *KernelInfo = GetSYCLKernelInfo(getContext(), E);
assert(KernelInfo && "Type does not correspond to a SYCL kernel name.");
// Emit total number of parameters of kernel caller function.
int test = KernelInfo->GetParamCount();
return RValue::get(llvm::ConstantInt::get(Int32Ty, test));
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier comments apply to the above builtins as well. E.g., prefer IntTy over Int32Ty and error checking of a parameter index argument.

}

// If this is an alias for a lib function (e.g. __builtin_sin), emit
Expand Down
34 changes: 19 additions & 15 deletions clang/lib/CodeGen/CodeGenModule.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3168,6 +3168,22 @@ void CodeGenModule::EmitDeferred() {
CurDeclsToEmit.swap(DeferredDeclsToEmit);

for (GlobalDecl &D : CurDeclsToEmit) {
// If the Decl corresponds to a SYCL kernel entry point function, generate
// and emit the corresponding the SYCL kernel caller function i.e the
// offload kernel.
if (const auto *FD = D.getDecl()->getAsFunction()) {
if (LangOpts.SYCLIsDevice && FD->hasAttr<SYCLKernelEntryPointAttr>() &&
FD->isDefined()) {
// Generate and emit the offload kernel
EmitSYCLKernelCaller(FD, getContext());
// The offload kernel invokes the operator method of the SYCL kernel
// object i.e. the SYCL kernel function is invoked. Emit this function.
EmitDeferred();
// Do not emit the SYCL kernel entry point function.
continue;
}
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My motivation for moving this chunk of code later in the function was to avoid the recursive call to EmitDeferred() here since that recursion will already happen further below (along with an assertion that all deferred declarations were actually emitted; that is missing here).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is GetAddrOfGlobal below will result in an emit of the entry-point function. We need this block in the beginning to avoid emitting anything related the the entry-point function.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't realize that. That makes sense. Perhaps we should add a comment to that effect. And update tests to validate that the entry point function isn't emitted on the device side.

// We should call GetAddrOfGlobal with IsForDefinition set to true in order
// to get GlobalValue with exactly the type we need, not something that
// might had been created for another decl with the same mangled name but
Expand Down Expand Up @@ -3197,21 +3213,6 @@ void CodeGenModule::EmitDeferred() {
if (LangOpts.OpenMP && OpenMPRuntime && OpenMPRuntime->emitTargetGlobal(D))
continue;

// If the Decl corresponds to a SYCL kernel entry point function, generate
// and emit the corresponding SYCL kernel caller function, i.e the
// offload kernel. Otherwise, emit the definition and move on to the next
// one.
const FunctionDecl *FD = nullptr;
if (LangOpts.SYCLIsDevice &&
(FD = D.getDecl()->getAsFunction()) != nullptr &&
FD->hasAttr<SYCLKernelEntryPointAttr>() &&
FD->isDefined()) {
// Generate and emit the offload kernel
EmitSYCLKernelCaller(FD, getContext());
} else {
EmitGlobalDefinition(D, GV);
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this code requires restoring the original code I had replaced. The else branch was there to preserve existing functionality. I'm surprised this didn't cause any tests to fail!

    // Otherwise, emit the definition and move on to the next one.
    EmitGlobalDefinition(D, GV);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think I accidently removed that. It is there in my fork. I'll put it back!

// If we found out that we need to emit more decls, do that recursively.
// This has the advantage that the decls are emitted in a DFS and related
// ones are close together, which is convenient for testing.
Expand Down Expand Up @@ -3515,6 +3516,9 @@ bool CodeGenModule::MayBeEmittedEagerly(const ValueDecl *Global) {
// Defer until all versions have been semantically checked.
if (FD->hasAttr<TargetVersionAttr>() && !FD->isMultiVersion())
return false;
// Defer emission of SYCL kernel entry point functions.
if (LangOpts.SYCLIsDevice && FD->hasAttr<SYCLKernelEntryPointAttr>())
return false;
}
if (const auto *VD = dyn_cast<VarDecl>(Global)) {
if (Context.getInlineVariableDefinitionKind(VD) ==
Expand Down
12 changes: 7 additions & 5 deletions clang/lib/CodeGen/CodeGenSYCL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,14 @@ void CodeGenModule::EmitSYCLKernelCaller(const FunctionDecl *KernelEntryPointFn,
llvm::FunctionType *FnTy = getTypes().GetFunctionType(FnInfo);

// Retrieve the generated name for the SYCL kernel caller function
const auto *SKEPAttr = KernelEntryPointFn->getAttr<SYCLKernelEntryPointAttr>();
assert(SKEPAttr && "Missing sycl_kernel_entry_point attribute");
CanQualType KernelNameType = Ctx.getCanonicalType(SKEPAttr->getKernelName());
const SYCLKernelInfo &SKI = Ctx.SYCLKernels.at(KernelNameType);
const auto *KernelEntryPointAttr =
KernelEntryPointFn->getAttr<SYCLKernelEntryPointAttr>();
CanQualType KernelNameType =
Ctx.getCanonicalType(KernelEntryPointAttr->getKernelName());
const SYCLKernelInfo *KernelInfo = Ctx.findSYCLKernelInfo(KernelNameType);
assert(KernelInfo && "Type does not correspond to a kernel name");
auto *Fn = llvm::Function::Create(FnTy, llvm::GlobalVariable::ExternalLinkage,
SKI.GetKernelName(), &getModule());
KernelInfo->GetKernelName(), &getModule());

// Emit the SYCL kernel caller function
CodeGenFunction CGF(*this);
Expand Down
34 changes: 34 additions & 0 deletions clang/lib/Sema/SemaChecking.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2450,6 +2450,25 @@ static bool BuiltinCountZeroBitsGeneric(Sema &S, CallExpr *TheCall) {
return false;
}

// The argument must be a class or struct with a member
// named type.
static bool CheckBuiltinSyclKernelName(Sema &S, CallExpr *TheCall) {
QualType ArgTy = TheCall->getArg(0)->getType();
const auto *RT = ArgTy->getAs<RecordType>();
Comment on lines +2453 to +2457
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary intent of this function appears to be to validate that the type that identifies the kernel meets specific requirements. This is useful and is separable from the fact that the type is obtained by querying the type of the first argument in a call expression; I think it would be clearer to colocate the retrieval of the type with the code that checks for the number of arguments below.

Suggested change
// The argument must be a class or struct with a member
// named type.
static bool CheckBuiltinSyclKernelName(Sema &S, CallExpr *TheCall) {
QualType ArgTy = TheCall->getArg(0)->getType();
const auto *RT = ArgTy->getAs<RecordType>();
// The SYCL kernel builtin functions identify the kernel to operate on
// by the type of one of the arguments. That type is required to be a
// class or structure type with a member typedef or type alias named
// "type".
static bool CheckBuiltinSyclKernelIDType(Sema &S, QualType T) {
const auto *RT = T->getAs<RecordType>();


if(!RT)
return true;

RecordDecl *RD = RT->getDecl();
IdentifierTable &IdentTable = S.Context.Idents;
auto Name = DeclarationName(&(IdentTable.get("type")));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I should be worried about this, but IdentifierTable::get() will create an IdentifierInfo object if one doesn't already exist. That makes me nervous. There is a find() member function to do a lookup without creating a new object, but it isn't particularly ergonomic to use. The lack of a getExisting() or similar function that can return null is a little surprising to me though, so maybe I'm being overly paranoid. Maybe it is worth using find() here to avoid unlikely potential problems?

DeclContext::lookup_result Lookup = RD->lookup(Name);
if (Lookup.empty() || !Lookup.isSingleResult() || !isa<TypeAliasDecl>(Lookup.front()))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (Lookup.empty() || !Lookup.isSingleResult() || !isa<TypeAliasDecl>(Lookup.front()))
if (Lookup.empty() || !Lookup.isSingleResult() || !isa<TypedefNameDecl>(Lookup.front()))

return true;

return false;
}

ExprResult
Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
CallExpr *TheCall) {
Expand Down Expand Up @@ -3252,6 +3271,21 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
}
break;
}
case Builtin::BI__builtin_sycl_kernel_name:
case Builtin::BI__builtin_sycl_kernel_param_count: {
// Builtin takes 1 argument
if (TheCall->getNumArgs() != 1) {
Diag(TheCall->getBeginLoc(), diag::err_builtin_invalid_argument_count);
return ExprError();
}

if (CheckBuiltinSyclKernelName(*this, TheCall)) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per prior comment:

Suggested change
if (CheckBuiltinSyclKernelName(*this, TheCall)) {
if (CheckBuiltinSyclKernelName(*this, TheCall->getArg(0)->getType())) {

Diag(TheCall->getArg(0)->getBeginLoc(), diag::err_builtin_invalid_argument);
return ExprError();
}

break;
}
case Builtin::BI__builtin_popcountg:
if (BuiltinPopcountg(*this, TheCall))
return ExprError();
Expand Down
60 changes: 60 additions & 0 deletions clang/test/CodeGenSYCL/builtin-sycl-kernel-name.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
// RUN: %clang_cc1 -fsycl-is-host -emit-llvm -triple x86_64 %s -o - | FileCheck %s

// Test IR generated by __builtin_sycl_kernel_name(). This builtin accepts a SYCL
// kernel name type and returns it's mangled name.

class kernel_name_1;
class kernel_name_2;
typedef kernel_name_2 kernel_name_TD;
class kernel_name_3;
class kernel_name_4;
typedef kernel_name_4 kernel_name_TD2;

template<typename KN>
struct kernel_id_t {
using type = KN;
};

struct kernel_id_nt {
using type = kernel_name_3;
};

template <typename name, typename Func>
__attribute__((sycl_kernel_entry_point(name))) void kernel_single_task(const Func kernelFunc) {
kernelFunc();
}

struct SYCLKernel {
int m;
public:
void operator()() const {}
};

void test() {
SYCLKernel Obj;
kernel_single_task<kernel_name_1>(Obj);
kernel_single_task<kernel_name_TD>(Obj);
kernel_single_task<kernel_name_3>(Obj);
kernel_single_task<kernel_name_TD2>(Obj);
const char* test1 = __builtin_sycl_kernel_name(kernel_id_t<kernel_name_1>());
const char* test2 = __builtin_sycl_kernel_name(kernel_id_t<kernel_name_TD>());
const char* test3 = __builtin_sycl_kernel_name(kernel_id_nt());
const char* test4 = __builtin_sycl_kernel_name(kernel_id_t<kernel_name_4>());
}

// Kernel names retrieved from KernelInfo map
// CHECK: @0 = private unnamed_addr constant [44 x i8] c"_Z20__sycl_kernel_callerI13kernel_name_1Evv\00", align 1
// CHECK: @1 = private unnamed_addr constant [44 x i8] c"_Z20__sycl_kernel_callerI13kernel_name_2Evv\00", align 1
// CHECK: @2 = private unnamed_addr constant [44 x i8] c"_Z20__sycl_kernel_callerI13kernel_name_3Evv\00", align 1
// CHECK: @3 = private unnamed_addr constant [44 x i8] c"_Z20__sycl_kernel_callerI13kernel_name_4Evv\00", align 1

// CHECK: define dso_local void @_Z4testv()
// CHECK: %test1 = alloca ptr, align 8
// CHECK: %test2 = alloca ptr, align 8
// CHECK: %test3 = alloca ptr, align 8
// CHECK: %test4 = alloca ptr, align 8
// CHECK: store ptr @0, ptr %test1, align 8
// CHECK: store ptr @1, ptr %test2, align 8
// CHECK: store ptr @2, ptr %test3, align 8
// CHECK: store ptr @3, ptr %test4, align 8

Loading