Skip to content
Merged
Show file tree
Hide file tree
Changes from 118 commits
Commits
Show all changes
138 commits
Select commit Hold shift + click to select a range
652caa8
Preliminary implementation of work_group_memory extension
lbushi25 Aug 14, 2024
76daf77
Preliminary implementation of work_group_memory extension
lbushi25 Aug 14, 2024
21e082b
Implement work_group_memory extension
lbushi25 Aug 16, 2024
025cbc4
Implement work_group_memory extension
lbushi25 Aug 19, 2024
b94f7c9
Implement work group memory
lbushi25 Aug 20, 2024
0d6d694
Remove debug dumps
lbushi25 Aug 22, 2024
448071f
Merge branch 'intel:sycl' into work_group_memoy_new
lbushi25 Aug 22, 2024
9f2973a
Update work_group_memory.hpp
lbushi25 Aug 22, 2024
852315f
Remove include of deleted header file
lbushi25 Aug 22, 2024
4234022
Fix SPIRV compilation errors
lbushi25 Aug 22, 2024
ae5eb7e
Remove accidental change
lbushi25 Aug 23, 2024
8ce0280
Formatting changes
lbushi25 Aug 23, 2024
4ee31a5
Formatting changes
lbushi25 Aug 23, 2024
7b1b90b
Put the work group memory doc to supported
lbushi25 Aug 23, 2024
cf7476e
More formatting changes
lbushi25 Aug 23, 2024
50c0954
Delete sycl/include/sycl/ext/oneapi/experimental/test.cpp
lbushi25 Aug 23, 2024
44811b8
Yet more formatting changes
lbushi25 Aug 23, 2024
ad1046f
Merge branch 'work_group_memoy_new' of https://github.com/lbushi25/ll…
lbushi25 Aug 23, 2024
d343a2e
Fix warnings on Linux
lbushi25 Aug 23, 2024
3f1bc30
Remove unnecessary forward declaration from handler.hpp
lbushi25 Aug 23, 2024
103e233
Remove rvalue references in favor of const lvalue references
lbushi25 Aug 23, 2024
bfa5830
Fix syntax errors
lbushi25 Aug 23, 2024
2031478
Fix syntax errors
lbushi25 Aug 23, 2024
76f0acc
Don't explicitly make the work_group_memory class device-copyable as …
lbushi25 Aug 23, 2024
e0ad435
Don't explicitly make the work_group_memory class device-copyable as …
lbushi25 Aug 23, 2024
3513251
Remove some more unnecessary code
lbushi25 Aug 23, 2024
2cec997
Update work_group_memory.hpp
lbushi25 Aug 23, 2024
4c8b196
Update work_group_memory.hpp
lbushi25 Aug 23, 2024
a0b70e2
Formatting
lbushi25 Aug 23, 2024
ed8f125
Move doc to experimental folder
lbushi25 Aug 23, 2024
9460876
Update status section in doc
lbushi25 Aug 23, 2024
ac7130a
Merge branch 'work_group_memoy_new' of https://github.com/lbushi25/ll…
lbushi25 Aug 23, 2024
e2889b3
Final fixes
lbushi25 Aug 23, 2024
d6c78b9
Remove unnecessary include
lbushi25 Aug 23, 2024
8f7a07b
Add initial tests for work_group_memory extension
lbushi25 Aug 28, 2024
ae59899
Add E2E tests for work group memory
lbushi25 Aug 29, 2024
8cff603
Fix formatting
lbushi25 Aug 29, 2024
3ceead1
Resolve merge conflict
lbushi25 Sep 24, 2024
3228aeb
Revamp tests for work group memory extension
lbushi25 Sep 27, 2024
0e95ee5
Remove sanity test
lbushi25 Sep 27, 2024
52f13f0
Move extension doc to proposed
lbushi25 Sep 27, 2024
71d1013
Restore proposed status of work group memory doc
lbushi25 Sep 27, 2024
d48bc42
Fix unusd variable warning
lbushi25 Sep 27, 2024
f6515bc
Reduce test size to make sure UR does not run out or resources
lbushi25 Sep 27, 2024
3e4c73c
Replace sycl.hpp with core.hpp in the includes of E2E test
lbushi25 Sep 27, 2024
c84229e
Remove sycl.hpp include from tests
lbushi25 Sep 30, 2024
d2fddd8
Add support for unbounded arrays
lbushi25 Oct 2, 2024
0f677c2
Fix compilation errors
lbushi25 Oct 2, 2024
6ef823e
Improve swap test
lbushi25 Oct 2, 2024
6dc262a
Merge branch 'work_group_memoy_new' of https://github.com/lbushi25/ll…
lbushi25 Oct 2, 2024
4de6d50
Refactor CodeGenTypes.cpp changes
lbushi25 Oct 3, 2024
5653f04
Refactor CodeGenTypes.cpp changes
lbushi25 Oct 3, 2024
40eb63e
translate unbounded arrays to 1-sized arrays in LLVM IR in device com…
lbushi25 Oct 3, 2024
f6a0df7
Remove trailing spaces
lbushi25 Oct 3, 2024
026501c
Add unbounded array support by modifying LLVM IR -> SPIRV type lowering
lbushi25 Oct 3, 2024
2ce21b3
Revert CodeGenTypes.cpp changes
lbushi25 Oct 3, 2024
a9b2875
Fix merge conflicts
lbushi25 Oct 3, 2024
3821df4
Merge branch 'intel:sycl' into work_group_memoy_new
lbushi25 Oct 7, 2024
d73b0b1
Update SPIRVWriter.cpp
lbushi25 Oct 7, 2024
c1087ad
Merge branch 'work_group_memoy_new' of https://github.com/lbushi25/ll…
lbushi25 Oct 7, 2024
31481b8
Revert SPIRV translator changes
lbushi25 Oct 10, 2024
396169f
Revert SPIRV translator changes
lbushi25 Oct 10, 2024
dc37b2c
Merge branch 'intel:sycl' into work_group_memoy_new
lbushi25 Oct 10, 2024
236139f
Revert SPIRV translator changes
lbushi25 Oct 10, 2024
2beda8e
Stash changes
lbushi25 Oct 11, 2024
dbafe31
Add inital free function kernel support for work group memory
lbushi25 Oct 11, 2024
1b968df
merge latest upstream changes
lbushi25 Oct 11, 2024
e6b66c3
Apply suggestions
lbushi25 Oct 11, 2024
84ef6a8
Merge branch 'intel:sycl' into work_group_memoy_new
lbushi25 Oct 11, 2024
f24af09
Merge branch 'intel:sycl' into work_group_memoy_new
lbushi25 Oct 14, 2024
7dfa80b
Address reviews
lbushi25 Oct 14, 2024
3957cb5
Address reviews
lbushi25 Oct 14, 2024
91820d8
Address reviews
lbushi25 Oct 14, 2024
34bc23d
Update WorkGroupMemoryBackendArgument.cpp
lbushi25 Oct 14, 2024
3acf835
Update WorkGroupMemoryBackendArgument.cpp
lbushi25 Oct 14, 2024
604c640
Fix unit test implementation for backend kernel argument
lbushi25 Oct 15, 2024
5510208
Merge branch 'work_group_memoy_new' of https://github.com/lbushi25/ll…
lbushi25 Oct 15, 2024
d9418f9
Merge branch 'sycl' into work_group_memoy_new
lbushi25 Oct 15, 2024
e90a3b7
Add frontend tests
lbushi25 Oct 15, 2024
3b9a55a
Merge branch 'work_group_memoy_new' of https://github.com/lbushi25/ll…
lbushi25 Oct 15, 2024
b9ed6f4
Update work_group_memory.cpp
lbushi25 Oct 15, 2024
af08c19
Fix segmentation fault
lbushi25 Oct 15, 2024
b2a97a2
Merge branch 'work_group_memoy_new' of https://github.com/lbushi25/ll…
lbushi25 Oct 15, 2024
77a6de1
Move implementation details away from handler to handler_impl
lbushi25 Oct 16, 2024
3cb0ba4
Fix ABI breakage
lbushi25 Oct 16, 2024
1783f75
Update handler.hpp
lbushi25 Oct 16, 2024
5a6085f
Merge branch 'intel:sycl' into work_group_memoy_new
lbushi25 Oct 16, 2024
6affbc3
Add missing symbols to dumps
lbushi25 Oct 16, 2024
ed3c60f
Merge branch 'sycl' into work_group_memoy_new
lbushi25 Oct 16, 2024
fd89473
Fix compilation errors from changes in handler.hpp in another commit
lbushi25 Oct 17, 2024
24f87b0
Merge branch 'intel:sycl' into work_group_memoy_new
lbushi25 Oct 17, 2024
3f8ced8
Add free function kernel support
Oct 18, 2024
084eb7d
Refactor handler.hpp
lbushi25 Oct 22, 2024
06a3c37
Polish free function kernel implementation and improve test
lbushi25 Oct 24, 2024
d96f2e1
Complete free function kernel support and add frontend tests
lbushi25 Oct 24, 2024
a2b44a2
Formatting changes
lbushi25 Oct 24, 2024
494870a
Fix bug in tests
lbushi25 Oct 24, 2024
13dd2b6
Merge branch 'sycl' into free_funtion_kernel_work_group_memory_parameter
lbushi25 Oct 24, 2024
4be6df0
Improve test and add comments
lbushi25 Oct 24, 2024
42edb22
Formatting changes
lbushi25 Oct 24, 2024
6bc951e
Formatting changes
lbushi25 Oct 24, 2024
5a91bfa
Add missing file
lbushi25 Oct 24, 2024
6f356ea
Update handler.hpp
lbushi25 Oct 24, 2024
ca4b228
Update handler.hpp
lbushi25 Oct 24, 2024
98186a5
move doc to experimental and update status
lbushi25 Oct 24, 2024
377945b
Merge branch 'free_funtion_kernel_work_group_memory_parameter' of htt…
lbushi25 Oct 24, 2024
c00c1f7
Define feature macro
lbushi25 Oct 24, 2024
5cb2d21
more formatting
lbushi25 Oct 24, 2024
f4b9b1a
Fix undefined behavior in tests and add unsupported directive on cuda
lbushi25 Oct 25, 2024
c3426d7
Update work_group_memory_forward_decl.hpp
lbushi25 Oct 25, 2024
7a6516c
Update work_group_memory_forward_decl.hpp
lbushi25 Oct 25, 2024
872f671
Update work_group_memory_forward_decl.hpp
lbushi25 Oct 25, 2024
805b00f
Update work_group_memory_forward_decl.hpp
lbushi25 Oct 25, 2024
c3b494c
Update free_function_kernel.cpp
lbushi25 Oct 25, 2024
bff19f4
Update free_function_kernel.cpp
lbushi25 Oct 25, 2024
9bab966
Make header files more lightweight
lbushi25 Oct 25, 2024
69388f0
Merge branch 'free_funtion_kernel_work_group_memory_parameter' of htt…
lbushi25 Oct 25, 2024
740f389
Modify integration header changes
lbushi25 Oct 25, 2024
0fa7af4
Update work_group_memory.cpp
lbushi25 Oct 28, 2024
883549a
Update work_group_memory.cpp
lbushi25 Oct 28, 2024
0997c60
Merge branch 'free_funtion_kernel_work_group_memory_parameter' of htt…
lbushi25 Oct 31, 2024
535890a
Remove work_group_memory_forward_decl.hpp
lbushi25 Oct 31, 2024
dcff700
Merge branch 'intel:sycl' into free_funtion_kernel_work_group_memory_…
lbushi25 Oct 31, 2024
4b2a148
Revert "Remove work_group_memory_forward_decl.hpp"
lbushi25 Nov 1, 2024
3578639
Merge branch 'free_funtion_kernel_work_group_memory_parameter' of htt…
lbushi25 Nov 6, 2024
54ab379
Fix bug in integration header free function kernel declaration
lbushi25 Nov 7, 2024
336c90e
Refactor tests
lbushi25 Nov 12, 2024
5bd6e3b
Merge branch 'sycl' into free_funtion_kernel_work_group_memory_parameter
lbushi25 Nov 18, 2024
ec91887
Implement the default constructor access hack for handling of special…
lbushi25 Nov 18, 2024
f08478e
Merge branch 'intel:sycl' into free_funtion_kernel_work_group_memory_…
lbushi25 Nov 18, 2024
3d326a3
Delete sycl/include/sycl/ext/oneapi/experimental/work_group_memory_fo…
lbushi25 Nov 18, 2024
dab03c5
Delete clang/test/SemaSYCL/Inputs/sycl/ext/oneapi/experimental/work_g…
lbushi25 Nov 18, 2024
ca4f2f3
Make work group memory inheritance private
lbushi25 Nov 19, 2024
299ecde
Merge branch 'free_funtion_kernel_work_group_memory_parameter' of htt…
lbushi25 Nov 19, 2024
d35ebb5
Merge branch 'intel:sycl' into free_funtion_kernel_work_group_memory_…
lbushi25 Nov 19, 2024
e3899f7
Remove XFAIL from free function test
lbushi25 Nov 19, 2024
7b3e5e3
Remove unnecessary includes
lbushi25 Nov 19, 2024
71ee15f
Merge branch 'intel:sycl' into free_funtion_kernel_work_group_memory_…
lbushi25 Nov 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
132 changes: 114 additions & 18 deletions clang/lib/Sema/SemaSYCL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1522,7 +1522,7 @@ class KernelObjVisitor {
void visitParam(ParmVarDecl *Param, QualType ParamTy,
HandlerTys &...Handlers) {
if (isSyclSpecialType(ParamTy, SemaSYCLRef))
KP_FOR_EACH(handleOtherType, Param, ParamTy);
KP_FOR_EACH(handleSyclSpecialType, Param, ParamTy);
else if (ParamTy->isStructureOrClassType()) {
if (KP_FOR_EACH(handleStructType, Param, ParamTy)) {
CXXRecordDecl *RD = ParamTy->getAsCXXRecordDecl();
Expand Down Expand Up @@ -2070,8 +2070,11 @@ class SyclKernelFieldChecker : public SyclKernelFieldHandler {
}

bool handleSyclSpecialType(ParmVarDecl *PD, QualType ParamTy) final {
Diag.Report(PD->getLocation(), diag::err_bad_kernel_param_type) << ParamTy;
IsInvalid = true;
if (!SemaSYCL::isSyclType(ParamTy, SYCLTypeAttr::work_group_memory)) {
Diag.Report(PD->getLocation(), diag::err_bad_kernel_param_type)
<< ParamTy;
IsInvalid = true;
}
return isValid();
}

Expand Down Expand Up @@ -2223,8 +2226,8 @@ class SyclKernelUnionChecker : public SyclKernelFieldHandler {
}

bool handleSyclSpecialType(ParmVarDecl *PD, QualType ParamTy) final {
// TODO
unsupportedFreeFunctionParamType();
if (!SemaSYCL::isSyclType(ParamTy, SYCLTypeAttr::work_group_memory))
unsupportedFreeFunctionParamType(); // TODO
return true;
}

Expand Down Expand Up @@ -3008,9 +3011,26 @@ class SyclKernelDeclCreator : public SyclKernelFieldHandler {
return handleSpecialType(FD, FieldTy);
}

bool handleSyclSpecialType(ParmVarDecl *, QualType) final {
// TODO
unsupportedFreeFunctionParamType();
bool handleSyclSpecialType(ParmVarDecl *PD, QualType ParamTy) final {
if (SemaSYCL::isSyclType(ParamTy, SYCLTypeAttr::work_group_memory)) {
const auto *RecordDecl = ParamTy->getAsCXXRecordDecl();
assert(RecordDecl && "The type must be a RecordDecl");
CXXMethodDecl *InitMethod = getMethodByName(RecordDecl, InitMethodName);
assert(InitMethod && "The type must have the __init method");
// Don't do -1 here because we count on this to be the first parameter
// added (if any).
size_t ParamIndex = Params.size();
for (const ParmVarDecl *Param : InitMethod->parameters()) {
QualType ParamTy = Param->getType();
addParam(Param, ParamTy.getCanonicalType());
// Propagate add_ir_attributes_kernel_parameter attribute.
if (const auto *AddIRAttr =
Param->getAttr<SYCLAddIRAttributesKernelParameterAttr>())
Params.back()->addAttr(AddIRAttr->clone(SemaSYCLRef.getASTContext()));
}
LastParamIndex = ParamIndex;
} else // TODO
unsupportedFreeFunctionParamType();
return true;
}

Expand Down Expand Up @@ -3286,9 +3306,7 @@ class SyclKernelArgsSizeChecker : public SyclKernelFieldHandler {
}

bool handleSyclSpecialType(ParmVarDecl *PD, QualType ParamTy) final {
// TODO
unsupportedFreeFunctionParamType();
return true;
return handleSpecialType(ParamTy);
}

bool handleSyclSpecialType(const CXXRecordDecl *, const CXXBaseSpecifier &BS,
Expand Down Expand Up @@ -4416,6 +4434,45 @@ class FreeFunctionKernelBodyCreator : public SyclKernelFieldHandler {
{});
}

MemberExpr *buildMemberExpr(Expr *Base, ValueDecl *Member) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a new issue but there is a significant overlap between code for kernel body and free function body creator. I wonder if this can be refactored (not in this PR since it is an orthogonal issue) so that we don't duplicate so much code. @Fznamznon can you weigh in here since you are implementing free function functionality now.

DeclAccessPair MemberDAP = DeclAccessPair::make(Member, AS_none);
MemberExpr *Result = SemaSYCLRef.SemaRef.BuildMemberExpr(
Base, /*IsArrow */ false, FreeFunctionSrcLoc, NestedNameSpecifierLoc(),
FreeFunctionSrcLoc, Member, MemberDAP,
/*HadMultipleCandidates*/ false,
DeclarationNameInfo(Member->getDeclName(), FreeFunctionSrcLoc),
Member->getType(), VK_LValue, OK_Ordinary);
return Result;
}

void createSpecialMethodCall(const CXXRecordDecl *RD, StringRef MethodName,
Expr *MemberBaseExpr,
SmallVectorImpl<Stmt *> &AddTo) {
CXXMethodDecl *Method = getMethodByName(RD, MethodName);
if (!Method)
return;
unsigned NumParams = Method->getNumParams();
llvm::SmallVector<Expr *, 4> ParamDREs(NumParams);
llvm::ArrayRef<ParmVarDecl *> KernelParameters =
DeclCreator.getParamVarDeclsForCurrentField();
for (size_t I = 0; I < NumParams; ++I) {
QualType ParamType = KernelParameters[I]->getOriginalType();
ParamDREs[I] = SemaSYCLRef.SemaRef.BuildDeclRefExpr(
KernelParameters[I], ParamType, VK_LValue, FreeFunctionSrcLoc);
}
MemberExpr *MethodME = buildMemberExpr(MemberBaseExpr, Method);
QualType ResultTy = Method->getReturnType();
ExprValueKind VK = Expr::getValueKindForType(ResultTy);
ResultTy = ResultTy.getNonLValueExprType(SemaSYCLRef.getASTContext());
llvm::SmallVector<Expr *, 4> ParamStmts;
const auto *Proto = cast<FunctionProtoType>(Method->getType());
SemaSYCLRef.SemaRef.GatherArgumentsForCall(FreeFunctionSrcLoc, Method,
Proto, 0, ParamDREs, ParamStmts);
AddTo.push_back(CXXMemberCallExpr::Create(
SemaSYCLRef.getASTContext(), MethodME, ParamStmts, ResultTy, VK,
FreeFunctionSrcLoc, FPOptionsOverride()));
}

public:
static constexpr const bool VisitInsideSimpleContainers = false;

Expand All @@ -4435,9 +4492,37 @@ class FreeFunctionKernelBodyCreator : public SyclKernelFieldHandler {
return true;
}

bool handleSyclSpecialType(ParmVarDecl *, QualType) final {
// TODO
unsupportedFreeFunctionParamType();
bool handleSyclSpecialType(ParmVarDecl *PD, QualType ParamTy) final {
if (SemaSYCL::isSyclType(ParamTy, SYCLTypeAttr::work_group_memory)) {
const auto *RecordDecl = ParamTy->getAsCXXRecordDecl();
QualType Ty = PD->getOriginalType();
ASTContext &Ctx = SemaSYCLRef.SemaRef.getASTContext();
VarDecl *WorkGroupMemoryClone = VarDecl::Create(
Ctx, DeclCreator.getKernelDecl(), FreeFunctionSrcLoc,
FreeFunctionSrcLoc, PD->getIdentifier(), PD->getType(),
Ctx.getTrivialTypeSourceInfo(Ty), SC_None);
InitializedEntity VarEntity =
InitializedEntity::InitializeVariable(WorkGroupMemoryClone);
InitializationKind InitKind =
InitializationKind::CreateDefault(FreeFunctionSrcLoc);
InitializationSequence InitSeq(SemaSYCLRef.SemaRef, VarEntity, InitKind,
std::nullopt);
ExprResult Init = InitSeq.Perform(SemaSYCLRef.SemaRef, VarEntity,
InitKind, std::nullopt);
WorkGroupMemoryClone->setInit(
SemaSYCLRef.SemaRef.MaybeCreateExprWithCleanups(Init.get()));
WorkGroupMemoryClone->setInitStyle(VarDecl::CallInit);
Stmt *DS = new (SemaSYCLRef.getASTContext())
DeclStmt(DeclGroupRef(WorkGroupMemoryClone), FreeFunctionSrcLoc,
FreeFunctionSrcLoc);
BodyStmts.push_back(DS);
Expr *MemberBaseExpr = SemaSYCLRef.SemaRef.BuildDeclRefExpr(
WorkGroupMemoryClone, Ty, VK_PRValue, FreeFunctionSrcLoc);
createSpecialMethodCall(RecordDecl, InitMethodName, MemberBaseExpr,
BodyStmts);
ArgExprs.push_back(MemberBaseExpr);
} else // TODO
unsupportedFreeFunctionParamType();
return true;
}

Expand Down Expand Up @@ -4717,9 +4802,11 @@ class SyclKernelIntHeaderCreator : public SyclKernelFieldHandler {
return true;
}

bool handleSyclSpecialType(ParmVarDecl *, QualType) final {
// TODO
unsupportedFreeFunctionParamType();
bool handleSyclSpecialType(ParmVarDecl *PD, QualType ParamTy) final {
if (SemaSYCL::isSyclType(ParamTy, SYCLTypeAttr::work_group_memory))
addParam(PD, ParamTy, SYCLIntegrationHeader::kind_work_group_memory);
else
unsupportedFreeFunctionParamType(); // TODO
return true;
}

Expand Down Expand Up @@ -6196,7 +6283,16 @@ void SYCLIntegrationHeader::emit(raw_ostream &O) {
O << "#include <sycl/detail/defines_elementary.hpp>\n";
O << "#include <sycl/detail/kernel_desc.hpp>\n";
O << "#include <sycl/ext/oneapi/experimental/free_function_traits.hpp>\n";

// When using work group memory parameters in free kernel functions, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this with an example? I don't fully understand what you are trying to solve. What gets emitted by integration header and what does sycl/ext/oneapi/experimental/work_group_memory_forward_decl.hpp include?

Copy link
Contributor Author

@lbushi25 lbushi25 Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the work group memory type, the integration header emits a forward declaration of the form

template<typename, typename>
class work_group_memory;

The second parameter actually has a default argument as per the spec, but clearly it is not included here(I guess to avoid redefinition of default argument which would be a compiler error).
So when you use a type such as work_group_memory<int> as a free function kernel parameter, the compiler complains because it thinks it should have two parameters and not just one.

The file sycl/ext/oneapi/experimental/work_group_memory_forward_decl.hpp, as the name suggests, contains the forward declaration of work_group_memory where the default argument is also included. So I just manually include this file into the integration header. Unfortunately, this fixed my initial issue but it has broken 2 tests in Windows pre-commit CI. This solution may be a hack, so perhaps the user you tagged(@Fznamznon ) may have a better solution in mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@lbushi25 lbushi25 Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To follow up on this, I'm not quite sure how this issue has not come up yet. It seems to me that in general, every templated class with a default argument, special or not, should be problematic to use as a free function kernel parameter but it looks like current tests are not catching this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of free function kernels is not complete, and I know there are many limitations currently. My guess is that no one has added support for parameters that are templated types, and no one tested this yet. :-(

Copy link
Contributor Author

@lbushi25 lbushi25 Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. For ease of reference, here is how work_group_memory is declared in work_group_memory_forward_decl.hpp:

template <typename DataT, typename PropertiesT = int>
class work_group_memory;

This indeed looks like a bug in the integration header generation. The definition of __sycl_shim1 should be declared as follows (with the PropertiesT default argument int inserted as the second template argument of work_group_memory in both locations):

extern "C" void foo(sycl::ext::oneapi::experimental::work_group_memory<int, int> mem);
static constexpr auto __sycl_shim1() {
  return (void (*)(class sycl::ext::oneapi::experimental::work_group_memory<int, int>))foo;
}

Oops, looks like you referenced the dummy work_group_memory_forward_decl.hpp from the clang/test directory. That is purely for testing purposes so the value of the default argument does not matter so I just put int.
Further down in the list of changed files should be the real work_group_memory_forward_decl.hpp in the sycl/ext/oneapi/experimental/ directory. But yes, the principle is the same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. For ease of reference, here is how work_group_memory is declared in work_group_memory_forward_decl.hpp:

template <typename DataT, typename PropertiesT = int>
class work_group_memory;

This indeed looks like a bug in the integration header generation. The definition of __sycl_shim1 should be declared as follows (with the PropertiesT default argument int inserted as the second template argument of work_group_memory in both locations):

extern "C" void foo(sycl::ext::oneapi::experimental::work_group_memory<int, int> mem);
static constexpr auto __sycl_shim1() {
  return (void (*)(class sycl::ext::oneapi::experimental::work_group_memory<int, int>))foo;
}

Would the frontend team be willing to open a PR soon to correct this assuming its not too much work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indeed looks like a bug in the integration header generation. The definition of __sycl_shim1 should be declared as follows (with the PropertiesT default argument int inserted as the second template argument of work_group_memory in both locations):

I think this is not quite right. The second template parameter should not be int. It should be empty_properties_t aka properties<std::tuple<>>. See my comment above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indeed looks like a bug in the integration header generation. The definition of __sycl_shim1 should be declared as follows (with the PropertiesT default argument int inserted as the second template argument of work_group_memory in both locations):

I think this is not quite right. The second template parameter should not be int. It should be empty_properties_t aka properties<std::tuple<>>. See my comment above.

Yeah, @tahonermann was misled by the dummy work_group_memory_forward_decl.hpp that I had to create for clang testing purposes. Now that we've decided to go a different route, that file is actually not needed at all so I'll just remove it to avoid confusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I neglected to check that my quick search actually found the thing I thought I was searching for :)

That doesn't change the analysis though. @lbushi25, I recommend you open a Jira issue for the FE team. I suspect a fix won't be too difficult, but I'm not very familiar with that code either.

// integration header emits incorrect forward declarations for the work group
// memory type because it ignores default arguments. This means the user
// cannot use work group memory types with parameters omitted such as
// work_group_memory<int> where the hidden second parameter has a default
// value. To circumvent this, we include the correct forward declaration
// ourselves.
O << "#include <tuple>\n";
O << "#include "
"<sycl/ext/oneapi/experimental/work_group_memory_forward_decl.hpp>\n";
O << "\n";

LangOptions LO;
Expand Down
40 changes: 39 additions & 1 deletion clang/test/CodeGenSYCL/free_function_int_header.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
// RUN: FileCheck -input-file=%t.h %s
//
// This test checks integration header contents for free functions with scalar,
// pointer and non-decomposed struct parameters.
// pointer, non-decomposed struct parameters and work group memory parameters.

#include "mock_properties.hpp"
#include "sycl.hpp"
Expand Down Expand Up @@ -96,6 +96,12 @@ void ff_7(KArgWithPtrArray<ArrSize> KArg) {

template void ff_7(KArgWithPtrArray<TestArrSize> KArg);

__attribute__((sycl_device))
[[__sycl_detail__::add_ir_attributes_function("sycl-nd-range-kernel", 0)]]
void ff_8(sycl::work_group_memory<int>) {
}


// CHECK: const char* const kernel_names[] = {
// CHECK-NEXT: {{.*}}__sycl_kernel_ff_2Piii
// CHECK-NEXT: {{.*}}__sycl_kernel_ff_2Piiii
Expand All @@ -105,6 +111,7 @@ template void ff_7(KArgWithPtrArray<TestArrSize> KArg);
// CHECK-NEXT: {{.*}}__sycl_kernel_ff_410NoPointers8Pointers3Agg
// CHECK-NEXT: {{.*}}__sycl_kernel_ff_6I3Agg7DerivedEvT_T0_i
// CHECK-NEXT: {{.*}}__sycl_kernel_ff_7ILi3EEv16KArgWithPtrArrayIXT_EE
// CHECK-NEXT: {{.*}}__sycl_kernel_ff_8N4sycl3_V117work_group_memoryIiEE
// CHECK-NEXT: ""
// CHECK-NEXT: };

Expand Down Expand Up @@ -148,6 +155,9 @@ template void ff_7(KArgWithPtrArray<TestArrSize> KArg);
// CHECK: //--- _Z18__sycl_kernel_ff_7ILi3EEv16KArgWithPtrArrayIXT_EE
// CHECK-NEXT: { kernel_param_kind_t::kind_std_layout, 48, 0 },

// CHECK: //--- _Z18__sycl_kernel_ff_8N4sycl3_V117work_group_memoryIiEE
// CHECK-NEXT: { kernel_param_kind_t::kind_work_group_memory, 8, 0 },

// CHECK: { kernel_param_kind_t::kind_invalid, -987654321, -987654321 },
// CHECK-NEXT: };

Expand Down Expand Up @@ -294,6 +304,26 @@ template void ff_7(KArgWithPtrArray<TestArrSize> KArg);
// CHECK-NEXT: };
// CHECK-NEXT: }

// CHECK: Definition of _Z18__sycl_kernel_ff_8N4sycl3_V117work_group_memoryIiEE as a free function kernel

// CHECK: Forward declarations of kernel and its argument types:
// CHECK: template <typename DataT> class work_group_memory;

// CHECK: void ff_8(sycl::work_group_memory<int>);
// CHECK-NEXT: static constexpr auto __sycl_shim9() {
// CHECK-NEXT: return (void (*)(class sycl::work_group_memory<int>))ff_8;
// CHECK-NEXT: }
// CHECK-NEXT: namespace sycl {
// CHECK-NEXT: template <>
// CHECK-NEXT: struct ext::oneapi::experimental::is_kernel<__sycl_shim9()> {
// CHECK-NEXT: static constexpr bool value = true;
// CHECK-NEXT: };
// CHECK-NEXT: template <>
// CHECK-NEXT: struct ext::oneapi::experimental::is_single_task_kernel<__sycl_shim9()> {
// CHECK-NEXT: static constexpr bool value = true;
// CHECK-NEXT: };
// CHECK-NEXT: }

// CHECK: #include <sycl/kernel_bundle.hpp>

// CHECK: Definition of kernel_id of _Z18__sycl_kernel_ff_2Piii
Expand Down Expand Up @@ -359,3 +389,11 @@ template void ff_7(KArgWithPtrArray<TestArrSize> KArg);
// CHECK-NEXT: return sycl::detail::get_kernel_id_impl(std::string_view{"_Z18__sycl_kernel_ff_7ILi3EEv16KArgWithPtrArrayIXT_EE"});
// CHECK-NEXT: }
// CHECK-NEXT: }

// CHECK: Definition of kernel_id of _Z18__sycl_kernel_ff_8N4sycl3_V117work_group_memoryIiEE
// CHECK-NEXT: namespace sycl {
// CHECK-NEXT: template <>
// CHECK-NEXT: kernel_id ext::oneapi::experimental::get_kernel_id<__sycl_shim9()>() {
// CHECK-NEXT: return sycl::detail::get_kernel_id_impl(std::string_view{"_Z18__sycl_kernel_ff_8N4sycl3_V117work_group_memoryIiEE"});
// CHECK-NEXT: }
// CHECK-NEXT: }
17 changes: 16 additions & 1 deletion clang/test/CodeGenSYCL/free_function_kernel_params.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// RUN: %clang_cc1 -internal-isystem %S/Inputs -fsycl-is-device -triple spir64 \
// RUN: -emit-llvm %s -o - | FileCheck %s
// This test checks parameter IR generation for free functions with parameters
// of non-decomposed struct type.
// of non-decomposed struct type and work group memory type.

#include "sycl.hpp"

Expand Down Expand Up @@ -56,3 +56,18 @@ template void ff_6(KArgWithPtrArray<TestArrSize> KArg);
// CHECK: %struct.KArgWithPtrArray = type { [3 x ptr addrspace(4)], [3 x i32], [3 x i32] }
// CHECK: define dso_local spir_kernel void @{{.*}}__sycl_kernel{{.*}}(ptr noundef byval(%struct.NoPointers) align 4 %__arg_S1, ptr noundef byval(%struct.__generated_Pointers) align 8 %__arg_S2, ptr noundef byval(%struct.__generated_Agg) align 8 %__arg_S3)
// CHECK: define dso_local spir_kernel void @{{.*}}__sycl_kernel_ff_6{{.*}}(ptr noundef byval(%struct.__generated_KArgWithPtrArray) align 8 %__arg_KArg)

__attribute__((sycl_device))
[[__sycl_detail__::add_ir_attributes_function("sycl-nd-range-kernel", 0)]]
void ff_7(sycl::work_group_memory<int> mem) {
}

// CHECK: define dso_local spir_kernel void @{{.*}}__sycl_kernel_ff_7{{.*}}(ptr addrspace(3) noundef align 4 %__arg_Ptr)
// CHECK: %__arg_Ptr.addr = alloca ptr addrspace(3), align 8
// CHECK-NEXT: %mem = alloca %"class.sycl::_V1::work_group_memory", align 8
// CHECK: %__arg_Ptr.addr.ascast = addrspacecast ptr %__arg_Ptr.addr to ptr addrspace(4)
// CHECK-NEXT: %mem.ascast = addrspacecast ptr %mem to ptr addrspace(4)
// CHECK: store ptr addrspace(3) %__arg_Ptr, ptr addrspace(4) %__arg_Ptr.addr.ascast, align 8
// CHECK-NEXT: [[REGISTER:%[a-zA-Z0-9_]+]] = load ptr addrspace(3), ptr addrspace(4) %__arg_Ptr.addr.ascast, align 8
// CHECK-NEXT: call spir_func void @{{.*}}work_group_memory{{.*}}__init{{.*}}(ptr addrspace(4) noundef align 8 dereferenceable_or_null(8) %mem.ascast, ptr addrspace(3) noundef [[REGISTER]])

2 changes: 1 addition & 1 deletion clang/test/CodeGenSYCL/work_group_memory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
//
// CHECK-INT-HEADER: const kernel_param_desc_t kernel_signatures[] = {
// CHECK-INT-HEADER-NEXT: //--- _ZTSZZ4mainENKUlRN4sycl3_V17handlerEE_clES2_EUlNS0_4itemILi1EEEE_
// CHECK-INT-HEADER-NEXT: { kernel_param_kind_t::kind_work_group_memory, {{[4,8]}}, 0 },
// CHECK-INT-HEADER-NEXT: { kernel_param_kind_t::kind_work_group_memory, 8, 0 },
// CHECK-INT-HEADER-EMPTY:
// CHECK-INT-HEADER-NEXT: { kernel_param_kind_t::kind_invalid, -987654321, -987654321 },
// CHECK-INT-HEADER-NEXT: };
Expand Down
17 changes: 17 additions & 0 deletions clang/test/SemaSYCL/Inputs/sycl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -448,6 +448,23 @@ class __SYCL_TYPE(multi_ptr) multi_ptr<T, AS, access::decorated::legacy> {
pointer_t m_Pointer;
};

// Dummy implementation of work_group_memory for use in SemaSYCL tests.
template <typename DataT>
class __attribute__((sycl_special_class))
__SYCL_TYPE(work_group_memory) work_group_memory {
public:
work_group_memory(handler &CGH) {}
#ifdef __SYCL_DEVICE_ONLY__
// Default constructor for objects later initialized with __init member.
work_group_memory() = default;
#endif

void __init(__attribute((opencl_local)) DataT *Ptr) { this->Ptr = Ptr; }

private:
__attribute((opencl_local)) DataT *Ptr;
};

namespace ext {
namespace oneapi {
namespace experimental {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#pragma once
// Dummy header file for the purpose of SemaSYCL testing.
// It shadows the file
// sycl/include/sycl/ext/oneapi/experimental/work_group_memory_forward_decl.hpp
namespace sycl {
inline namespace _V1 {
namespace ext {
namespace oneapi {
namespace experimental {
template <typename DataT, typename PropertiesT = int>
class work_group_memory;
} // namespace experimental
} // namespace oneapi
} // namespace ext
} // namespace _V1
} // namespace sycl
22 changes: 21 additions & 1 deletion clang/test/SemaSYCL/free_function_kernel_params.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// RUN: %clang_cc1 -internal-isystem %S/Inputs -fsycl-is-device -ast-dump \
// RUN: %s -o - | FileCheck %s
// This test checks parameter rewriting for free functions with parameters
// of type scalar, pointer and non-decomposed struct.
// of type scalar, pointer, non-decomposed struct and work group memory.

#include "sycl.hpp"

Expand Down Expand Up @@ -171,3 +171,23 @@ template void ff_6(Agg S1, Derived1 S2, int);
// CHECK-NEXT: DeclRefExpr {{.*}} '__generated_Derived1' lvalue ParmVar {{.*}} '__arg_S2' '__generated_Derived1'
// CHECK-NEXT: ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
// CHECK-NEXT: DeclRefExpr {{.*}} 'int' lvalue ParmVar {{.*}} '__arg_end' 'int'

__attribute__((sycl_device))
[[__sycl_detail__::add_ir_attributes_function("sycl-nd-range-kernel", 0)]]
void ff_7(sycl::work_group_memory<int> mem) {
}
// CHECK: FunctionDecl {{.*}}__sycl_kernel{{.*}}'void (__local int *)'
// CHECK-NEXT: ParmVarDecl {{.*}} used __arg_Ptr '__local int *'
// CHECK-NEXT: CompoundStmt
// CHECK-NEXT: DeclStmt
// CHECK-NEXT: VarDecl {{.*}} used mem 'sycl::work_group_memory<int>' callinit
// CHECK-NEXT: CXXConstructExpr {{.*}} 'sycl::work_group_memory<int>' 'void () noexcept'
// CHECK-NEXT: CXXMemberCallExpr {{.*}} 'void'
// CHECK-NEXT: MemberExpr {{.*}} 'void (__local int *)' lvalue .__init
// CHECK-NEXT: DeclRefExpr {{.*}} 'sycl::work_group_memory<int>' Var {{.*}} 'mem' 'sycl::work_group_memory<int>'
// CHECK-NEXT: ImplicitCastExpr {{.*}} '__local int *' <LValueToRValue>
// CHECK-NEXT: DeclRefExpr {{.*}} '__local int *' lvalue ParmVar {{.*}} '__arg_Ptr' '__local int *'
// CHECK-NEXT: CallExpr {{.*}} 'void'
// CHECK-NEXT: ImplicitCastExpr {{.*}} 'void (*)(sycl::work_group_memory<int>)' <FunctionToPointerDecay>
// CHECK-NEXT: DeclRefExpr {{.*}} 'void (sycl::work_group_memory<int>)' lvalue Function {{.*}} 'ff_7' 'void (sycl::work_group_memory<int>)'
// CHECK-NEXT: DeclRefExpr {{.*}} 'sycl::work_group_memory<int>' Var {{.*}} 'mem' 'sycl::work_group_memory<int>'
Loading
Loading