-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[clang][CodeGen][SPIRV] Translate amdgpu_flat_work_group_size into max_work_group_size.
#116820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
c5efdd2
4349def
2dd4456
958d1cd
27d5c6e
59d057f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -64,6 +64,8 @@ class SPIRVTargetCodeGenInfo : public CommonSPIRTargetCodeGenInfo { | |
| void setCUDAKernelCallingConvention(const FunctionType *&FT) const override; | ||
| LangAS getGlobalVarAddressSpace(CodeGenModule &CGM, | ||
| const VarDecl *D) const override; | ||
| void setTargetAttributes(const Decl *D, llvm::GlobalValue *GV, | ||
| CodeGen::CodeGenModule &M) const override; | ||
| llvm::SyncScope::ID getLLVMSyncScopeID(const LangOptions &LangOpts, | ||
| SyncScope Scope, | ||
| llvm::AtomicOrdering Ordering, | ||
|
|
@@ -245,6 +247,41 @@ SPIRVTargetCodeGenInfo::getGlobalVarAddressSpace(CodeGenModule &CGM, | |
| return DefaultGlobalAS; | ||
| } | ||
|
|
||
| void SPIRVTargetCodeGenInfo::setTargetAttributes( | ||
| const Decl *D, llvm::GlobalValue *GV, CodeGen::CodeGenModule &M) const { | ||
| if (!M.getLangOpts().HIP || | ||
| M.getTarget().getTriple().getVendor() != llvm::Triple::AMD) | ||
| return; | ||
| if (GV->isDeclaration()) | ||
| return; | ||
|
|
||
| auto F = dyn_cast<llvm::Function>(GV); | ||
| if (!F) | ||
| return; | ||
|
Comment on lines
+258
to
+260
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can this fail if the FunctionDecl below fails? |
||
|
|
||
| auto FD = dyn_cast_or_null<FunctionDecl>(D); | ||
| if (!FD) | ||
| return; | ||
| if (!FD->hasAttr<CUDAGlobalAttr>()) | ||
| return; | ||
|
|
||
| unsigned N = M.getLangOpts().GPUMaxThreadsPerBlock; | ||
| if (auto FlatWGS = FD->getAttr<AMDGPUFlatWorkGroupSizeAttr>()) | ||
| N = FlatWGS->getMax()->EvaluateKnownConstInt(M.getContext()).getExtValue(); | ||
|
|
||
| // We encode the maximum flat WG size in the first component of the 3D | ||
| // max_work_group_size attribute, which will get reverse translated into the | ||
| // original AMDGPU attribute when targeting AMDGPU. | ||
|
Comment on lines
+272
to
+274
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm still confused about why this is supposed to be "OK"
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is OK in HIP because the language (only) defines In general we will eventually replace this with processing for all AMDGPU attributes, but that has some challenges in that it'd be more infectious in the translator (or the BE and any eventual SPIR-V consumer, if they were to manifest). Conversely we cannot just drop the original attribute on the floor as correctness depends on it. Hence the PR.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That reminds me, the way launch_bounds is implemented is also problematic: #91468 Launch bounds should not be implemented in terms of the flat workgroup size attribute in a header. clang should be directly interpreting it. So yes, there is a clang attribute for it. It's just HIP has a hackier implementation of it than the CUDA attribute which we just ignore
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We are talking across eachother. I am saying that the SPIR-V attribute cannot be generated via Clang, i.e. that you cannot write // An AST node is created for this attribute, but is not used by other parts
// of the compiler. However, this node needs to exist in the AST because
// non-LLVM backends may be relying on the attribute's presence.So this is a glorified annotation / we'd still have to decide on how to lower it into IR, which would likely end up atop flat workgroup size, unless we choose to spam yet another attribute. We also use flat workgroup size implicitly to control / implement That being said, the idea in #91468 is sound, but it will require a bit of work to get done; I think we'd still have to choose a way to pass the info through SPIR-V (what this PR tries to do). |
||
| auto Int32Ty = llvm::IntegerType::getInt32Ty(M.getLLVMContext()); | ||
| llvm::Metadata *AttrMDArgs[] = { | ||
| llvm::ConstantAsMetadata::get(llvm::ConstantInt::get(Int32Ty, N)), | ||
| llvm::ConstantAsMetadata::get(llvm::ConstantInt::get(Int32Ty, 1)), | ||
| llvm::ConstantAsMetadata::get(llvm::ConstantInt::get(Int32Ty, 1))}; | ||
|
|
||
| F->setMetadata("max_work_group_size", | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why metadata? I know the OpenCL stuff uses metadata, but I think that's because it predates arbitrary string attributes. This is also "setTargetAttributes".
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because the Translator and other tools interact/expect it as metadata. |
||
| llvm::MDNode::get(M.getLLVMContext(), AttrMDArgs)); | ||
| } | ||
|
|
||
| llvm::SyncScope::ID | ||
| SPIRVTargetCodeGenInfo::getLLVMSyncScopeID(const LangOptions &, SyncScope Scope, | ||
| llvm::AtomicOrdering, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the vendor check. The language check is also suspect, this is interpretation of a target attribute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not super suspect in context as at the moment this is only for AMDGCN flavoured SPIR-V, which we only support in HIP, and the mapping from flat to dim X only makes sense there.