-
Notifications
You must be signed in to change notification settings - Fork 0
[WIP][SYCL][Driver] Initial support to enable --offload-arch option for SYCL. #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| /// added to the host compilation step. | ||
| void addSYCLTargetMacro(const llvm::opt::ArgList &Args, | ||
| StringRef Macro) const { | ||
| SYCLTargetMacro.push_back(Args.MakeArgString(Macro)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that this patch doesn't currently have the macro addition steps - but this may be a good opportunity to reduce macro duplication that is added to the host compilation by only adding unique macro values to the SYCLTargetMacro array.
|
|
||
|
|
||
| static std::optional<llvm::Triple> | ||
| getINTELOffloadTargetTriple(const Driver &D, const ArgList &Args, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name doesn't seem to fit the triple being returned. The value here is a spirv value, so is this more of a 'default SYCL JIT' triple?
| : Warning<"OpenACC directives will result in no runtime behavior; use " | ||
| "-fclangir to enable runtime effect">, | ||
| InGroup<SourceUsesOpenACC>; | ||
| def err_drv_sycl_offload_arch_missing_value : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// Vector of Macros that need to be added to the Host compilation in a | ||
| /// SYCL based offloading scenario. These macros are gathered during | ||
| /// construction of the device compilations. | ||
| mutable std::vector<std::string> SYCLTargetMacro; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| mutable std::vector<std::string> SYCLTargetMacro; | |
| mutable std::vector<std::string> SYCLTargetMacros; |
| return llvm::Triple(HostTriple.isArch64Bit() ? "spirv64-intel-sycl" | ||
| : "spirv32-intel-sycl"); | ||
| } | ||
| return std::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we emit something if user specifies -offload= for SYCL offloading? Or atleast add an assert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we emit a diagnostic for empty --offload-arch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This question was about -offload. What will happen if user says '-fsycl -offload=abc'?
| TargetTriple.setVendor(llvm::Triple::UnknownVendor); | ||
| TargetTriple.setOS(llvm::Triple::UnknownOS); | ||
| TargetTriple.setVendor(llvm::Triple::Intel); | ||
| TargetTriple.setOS(llvm::Triple::SYCL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm..This is a bit confusing. Is it correct to set OS as SYCL? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to set it here as this will be used to set the target triple string for the JIT flow (spirv64-unknown-unknown). I need to do some refactor for the JIT flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, it looks like spirv64-unknown-unknown is used for AOT and spirv64-intel-sycl is being used for JIT? Why use a different triple? Why not just rely on arch=<val> in the package and have the clang-linker-wrapper determine AOT based on that? The triple is used for the device compilation which shouldn't need to know if the generated LLVM-IR file is going to be used for JIT or AOT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. clang-linker-wrapper can determine AOT/JIT based on arch=.
Also, just a nit. I suppose you meant to write: "So, it looks like spirv64-unknown-unknown is used for JIT and spirv64-intel-sycl is being used for AOT?"
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right - I must have read the testing wrong. For some reason when I was looking at the testing, I was associating the target and triple backwards. Regardless, it looks like we are in agreement on using a single triple and having the clang-linker-wrapper understand the AOT target based on the arch= value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdtoguchi
Why not just rely on arch=<val> in the package and have the clang-linker-wrapper determine AOT based on that
This is still the case.
The proposal to have spirv64-intel-sycl as the target triple string for AOT is to have a more descriptive string that has info about the vendor and OS as opposed to having 'unknown' for both. This also matches what is done for CUDA and HIP.
This also updates triple values in clang-offload-packager call and other places where -triple is passed:
"--image=file=test.bc,triple=spirv64-intel-sycl,arch=bmg_g21_gpu,kind=sycl"
We could probably drop 'sycl' and use spirv64-intel-unknown which is already used for OpenMP AOT.
I'm wondering why we aren't aligning with OpenMP Intel AOT, CUDA, and HIP. I don't see any issues with using a more descriptive target triple string for SYCL AOT to Intel targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be no issues with moving to a more descriptive triple, but the usage of the triple during the device compilation isn't Intel specific. It is just generating generic IR. We were already using the spirv64-unknown-unknown for JIT, so I don't see why we should move away from that for AOT. We are able to use the same generated device binary with spirv64-unknown-unknown for JIT and AOT. triple=spirv64-unknown-unknown,arch=bmg_g21_gpu,kind=sycl should be plenty of information for the clang-linker-wrapper to decipher what needs to be done with the packaged binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An LLVM target triple is a string that describes the target architecture, operating system, and vendor for which LLVM is compiling code.
For SYCL AOT to Intel GPUs or CPUS, a target triple string such as spirv64-intel-sycl/unknown describes that we are compiling code for an Intel target compatible with SPIRV.
For JIT, the generated SPIRV is not tied to Intel targets, so it seems reasonable to have 'unknown' for the target vendor and OS.
AFAIK, even with SYCL offloading to CUDA targets, the generated LLVM IR is generic and the NVPTX Back End adds additional CUDA specific libraries.
We could still generate generic SPIV/LLVM IR and yet have a target triple string that describes for which LLVM is compiling code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @srividya-sundaram. I'm OK with using spirv64-intel-unknown for AOT.
|
Hi @srividya-sundaram and @mdtoguchi In llvm#146594 which was just merged, they use --offload-targets. Should we also use that instead of --offload-arch? Thanks |
@asudarsa |
AH. I understand now. In the upstream PR, they are trying to simplify the way we specify offload targets by using two options instead of one - offload-targets and offload-arch. For Intel targets, I think --offload-targets is easy to decipher. spirv64 for JIT and spirv64-intel-sycl for AOT. So user need not specify. Thanks |
| : Warning<"OpenACC directives will result in no runtime behavior; use " | ||
| "-fclangir to enable runtime effect">, | ||
| InGroup<SourceUsesOpenACC>; | ||
| def err_drv_sycl_offload_arch_missing_value : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these warnings SYCL specific?
Thanks
| // public one. | ||
| // Intel CPUs | ||
| GRANITERAPIDS, | ||
| GRANITERAPIDS_CPU, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the format we will follow going forward? Processor name + "_" + CPU/GPU? I am ok with it.
Adding @bader for comment.
Thanks
This patch is an attempt to have a single target triple string (
spirv64-intel-sycl) to describe all the Intel devices (currently GPUs and CPUs) and the corresponding offloading device architecture is specified by the--offload-archcommand-line argument, for the AOT compilation flow.Example:
would imply
spirv64-intel-syclas target triple string for both the Intel CPU and GPU.For JIT compilation, the default SYCL target triple string would be
spirv-unknown-unknownTo Do:
Implement target macro additions to SYCL device compilation flow.
Fix default SYCL target triple string code for JIT compilation.