Skip to content

Conversation

@srividya-sundaram
Copy link
Owner

@srividya-sundaram srividya-sundaram commented Jul 1, 2025

This patch is an attempt to have a single target triple string (spirv64-intel-sycl) to describe all the Intel devices (currently GPUs and CPUs) and the corresponding offloading device architecture is specified by the --offload-arch command-line argument, for the AOT compilation flow.

Example:

clang -fsycl --offload-arch=bmg_g21_gpu syclfile.cpp 
clang -fsycl --offload-arch=graniterapids_cpu syclfile.cpp

would imply spirv64-intel-sycl as target triple string for both the Intel CPU and GPU.

For JIT compilation, the default SYCL target triple string would be spirv-unknown-unknown

AOT flow : spirv32/64-intel-sycl
JIT flow:    spirv32/64-unknown-unknown

To Do:
Implement target macro additions to SYCL device compilation flow.
Fix default SYCL target triple string code for JIT compilation.

@srividya-sundaram srividya-sundaram marked this pull request as draft July 1, 2025 16:55
/// added to the host compilation step.
void addSYCLTargetMacro(const llvm::opt::ArgList &Args,
StringRef Macro) const {
SYCLTargetMacro.push_back(Args.MakeArgString(Macro));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that this patch doesn't currently have the macro addition steps - but this may be a good opportunity to reduce macro duplication that is added to the host compilation by only adding unique macro values to the SYCLTargetMacro array.



static std::optional<llvm::Triple>
getINTELOffloadTargetTriple(const Driver &D, const ArgList &Args,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name doesn't seem to fit the triple being returned. The value here is a spirv value, so is this more of a 'default SYCL JIT' triple?

@srividya-sundaram srividya-sundaram changed the title [SYCL][Driver] Initial support to enable --offload-arch option for SYCL. [WIP][SYCL][Driver] Initial support to enable --offload-arch option for SYCL. Jul 3, 2025
: Warning<"OpenACC directives will result in no runtime behavior; use "
"-fclangir to enable runtime effect">,
InGroup<SourceUsesOpenACC>;
def err_drv_sycl_offload_arch_missing_value :
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @srividya-sundaram

I will take a look in a bit.

Thanks

/// Vector of Macros that need to be added to the Host compilation in a
/// SYCL based offloading scenario. These macros are gathered during
/// construction of the device compilations.
mutable std::vector<std::string> SYCLTargetMacro;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mutable std::vector<std::string> SYCLTargetMacro;
mutable std::vector<std::string> SYCLTargetMacros;

return llvm::Triple(HostTriple.isArch64Bit() ? "spirv64-intel-sycl"
: "spirv32-intel-sycl");
}
return std::nullopt;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we emit something if user specifies -offload= for SYCL offloading? Or atleast add an assert?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we emit a diagnostic for empty --offload-arch

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question was about -offload. What will happen if user says '-fsycl -offload=abc'?

TargetTriple.setVendor(llvm::Triple::UnknownVendor);
TargetTriple.setOS(llvm::Triple::UnknownOS);
TargetTriple.setVendor(llvm::Triple::Intel);
TargetTriple.setOS(llvm::Triple::SYCL);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm..This is a bit confusing. Is it correct to set OS as SYCL? Thanks

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to set it here as this will be used to set the target triple string for the JIT flow (spirv64-unknown-unknown). I need to do some refactor for the JIT flow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, it looks like spirv64-unknown-unknown is used for AOT and spirv64-intel-sycl is being used for JIT? Why use a different triple? Why not just rely on arch=<val> in the package and have the clang-linker-wrapper determine AOT based on that? The triple is used for the device compilation which shouldn't need to know if the generated LLVM-IR file is going to be used for JIT or AOT.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. clang-linker-wrapper can determine AOT/JIT based on arch=.
Also, just a nit. I suppose you meant to write: "So, it looks like spirv64-unknown-unknown is used for JIT and spirv64-intel-sycl is being used for AOT?"

Thanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - I must have read the testing wrong. For some reason when I was looking at the testing, I was associating the target and triple backwards. Regardless, it looks like we are in agreement on using a single triple and having the clang-linker-wrapper understand the AOT target based on the arch= value.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdtoguchi
Why not just rely on arch=<val> in the package and have the clang-linker-wrapper determine AOT based on that
This is still the case.
The proposal to have spirv64-intel-sycl as the target triple string for AOT is to have a more descriptive string that has info about the vendor and OS as opposed to having 'unknown' for both. This also matches what is done for CUDA and HIP.
This also updates triple values in clang-offload-packager call and other places where -triple is passed:
"--image=file=test.bc,triple=spirv64-intel-sycl,arch=bmg_g21_gpu,kind=sycl"

We could probably drop 'sycl' and use spirv64-intel-unknown which is already used for OpenMP AOT.

I'm wondering why we aren't aligning with OpenMP Intel AOT, CUDA, and HIP. I don't see any issues with using a more descriptive target triple string for SYCL AOT to Intel targets.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be no issues with moving to a more descriptive triple, but the usage of the triple during the device compilation isn't Intel specific. It is just generating generic IR. We were already using the spirv64-unknown-unknown for JIT, so I don't see why we should move away from that for AOT. We are able to use the same generated device binary with spirv64-unknown-unknown for JIT and AOT. triple=spirv64-unknown-unknown,arch=bmg_g21_gpu,kind=sycl should be plenty of information for the clang-linker-wrapper to decipher what needs to be done with the packaged binary.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An LLVM target triple is a string that describes the target architecture, operating system, and vendor for which LLVM is compiling code.

For SYCL AOT to Intel GPUs or CPUS, a target triple string such as spirv64-intel-sycl/unknown describes that we are compiling code for an Intel target compatible with SPIRV.

For JIT, the generated SPIRV is not tied to Intel targets, so it seems reasonable to have 'unknown' for the target vendor and OS.

AFAIK, even with SYCL offloading to CUDA targets, the generated LLVM IR is generic and the NVPTX Back End adds additional CUDA specific libraries.

We could still generate generic SPIV/LLVM IR and yet have a target triple string that describes for which LLVM is compiling code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @srividya-sundaram. I'm OK with using spirv64-intel-unknown for AOT.

@asudarsa
Copy link

asudarsa commented Jul 7, 2025

Hi @srividya-sundaram and @mdtoguchi

In llvm#146594 which was just merged, they use --offload-targets. Should we also use that instead of --offload-arch?

Thanks

@srividya-sundaram
Copy link
Owner Author

Hi @srividya-sundaram and @mdtoguchi

In llvm#146594 which was just merged, they use --offload-targets. Should we also use that instead of --offload-arch?

Thanks

@asudarsa
The HelpText in his patch for --offload-targets reads : Specify a list of target architectures to use for offloading.
Although, the example usage in the tests seems to use the target triple string as the value.
--offload-targets=amdgcn-amd-amdhsa
If we need to pass info about the offloading device architecture, do we still need to pass --offload-arch?
Example:
clang -fsycl --offload-targets=spirv64 --offload-arch=pvc syclfile.cpp
This is not clear to me from his patch.

@asudarsa
Copy link

asudarsa commented Jul 7, 2025

Hi @srividya-sundaram and @mdtoguchi
In llvm#146594 which was just merged, they use --offload-targets. Should we also use that instead of --offload-arch?
Thanks

@asudarsa The HelpText in his patch for --offload-targets reads : Specify a list of target architectures to use for offloading. Although, the example usage in the tests seems to use the target triple string as the value. --offload-targets=amdgcn-amd-amdhsa If we need to pass info about the offloading device architecture, do we still need to pass --offload-arch? Example: clang -fsycl --offload-targets=spirv64 --offload-arch=pvc syclfile.cpp This is not clear to me from his patch.

AH. I understand now. In the upstream PR, they are trying to simplify the way we specify offload targets by using two options instead of one - offload-targets and offload-arch. For Intel targets, I think --offload-targets is easy to decipher. spirv64 for JIT and spirv64-intel-sycl for AOT. So user need not specify.

Thanks

: Warning<"OpenACC directives will result in no runtime behavior; use "
"-fclangir to enable runtime effect">,
InGroup<SourceUsesOpenACC>;
def err_drv_sycl_offload_arch_missing_value :
Copy link

@asudarsa asudarsa Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these warnings SYCL specific?

Thanks

// public one.
// Intel CPUs
GRANITERAPIDS,
GRANITERAPIDS_CPU,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the format we will follow going forward? Processor name + "_" + CPU/GPU? I am ok with it.
Adding @bader for comment.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants