Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions clang/test/Driver/linker-wrapper-sycl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
// CHK-CMDS-NEXT: "{{.*}}llvm-link" -only-needed [[FIRSTLLVMLINKOUT]].bc {{.*}}.bc -o [[SECONDLLVMLINKOUT:.*]].bc --suppress-warnings
// CHK-CMDS-NEXT: "{{.*}}sycl-post-link"{{.*}} SYCL_POST_LINK_OPTIONS -o [[SYCLPOSTLINKOUT:.*]].table [[SECONDLLVMLINKOUT]].bc
// CHK-CMDS-NEXT: "{{.*}}llvm-spirv"{{.*}} LLVM_SPIRV_OPTIONS -o {{.*}}
// CHK-CMDS-NEXT: offload-wrapper: input: {{.*}}, output: [[WRAPPEROUT:.*]].bc
// CHK-CMDS-NEXT: offload-wrapper: input: {{.*}}, output: [[WRAPPEROUT:.*]].bc, compile-opts: , link-opts:
// CHK-CMDS-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT:.*]] [[WRAPPEROUT]].bc
// CHK-CMDS-NEXT: "{{.*}}/ld" -- HOST_LINKER_FLAGS -dynamic-linker HOST_DYN_LIB -o a.out [[LLCOUT]] HOST_LIB_PATH HOST_STAT_LIB {{.*}}.o

Expand All @@ -32,7 +32,7 @@
// CHK-SPLIT-CMDS-NEXT: sycl-module-split: input: [[SECONDLLVMLINKOUT]].bc, output: [[SYCLMODULESPLITOUT:.*]].bc
// CHK-SPLIT-CMDS-NEXT: "{{.*}}llvm-spirv"{{.*}} LLVM_SPIRV_OPTIONS -o [[SPIRVOUT:.*]].spv [[SYCLMODULESPLITOUT]].bc
// LLVM-SPIRV is not called in dry-run
// CHK-SPLIT-CMDS-NEXT: offload-wrapper: input: [[SPIRVOUT]].spv, output: [[WRAPPEROUT:.*]].bc
// CHK-SPLIT-CMDS-NEXT: offload-wrapper: input: [[SPIRVOUT]].spv, output: [[WRAPPEROUT:.*]].bc, compile-opts: , link-opts:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that clang-linker-wrapper --dry-run prints only external commands.
Why do we print offload-wrapper which not an external command?

On the other hand. LLVM-SPIRV is not call in dry-run. Why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, the previous line checking that llvm-spirv tool is called, but the comment seems to be misleading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that clang-linker-wrapper --dry-run prints only external commands.

That is right. 'offload-wrapper' printing was done that way to increase debuggability and transparency.
It should be adjusted to the initial approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, we have a complicated option parsing and it is not clear how to test without verbose and dry-run modes.

/// Add any sycl-post-link options that rely on a specific Triple in addition
/// to user supplied options.
/// NOTE: Any changes made here should be reflected in the similarly named
/// function in clang/lib/Driver/ToolChains/Clang.cpp.
static void
getTripleBasedSYCLPostLinkOpts(const ArgList &Args,
SmallVector<StringRef, 8> &PostLinkArgs,
const llvm::Triple Triple) {
const llvm::Triple HostTriple(Args.getLastArgValue(OPT_host_triple_EQ));
bool SYCLNativeCPU = (HostTriple == Triple);
bool SpecConstsSupported = (!Triple.isNVPTX() && !Triple.isAMDGCN() &&
!Triple.isSPIRAOT() && !SYCLNativeCPU);
if (SpecConstsSupported)
PostLinkArgs.push_back("-spec-const=native");
else
PostLinkArgs.push_back("-spec-const=emulation");
// TODO: If we ever pass -ir-output-only based on the triple,
// make sure we don't pass -properties.
PostLinkArgs.push_back("-properties");
// See if device code splitting is already requested. If not requested, then
// set -split=auto for non-FPGA targets.
bool NoSplit = true;
for (auto Arg : PostLinkArgs)
if (Arg.contains("-split=")) {
NoSplit = false;
break;
}
if (NoSplit && (Triple.getSubArch() != llvm::Triple::SPIRSubArch_fpga))
PostLinkArgs.push_back("-split=auto");
// On Intel targets we don't need non-kernel functions as entry points,
// because it only increases amount of code for device compiler to handle,
// without any actual benefits.
// TODO: Try to extend this feature for non-Intel GPUs.
if ((!Args.hasFlag(OPT_no_sycl_remove_unused_external_funcs,
OPT_sycl_remove_unused_external_funcs, false) &&
!SYCLNativeCPU) &&
!Args.hasArg(OPT_sycl_allow_device_image_dependencies) &&
!Triple.isNVPTX() && !Triple.isAMDGPU())
PostLinkArgs.push_back("-emit-only-kernels-as-entry-points");
if (!Triple.isAMDGCN())
PostLinkArgs.push_back("-emit-param-info");
// Enable program metadata
if (Triple.isNVPTX() || Triple.isAMDGCN() || SYCLNativeCPU)
PostLinkArgs.push_back("-emit-program-metadata");
bool SplitEsimdByDefault = Triple.isSPIROrSPIRV();
bool SplitEsimd =
Args.hasFlag(OPT_sycl_device_code_split_esimd,
OPT_no_sycl_device_code_split_esimd, SplitEsimdByDefault);
if (!Args.hasArg(OPT_sycl_thin_lto))
PostLinkArgs.push_back("-symbols");
// Specialization constant info generation is mandatory -
// add options unconditionally
PostLinkArgs.push_back("-emit-exported-symbols");
PostLinkArgs.push_back("-emit-imported-symbols");
if (SplitEsimd)
PostLinkArgs.push_back("-split-esimd");
PostLinkArgs.push_back("-lower-esimd");
bool IsAOT = Triple.isNVPTX() || Triple.isAMDGCN() || Triple.isSPIRAOT();
if (Args.hasFlag(OPT_sycl_add_default_spec_consts_image,
OPT_no_sycl_add_default_spec_consts_image, false) &&
IsAOT)
PostLinkArgs.push_back("-generate-device-image-default-spec-consts");
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The work done by the tool be split into two areas:

  1. Link offload code by calling external linker tool.
  2. Wrap offload code.

Technically the tool calls host linker, but it's tested the same way as (1). (1) is tested by printing external commands in dry-run mode.

(2) is tested by printing the wrapper added by the tool. There is a special option for that.

Community version of the tool does not do the linking itself. It calls an external linker, we the test checks that external linker is invoked correctly. Assuming that external linker it tested, it's good enough check.

The main problem here is that clang-linker-wrapper is not supported call sycl-post-link directly. sycl-post-link should be invoked by clang-sycl-linker.

We can add an option to print the LLVM IR module, which SYCL code produces.

Eventually, we should use dry-run mode to validate that clang-linker-wrapper calls device linker in the SYCL mode.

Copy link
Contributor Author

@maksimsab maksimsab May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sycl-post-link should be invoked by clang-sycl-linker.

We can add an option to print the LLVM IR module, which SYCL code produces.

This could be tested by inspecting the output of clang-sycl-linker.

Removed the printing of offload-wrapper.

// CHK-SPLIT-CMDS-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT:.*]] [[WRAPPEROUT]].bc
// CHK-SPLIT-CMDS-NEXT: "{{.*}}/ld" -- HOST_LINKER_FLAGS -dynamic-linker HOST_DYN_LIB -o a.out [[LLCOUT]] HOST_LIB_PATH HOST_STAT_LIB {{.*}}.o

Expand Down Expand Up @@ -117,8 +117,8 @@
// CHK-CMDS-AOT-NV-NEXT: "{{.*}}clang"{{.*}} -o [[CLANGOUT:.*]] --target=nvptx64-nvidia-cuda -march={{.*}}
// CHK-CMDS-AOT-NV-NEXT: "{{.*}}ptxas"{{.*}} --output-file [[PTXASOUT:.*]] [[CLANGOUT]]
// CHK-CMDS-AOT-NV-NEXT: "{{.*}}fatbinary"{{.*}} --create [[FATBINOUT:.*]] --image=profile={{.*}},file=[[CLANGOUT]] --image=profile={{.*}},file=[[PTXASOUT]]
// CHK-CMDS-AOT-NV-NEXT: offload-wrapper: input: [[FATBINOUT]], output: [[WRAPPEROUT:.*]]
// CHK-CMDS-AOT-NV-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT:.*]] [[WRAPPEROUT]]
// CHK-CMDS-AOT-NV-NEXT: offload-wrapper: input: [[FATBINOUT]], output: [[WRAPPEROUT:.*]].bc,
// CHK-CMDS-AOT-NV-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT:.*]] [[WRAPPEROUT]].bc
// CHK-CMDS-AOT-NV-NEXT: "{{.*}}ld" -- HOST_LINKER_FLAGS -dynamic-linker HOST_DYN_LIB -o a.out [[LLCOUT]] HOST_LIB_PATH HOST_STAT_LIB {{.*}}.o

/// Check for list of commands for standalone clang-linker-wrapper run for sycl (AOT for AMD)
Expand All @@ -135,8 +135,8 @@
// CHK-CMDS-AOT-AMD-NEXT: "{{.*}}sycl-post-link"{{.*}} SYCL_POST_LINK_OPTIONS -o [[SYCLPOSTLINKOUT:.*]].table [[FIRSTLLVMLINKOUT]].bc
// CHK-CMDS-AOT-AMD-NEXT: "{{.*}}clang"{{.*}} -o [[CLANGOUT:.*]] --target=amdgcn-amd-amdhsa -mcpu={{.*}}
// CHK-CMDS-AOT-AMD-NEXT: "{{.*}}clang-offload-bundler"{{.*}} -targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa--gfx803 -input=/dev/null -input=[[CLANGOUT]] -output=[[BUNDLEROUT:.*]]
// CHK-CMDS-AOT-AMD-NEXT: offload-wrapper: input: [[BUNDLEROUT]], output: [[WRAPPEROUT:.*]]
// CHK-CMDS-AOT-AMD-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT:.*]] [[WRAPPEROUT]]
// CHK-CMDS-AOT-AMD-NEXT: offload-wrapper: input: [[BUNDLEROUT]], output: [[WRAPPEROUT:.*]].bc,
// CHK-CMDS-AOT-AMD-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT:.*]] [[WRAPPEROUT]].bc
// CHK-CMDS-AOT-AMD-NEXT: "{{.*}}ld" -- HOST_LINKER_FLAGS -dynamic-linker HOST_DYN_LIB -o a.out [[LLCOUT]] HOST_LIB_PATH HOST_STAT_LIB {{.*}}.o

/// Check for -sycl-embed-ir for standalone clang-linker-wrapper run for sycl (NVPTX)
Expand All @@ -157,13 +157,13 @@
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}llvm-link" [[FIRSTLLVMLINKIN]].bc -o [[FIRSTLLVMLINKOUT:.*]].bc --suppress-warnings
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}llvm-link" -only-needed [[FIRSTLLVMLINKOUT]].bc {{.*}}.bc -o [[SECONDLLVMLINKOUT:.*]].bc --suppress-warnings
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}sycl-post-link"{{.*}} SYCL_POST_LINK_OPTIONS -o [[SYCLPOSTLINKOUT:.*]].table [[SECONDLLVMLINKOUT]].bc
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: offload-wrapper: input: {{.*}}.bc, output: [[WRAPPEROUT1:.*]]
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT1:.*]] [[WRAPPEROUT1]]
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: offload-wrapper: input: {{.*}}.bc, output: [[WRAPPEROUT1:.*]].bc,
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT1:.*]] [[WRAPPEROUT1]].bc
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}clang"{{.*}} -o [[CLANGOUT:.*]] --target=nvptx64-nvidia-cuda -march={{.*}}
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}ptxas"{{.*}} --output-file [[PTXASOUT:.*]] [[CLANGOUT]]
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}fatbinary"{{.*}} --create [[FATBINOUT:.*]] --image=profile={{.*}},file=[[CLANGOUT]] --image=profile={{.*}},file=[[PTXASOUT]]
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: offload-wrapper: input: [[FATBINOUT]], output: [[WRAPPEROUT:.*]]
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT2:.*]] [[WRAPPEROUT]]
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: offload-wrapper: input: [[FATBINOUT]], output: [[WRAPPEROUT:.*]].bc,
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT2:.*]] [[WRAPPEROUT]].bc
// CHK-CMDS-AOT-NV-EMBED-IR-NEXT: "{{.*}}ld" -- HOST_LINKER_FLAGS -dynamic-linker HOST_DYN_LIB -o a.out [[LLCOUT1]] [[LLCOUT2]] HOST_LIB_PATH HOST_STAT_LIB {{.*}}.o

/// Check for -sycl-embed-ir for standalone clang-linker-wrapper run for sycl (AMD)
Expand All @@ -178,12 +178,12 @@
// CHK-CMDS-AOT-AMD-EMBED-IR: "{{.*}}spirv-to-ir-wrapper" {{.*}} -o [[FIRSTLLVMLINKIN:.*]].bc --llvm-spirv-opts --spirv-preserve-auxdata --spirv-target-env=SPV-IR --spirv-builtin-format=global
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: "{{.*}}llvm-link" [[FIRSTLLVMLINKIN]].bc -o [[FIRSTLLVMLINKOUT:.*]].bc --suppress-warnings
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: "{{.*}}sycl-post-link"{{.*}} SYCL_POST_LINK_OPTIONS -o [[SYCLPOSTLINKOUT:.*]].table [[FIRSTLLVMLINKOUT]].bc
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: offload-wrapper: input: {{.*}}.bc, output: [[WRAPPEROUT1:.*]]
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT1:.*]] [[WRAPPEROUT1]]
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: offload-wrapper: input: {{.*}}.bc, output: [[WRAPPEROUT1:.*]].bc,
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT1:.*]] [[WRAPPEROUT1]].bc
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: "{{.*}}clang"{{.*}} -o [[CLANGOUT:.*]] --target=amdgcn-amd-amdhsa -mcpu={{.*}}
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: "{{.*}}clang-offload-bundler"{{.*}} -input=[[CLANGOUT]] -output=[[BUNDLEROUT:.*]]
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: offload-wrapper: input: [[BUNDLEROUT]], output: [[WRAPPEROUT2:.*]]
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT2:.*]] [[WRAPPEROUT2]]
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: offload-wrapper: input: [[BUNDLEROUT]], output: [[WRAPPEROUT2:.*]].bc,
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: "{{.*}}clang"{{.*}} -c -o [[LLCOUT2:.*]] [[WRAPPEROUT2]].bc
// CHK-CMDS-AOT-AMD-EMBED-IR-NEXT: "{{.*}}ld" -- HOST_LINKER_FLAGS -dynamic-linker HOST_DYN_LIB -o a.out [[LLCOUT1]] [[LLCOUT2]] HOST_LIB_PATH HOST_STAT_LIB {{.*}}.o

// Error handling when --linker-path is not provided for clang-linker-wrapper
Expand Down
111 changes: 72 additions & 39 deletions clang/test/Driver/sycl-linker-wrapper-image.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@
// RUN: touch %t.devicelib.cpp
// RUN: %clang %t.devicelib.cpp -fsycl -fsycl-targets=spir64-unknown-unknown -c --offload-new-driver -o %t.devicelib.o
//
// Run clang-linker-wrapper test
// Check SYCL Offload Wrapper in non-dry-mode since we need to cover every it's function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that offload wrapping functionality must work in dry-run mode as well.
Dry-run mode skips running 3rd party executables, but all logic implemented in the clang-linker-wrapper tool itself should be executed and therefore tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that offload wrapping functionality must work in dry-run mode as well.

Yes and I added such testing below in this file.

SYCL Offloading contains parts like this:

addPropertySetRegistry(const PropertySetRegistry &PropRegistry) {
// transform all property sets to IR and get the middle column image into
// the PropSetsInits
SmallVector<Constant *> PropSetsInits;
for (const auto &PropSet : PropRegistry) {
// create content in the rightmost column and get begin/end pointers
std::pair<Constant *, Constant *> Props =
addPropertySetToModule(PropSet.second);
// get the next the middle column element
auto *Category = addStringToModule(PropSet.first, "SYCL_PropSetName");
PropSetsInits.push_back(ConstantStruct::get(SyclPropSetTy, Category,
Props.first, Props.second));
}
// now get content for the leftmost column - create the top-level
// PropertySetsBegin/PropertySetsBegin entries and return pointers to them
return addStructArrayToModule(PropSetsInits, SyclPropSetTy);
}

The loop in the function is being covered by test only if properties are non-empty. As you already noticed it is not a good idea to add arbitrary stub values in dry mode like I tried initially with the following code:

    // Arbitrary values are used for the testing of SYCL Offload Wrapping.
    auto Properties = util::PropertySetRegistry();
    Properties.add(util::PropertySetRegistry::SYCL_DEVICE_REQUIREMENTS, "key",
                   util::PropertyValue(uint32_t(0)));
    std::vector Modules = {module_split::SplitModule(
        *ImageFileOrErr, std::move(Properties), "entry1\nentry2")};

In order to cover the whole SYCLOffloadWrapper it leaves me with a choice between:

  1. Do the testing with both dry and non-dry modes.
  2. Do the testing only in dry mode but insert various stub values.

If I do the testing in dry mode with the default value {module_split::SplitModule(*ImageFileOrErr, util::PropertySetRegistry(), "")}; then it would lead to less code coverage.

//
//// RUN: clang-linker-wrapper --print-wrapped-module --host-triple=x86_64-unknown-linux-gnu \
// RUN: clang-linker-wrapper --print-wrapped-module --host-triple=x86_64-unknown-linux-gnu \
// RUN: -sycl-device-libraries=%t.devicelib.o \
// RUN: -sycl-post-link-options="-split=auto -symbols -properties" %t.o -o %t.out 2>&1 --linker-path="/usr/bin/ld" | FileCheck %s
// RUN: -sycl-post-link-options="-split=auto -symbols -properties" %t.o -o %t.out 2>&1 --linker-path="/usr/bin/ld" | FileCheck %s --check-prefix=CHECK-FULL

template <typename t, typename Func>
__attribute__((sycl_kernel)) void kernel(const Func &func) {
Expand All @@ -35,42 +35,75 @@ int main() {

//#endif

// CHECK-DAG: %_pi_device_binary_property_struct = type { ptr, ptr, i32, i64 }
// CHECK-DAG: %_pi_device_binary_property_set_struct = type { ptr, ptr, ptr }
// CHECK-DAG: %struct.__tgt_offload_entry = type { i64, i16, i16, i32, ptr, ptr, i64, i64, ptr }
// CHECK-DAG: %__sycl.tgt_device_image = type { i16, i8, i8, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr }
// CHECK-DAG: %__sycl.tgt_bin_desc = type { i16, i16, ptr, ptr, ptr }
// CHECK-FULL: %_pi_device_binary_property_struct = type { ptr, ptr, i32, i64 }
// CHECK-FULL-NEXT: %_pi_device_binary_property_set_struct = type { ptr, ptr, ptr }
// CHECK-FULL-NEXT: %struct.__tgt_offload_entry = type { i64, i16, i16, i32, ptr, ptr, i64, i64, ptr }
// CHECK-FULL-NEXT: %__sycl.tgt_device_image = type { i16, i8, i8, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr }
// CHECK-FULL-NEXT: %__sycl.tgt_bin_desc = type { i16, i16, ptr, ptr, ptr }

// CHECK-DAG: @.sycl_offloading.target.0 = internal unnamed_addr constant [7 x i8] c"spir64\00"
// CHECK-DAG: @.sycl_offloading.opts.compile.0 = internal unnamed_addr constant [1 x i8] zeroinitializer
// CHECK-DAG: @.sycl_offloading.opts.link.0 = internal unnamed_addr constant [1 x i8] zeroinitializer
// CHECK-DAG: @prop = internal unnamed_addr constant [17 x i8] c"DeviceLibReqMask\00"
// CHECK-DAG: @__sycl_offload_prop_sets_arr = internal constant [1 x %_pi_device_binary_property_struct] [%_pi_device_binary_property_struct { ptr @prop, ptr null, i32 1, i64 0 }]
// CHECK-DAG: @SYCL_PropSetName = internal unnamed_addr constant [24 x i8] c"SYCL/devicelib req mask\00"
// CHECK-DAG: @prop.1 = internal unnamed_addr constant [8 x i8] c"aspects\00"
// CHECK-DAG: @prop_val = internal unnamed_addr constant [8 x i8] zeroinitializer
// CHECK-DAG: @__sycl_offload_prop_sets_arr.2 = internal constant [1 x %_pi_device_binary_property_struct] [%_pi_device_binary_property_struct { ptr @prop.1, ptr @prop_val, i32 2, i64 8 }]
// CHECK-DAG: @SYCL_PropSetName.3 = internal unnamed_addr constant [25 x i8] c"SYCL/device requirements\00"
// CHECK-DAG: @SYCL_PropSetName.4 = internal unnamed_addr constant [22 x i8] c"SYCL/kernel param opt\00"
// CHECK-DAG: @__sycl_offload_prop_sets_arr.5 = internal constant [3 x %_pi_device_binary_property_set_struct] [%_pi_device_binary_property_set_struct { ptr @SYCL_PropSetName, ptr @__sycl_offload_prop_sets_arr, ptr getelementptr ([1 x %_pi_device_binary_property_struct], ptr @__sycl_offload_prop_sets_arr, i64 0, i64 1) }, %_pi_device_binary_property_set_struct { ptr @SYCL_PropSetName.3, ptr @__sycl_offload_prop_sets_arr.2, ptr getelementptr ([1 x %_pi_device_binary_property_struct], ptr @__sycl_offload_prop_sets_arr.2, i64 0, i64 1) }, %_pi_device_binary_property_set_struct { ptr @SYCL_PropSetName.4, ptr null, ptr null }]
// CHECK-DAG: @.sycl_offloading.0.data = internal unnamed_addr constant [912 x i8]
// CHECK-DAG: @__sycl_offload_entry_name = internal unnamed_addr constant [25 x i8] c"_ZTSZ4mainE11fake_kernel\00"
// CHECK-DAG: @__sycl_offload_entries_arr = internal constant [1 x %struct.__tgt_offload_entry] [%struct.__tgt_offload_entry { i64 0, i16 1, i16 4, i32 0, ptr null, ptr @__sycl_offload_entry_name, i64 0, i64 0, ptr null }]
// CHECK-DAG: @.sycl_offloading.0.info = internal local_unnamed_addr constant [2 x i64] [i64 ptrtoint (ptr @.sycl_offloading.0.data to i64), i64 912], section ".tgtimg", align 16
// CHECK-DAG: @llvm.used = appending global [1 x ptr] [ptr @.sycl_offloading.0.info], section "llvm.metadata"
// CHECK-DAG: @.sycl_offloading.device_images = internal unnamed_addr constant [1 x %__sycl.tgt_device_image] [%__sycl.tgt_device_image { i16 2, i8 4, i8 0, ptr @.sycl_offloading.target.0, ptr @.sycl_offloading.opts.compile.0, ptr @.sycl_offloading.opts.link.0, ptr null, ptr null, ptr @.sycl_offloading.0.data, ptr getelementptr ([912 x i8], ptr @.sycl_offloading.0.data, i64 0, i64 912), ptr @__sycl_offload_entries_arr, ptr getelementptr ([1 x %struct.__tgt_offload_entry], ptr @__sycl_offload_entries_arr, i64 0, i64 1), ptr @__sycl_offload_prop_sets_arr.5, ptr getelementptr ([3 x %_pi_device_binary_property_set_struct], ptr @__sycl_offload_prop_sets_arr.5, i64 0, i64 3) }]
// CHECK-DAG: @.sycl_offloading.descriptor = internal constant %__sycl.tgt_bin_desc { i16 1, i16 1, ptr @.sycl_offloading.device_images, ptr null, ptr null }
// CHECK-DAG: @llvm.global_ctors = {{.*}} { i32 1, ptr @sycl.descriptor_reg, ptr null }]
// CHECK-DAG: @llvm.global_dtors = {{.*}} { i32 1, ptr @sycl.descriptor_unreg, ptr null }]
// CHECK-FULL: @.sycl_offloading.target.0 = internal unnamed_addr constant [7 x i8] c"spir64\00"
// CHECK-FULL-NEXT: @.sycl_offloading.opts.compile.0 = internal unnamed_addr constant [1 x i8] zeroinitializer
// CHECK-FULL-NEXT: @.sycl_offloading.opts.link.0 = internal unnamed_addr constant [1 x i8] zeroinitializer
// CHECK-FULL-NEXT: @prop = internal unnamed_addr constant [17 x i8] c"DeviceLibReqMask\00"
// CHECK-FULL-NEXT: @__sycl_offload_prop_sets_arr = internal constant [1 x %_pi_device_binary_property_struct] [%_pi_device_binary_property_struct { ptr @prop, ptr null, i32 1, i64 0 }]
// CHECK-FULL-NEXT: @SYCL_PropSetName = internal unnamed_addr constant [24 x i8] c"SYCL/devicelib req mask\00"
// CHECK-FULL-NEXT: @prop.1 = internal unnamed_addr constant [8 x i8] c"aspects\00"
// CHECK-FULL-NEXT: @prop_val = internal unnamed_addr constant [8 x i8] zeroinitializer
// CHECK-FULL-NEXT: @__sycl_offload_prop_sets_arr.2 = internal constant [1 x %_pi_device_binary_property_struct] [%_pi_device_binary_property_struct { ptr @prop.1, ptr @prop_val, i32 2, i64 8 }]
// CHECK-FULL-NEXT: @SYCL_PropSetName.3 = internal unnamed_addr constant [25 x i8] c"SYCL/device requirements\00"
// CHECK-FULL-NEXT: @SYCL_PropSetName.4 = internal unnamed_addr constant [22 x i8] c"SYCL/kernel param opt\00"
// CHECK-FULL-NEXT: @__sycl_offload_prop_sets_arr.5 = internal constant [3 x %_pi_device_binary_property_set_struct] [%_pi_device_binary_property_set_struct { ptr @SYCL_PropSetName, ptr @__sycl_offload_prop_sets_arr, ptr getelementptr ([1 x %_pi_device_binary_property_struct], ptr @__sycl_offload_prop_sets_arr, i64 0, i64 1) }, %_pi_device_binary_property_set_struct { ptr @SYCL_PropSetName.3, ptr @__sycl_offload_prop_sets_arr.2, ptr getelementptr ([1 x %_pi_device_binary_property_struct], ptr @__sycl_offload_prop_sets_arr.2, i64 0, i64 1) }, %_pi_device_binary_property_set_struct { ptr @SYCL_PropSetName.4, ptr null, ptr null }]
// CHECK-FULL-NEXT: @.sycl_offloading.0.data = internal unnamed_addr constant [912 x i8]
// CHECK-FULL-NEXT: @__sycl_offload_entry_name = internal unnamed_addr constant [25 x i8] c"_ZTSZ4mainE11fake_kernel\00"
// CHECK-FULL-NEXT: @__sycl_offload_entries_arr = internal constant [1 x %struct.__tgt_offload_entry] [%struct.__tgt_offload_entry { i64 0, i16 1, i16 4, i32 0, ptr null, ptr @__sycl_offload_entry_name, i64 0, i64 0, ptr null }]
// CHECK-FULL-NEXT: @.sycl_offloading.0.info = internal local_unnamed_addr constant [2 x i64] [i64 ptrtoint (ptr @.sycl_offloading.0.data to i64), i64 912], section ".tgtimg", align 16
// CHECK-FULL-NEXT: @llvm.used = appending global [1 x ptr] [ptr @.sycl_offloading.0.info], section "llvm.metadata"
// CHECK-FULL-NEXT: @.sycl_offloading.device_images = internal unnamed_addr constant [1 x %__sycl.tgt_device_image] [%__sycl.tgt_device_image { i16 2, i8 4, i8 0, ptr @.sycl_offloading.target.0, ptr @.sycl_offloading.opts.compile.0, ptr @.sycl_offloading.opts.link.0, ptr null, ptr null, ptr @.sycl_offloading.0.data, ptr getelementptr ([912 x i8], ptr @.sycl_offloading.0.data, i64 0, i64 912), ptr @__sycl_offload_entries_arr, ptr getelementptr ([1 x %struct.__tgt_offload_entry], ptr @__sycl_offload_entries_arr, i64 0, i64 1), ptr @__sycl_offload_prop_sets_arr.5, ptr getelementptr ([3 x %_pi_device_binary_property_set_struct], ptr @__sycl_offload_prop_sets_arr.5, i64 0, i64 3) }]
// CHECK-FULL-NEXT: @.sycl_offloading.descriptor = internal constant %__sycl.tgt_bin_desc { i16 1, i16 1, ptr @.sycl_offloading.device_images, ptr null, ptr null }
// CHECK-FULL-NEXT: @llvm.global_ctors = {{.*}} { i32 1, ptr @sycl.descriptor_reg, ptr null }]
// CHECK-FULL-NEXT: @llvm.global_dtors = {{.*}} { i32 1, ptr @sycl.descriptor_unreg, ptr null }]

// CHECK: define internal void @sycl.descriptor_reg() section ".text.startup" {
// CHECK-NEXT: entry:
// CHECK-NEXT: call void @__sycl_register_lib(ptr @.sycl_offloading.descriptor)
// CHECK-NEXT: ret void
// CHECK-NEXT: }
// CHECK-FULL: define internal void @sycl.descriptor_reg() section ".text.startup" {
// CHECK-FULL-NEXT: entry:
// CHECK-FULL-NEXT: call void @__sycl_register_lib(ptr @.sycl_offloading.descriptor)
// CHECK-FULL-NEXT: ret void
// CHECK-FULL-NEXT: }

// CHECK: define internal void @sycl.descriptor_unreg() section ".text.startup" {
// CHECK-NEXT: entry:
// CHECK-NEXT: call void @__sycl_unregister_lib(ptr @.sycl_offloading.descriptor)
// CHECK-NEXT: ret void
// CHECK-NEXT: }
// CHECK-FULL: define internal void @sycl.descriptor_unreg() section ".text.startup" {
// CHECK-FULL-NEXT: entry:
// CHECK-FULL-NEXT: call void @__sycl_unregister_lib(ptr @.sycl_offloading.descriptor)
// CHECK-FULL-NEXT: ret void
// CHECK-FULL-NEXT: }


// Run Check SYCL Offload Wrapping in dry-run mode.
//
// RUN: clang-linker-wrapper --print-wrapped-module --dry-run --host-triple=x86_64-unknown-linux-gnu \
// RUN: -sycl-device-libraries=%t.devicelib.o \
// RUN: %t.o -o %t.out 2>&1 --linker-path="/usr/bin/ld" | FileCheck %s --check-prefix=CHECK-DRY

// CHECK-DRY: %__sycl.tgt_device_image = type { i16, i8, i8, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr, ptr }
// CHECK-DRY-NEXT: %__sycl.tgt_bin_desc = type { i16, i16, ptr, ptr, ptr }

// CHECK-DRY: @.sycl_offloading.target.0 = internal unnamed_addr constant [7 x i8] c"spir64\00"
// CHECK-DRY-NEXT: @.sycl_offloading.opts.compile.0 = internal unnamed_addr constant [1 x i8] zeroinitializer
// CHECK-DRY-NEXT: @.sycl_offloading.opts.link.0 = internal unnamed_addr constant [1 x i8] zeroinitializer
// CHECK-DRY-NEXT: @.sycl_offloading.0.data = internal unnamed_addr constant [0 x i8] zeroinitializer, section "spir64"
// CHECK-DRY-NEXT: @.sycl_offloading.0.info = internal local_unnamed_addr constant [2 x i64] [i64 ptrtoint (ptr @.sycl_offloading.0.data to i64), i64 0], section ".tgtimg", align 16
// CHECK-DRY-NEXT: @llvm.used = appending global [1 x ptr] [ptr @.sycl_offloading.0.info], section "llvm.metadata"
// CHECK-DRY-NEXT: @.sycl_offloading.device_images = internal unnamed_addr constant [1 x %__sycl.tgt_device_image] [%__sycl.tgt_device_image { i16 2, i8 4, i8 0, ptr @.sycl_offloading.target.0, ptr @.sycl_offloading.opts.compile.0, ptr @.sycl_offloading.opts.link.0, ptr null, ptr null, ptr @.sycl_offloading.0.data, ptr @.sycl_offloading.0.data, ptr null, ptr null, ptr null, ptr null }]
// CHECK-DRY-NEXT: @.sycl_offloading.descriptor = internal constant %__sycl.tgt_bin_desc { i16 1, i16 1, ptr @.sycl_offloading.device_images, ptr null, ptr null }
// CHECK-DRY-NEXT: @llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @sycl.descriptor_reg, ptr null }]
// CHECK-DRY-NEXT: @llvm.global_dtors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 1, ptr @sycl.descriptor_unreg, ptr null }]

// CHECK-DRY: define internal void @sycl.descriptor_reg() section ".text.startup" {
// CHECK-DRY-NEXT: entry:
// CHECK-DRY-NEXT: call void @__sycl_register_lib(ptr @.sycl_offloading.descriptor)
// CHECK-DRY-NEXT: ret void
// CHECK-DRY-NEXT: }

// CHECK-DRY: define internal void @sycl.descriptor_unreg() section ".text.startup" {
// CHECK-DRY-NEXT: entry:
// CHECK-DRY-NEXT: call void @__sycl_unregister_lib(ptr @.sycl_offloading.descriptor)
// CHECK-DRY-NEXT: ret void
// CHECK-DRY-NEXT: }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// CHECK-DRY-NEXT: }
// CHECK-DRY-NEXT: }

Loading
Loading