From 17576dd5b915198f3238a83678adda30275de25d Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Mon, 3 Feb 2025 07:48:27 -0800 Subject: [PATCH 01/30] [SYCL][Docs] Add SYCLBIN feature and format design document This commit adds a design document detailing the SYCLBIN binary format for representing SYCL device kernel binaries to be loaded dynamically at runtime. Additionally, the design document details how this is to be handled by the SYCL runtime, driver and clang tooling. Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.asciidoc | 255 ++++++++++++++++++ .../SYCLBIN_file_format_illustration.svg | 1 + 2 files changed, 256 insertions(+) create mode 100644 sycl/doc/design/SYCLBINDesign.asciidoc create mode 100644 sycl/doc/design/SYCLBIN_file_format_illustration.svg diff --git a/sycl/doc/design/SYCLBINDesign.asciidoc b/sycl/doc/design/SYCLBINDesign.asciidoc new file mode 100644 index 0000000000000..af6962c6efe04 --- /dev/null +++ b/sycl/doc/design/SYCLBINDesign.asciidoc @@ -0,0 +1,255 @@ += SYCLBIN - A format for separately compiled SYCL device code + +This design document details the SYCLBIN binary format used for storing SYCL +device binaries to be loaded dynamically by the SYCL runtime. It also details +how the toolchain produces, links and packages these binaries, as well as how +the SYCL runtime library handles files of this format. + +[[syclbin_format]] +== SYCLBIN binary format + +The files produced by the new compilation path will follow the format described +in this section. The intention of defining a new format for these is to give +the DPC++ implementation an extendable and lightweight wrapper around the +multiple modules and corresponding metadata captured in the SYCLBIN file. +The content of the SYCLBIN may be contained as an entry in the offloading binary +format produced by the clang-offload-packager, as described in +link:../../clang/docs/ClangOffloadPackager.rst[ClangOffloadPackager.rst]. + +The following illustration gives an overview of how the file format is +structured. + +image::SYCLBIN_file_format_illustration.svg["SYCLBIN binary file format illustration", width=40%] + +=== Header + +The header segment appears as the first part of the SYCLBIN file. Like many +other file-formats, it defines a magic number to help identify the format, which +is 0x53594249 (or "SYBI".) Immediately following the magic number is the version +number, which is used by SYCLBIN consumers when parsing data in the rest of the +file. + +[cols="1,3,1"] +|=== +| Type | Description | Value variable + +| `uint32_t` | Magic number. (0x53594249) | +| `uint32_t` | SYCLBIN version number. | +| `uint8_t` | `sycl::bundle_state` corresponding to the contents of the SYCLBIN. | +|=== + +The `sycl::bundle_state` is an integer with the values as follows: + +[cols="3,1"] +|=== +| `sycl::bundle_state` | Value + +| `input` | 0 +| `object` | 1 +| `executable` | 2 +|=== + + +=== Body + +Immediately after the header is the body of the SYCLBIN file. The body consists +of a list of abstract modules. + +[cols="1,3,1"] +|=== +| Type | Description | Value variable + +| `uint64_t` | Byte size of the list of abstract modules. | `B` +| `B` | List of abstract modules. | +|=== + + +==== Abstract module + +Each abstract module represents a set of kernels, the corresponding metadata, 0 +or more IR modules containing these kernels, and 0 or more native device code +images containing the kernels. + +[cols="1,3,1"] +|=== +| Type | Description | Value variable + +| `uint64_t` | Byte size of the list of the metadata. | `M` +| `M` | Module metadata. | +| `uint64_t` | Byte size of list of IR modules. | `IR` +| `IR` | List of IR modules. | +| `uint64_t` | Byte size of list of native device code images. | `ND` +| `ND` | List of native device code images. | +|=== + + +===== Module metadata + +The module metadata contains the following information about the contents of the +module. + +[cols="1,3,1"] +|=== +| Type | Description | Value variable + +| `uint32_t` | Byte size of the list of kernel names. | `K` +| `K` | List of kernel names. (String list) | +| `uint32_t` | Byte size of the list of imported symbols. | `I` +| `I` | List of imported symbols. (String list) | +| `uint32_t` | Byte size of the list of exported symbols. | `E` +| `E` | List of exported symbols. (String list) | +| `uint32_t` | Byte size of property set data. | `P` +| `P` | Property set data. | +|=== + + +NOTE: Optional features used is embedded in the property set data. +TODO: Consolidate and/or document the property set data in this document. + +====== String list + +A string list in this binary format consists of a `uint32_t` at the beginning +containing the number of elements in the list, followed by that number of +entries with the format: + +[cols="1,3,1"] +|=== +| Type | Description | Value variable + +| `uint32_t` | Byte size of the string. | `S` +| `S` | String bytes. | +|=== + + +===== IR module + +An IR module contains the binary data for the corresponding module compiled to a +given IR representation, identified by the IR type field. + +[cols="1,3,1"] +|=== +| Type | Description | Value variable + +| `uint8_t` | IR type. | +| `uint32_t` | Byte size of the raw IR bytes. | `IB` +| `IB` | Raw IR bytes. | +|=== + +TODO: Do we need a target-specific blob inside this structure? E.g. for CUDA we + may want to embed the SM version. + + +====== IR types + +The IR types must be one of the following values: + +[cols="3,1"] +|=== +| IR type | Value + +| SPIR-V | 0 +|=== + + +===== Native device code image + +An native device code image contains the binary data for the corresponding +module AOT compiled for a specific device, identified by the architecture +string. + +[cols="1,3,1"] +|=== +| Type | Description | Value variable + +| `uint32_t` | Byte size of the architecture string. | `A` +| `A` | Architecture string. | +| `uint32_t` | Byte size of the native device code image bytes. | `NB` +| NB | Native device code image bytes. | +|=== + + +=== SYCLBIN version changelog + +The SYCLBIN format is subject to change, but any such changes must come with an +increment to the version number in the header and a subsection to this section +describing the change. + +==== Version 1 + + * Initial version of the layout. + + +== Clang driver changes + +The clang driver needs to accept the following new flags: + +[cols="1,3"] +|=== +| Option | Description + +| `-fsyclbin` +| If this option is set, the output of the invocation is a SYCLBIN file with the + .syclbin file extension. This skips the host-compilation invocation of the + typical `-fsycl` pipeline, instead passing the output of the + clang-offloat-packager invocation to clang-linker-wrapper together with the + new `--syclbin` flag. + + Setting this option will override `-fsycl` and `-fsycl-device-only`. + + This option currently requires `--offload-new-driver` to be set. + +| `--offload-ir` +| TODO + +| `--offload-rdc` +| This is an alias of `-fgpu-rdc`. +|=== + +Additionally, `-fsycl-link` should work with .syclbin files. Semantics of how +SYCLBIN files are linked together is yet to be specified. + + +== clang-linker-wrapper changes + +The clang-linker-wrapper is responsible for doing post-processing and linking of +device binaries, as described in +link:OffloadDesign.rst[OffloadDesign.md]. +However, to support SYCLBIN files, the clang-linker-wrapper must be able to +unpack an offload binary (as described in +link:../../clang/docs/ClangOffloadPackager.rst[ClangOffloadPackager.rst]) +directly, instead of extracting it from a host binary. This should be done when +a new flag, `--syclbin`, is passed. In this case, the clang-linker-wrapper is +responsible to package the resulting device binaries and produced metadata into +the format described in <>. Additionally, in this case the +clang-linker-wrapper will skip the wrapping of the device code and the host code +linking stage, as there is no host code to wrap the device code in and link. + +TODO: Describe the details of linking SYCLBIN files. + + +== SYCL runtime library changes + +Using the interfaces from the +link:../extensions/proposed/sycl_ext_oneapi_syclbin.asciidoc[sycl_ext_oneapi_syclbin] +extension, the runtime must be able to parse the SYCLBIN format, as described in +<>. To avoid large amounts of code duplication, the runtime uses +the implementation of SYCLBIN reading and writing implemented in LLVM. + +When creating a `kernel_bundle` from a SYCLBIN file, the runtime reads the +contents of the SYCLBIN file and creates the corresponding data structure from +it. In order for the SYCL runtime library's existing logic to use the binaries, +the runtime then creates a collection of `sycl_device_binary_struct` objects and +its constituents, pointing to the data in the parsed SYCLBIN object. Passing +these objects to the runtime library's `ProgramManager` allows it to reuse the +logic for compiling, linking and building SYCL binaries. + +In the other direction, users can request the "contents" of a `kernel_bundle`. +When this is done, the runtime library must ensure that a SYCLBIN file is +available for the contents of the `kernel_bundle` and must then write the +SYCLBIN object to the corresponding binary representation in the format +described in <>. In cases where the `kernel_bundle` was created +with a SYCLBIN file, the SYCLBIN representation is immediately available and +can be serialized directly. In other cases, the runtime library creates a new +SYCLBIN object from the binaries associated with the `kernel_bundle`, then +serializes it and returns the result. + diff --git a/sycl/doc/design/SYCLBIN_file_format_illustration.svg b/sycl/doc/design/SYCLBIN_file_format_illustration.svg new file mode 100644 index 0000000000000..a274d63a5520d --- /dev/null +++ b/sycl/doc/design/SYCLBIN_file_format_illustration.svg @@ -0,0 +1 @@ +HeaderAbstract module 1Abstract module 2Abstract module N MetadataIR module 1IR module M Native device code image 1Native device code image OSYCLBIN file \ No newline at end of file From 60ff95fd99c1ca3037613776b5bd15b8d2e2db54 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Tue, 4 Feb 2025 00:16:05 -0800 Subject: [PATCH 02/30] Move to Markdown format Signed-off-by: Larsen, Steffen --- ...YCLBINDesign.asciidoc => SYCLBINDesign.md} | 235 ++++++++---------- 1 file changed, 104 insertions(+), 131 deletions(-) rename sycl/doc/design/{SYCLBINDesign.asciidoc => SYCLBINDesign.md} (51%) diff --git a/sycl/doc/design/SYCLBINDesign.asciidoc b/sycl/doc/design/SYCLBINDesign.md similarity index 51% rename from sycl/doc/design/SYCLBINDesign.asciidoc rename to sycl/doc/design/SYCLBINDesign.md index af6962c6efe04..abea7a447eed9 100644 --- a/sycl/doc/design/SYCLBINDesign.asciidoc +++ b/sycl/doc/design/SYCLBINDesign.md @@ -1,12 +1,11 @@ -= SYCLBIN - A format for separately compiled SYCL device code +# SYCLBIN - A format for separately compiled SYCL device code This design document details the SYCLBIN binary format used for storing SYCL device binaries to be loaded dynamically by the SYCL runtime. It also details how the toolchain produces, links and packages these binaries, as well as how the SYCL runtime library handles files of this format. -[[syclbin_format]] -== SYCLBIN binary format +## SYCLBIN binary format The files produced by the new compilation path will follow the format described in this section. The intention of defining a new format for these is to give @@ -14,14 +13,14 @@ the DPC++ implementation an extendable and lightweight wrapper around the multiple modules and corresponding metadata captured in the SYCLBIN file. The content of the SYCLBIN may be contained as an entry in the offloading binary format produced by the clang-offload-packager, as described in -link:../../clang/docs/ClangOffloadPackager.rst[ClangOffloadPackager.rst]. +[../../clang/docs/ClangOffloadPackager.rst](ClangOffloadPackager.rst). The following illustration gives an overview of how the file format is structured. -image::SYCLBIN_file_format_illustration.svg["SYCLBIN binary file format illustration", width=40%] +![SYCLBIN binary file format illustration](SYCLBIN_file_format_illustration.svg) -=== Header +### Header The header segment appears as the first part of the SYCLBIN file. Like many other file-formats, it defines a magic number to help identify the format, which @@ -29,152 +28,127 @@ is 0x53594249 (or "SYBI".) Immediately following the magic number is the version number, which is used by SYCLBIN consumers when parsing data in the rest of the file. -[cols="1,3,1"] -|=== -| Type | Description | Value variable - -| `uint32_t` | Magic number. (0x53594249) | -| `uint32_t` | SYCLBIN version number. | -| `uint8_t` | `sycl::bundle_state` corresponding to the contents of the SYCLBIN. | -|=== +| Type | Description | Value variable | +| ---------- | ------------------------------------------------------------------ | -------------- | +| `uint32_t` | Magic number. (0x53594249) | | +| `uint32_t` | SYCLBIN version number. | | +| `uint8_t` | `sycl::bundle_state` corresponding to the contents of the SYCLBIN. | | The `sycl::bundle_state` is an integer with the values as follows: -[cols="3,1"] -|=== -| `sycl::bundle_state` | Value - -| `input` | 0 -| `object` | 1 -| `executable` | 2 -|=== +| `sycl::bundle_state` | Value | +| -------------------- | ----- | +| `input` | 0 | +| `object` | 1 | +| `executable` | 2 | -=== Body +### Body Immediately after the header is the body of the SYCLBIN file. The body consists of a list of abstract modules. -[cols="1,3,1"] -|=== -| Type | Description | Value variable +| Type | Description | Value variable | +| ---------- | ------------------------------------------ | -------------- | +| `uint64_t` | Byte size of the list of abstract modules. | `B` | +| `B` | List of abstract modules. | | -| `uint64_t` | Byte size of the list of abstract modules. | `B` -| `B` | List of abstract modules. | -|=== - -==== Abstract module +#### Abstract module Each abstract module represents a set of kernels, the corresponding metadata, 0 or more IR modules containing these kernels, and 0 or more native device code images containing the kernels. -[cols="1,3,1"] -|=== -| Type | Description | Value variable - -| `uint64_t` | Byte size of the list of the metadata. | `M` -| `M` | Module metadata. | -| `uint64_t` | Byte size of list of IR modules. | `IR` -| `IR` | List of IR modules. | -| `uint64_t` | Byte size of list of native device code images. | `ND` -| `ND` | List of native device code images. | -|=== +| Type | Description | Value variable | +| ---------- | ----------------------------------------------- | -------------- | +| `uint64_t` | Byte size of the list of the metadata. | `M` | +| `M` | Module metadata. | | +| `uint64_t` | Byte size of list of IR modules. | `IR` | +| `IR` | List of IR modules. | | +| `uint64_t` | Byte size of list of native device code images. | `ND` | +| `ND` | List of native device code images. | | -===== Module metadata +##### Module metadata The module metadata contains the following information about the contents of the module. -[cols="1,3,1"] -|=== -| Type | Description | Value variable +| Type | Description | Value variable | +| ---------- | -------------------------------------------------------------- | -------------- | +| `uint32_t` | Byte size of the list of kernel names. | `K` | +| `K` | List of kernel names. (String list) | | +| `uint32_t` | Byte size of the list of imported symbols. | `I` | +| `I` | List of imported symbols. (String list) | | +| `uint32_t` | Byte size of the list of exported symbols. | `E` | +| `E` | List of exported symbols. (String list) | | +| `uint32_t` | Byte size of property set data. | `P` | +| `P` | Property set data. | | -| `uint32_t` | Byte size of the list of kernel names. | `K` -| `K` | List of kernel names. (String list) | -| `uint32_t` | Byte size of the list of imported symbols. | `I` -| `I` | List of imported symbols. (String list) | -| `uint32_t` | Byte size of the list of exported symbols. | `E` -| `E` | List of exported symbols. (String list) | -| `uint32_t` | Byte size of property set data. | `P` -| `P` | Property set data. | -|=== +*NOTE:* Optional features used is embedded in the property set data. +*TODO:* Consolidate and/or document the property set data in this document. -NOTE: Optional features used is embedded in the property set data. -TODO: Consolidate and/or document the property set data in this document. - -====== String list +##### String list A string list in this binary format consists of a `uint32_t` at the beginning containing the number of elements in the list, followed by that number of entries with the format: -[cols="1,3,1"] -|=== -| Type | Description | Value variable - -| `uint32_t` | Byte size of the string. | `S` -| `S` | String bytes. | -|=== +| Type | Description | Value variable | +| ---------- | ------------------------ | -------------- | +| `uint32_t` | Byte size of the string. | `S` | +| `S` | String bytes. | | -===== IR module +##### IR module An IR module contains the binary data for the corresponding module compiled to a given IR representation, identified by the IR type field. -[cols="1,3,1"] -|=== -| Type | Description | Value variable - -| `uint8_t` | IR type. | -| `uint32_t` | Byte size of the raw IR bytes. | `IB` -| `IB` | Raw IR bytes. | -|=== +| Type | Description | Value variable | +| ---------- | ------------------------------ | -------------- | +| `uint8_t` | IR type. | | +| `uint32_t` | Byte size of the raw IR bytes. | `IB` | +| `IB` | Raw IR bytes. | | -TODO: Do we need a target-specific blob inside this structure? E.g. for CUDA we - may want to embed the SM version. +*TODO:* Do we need a target-specific blob inside this structure? E.g. for CUDA + we may want to embed the SM version. -====== IR types +##### IR types The IR types must be one of the following values: -[cols="3,1"] -|=== -| IR type | Value +| IR type | Value | +| ------- | ----- | +| SPIR-V | 0 | +| PTX | 1 | +| AMDGCN | 2 | -| SPIR-V | 0 -|=== - -===== Native device code image +##### Native device code image An native device code image contains the binary data for the corresponding module AOT compiled for a specific device, identified by the architecture string. -[cols="1,3,1"] -|=== -| Type | Description | Value variable - -| `uint32_t` | Byte size of the architecture string. | `A` -| `A` | Architecture string. | -| `uint32_t` | Byte size of the native device code image bytes. | `NB` -| NB | Native device code image bytes. | -|=== +| Type | Description | Value variable | +| ---------- | ------------------------------------------------ | -------------- | +| `uint32_t` | Byte size of the architecture string. | `A` | +| `A` | Architecture string. | | +| `uint32_t` | Byte size of the native device code image bytes. | `NB` | +| NB | Native device code image bytes. | | -=== SYCLBIN version changelog +### SYCLBIN version changelog The SYCLBIN format is subject to change, but any such changes must come with an increment to the version number in the header and a subsection to this section describing the change. -==== Version 1 +#### Version 1 * Initial version of the layout. @@ -183,33 +157,30 @@ describing the change. The clang driver needs to accept the following new flags: -[cols="1,3"] -|=== -| Option | Description - -| `-fsyclbin` -| If this option is set, the output of the invocation is a SYCLBIN file with the - .syclbin file extension. This skips the host-compilation invocation of the - typical `-fsycl` pipeline, instead passing the output of the - clang-offloat-packager invocation to clang-linker-wrapper together with the - new `--syclbin` flag. - - Setting this option will override `-fsycl` and `-fsycl-device-only`. - - This option currently requires `--offload-new-driver` to be set. - -| `--offload-ir` -| TODO - -| `--offload-rdc` -| This is an alias of `-fgpu-rdc`. -|=== +| Option | Description | +| --------------- | ---------------------------------------------------------------- | +| `-fsyclbin` | If this option is set, the output of the invocation is a SYCLBIN | +| | file with the .syclbin file extension. This skips the | +| | host-compilation invocation of the typical `-fsycl` pipeline, | +| | instead passing the output of the clang-offloat-packager | +| | invocation to clang-linker-wrapper together with the new | +| | `--syclbin` flag. | +| | | +| | Setting this option will override `-fsycl` and | +| |`-fsycl-device-only`. | +| | | +| | This option currently requires `--offload-new-driver` to be set. | +| --------------- | ---------------------------------------------------------------- | +| `--offload-ir` | *TODO* | +| --------------- | ---------------------------------------------------------------- | +| `--offload-rdc` | This is an alias of `-fgpu-rdc`. | +| --------------- | ---------------------------------------------------------------- | Additionally, `-fsycl-link` should work with .syclbin files. Semantics of how SYCLBIN files are linked together is yet to be specified. -== clang-linker-wrapper changes +## clang-linker-wrapper changes The clang-linker-wrapper is responsible for doing post-processing and linking of device binaries, as described in @@ -220,20 +191,22 @@ link:../../clang/docs/ClangOffloadPackager.rst[ClangOffloadPackager.rst]) directly, instead of extracting it from a host binary. This should be done when a new flag, `--syclbin`, is passed. In this case, the clang-linker-wrapper is responsible to package the resulting device binaries and produced metadata into -the format described in <>. Additionally, in this case the -clang-linker-wrapper will skip the wrapping of the device code and the host code -linking stage, as there is no host code to wrap the device code in and link. +the format described in [SYCLBIN binary format section](#syclbin_format). +Additionally, in this case the clang-linker-wrapper will skip the wrapping of +the device code and the host code linking stage, as there is no host code to +wrap the device code in and link. -TODO: Describe the details of linking SYCLBIN files. +*TODO:* Describe the details of linking SYCLBIN files. -== SYCL runtime library changes +## SYCL runtime library changes Using the interfaces from the -link:../extensions/proposed/sycl_ext_oneapi_syclbin.asciidoc[sycl_ext_oneapi_syclbin] +[../extensions/proposed/sycl_ext_oneapi_syclbin.asciidoc](sycl_ext_oneapi_syclbin) extension, the runtime must be able to parse the SYCLBIN format, as described in -<>. To avoid large amounts of code duplication, the runtime uses -the implementation of SYCLBIN reading and writing implemented in LLVM. +the [SYCLBIN binary format section](#syclbin_format). To avoid large amounts of +code duplication, the runtime uses the implementation of SYCLBIN reading and +writing implemented in LLVM. When creating a `kernel_bundle` from a SYCLBIN file, the runtime reads the contents of the SYCLBIN file and creates the corresponding data structure from @@ -247,9 +220,9 @@ In the other direction, users can request the "contents" of a `kernel_bundle`. When this is done, the runtime library must ensure that a SYCLBIN file is available for the contents of the `kernel_bundle` and must then write the SYCLBIN object to the corresponding binary representation in the format -described in <>. In cases where the `kernel_bundle` was created -with a SYCLBIN file, the SYCLBIN representation is immediately available and -can be serialized directly. In other cases, the runtime library creates a new -SYCLBIN object from the binaries associated with the `kernel_bundle`, then -serializes it and returns the result. +described in the [SYCLBIN binary format section](#syclbin_format). In cases +where the `kernel_bundle` was created with a SYCLBIN file, the SYCLBIN +representation is immediately available and can be serialized directly. In other +cases, the runtime library creates a new SYCLBIN object from the binaries +associated with the `kernel_bundle`, then serializes it and returns the result. From b54afa8582531b908dd312ea2d01aacd1c7c29e8 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Tue, 4 Feb 2025 00:26:21 -0800 Subject: [PATCH 03/30] Fix tables, links and titles Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 56 ++++++++++++++++++-------------- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index abea7a447eed9..8bf5801d03525 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -13,7 +13,7 @@ the DPC++ implementation an extendable and lightweight wrapper around the multiple modules and corresponding metadata captured in the SYCLBIN file. The content of the SYCLBIN may be contained as an entry in the offloading binary format produced by the clang-offload-packager, as described in -[../../clang/docs/ClangOffloadPackager.rst](ClangOffloadPackager.rst). +[ClangOffloadPackager.rst](../../clang/docs/ClangOffloadPackager.rst). The following illustration gives an overview of how the file format is structured. @@ -153,28 +153,37 @@ describing the change. * Initial version of the layout. -== Clang driver changes +## Clang driver changes The clang driver needs to accept the following new flags: -| Option | Description | -| --------------- | ---------------------------------------------------------------- | -| `-fsyclbin` | If this option is set, the output of the invocation is a SYCLBIN | -| | file with the .syclbin file extension. This skips the | -| | host-compilation invocation of the typical `-fsycl` pipeline, | -| | instead passing the output of the clang-offloat-packager | -| | invocation to clang-linker-wrapper together with the new | -| | `--syclbin` flag. | -| | | -| | Setting this option will override `-fsycl` and | -| |`-fsycl-device-only`. | -| | | -| | This option currently requires `--offload-new-driver` to be set. | -| --------------- | ---------------------------------------------------------------- | -| `--offload-ir` | *TODO* | -| --------------- | ---------------------------------------------------------------- | -| `--offload-rdc` | This is an alias of `-fgpu-rdc`. | -| --------------- | ---------------------------------------------------------------- | + + + + + + + + + + + + + + + + + +
OptionDescription
`-fsyclbin` +If this option is set, the output of the invocation is a SYCLBIN file with the +.syclbin file extension. This skips the host-compilation invocation of the typical +`-fsycl` pipeline, instead passing the output of the clang-offloat-packager +invocation to clang-linker-wrapper together with the new `--syclbin` flag. + +Setting this option will override `-fsycl` and `-fsycl-device-only`. + +This option currently requires `--offload-new-driver` to be set. +
`--offload-ir`*TODO*
`--offload-rdc`This is an alias of `-fgpu-rdc`.
Additionally, `-fsycl-link` should work with .syclbin files. Semantics of how SYCLBIN files are linked together is yet to be specified. @@ -183,11 +192,10 @@ SYCLBIN files are linked together is yet to be specified. ## clang-linker-wrapper changes The clang-linker-wrapper is responsible for doing post-processing and linking of -device binaries, as described in -link:OffloadDesign.rst[OffloadDesign.md]. +device binaries, as described in [OffloadDesign.md](OffloadDesign.md). However, to support SYCLBIN files, the clang-linker-wrapper must be able to unpack an offload binary (as described in -link:../../clang/docs/ClangOffloadPackager.rst[ClangOffloadPackager.rst]) +[ClangOffloadPackager.rst](../../clang/docs/ClangOffloadPackager.rst)) directly, instead of extracting it from a host binary. This should be done when a new flag, `--syclbin`, is passed. In this case, the clang-linker-wrapper is responsible to package the resulting device binaries and produced metadata into @@ -202,7 +210,7 @@ wrap the device code in and link. ## SYCL runtime library changes Using the interfaces from the -[../extensions/proposed/sycl_ext_oneapi_syclbin.asciidoc](sycl_ext_oneapi_syclbin) +[sycl_ext_oneapi_syclbin](../extensions/proposed/sycl_ext_oneapi_syclbin.asciidoc) extension, the runtime must be able to parse the SYCLBIN format, as described in the [SYCLBIN binary format section](#syclbin_format). To avoid large amounts of code duplication, the runtime uses the implementation of SYCLBIN reading and From 12a6cad4102a9e52730ce67a4fbc29e955a2ff54 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Tue, 4 Feb 2025 03:18:46 -0800 Subject: [PATCH 04/30] Fix xrefs Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 8bf5801d03525..0cced1df097cd 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -5,7 +5,8 @@ device binaries to be loaded dynamically by the SYCL runtime. It also details how the toolchain produces, links and packages these binaries, as well as how the SYCL runtime library handles files of this format. -## SYCLBIN binary format +(syclbin_format)= +## SYCLBIN binary format The files produced by the new compilation path will follow the format described in this section. The intention of defining a new format for these is to give @@ -13,7 +14,7 @@ the DPC++ implementation an extendable and lightweight wrapper around the multiple modules and corresponding metadata captured in the SYCLBIN file. The content of the SYCLBIN may be contained as an entry in the offloading binary format produced by the clang-offload-packager, as described in -[ClangOffloadPackager.rst](../../clang/docs/ClangOffloadPackager.rst). +[ClangOffloadPackager.rst](../../../clang/docs/ClangOffloadPackager.rst). The following illustration gives an overview of how the file format is structured. @@ -195,7 +196,7 @@ The clang-linker-wrapper is responsible for doing post-processing and linking of device binaries, as described in [OffloadDesign.md](OffloadDesign.md). However, to support SYCLBIN files, the clang-linker-wrapper must be able to unpack an offload binary (as described in -[ClangOffloadPackager.rst](../../clang/docs/ClangOffloadPackager.rst)) +[ClangOffloadPackager.rst](../../../clang/docs/ClangOffloadPackager.rst)) directly, instead of extracting it from a host binary. This should be done when a new flag, `--syclbin`, is passed. In this case, the clang-linker-wrapper is responsible to package the resulting device binaries and produced metadata into From 0aad200a1aba9c97d2c2dbccd441630767248d6f Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Tue, 4 Feb 2025 03:41:43 -0800 Subject: [PATCH 05/30] Use link for clang design Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 0cced1df097cd..4a8fe1b27a377 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -14,7 +14,7 @@ the DPC++ implementation an extendable and lightweight wrapper around the multiple modules and corresponding metadata captured in the SYCLBIN file. The content of the SYCLBIN may be contained as an entry in the offloading binary format produced by the clang-offload-packager, as described in -[ClangOffloadPackager.rst](../../../clang/docs/ClangOffloadPackager.rst). +[ClangOffloadPackager.rst](https://github.com/intel/llvm/blob/sycl/clang/docs/ClangOffloadPackager.rst). The following illustration gives an overview of how the file format is structured. @@ -196,7 +196,7 @@ The clang-linker-wrapper is responsible for doing post-processing and linking of device binaries, as described in [OffloadDesign.md](OffloadDesign.md). However, to support SYCLBIN files, the clang-linker-wrapper must be able to unpack an offload binary (as described in -[ClangOffloadPackager.rst](../../../clang/docs/ClangOffloadPackager.rst)) +[ClangOffloadPackager.rst](https://github.com/intel/llvm/blob/sycl/clang/docs/ClangOffloadPackager.rst)) directly, instead of extracting it from a host binary. This should be done when a new flag, `--syclbin`, is passed. In this case, the clang-linker-wrapper is responsible to package the resulting device binaries and produced metadata into From 1277bd96230f42d0a9651224b3ed69410f75f6e7 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Wed, 5 Feb 2025 01:11:23 -0800 Subject: [PATCH 06/30] Address first set of comments Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 51 ++++++++++--------- .../SYCLBIN_file_format_illustration.svg | 2 +- 2 files changed, 27 insertions(+), 26 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 4a8fe1b27a377..83ab963db84dc 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -5,7 +5,6 @@ device binaries to be loaded dynamically by the SYCL runtime. It also details how the toolchain produces, links and packages these binaries, as well as how the SYCL runtime library handles files of this format. -(syclbin_format)= ## SYCLBIN binary format The files produced by the new compilation path will follow the format described @@ -33,6 +32,14 @@ file. | ---------- | ------------------------------------------------------------------ | -------------- | | `uint32_t` | Magic number. (0x53594249) | | | `uint32_t` | SYCLBIN version number. | | + +#### Global metadata + +Immediately after the header is the global metadata segment of the SYCLBIN, +containing information about the contained SYCLBIN file. + +| Type | Description | Value variable | +| ---------- | ------------------------------------------------------------------ | -------------- | | `uint8_t` | `sycl::bundle_state` corresponding to the contents of the SYCLBIN. | | The `sycl::bundle_state` is an integer with the values as follows: @@ -46,7 +53,7 @@ The `sycl::bundle_state` is an integer with the values as follows: ### Body -Immediately after the header is the body of the SYCLBIN file. The body consists +Following the global metadata is the body of the SYCLBIN file. The body consists of a list of abstract modules. | Type | Description | Value variable | @@ -57,6 +64,17 @@ of a list of abstract modules. #### Abstract module +An abstract module is a collection of device binaries that share properties, +including, but not limited to: kernel names, imported symbols, exported symbols, +aspect requirements, and specialization constants. + +The device binaries contained inside an abstract module must either be an IR +module or a native device code image. IR modules contain device binaries in some +known intermediate representation, such as SPIR-V, while the native device code +images can be an architecture-specific binary format. There is no requirement +that all device binaries in an abstract module is usable on the same device or +are specific to a single vendor. + Each abstract module represents a set of kernels, the corresponding metadata, 0 or more IR modules containing these kernels, and 0 or more native device code images containing the kernels. @@ -78,12 +96,6 @@ module. | Type | Description | Value variable | | ---------- | -------------------------------------------------------------- | -------------- | -| `uint32_t` | Byte size of the list of kernel names. | `K` | -| `K` | List of kernel names. (String list) | | -| `uint32_t` | Byte size of the list of imported symbols. | `I` | -| `I` | List of imported symbols. (String list) | | -| `uint32_t` | Byte size of the list of exported symbols. | `E` | -| `E` | List of exported symbols. (String list) | | | `uint32_t` | Byte size of property set data. | `P` | | `P` | Property set data. | | @@ -91,17 +103,6 @@ module. *NOTE:* Optional features used is embedded in the property set data. *TODO:* Consolidate and/or document the property set data in this document. -##### String list - -A string list in this binary format consists of a `uint32_t` at the beginning -containing the number of elements in the list, followed by that number of -entries with the format: - -| Type | Description | Value variable | -| ---------- | ------------------------ | -------------- | -| `uint32_t` | Byte size of the string. | `S` | -| `S` | String bytes. | | - ##### IR module @@ -168,7 +169,7 @@ The clang driver needs to accept the following new flags: If this option is set, the output of the invocation is a SYCLBIN file with the .syclbin file extension. This skips the host-compilation invocation of the typical -`-fsycl` pipeline, instead passing the output of the clang-offloat-packager +`-fsycl` pipeline, instead passing the output of the clang-offload-packager invocation to clang-linker-wrapper together with the new `--syclbin` flag. Setting this option will override `-fsycl` and `-fsycl-device-only`. @@ -200,7 +201,7 @@ unpack an offload binary (as described in directly, instead of extracting it from a host binary. This should be done when a new flag, `--syclbin`, is passed. In this case, the clang-linker-wrapper is responsible to package the resulting device binaries and produced metadata into -the format described in [SYCLBIN binary format section](#syclbin_format). +the format described in [SYCLBIN binary format section](#syclbin-binary-format). Additionally, in this case the clang-linker-wrapper will skip the wrapping of the device code and the host code linking stage, as there is no host code to wrap the device code in and link. @@ -213,9 +214,9 @@ wrap the device code in and link. Using the interfaces from the [sycl_ext_oneapi_syclbin](../extensions/proposed/sycl_ext_oneapi_syclbin.asciidoc) extension, the runtime must be able to parse the SYCLBIN format, as described in -the [SYCLBIN binary format section](#syclbin_format). To avoid large amounts of -code duplication, the runtime uses the implementation of SYCLBIN reading and -writing implemented in LLVM. +the [SYCLBIN binary format section](#syclbin-binary-format). To avoid large +amounts of code duplication, the runtime uses the implementation of SYCLBIN +reading and writing implemented in LLVM. When creating a `kernel_bundle` from a SYCLBIN file, the runtime reads the contents of the SYCLBIN file and creates the corresponding data structure from @@ -229,7 +230,7 @@ In the other direction, users can request the "contents" of a `kernel_bundle`. When this is done, the runtime library must ensure that a SYCLBIN file is available for the contents of the `kernel_bundle` and must then write the SYCLBIN object to the corresponding binary representation in the format -described in the [SYCLBIN binary format section](#syclbin_format). In cases +described in the [SYCLBIN binary format section](#syclbin-binary-format). In cases where the `kernel_bundle` was created with a SYCLBIN file, the SYCLBIN representation is immediately available and can be serialized directly. In other cases, the runtime library creates a new SYCLBIN object from the binaries diff --git a/sycl/doc/design/SYCLBIN_file_format_illustration.svg b/sycl/doc/design/SYCLBIN_file_format_illustration.svg index a274d63a5520d..e6c9d5788c522 100644 --- a/sycl/doc/design/SYCLBIN_file_format_illustration.svg +++ b/sycl/doc/design/SYCLBIN_file_format_illustration.svg @@ -1 +1 @@ -HeaderAbstract module 1Abstract module 2Abstract module N MetadataIR module 1IR module M Native device code image 1Native device code image OSYCLBIN file \ No newline at end of file +HeaderAbstract module 1Abstract module 2Abstract module N MetadataIR module 1IR module M Native device code image 1Native device code image OSYCLBIN imageHeaderMetadataRaw bytesMetadataMetadataRaw bytes \ No newline at end of file From 1d2b5c8db8d551e1827ef1842d95b8f761e785c0 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Wed, 5 Feb 2025 01:13:25 -0800 Subject: [PATCH 07/30] Remove redundant description Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 83ab963db84dc..476a317685c41 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -75,10 +75,6 @@ images can be an architecture-specific binary format. There is no requirement that all device binaries in an abstract module is usable on the same device or are specific to a single vendor. -Each abstract module represents a set of kernels, the corresponding metadata, 0 -or more IR modules containing these kernels, and 0 or more native device code -images containing the kernels. - | Type | Description | Value variable | | ---------- | ----------------------------------------------- | -------------- | | `uint64_t` | Byte size of the list of the metadata. | `M` | From 73b9c072242c92e986c6c3113397bc5f5a78783b Mon Sep 17 00:00:00 2001 From: Steffen Larsen Date: Thu, 6 Feb 2025 07:54:27 +0100 Subject: [PATCH 08/30] Update sycl/doc/design/SYCLBINDesign.md Co-authored-by: Michael Toguchi --- sycl/doc/design/SYCLBINDesign.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 476a317685c41..07a66355d22a9 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -170,7 +170,9 @@ invocation to clang-linker-wrapper together with the new `--syclbin` flag. Setting this option will override `-fsycl` and `-fsycl-device-only`. -This option currently requires `--offload-new-driver` to be set. +The behavior is dependent on using the clang-linker-wrapper. As the current default +offload compilation behavior is using the old offload model (driver based), this +option currently requires `--offload-new-driver` to be set. From c7c1512a6c6328abacdebfc23e52928532f10cff Mon Sep 17 00:00:00 2001 From: Steffen Larsen Date: Thu, 6 Feb 2025 07:54:51 +0100 Subject: [PATCH 09/30] Update sycl/doc/design/SYCLBINDesign.md Co-authored-by: Greg Lueck --- sycl/doc/design/SYCLBINDesign.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 07a66355d22a9..1d475cb65d6be 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -72,7 +72,7 @@ The device binaries contained inside an abstract module must either be an IR module or a native device code image. IR modules contain device binaries in some known intermediate representation, such as SPIR-V, while the native device code images can be an architecture-specific binary format. There is no requirement -that all device binaries in an abstract module is usable on the same device or +that all device binaries in an abstract module are usable on the same device or are specific to a single vendor. | Type | Description | Value variable | From edca48ecd0e07653563b4a4b7ca090f85fd1a106 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Thu, 6 Feb 2025 03:20:02 -0800 Subject: [PATCH 10/30] Add kernel names back and fix array types Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 68 +++++++++++++++++++------------- 1 file changed, 41 insertions(+), 27 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 1d475cb65d6be..2ca4f253a3f00 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -56,10 +56,10 @@ The `sycl::bundle_state` is an integer with the values as follows: Following the global metadata is the body of the SYCLBIN file. The body consists of a list of abstract modules. -| Type | Description | Value variable | -| ---------- | ------------------------------------------ | -------------- | -| `uint64_t` | Byte size of the list of abstract modules. | `B` | -| `B` | List of abstract modules. | | +| Type | Description | Value variable | +| ------------ | ------------------------------------------ | -------------- | +| `uint64_t` | Byte size of the list of abstract modules. | `B` | +| `uint8_t[B]` | List of abstract modules. | | #### Abstract module @@ -75,14 +75,14 @@ images can be an architecture-specific binary format. There is no requirement that all device binaries in an abstract module are usable on the same device or are specific to a single vendor. -| Type | Description | Value variable | -| ---------- | ----------------------------------------------- | -------------- | -| `uint64_t` | Byte size of the list of the metadata. | `M` | -| `M` | Module metadata. | | -| `uint64_t` | Byte size of list of IR modules. | `IR` | -| `IR` | List of IR modules. | | -| `uint64_t` | Byte size of list of native device code images. | `ND` | -| `ND` | List of native device code images. | | +| Type | Description | Value variable | +| ------------- | ----------------------------------------------- | -------------- | +| `uint64_t` | Byte size of the list of the metadata. | `M` | +| `uint8_t[M]` | Module metadata. | | +| `uint64_t` | Byte size of list of IR modules. | `IR` | +| `uint8_t[IR]` | List of IR modules. | | +| `uint64_t` | Byte size of list of native device code images. | `ND` | +| `uint8_t[ND]` | List of native device code images. | | ##### Module metadata @@ -90,10 +90,12 @@ are specific to a single vendor. The module metadata contains the following information about the contents of the module. -| Type | Description | Value variable | -| ---------- | -------------------------------------------------------------- | -------------- | -| `uint32_t` | Byte size of property set data. | `P` | -| `P` | Property set data. | | +| Type | Description | Value variable | +| ------------ | -------------------------------------------------------------- | -------------- | +| `uint32_t` | Byte size of the list of kernel names. | `K` | +| `uint8_t[K]` | List of kernel names. (String list) | | +| `uint32_t` | Byte size of property set data. | `P` | +| `uint8_t[P]` | Property set data. | | *NOTE:* Optional features used is embedded in the property set data. @@ -105,16 +107,28 @@ module. An IR module contains the binary data for the corresponding module compiled to a given IR representation, identified by the IR type field. -| Type | Description | Value variable | -| ---------- | ------------------------------ | -------------- | -| `uint8_t` | IR type. | | -| `uint32_t` | Byte size of the raw IR bytes. | `IB` | -| `IB` | Raw IR bytes. | | +| Type | Description | Value variable | +| ------------- | ------------------------------ | -------------- | +| `uint8_t` | IR type. | | +| `uint32_t` | Byte size of the raw IR bytes. | `IB` | +| `uint8_t[IB]` | Raw IR bytes. | | *TODO:* Do we need a target-specific blob inside this structure? E.g. for CUDA we may want to embed the SM version. +##### String list + +A string list in this binary format consists of a `uint32_t` at the beginning +containing the number of elements in the list, followed by that number of +entries with the format: + +| Type | Description | Value variable | +| ------------ | ------------------------ | -------------- | +| `uint32_t` | Byte size of the string. | `S` | +| `uint8_t[S]` | String bytes. | | + + ##### IR types The IR types must be one of the following values: @@ -132,12 +146,12 @@ An native device code image contains the binary data for the corresponding module AOT compiled for a specific device, identified by the architecture string. -| Type | Description | Value variable | -| ---------- | ------------------------------------------------ | -------------- | -| `uint32_t` | Byte size of the architecture string. | `A` | -| `A` | Architecture string. | | -| `uint32_t` | Byte size of the native device code image bytes. | `NB` | -| NB | Native device code image bytes. | | +| Type | Description | Value variable | +| ------------- | ------------------------------------------------ | -------------- | +| `uint32_t` | Byte size of the architecture string. | `A` | +| `uint8_t[A]` | Architecture string. | | +| `uint32_t` | Byte size of the native device code image bytes. | `NB` | +| `uint8_t[NB]` | Native device code image bytes. | | ### SYCLBIN version changelog From 1361d4851589b68fe96d731fdbbb7c3f19a43d47 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Mon, 24 Feb 2025 08:36:53 -0800 Subject: [PATCH 11/30] Switch to headers-based structure and add property set design document Signed-off-by: Larsen, Steffen --- sycl/doc/design/PropertySets.md | 245 ++++++++++++++++++ sycl/doc/design/SYCLBINDesign.md | 198 +++++++------- .../SYCLBIN_file_format_illustration.svg | 1 - 3 files changed, 355 insertions(+), 89 deletions(-) create mode 100644 sycl/doc/design/PropertySets.md delete mode 100644 sycl/doc/design/SYCLBIN_file_format_illustration.svg diff --git a/sycl/doc/design/PropertySets.md b/sycl/doc/design/PropertySets.md new file mode 100644 index 0000000000000..42db3593e2b3a --- /dev/null +++ b/sycl/doc/design/PropertySets.md @@ -0,0 +1,245 @@ +# SYCL binary property sets + +To communicate information about SYCL binaries to the SYCL runtime, the +implementation produces sets of properties. The intention of this design +document is to describe the structure of the property sets and define the +representation and meaning of pre-defined property set names. + +## Property sets structure + +A property set consists of a reserved name, enclosed in square brackets, +followed by a series of string key and value pairs. The set name and each entry +in the set are separated by a newline. + +The string key and value pairs have the following format: +``` +=| +``` + +The value type is a string and the value of it has the following meaning for the +corresponding value: + +| Value type | Description | +| ---------- | ----------------------------------------- | +| "0" | The value has no known type. | +| "1" | The value is a 32 bit integer. | +| "2" | The value is a base64 encoded byte array. | + +### Property sets + +This section describes the known property sets. + +#### SYCL/specialization constants + +__Key:__ Specialization constant name. + +__Value type:__ Byte array. ("2") + +__Value:__ Information about the specialization constant with the following +fields: + +```c++ +// Encodes ID of a scalar specialization constants which is a leaf of some +// composite specialization constant. +unsigned ID; +// Encodes offset from the beginning of composite, where scalar resides, i.e. +// location of the scalar value within a byte-array containing the whole +// composite specialization constant. If descriptor is used to represent a +// whole scalar specialization constant instead of an element of a composite, +// this field should be contain zero. +unsigned Offset; +// Encodes size of scalar specialization constant. +unsigned Size; +``` + + +#### SYCL/specialization constants default values + +__Key:__ Specialization constant name. + +__Value type:__ Byte array. ("2") + +__Value:__ Byte representation of the default value for the specialization +constant. + + +#### SYCL/devicelib req mask + +__Key:__ At most one entry with "DeviceLibReqMask". + +__Value type:__ 32 bit integer. ("1") + +__Value:__ A bitmask of which device libraries the binary uses. + + +#### SYCL/kernel param opt + +__Key:__ Kernel name. + +__Value type:__ Byte array. ("2") + +__Value:__ A bitmask identifying the arguments of the kernel that have been +removed by the dead-argument-elimination optimization pass. + + +#### SYCL/program metadata + +__Key:__ An arbitrary metadata key. This is often some identifier, such as a +kernel name, followed by a '@' and some metadata identifier. + +__Value type:__ Byte array. ("2") + +__Value:__ Unspecified. Depends on the metadata key. + + +#### SYCL/misc properties + +Miscellaneous properties: + +| Key | Value type | Value | +| ------------------------------- | --------------------- | --------------------------------------------------------------------------------------------- | +| "isEsimdImage" | 32 bit integer. ("1") | 1 if the image is ESIMD and 0 otherwise. | +| "sycl-register-alloc-mode" | 32 bit integer. ("1") | The register allocation mode: 0 for automatic and 2 for large. | +| "sycl-grf-size" | 32 bit integer. ("1") | The GRF size. Automatic if 0. | +| "optLevel" | 32 bit integer. ("1") | Optimization level, corresponding to the `-O` option used during compilation. | +| "sanUsed" | Byte array. ("2") | Specifying if address sanitization ("asan") or memory sanitization ("msan") is used. | +| "specConstsReplacedWithDefault" | 32 bit integer. ("1") | 1 if the specialization constants have been replaced by their default values and 0 otherwise. | + + +#### SYCL/assert used + +__Key:__ Kernel name. + +__Value type:__ 32 bit integer. ("1") + +__Value:__ 1. The key will not be in the set unless the kernel uses assertions. + + +#### SYCL/exported symbols + +__Key:__ Symbol name. + +__Value type:__ 32 bit integer. ("1") + +__Value:__ 1. The key will not be in the set unless the symbols is exported. + + +#### SYCL/imported symbols + +__Key:__ Symbol name. + +__Value type:__ 32 bit integer. ("1") + +__Value:__ 1. The key will not be in the set unless the symbols is exported. + + +#### SYCL/device globals + +__Key:__ Device global variable name. + +__Value type:__ Byte array. ("2") + +__Value:__ Information about the device global variable with the following +fields: + +```c++ + // Encodes size of the underlying type T of the device global variable. + uint32_t Size; + + // Either 1 (true) or 0 (false), telling whether the device global variable + // was declared with the device_image_scope property. + // We use uint32_t for a boolean value to eliminate padding after the field + // and suppress false positive reports from MemorySanitizer. + uint32_t DeviceImageScope; +``` + + +#### SYCL/device requirements + +Set of device requirements for the entire module: + +| Key | Value type | Value | +| ------------------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------- | +| "aspects" | Byte array. ("2") | A collection of 32 bit integers representing the SYCL aspects used. These correspond 1:1 with the enum values of `sycl::aspect`. | +| "fixed_target" | Byte array. ("2") | The string literals specified in `-fsycl-fixed-targets`. | +| "reqd_work_group_size_uint64_t" | Byte array. ("2") | At most three 64 bit unsigned integers representing the required work group size. | +| "joint_matrix" | Byte array. ("2") | A string containing a semi-colon-separated list of comma-separated descriptors for used matrices. The descriptors in the order they appear are:
  • sycl-joint-matrix-type
  • sycl-joint-matrix-use
  • sycl-joint-matrix-rows
  • sycl-joint-matrix-cols
| +| "joint_matrix_mad" | Byte array. ("2") | A string containing a semi-colon-separated list of comma-separated descriptors for used matrix MAD operations. The descriptors in the order they appear are:
  • sycl-joint-matrix-mad-type-A
  • sycl-joint-matrix-mad-type-B
  • sycl-joint-matrix-mad-type-C
  • sycl-joint-matrix-mad-type-D
  • sycl-joint-matrix-mad-size-M
  • sycl-joint-matrix-mad-size-K
  • sycl-joint-matrix-mad-size-N
| +| "reqd_sub_group_size" | Byte array. ("2") | At most three 32 bit unsigned integers representing the required sub-group size. | +| "work_group_num_dim" | Byte array. ("2") | At most three 32 bit unsigned integers representing the work group dimensionality. | + + +See also [OptionalDeviceFeatures.md](OptionalDeviceFeatures.md). + + +#### SYCL/host pipes + +__Key:__ Host pipe variable name. + +__Value type:__ Byte array. ("2") + +__Value:__ Information about the host pipe variable with the following +fields: + +```c++ + // Encodes size of the underlying type T of the host pipe variable. + uint32_t Size; +``` + + +#### SYCL/virtual functions + +Set of information about virtual function usage in the module. + +| Key | Value type | Value | +| ---------------------------- | ----------------- | ------------------------------------------------------------------------------------------- | +| "virtual-functions-set" | Byte array. ("2") | A string identifying the set of virtual functions contained in the module. | +| "uses-virtual-functions-set" | Byte array. ("2") | A string containing a comma-separated list of sets of virtual functions used by the module. | + + +#### SYCL/implicit local arg + +__Key:__ Kernel name. + +__Value type:__ 32 bit integer. ("1") + +__Value:__ Index of the implicit local memory argument. + + +#### SYCL/registered kernels + +__Key:__ "Registered" kernel name. + +__Value type:__ Byte array. ("2") + +__Value:__ The name of the kernel corresponding to the registered kernel name. + + +#### SYCLBIN/global metadata + +Set of global information about a SYCLBIN file. + +| Key | Value type | Value | +| ------- | --------------------- | ----- | +| "state" | 32 bit integer. ("1") | Integer representation of one of the possible states of the file, corresponding to the `sycl::bundle_state` enum. It must be one of the following:
  1. `sycl::bundle_state::input`
  2. `sycl::bundle_state::object`
  3. `sycl::bundle_state::executable`
| + + +#### SYCLBIN/ir module metadata + +Set of information about an IR module in a SYCLBIN file. + +| Key | Value type | Value | +| ------ | --------------------- | ----- | +| "type" | 32 bit integer. ("1") | Integer representation of one of the pre-defined IR types. It must be one of the following:
  1. SPIR-V
  2. PTX
  3. AMDGCN
| + + +#### SYCLBIN/native device code image metadata + +Set of information about an native device code image in a SYCLBIN file. + +| Key | Value type | Value | +| ------ | ----------------- | ----------------------------------------------------- | +| "arch" | Byte array. ("2") | A string representing the architecture of the binary. | + + + diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 2ca4f253a3f00..26c95c2f4f408 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -5,6 +5,7 @@ device binaries to be loaded dynamically by the SYCL runtime. It also details how the toolchain produces, links and packages these binaries, as well as how the SYCL runtime library handles files of this format. + ## SYCLBIN binary format The files produced by the new compilation path will follow the format described @@ -15,54 +16,60 @@ The content of the SYCLBIN may be contained as an entry in the offloading binary format produced by the clang-offload-packager, as described in [ClangOffloadPackager.rst](https://github.com/intel/llvm/blob/sycl/clang/docs/ClangOffloadPackager.rst). -The following illustration gives an overview of how the file format is -structured. - -![SYCLBIN binary file format illustration](SYCLBIN_file_format_illustration.svg) - -### Header - -The header segment appears as the first part of the SYCLBIN file. Like many +The format of these files consist of a [file header](#file-header), containing +general information about the file, followed by three lists of headers: The +[abstract module header](#abstract-module-header) list, the +[IR module header](#ir-module-header) list and +[native device code image header](#native-device-code-image-header) list, +containing information about the [abstract modules](#abstract-module), +[IR modules](#ir-module) and +[native device code images](#native-device-code-image) respectively. + +| | +| --------------------------------------------------------------------- | +| [File header](#file-header) | +| [Abstract module header](#abstract-module-header) 1 | +| ... | +| [Abstract module header](#abstract-module-header) N | +| [IR module header](#ir-module-header) 1 | +| ... | +| [IR module header](#ir-module-header) M | +| [Native device code image header](#native-device-code-image-header) 1 | +| ... | +| [Native device code image header](#native-device-code-image-header) L | +| Byte table | + + +### File header + +The file header segment appears as the first part of the SYCLBIN file. Like many other file-formats, it defines a magic number to help identify the format, which is 0x53594249 (or "SYBI".) Immediately following the magic number is the version number, which is used by SYCLBIN consumers when parsing data in the rest of the file. -| Type | Description | Value variable | -| ---------- | ------------------------------------------------------------------ | -------------- | -| `uint32_t` | Magic number. (0x53594249) | | -| `uint32_t` | SYCLBIN version number. | | - -#### Global metadata - -Immediately after the header is the global metadata segment of the SYCLBIN, -containing information about the contained SYCLBIN file. +| Type | Description | +| ---------- | ------------------------------------------------------ | +| `uint32_t` | Magic number. (0x53594249) | +| `uint32_t` | SYCLBIN version number. | +| `uint32_t` | Number of abstract modules. | +| `uint32_t` | Number of IR modules. | +| `uint32_t` | Number of native device code images. | +| `uint64_t` | Byte size of the byte table. | +| `uint64_t` | Byte offset of the global metadata in the byte table. | +| `uint64_t` | Byte size of the global metadata in the byte table. | -| Type | Description | Value variable | -| ---------- | ------------------------------------------------------------------ | -------------- | -| `uint8_t` | `sycl::bundle_state` corresponding to the contents of the SYCLBIN. | | +__Alignment:__ 64 bits. -The `sycl::bundle_state` is an integer with the values as follows: -| `sycl::bundle_state` | Value | -| -------------------- | ----- | -| `input` | 0 | -| `object` | 1 | -| `executable` | 2 | - - -### Body - -Following the global metadata is the body of the SYCLBIN file. The body consists -of a list of abstract modules. +#### Global metadata -| Type | Description | Value variable | -| ------------ | ------------------------------------------ | -------------- | -| `uint64_t` | Byte size of the list of abstract modules. | `B` | -| `uint8_t[B]` | List of abstract modules. | | +The global metadata entry contains a single property set with the identifying +name "SYCLBIN/global metadata", as described in the +[PropertySets.md](PropertySets.md#syclbinglobal-metadata) design document. -#### Abstract module +### Abstract module An abstract module is a collection of device binaries that share properties, including, but not limited to: kernel names, imported symbols, exported symbols, @@ -75,83 +82,97 @@ images can be an architecture-specific binary format. There is no requirement that all device binaries in an abstract module are usable on the same device or are specific to a single vendor. -| Type | Description | Value variable | -| ------------- | ----------------------------------------------- | -------------- | -| `uint64_t` | Byte size of the list of the metadata. | `M` | -| `uint8_t[M]` | Module metadata. | | -| `uint64_t` | Byte size of list of IR modules. | `IR` | -| `uint8_t[IR]` | List of IR modules. | | -| `uint64_t` | Byte size of list of native device code images. | `ND` | -| `uint8_t[ND]` | List of native device code images. | | +#### Abstract module header + +A abstract module header contains the following fields in the stated order: -##### Module metadata +| Type | Description | +| ---------- | ----------------------------------------------------- | +| `uint64_t` | Byte offset of the metadata in the byte table. | +| `uint64_t` | Byte size of the metadata in the byte table. | +| `uint32_t` | Number of IR modules. | +| `uint32_t` | Offset of the first IR module header. | +| `uint32_t` | Number of native device code images. | +| `uint32_t` | Offset of the first native device code images header. | -The module metadata contains the following information about the contents of the -module. +__Alignment:__ 64 bits. -| Type | Description | Value variable | -| ------------ | -------------------------------------------------------------- | -------------- | -| `uint32_t` | Byte size of the list of kernel names. | `K` | -| `uint8_t[K]` | List of kernel names. (String list) | | -| `uint32_t` | Byte size of property set data. | `P` | -| `uint8_t[P]` | Property set data. | | +#### Abstract module metadata +An abstract module metadata entry contains any number of property sets, as +described in [PropertySets.md](PropertySets.md), excluding: -*NOTE:* Optional features used is embedded in the property set data. -*TODO:* Consolidate and/or document the property set data in this document. +* ["SYCLBIN/global metadata"](PropertySets.md#syclbinglobal-metadata) +* ["SYCLBIN/ir module metadata"](PropertySets.md#syclbinir-module-metadata) +* ["SYCLBIN/native device code image module metadata"](PropertySets.md#syclbinnative-device-code-image-metadata) -##### IR module +#### IR module An IR module contains the binary data for the corresponding module compiled to a given IR representation, identified by the IR type field. -| Type | Description | Value variable | -| ------------- | ------------------------------ | -------------- | -| `uint8_t` | IR type. | | -| `uint32_t` | Byte size of the raw IR bytes. | `IB` | -| `uint8_t[IB]` | Raw IR bytes. | | - -*TODO:* Do we need a target-specific blob inside this structure? E.g. for CUDA - we may want to embed the SM version. +##### IR module header -##### String list +A IR module header contains the following fields in the stated order: -A string list in this binary format consists of a `uint32_t` at the beginning -containing the number of elements in the list, followed by that number of -entries with the format: +| Type | Description | +| ---------- | ----------------------------------------------------- | +| `uint64_t` | Byte offset of the metadata in the byte table. | +| `uint64_t` | Byte size of the metadata in the byte table. | +| `uint64_t` | Byte offset of the raw IR bytes in the byte table. | +| `uint64_t` | Byte size of the raw IR bytes in the byte table. | -| Type | Description | Value variable | -| ------------ | ------------------------ | -------------- | -| `uint32_t` | Byte size of the string. | `S` | -| `uint8_t[S]` | String bytes. | | +__Alignment:__ 64 bits. -##### IR types +##### IR module metadata -The IR types must be one of the following values: +An IR module metadata entry contains a single property set with the identifying +name "SYCLBIN/ir module metadata", as described in the +[PropertySets.md](PropertySets.md#syclbinir-module-metadata) design document. -| IR type | Value | -| ------- | ----- | -| SPIR-V | 0 | -| PTX | 1 | -| AMDGCN | 2 | - -##### Native device code image +#### Native device code image An native device code image contains the binary data for the corresponding module AOT compiled for a specific device, identified by the architecture string. -| Type | Description | Value variable | -| ------------- | ------------------------------------------------ | -------------- | -| `uint32_t` | Byte size of the architecture string. | `A` | -| `uint8_t[A]` | Architecture string. | | -| `uint32_t` | Byte size of the native device code image bytes. | `NB` | -| `uint8_t[NB]` | Native device code image bytes. | | + +##### Native device code image header + +A native device code image header contains the following fields in the stated +order: + +| Type | Description | +| ---------- | ------------------------------------------------------------- | +| `uint64_t` | Byte offset of the metadata in the byte table. | +| `uint64_t` | Byte size of the metadata in the byte table. | +| `uint64_t` | Byte offset of the device code image bytes in the byte table. | +| `uint64_t` | Byte size of the device code image bytes in the byte table. | + +__Alignment:__ 64 bits. + + +##### Native device code image metadata + +A native device code image metadata entry contains a single property set with +the identifying name "SYCLBIN/native device code image module metadata", as +described in the +[PropertySets.md](PropertySets.md#syclbinnative-device-code-image-metadata) +design document. + + +### Byte table + +The byte table contains dynamic data, such as metadata and binary blobs. The +contents of it is generally referenced by an offset specified in the headers. + +__Alignment:__ 64 bits. This alignment guarantee does not apply to the +structures contained in the table. ### SYCLBIN version changelog @@ -160,6 +181,7 @@ The SYCLBIN format is subject to change, but any such changes must come with an increment to the version number in the header and a subsection to this section describing the change. + #### Version 1 * Initial version of the layout. diff --git a/sycl/doc/design/SYCLBIN_file_format_illustration.svg b/sycl/doc/design/SYCLBIN_file_format_illustration.svg deleted file mode 100644 index e6c9d5788c522..0000000000000 --- a/sycl/doc/design/SYCLBIN_file_format_illustration.svg +++ /dev/null @@ -1 +0,0 @@ -HeaderAbstract module 1Abstract module 2Abstract module N MetadataIR module 1IR module M Native device code image 1Native device code image OSYCLBIN imageHeaderMetadataRaw bytesMetadataMetadataRaw bytes \ No newline at end of file From 533e901e633a317170a764e9d7657ce6ce1dd541 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Tue, 25 Feb 2025 22:50:52 -0800 Subject: [PATCH 12/30] Address PropertySets.md comments Signed-off-by: Larsen, Steffen --- sycl/doc/design/PropertySets.md | 48 +++++++++++++++++++-------------- 1 file changed, 28 insertions(+), 20 deletions(-) diff --git a/sycl/doc/design/PropertySets.md b/sycl/doc/design/PropertySets.md index 42db3593e2b3a..b2f08455ec146 100644 --- a/sycl/doc/design/PropertySets.md +++ b/sycl/doc/design/PropertySets.md @@ -21,7 +21,6 @@ corresponding value: | Value type | Description | | ---------- | ----------------------------------------- | -| "0" | The value has no known type. | | "1" | The value is a 32 bit integer. | | "2" | The value is a base64 encoded byte array. | @@ -29,7 +28,7 @@ corresponding value: This section describes the known property sets. -#### SYCL/specialization constants +#### [SYCL/specialization constants] __Key:__ Specialization constant name. @@ -53,7 +52,7 @@ unsigned Size; ``` -#### SYCL/specialization constants default values +#### [SYCL/specialization constants default values] __Key:__ Specialization constant name. @@ -63,7 +62,7 @@ __Value:__ Byte representation of the default value for the specialization constant. -#### SYCL/devicelib req mask +#### [SYCL/devicelib req mask] __Key:__ At most one entry with "DeviceLibReqMask". @@ -72,7 +71,7 @@ __Value type:__ 32 bit integer. ("1") __Value:__ A bitmask of which device libraries the binary uses. -#### SYCL/kernel param opt +#### [SYCL/kernel param opt] __Key:__ Kernel name. @@ -82,7 +81,7 @@ __Value:__ A bitmask identifying the arguments of the kernel that have been removed by the dead-argument-elimination optimization pass. -#### SYCL/program metadata +#### [SYCL/program metadata] __Key:__ An arbitrary metadata key. This is often some identifier, such as a kernel name, followed by a '@' and some metadata identifier. @@ -92,21 +91,24 @@ __Value type:__ Byte array. ("2") __Value:__ Unspecified. Depends on the metadata key. -#### SYCL/misc properties +#### [SYCL/misc properties] Miscellaneous properties: | Key | Value type | Value | | ------------------------------- | --------------------- | --------------------------------------------------------------------------------------------- | -| "isEsimdImage" | 32 bit integer. ("1") | 1 if the image is ESIMD and 0 otherwise. | +| "isEsimdImage" | 32 bit integer. ("1") | 1 if the image is ESIMD and 0 or missing otherwise. | | "sycl-register-alloc-mode" | 32 bit integer. ("1") | The register allocation mode: 0 for automatic and 2 for large. | | "sycl-grf-size" | 32 bit integer. ("1") | The GRF size. Automatic if 0. | | "optLevel" | 32 bit integer. ("1") | Optimization level, corresponding to the `-O` option used during compilation. | | "sanUsed" | Byte array. ("2") | Specifying if address sanitization ("asan") or memory sanitization ("msan") is used. | | "specConstsReplacedWithDefault" | 32 bit integer. ("1") | 1 if the specialization constants have been replaced by their default values and 0 otherwise. | +__NOTE:__ All of these properties are optional and not having them will result +in implementation defined behavior. -#### SYCL/assert used + +#### [SYCL/assert used] __Key:__ Kernel name. @@ -115,7 +117,7 @@ __Value type:__ 32 bit integer. ("1") __Value:__ 1. The key will not be in the set unless the kernel uses assertions. -#### SYCL/exported symbols +#### [SYCL/exported symbols] __Key:__ Symbol name. @@ -124,7 +126,7 @@ __Value type:__ 32 bit integer. ("1") __Value:__ 1. The key will not be in the set unless the symbols is exported. -#### SYCL/imported symbols +#### [SYCL/imported symbols] __Key:__ Symbol name. @@ -133,7 +135,7 @@ __Value type:__ 32 bit integer. ("1") __Value:__ 1. The key will not be in the set unless the symbols is exported. -#### SYCL/device globals +#### [SYCL/device globals] __Key:__ Device global variable name. @@ -154,7 +156,7 @@ fields: ``` -#### SYCL/device requirements +#### [SYCL/device requirements] Set of device requirements for the entire module: @@ -171,8 +173,11 @@ Set of device requirements for the entire module: See also [OptionalDeviceFeatures.md](OptionalDeviceFeatures.md). +__NOTE:__ All of these properties are optional and not having them will result +in implementation defined behavior. + -#### SYCL/host pipes +#### [SYCL/host pipes] __Key:__ Host pipe variable name. @@ -187,7 +192,7 @@ fields: ``` -#### SYCL/virtual functions +#### [SYCL/virtual functions] Set of information about virtual function usage in the module. @@ -196,8 +201,11 @@ Set of information about virtual function usage in the module. | "virtual-functions-set" | Byte array. ("2") | A string identifying the set of virtual functions contained in the module. | | "uses-virtual-functions-set" | Byte array. ("2") | A string containing a comma-separated list of sets of virtual functions used by the module. | +__NOTE:__ All of these properties are optional and not having them will result +in implementation defined behavior. + -#### SYCL/implicit local arg +#### [SYCL/implicit local arg] __Key:__ Kernel name. @@ -206,7 +214,7 @@ __Value type:__ 32 bit integer. ("1") __Value:__ Index of the implicit local memory argument. -#### SYCL/registered kernels +#### [SYCL/registered kernels] __Key:__ "Registered" kernel name. @@ -215,7 +223,7 @@ __Value type:__ Byte array. ("2") __Value:__ The name of the kernel corresponding to the registered kernel name. -#### SYCLBIN/global metadata +#### [SYCLBIN/global metadata] Set of global information about a SYCLBIN file. @@ -224,7 +232,7 @@ Set of global information about a SYCLBIN file. | "state" | 32 bit integer. ("1") | Integer representation of one of the possible states of the file, corresponding to the `sycl::bundle_state` enum. It must be one of the following:
  1. `sycl::bundle_state::input`
  2. `sycl::bundle_state::object`
  3. `sycl::bundle_state::executable`
| -#### SYCLBIN/ir module metadata +#### [SYCLBIN/ir module metadata] Set of information about an IR module in a SYCLBIN file. @@ -233,7 +241,7 @@ Set of information about an IR module in a SYCLBIN file. | "type" | 32 bit integer. ("1") | Integer representation of one of the pre-defined IR types. It must be one of the following:
  1. SPIR-V
  2. PTX
  3. AMDGCN
| -#### SYCLBIN/native device code image metadata +#### [SYCLBIN/native device code image metadata] Set of information about an native device code image in a SYCLBIN file. From 63d0f9aa0adb9f91776d72973ee5d5a57ff033e5 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Tue, 25 Feb 2025 23:09:29 -0800 Subject: [PATCH 13/30] Address SYCLBIN design comments Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 73 ++++++++++++++++++-------------- 1 file changed, 41 insertions(+), 32 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 26c95c2f4f408..7a815daa04683 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -37,7 +37,8 @@ containing information about the [abstract modules](#abstract-module), | [Native device code image header](#native-device-code-image-header) 1 | | ... | | [Native device code image header](#native-device-code-image-header) L | -| Byte table | +| Metadata byte table | +| Binary byte table | ### File header @@ -48,16 +49,17 @@ is 0x53594249 (or "SYBI".) Immediately following the magic number is the version number, which is used by SYCLBIN consumers when parsing data in the rest of the file. -| Type | Description | -| ---------- | ------------------------------------------------------ | -| `uint32_t` | Magic number. (0x53594249) | -| `uint32_t` | SYCLBIN version number. | -| `uint32_t` | Number of abstract modules. | -| `uint32_t` | Number of IR modules. | -| `uint32_t` | Number of native device code images. | -| `uint64_t` | Byte size of the byte table. | -| `uint64_t` | Byte offset of the global metadata in the byte table. | -| `uint64_t` | Byte size of the global metadata in the byte table. | +| Type | Description | +| ---------- | -------------------------------------------------------------- | +| `uint32_t` | Magic number. (0x53594249) | +| `uint32_t` | SYCLBIN version number. | +| `uint32_t` | Number of abstract modules. | +| `uint32_t` | Number of IR modules. | +| `uint32_t` | Number of native device code images. | +| `uint64_t` | Byte size of the metadata byte table. | +| `uint64_t` | Byte size of the binary byte table. | +| `uint64_t` | Byte offset of the global metadata in the metadata byte table. | +| `uint64_t` | Byte size of the global metadata in the metadata byte table. | __Alignment:__ 64 bits. @@ -87,14 +89,14 @@ are specific to a single vendor. A abstract module header contains the following fields in the stated order: -| Type | Description | -| ---------- | ----------------------------------------------------- | -| `uint64_t` | Byte offset of the metadata in the byte table. | -| `uint64_t` | Byte size of the metadata in the byte table. | -| `uint32_t` | Number of IR modules. | -| `uint32_t` | Offset of the first IR module header. | -| `uint32_t` | Number of native device code images. | -| `uint32_t` | Offset of the first native device code images header. | +| Type | Description | +| ---------- | ------------------------------------------------------------------------------------------ | +| `uint64_t` | Byte offset of the metadata in the metadata byte table. | +| `uint64_t` | Byte size of the metadata in the metadata byte table. | +| `uint32_t` | Number of IR modules. | +| `uint32_t` | Index of the first IR module header in the IR module header array. | +| `uint32_t` | Number of native device code images. | +| `uint32_t` | Index of the first native device code images header native device code image header array. | __Alignment:__ 64 bits. @@ -118,12 +120,12 @@ given IR representation, identified by the IR type field. A IR module header contains the following fields in the stated order: -| Type | Description | -| ---------- | ----------------------------------------------------- | -| `uint64_t` | Byte offset of the metadata in the byte table. | -| `uint64_t` | Byte size of the metadata in the byte table. | -| `uint64_t` | Byte offset of the raw IR bytes in the byte table. | -| `uint64_t` | Byte size of the raw IR bytes in the byte table. | +| Type | Description | +| ---------- | --------------------------------------------------------- | +| `uint64_t` | Byte offset of the metadata in the metadata byte table. | +| `uint64_t` | Byte size of the metadata in the metadata byte table. | +| `uint64_t` | Byte offset of the raw IR bytes in the binary byte table. | +| `uint64_t` | Byte size of the raw IR bytes in the binary byte table. | __Alignment:__ 64 bits. @@ -147,12 +149,12 @@ string. A native device code image header contains the following fields in the stated order: -| Type | Description | -| ---------- | ------------------------------------------------------------- | -| `uint64_t` | Byte offset of the metadata in the byte table. | -| `uint64_t` | Byte size of the metadata in the byte table. | -| `uint64_t` | Byte offset of the device code image bytes in the byte table. | -| `uint64_t` | Byte size of the device code image bytes in the byte table. | +| Type | Description | +| ---------- | -------------------------------------------------------------------- | +| `uint64_t` | Byte offset of the metadata in the metadata byte table. | +| `uint64_t` | Byte size of the metadata in the metadata byte table. | +| `uint64_t` | Byte offset of the device code image bytes in the binary byte table. | +| `uint64_t` | Byte size of the device code image bytes in the binary byte table. | __Alignment:__ 64 bits. @@ -168,8 +170,9 @@ design document. ### Byte table -The byte table contains dynamic data, such as metadata and binary blobs. The +A byte table contains dynamic data, such as metadata and binary blobs. The contents of it is generally referenced by an offset specified in the headers. +The implementation has two __Alignment:__ 64 bits. This alignment guarantee does not apply to the structures contained in the table. @@ -181,6 +184,12 @@ The SYCLBIN format is subject to change, but any such changes must come with an increment to the version number in the header and a subsection to this section describing the change. +Additionally, any changes to the property set structure that affects the way the +runtime has to parse the contained property sets will require an increase in the +SYCLBIN version. Adding new property set names or new predefined properties only +require a SYCLBIN version change if the the SYCLBIN consumer cannot safely +ignore the property. + #### Version 1 From fbf54ad13658d83faf7b1b181ce79539f4f86bb3 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Tue, 25 Feb 2025 23:10:52 -0800 Subject: [PATCH 14/30] Removed unfinished line Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 7a815daa04683..aff06f3add6db 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -168,11 +168,10 @@ described in the design document. -### Byte table +### Byte tables A byte table contains dynamic data, such as metadata and binary blobs. The contents of it is generally referenced by an offset specified in the headers. -The implementation has two __Alignment:__ 64 bits. This alignment guarantee does not apply to the structures contained in the table. From 05481f16b4bd2e7e558b4908b235912d867eeae4 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Mon, 3 Mar 2025 02:07:58 -0800 Subject: [PATCH 15/30] Move alignment and size guarantees Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index aff06f3add6db..6e7ce298bfccc 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -40,6 +40,9 @@ containing information about the [abstract modules](#abstract-module), | Metadata byte table | | Binary byte table | +The headers and each byte table are all aligned to 8 bytes. The fields in the +headers use C/C++ type notation, including the fixed-size integer types defined +in the `` header, and will have the same size and alignment. ### File header @@ -61,8 +64,6 @@ file. | `uint64_t` | Byte offset of the global metadata in the metadata byte table. | | `uint64_t` | Byte size of the global metadata in the metadata byte table. | -__Alignment:__ 64 bits. - #### Global metadata @@ -98,8 +99,6 @@ A abstract module header contains the following fields in the stated order: | `uint32_t` | Number of native device code images. | | `uint32_t` | Index of the first native device code images header native device code image header array. | -__Alignment:__ 64 bits. - #### Abstract module metadata An abstract module metadata entry contains any number of property sets, as @@ -127,8 +126,6 @@ A IR module header contains the following fields in the stated order: | `uint64_t` | Byte offset of the raw IR bytes in the binary byte table. | | `uint64_t` | Byte size of the raw IR bytes in the binary byte table. | -__Alignment:__ 64 bits. - ##### IR module metadata @@ -156,8 +153,6 @@ order: | `uint64_t` | Byte offset of the device code image bytes in the binary byte table. | | `uint64_t` | Byte size of the device code image bytes in the binary byte table. | -__Alignment:__ 64 bits. - ##### Native device code image metadata @@ -173,9 +168,6 @@ design document. A byte table contains dynamic data, such as metadata and binary blobs. The contents of it is generally referenced by an offset specified in the headers. -__Alignment:__ 64 bits. This alignment guarantee does not apply to the -structures contained in the table. - ### SYCLBIN version changelog From f59fcabbb5e5a3980ab2f8b8947879afd370c535 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Mon, 3 Mar 2025 02:10:06 -0800 Subject: [PATCH 16/30] Specify offset is in the byte table Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 48 ++++++++++++++++---------------- 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 6e7ce298bfccc..3c64bc5229f1c 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -52,17 +52,17 @@ is 0x53594249 (or "SYBI".) Immediately following the magic number is the version number, which is used by SYCLBIN consumers when parsing data in the rest of the file. -| Type | Description | -| ---------- | -------------------------------------------------------------- | -| `uint32_t` | Magic number. (0x53594249) | -| `uint32_t` | SYCLBIN version number. | -| `uint32_t` | Number of abstract modules. | -| `uint32_t` | Number of IR modules. | -| `uint32_t` | Number of native device code images. | -| `uint64_t` | Byte size of the metadata byte table. | -| `uint64_t` | Byte size of the binary byte table. | -| `uint64_t` | Byte offset of the global metadata in the metadata byte table. | -| `uint64_t` | Byte size of the global metadata in the metadata byte table. | +| Type | Description | +| ---------- | ----------------------------------------------------------------------------- | +| `uint32_t` | Magic number. (0x53594249) | +| `uint32_t` | SYCLBIN version number. | +| `uint32_t` | Number of abstract modules. | +| `uint32_t` | Number of IR modules. | +| `uint32_t` | Number of native device code images. | +| `uint64_t` | Byte size of the metadata byte table. | +| `uint64_t` | Byte size of the binary byte table. | +| `uint64_t` | Byte offset of the global metadata from the start of the metadata byte table. | +| `uint64_t` | Byte size of the global metadata from the start of the metadata byte table. | #### Global metadata @@ -92,7 +92,7 @@ A abstract module header contains the following fields in the stated order: | Type | Description | | ---------- | ------------------------------------------------------------------------------------------ | -| `uint64_t` | Byte offset of the metadata in the metadata byte table. | +| `uint64_t` | Byte offset of the metadata from the start of the metadata byte table. | | `uint64_t` | Byte size of the metadata in the metadata byte table. | | `uint32_t` | Number of IR modules. | | `uint32_t` | Index of the first IR module header in the IR module header array. | @@ -119,12 +119,12 @@ given IR representation, identified by the IR type field. A IR module header contains the following fields in the stated order: -| Type | Description | -| ---------- | --------------------------------------------------------- | -| `uint64_t` | Byte offset of the metadata in the metadata byte table. | -| `uint64_t` | Byte size of the metadata in the metadata byte table. | -| `uint64_t` | Byte offset of the raw IR bytes in the binary byte table. | -| `uint64_t` | Byte size of the raw IR bytes in the binary byte table. | +| Type | Description | +| ---------- | ------------------------------------------------------------------------ | +| `uint64_t` | Byte offset of the metadata from the start of the metadata byte table. | +| `uint64_t` | Byte size of the metadata in the metadata byte table. | +| `uint64_t` | Byte offset of the raw IR bytes from the start of the binary byte table. | +| `uint64_t` | Byte size of the raw IR bytes in the binary byte table. | ##### IR module metadata @@ -146,12 +146,12 @@ string. A native device code image header contains the following fields in the stated order: -| Type | Description | -| ---------- | -------------------------------------------------------------------- | -| `uint64_t` | Byte offset of the metadata in the metadata byte table. | -| `uint64_t` | Byte size of the metadata in the metadata byte table. | -| `uint64_t` | Byte offset of the device code image bytes in the binary byte table. | -| `uint64_t` | Byte size of the device code image bytes in the binary byte table. | +| Type | Description | +| ---------- | ----------------------------------------------------------------------------------- | +| `uint64_t` | Byte offset of the metadata from the start of the metadata byte table. | +| `uint64_t` | Byte size of the metadata in the metadata byte table. | +| `uint64_t` | Byte offset of the device code image bytes from the start of the binary byte table. | +| `uint64_t` | Byte size of the device code image bytes in the binary byte table. | ##### Native device code image metadata From ad8251cfb4db9fa0ac1d1846a6292a84c72c4241 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Mon, 3 Mar 2025 05:33:13 -0800 Subject: [PATCH 17/30] Be more specific in property sets Signed-off-by: Larsen, Steffen --- sycl/doc/design/PropertySets.md | 99 +++++++++++++++++++++------------ 1 file changed, 63 insertions(+), 36 deletions(-) diff --git a/sycl/doc/design/PropertySets.md b/sycl/doc/design/PropertySets.md index b2f08455ec146..45cfd0a79b80b 100644 --- a/sycl/doc/design/PropertySets.md +++ b/sycl/doc/design/PropertySets.md @@ -70,6 +70,10 @@ __Value type:__ 32 bit integer. ("1") __Value:__ A bitmask of which device libraries the binary uses. +__Notes:__ + +1. If this property set is missing, no device libraries are used by the binary. + #### [SYCL/kernel param opt] @@ -80,32 +84,39 @@ __Value type:__ Byte array. ("2") __Value:__ A bitmask identifying the arguments of the kernel that have been removed by the dead-argument-elimination optimization pass. +__Notes:__ -#### [SYCL/program metadata] +1. If no entry is present for a given kernel in the binary, no arguments have +been eliminated. +2. If this property set is missing, no kernels in the binary have any eliminated +arguments. -__Key:__ An arbitrary metadata key. This is often some identifier, such as a -kernel name, followed by a '@' and some metadata identifier. -__Value type:__ Byte array. ("2") +#### [SYCL/program metadata] -__Value:__ Unspecified. Depends on the metadata key. +Program metadata properties: + +| Key | Value type | Value | +| ---------------------------------------- | --------------------- | ---------------------------------------------------------------------------------------------------------------- | +| `kernel` + "@reqd_work_group_size" | Byte array. ("2") | Specifies the required work-group size for the kernel identified by the name `kernel`. | +| `kernel` + "@work_group_num_dim" | Byte array. ("2") | Specifies the work-group dimensionality of the kernel identified by the name `kernel`. | +| `kernel` + "@max_work_group_size" | Byte array. ("2") | Specifies the max work-group size for the kernel identified by the name `kernel`. | +| `kernel` + "@max_linear_work_group_size" | Byte array. ("2") | Specifies the max linear work-group size for the kernel identified by the name `kernel`. | +| `variable` + "@global_id_mapping" | Byte array. ("2") | Specifies the mapping between the global variable with unique identifier `variable` and its name in the binary. | #### [SYCL/misc properties] Miscellaneous properties: -| Key | Value type | Value | -| ------------------------------- | --------------------- | --------------------------------------------------------------------------------------------- | -| "isEsimdImage" | 32 bit integer. ("1") | 1 if the image is ESIMD and 0 or missing otherwise. | -| "sycl-register-alloc-mode" | 32 bit integer. ("1") | The register allocation mode: 0 for automatic and 2 for large. | -| "sycl-grf-size" | 32 bit integer. ("1") | The GRF size. Automatic if 0. | -| "optLevel" | 32 bit integer. ("1") | Optimization level, corresponding to the `-O` option used during compilation. | -| "sanUsed" | Byte array. ("2") | Specifying if address sanitization ("asan") or memory sanitization ("msan") is used. | -| "specConstsReplacedWithDefault" | 32 bit integer. ("1") | 1 if the specialization constants have been replaced by their default values and 0 otherwise. | - -__NOTE:__ All of these properties are optional and not having them will result -in implementation defined behavior. +| Key | Value type | Value | +| ------------------------------- | --------------------- | ---------------------------------------------------------------------------------------------------------------- | +| "isEsimdImage" | 32 bit integer. ("1") | 1 if the image is ESIMD and 0 or missing otherwise. | +| "sycl-register-alloc-mode" | 32 bit integer. ("1") | The register allocation mode: 2 for large and 0 or missing for automatic. | +| "sycl-grf-size" | 32 bit integer. ("1") | The GRF size. Automatic if 0 or missing. | +| "optLevel" | 32 bit integer. ("1") | Optimization level, corresponding to the `-O` option used during compilation. | +| "sanUsed" | Byte array. ("2") | Specifying if address sanitization ("asan") or memory sanitization ("msan") is used. Missing if neither is used. | +| "specConstsReplacedWithDefault" | 32 bit integer. ("1") | 1 if the specialization constants have been replaced by their default values and 0 or missing otherwise. | #### [SYCL/assert used] @@ -114,7 +125,7 @@ __Key:__ Kernel name. __Value type:__ 32 bit integer. ("1") -__Value:__ 1. The key will not be in the set unless the kernel uses assertions. +__Value:__ 1 if the kernel uses assertions and 0 or missing otherwise. #### [SYCL/exported symbols] @@ -123,7 +134,7 @@ __Key:__ Symbol name. __Value type:__ 32 bit integer. ("1") -__Value:__ 1. The key will not be in the set unless the symbols is exported. +__Value:__ 1 if the symbol is exported by the binary and 0 or missing otherwise. #### [SYCL/imported symbols] @@ -132,7 +143,7 @@ __Key:__ Symbol name. __Value type:__ 32 bit integer. ("1") -__Value:__ 1. The key will not be in the set unless the symbols is exported. +__Value:__ 1 if the symbol is imported by the binary and 0 or missing otherwise. #### [SYCL/device globals] @@ -155,27 +166,29 @@ fields: uint32_t DeviceImageScope; ``` +__Notes:__ + +1. If this property set is missing, the binary does not contain any device +global variables. + #### [SYCL/device requirements] Set of device requirements for the entire module: -| Key | Value type | Value | -| ------------------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------- | -| "aspects" | Byte array. ("2") | A collection of 32 bit integers representing the SYCL aspects used. These correspond 1:1 with the enum values of `sycl::aspect`. | -| "fixed_target" | Byte array. ("2") | The string literals specified in `-fsycl-fixed-targets`. | -| "reqd_work_group_size_uint64_t" | Byte array. ("2") | At most three 64 bit unsigned integers representing the required work group size. | +| Key | Value type | Value | +| ------------------------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| "aspects" | Byte array. ("2") | A collection of 32 bit integers representing the SYCL aspects used. These correspond 1:1 with the enum values of `sycl::aspect`. | +| "fixed_target" | Byte array. ("2") | The string literals specified in `-fsycl-fixed-targets`. | +| "reqd_work_group_size_uint64_t" | Byte array. ("2") | At most three 64 bit unsigned integers representing the required work-group size. If this entry is missing, there is no work-group size requirement. | | "joint_matrix" | Byte array. ("2") | A string containing a semi-colon-separated list of comma-separated descriptors for used matrices. The descriptors in the order they appear are:
  • sycl-joint-matrix-type
  • sycl-joint-matrix-use
  • sycl-joint-matrix-rows
  • sycl-joint-matrix-cols
| | "joint_matrix_mad" | Byte array. ("2") | A string containing a semi-colon-separated list of comma-separated descriptors for used matrix MAD operations. The descriptors in the order they appear are:
  • sycl-joint-matrix-mad-type-A
  • sycl-joint-matrix-mad-type-B
  • sycl-joint-matrix-mad-type-C
  • sycl-joint-matrix-mad-type-D
  • sycl-joint-matrix-mad-size-M
  • sycl-joint-matrix-mad-size-K
  • sycl-joint-matrix-mad-size-N
| -| "reqd_sub_group_size" | Byte array. ("2") | At most three 32 bit unsigned integers representing the required sub-group size. | -| "work_group_num_dim" | Byte array. ("2") | At most three 32 bit unsigned integers representing the work group dimensionality. | +| "reqd_sub_group_size" | Byte array. ("2") | At most three 32 bit unsigned integers representing the required sub-group size. If this entry is missing, there is no sub-group size requirement. | +| "work_group_num_dim" | Byte array. ("2") | At most three 32 bit unsigned integers representing the work-group dimensionality. If this entry is missing, there is no specified work-group dimensionality. | See also [OptionalDeviceFeatures.md](OptionalDeviceFeatures.md). -__NOTE:__ All of these properties are optional and not having them will result -in implementation defined behavior. - #### [SYCL/host pipes] @@ -191,18 +204,20 @@ fields: uint32_t Size; ``` +__Notes:__ + +1. If this property set is missing, the binary does not contain any host pipe +variables. + #### [SYCL/virtual functions] Set of information about virtual function usage in the module. -| Key | Value type | Value | -| ---------------------------- | ----------------- | ------------------------------------------------------------------------------------------- | -| "virtual-functions-set" | Byte array. ("2") | A string identifying the set of virtual functions contained in the module. | -| "uses-virtual-functions-set" | Byte array. ("2") | A string containing a comma-separated list of sets of virtual functions used by the module. | - -__NOTE:__ All of these properties are optional and not having them will result -in implementation defined behavior. +| Key | Value type | Value | +| ---------------------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| "virtual-functions-set" | Byte array. ("2") | A string identifying the set of virtual functions contained in the module. If this is missing, the module does not contain any virtual function sets. | +| "uses-virtual-functions-set" | Byte array. ("2") | A string containing a comma-separated list of sets of virtual functions used by the module. If this is missing, the module does not use any virtual function sets. | #### [SYCL/implicit local arg] @@ -213,6 +228,13 @@ __Value type:__ 32 bit integer. ("1") __Value:__ Index of the implicit local memory argument. +__Notes:__ + +1. If no entry is present for a given kernel in the binary, the kernel does not +have an implicit local memory argument. +2. If this property set is missing, no kernels in the binary have an implicit +local memory argument. + #### [SYCL/registered kernels] @@ -222,6 +244,11 @@ __Value type:__ Byte array. ("2") __Value:__ The name of the kernel corresponding to the registered kernel name. +__Notes:__ + +1. If this property set is missing, the binary does not have any registered +kernel names. + #### [SYCLBIN/global metadata] From 63aa5729449b000524282552abe001c465ff44ad Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Mon, 3 Mar 2025 05:56:54 -0800 Subject: [PATCH 18/30] Add new docs to TOC Signed-off-by: Larsen, Steffen --- sycl/doc/index.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sycl/doc/index.rst b/sycl/doc/index.rst index cc4961dd7f438..d49a788ace48d 100644 --- a/sycl/doc/index.rst +++ b/sycl/doc/index.rst @@ -54,6 +54,8 @@ Design Documents for the oneAPI DPC++ Compiler design/CommandGraph design/OffloadDesign design/PrivateAlloca + design/SYCLBINDesign + design/PropertySets New OpenCL Extensions New SPIR-V Extensions From f7e905d4bd0e94fa3d7cdbbc47258d3f3572ce77 Mon Sep 17 00:00:00 2001 From: Steffen Larsen Date: Thu, 6 Mar 2025 11:55:50 +0100 Subject: [PATCH 19/30] Update sycl/doc/design/SYCLBINDesign.md --- sycl/doc/design/SYCLBINDesign.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 3c64bc5229f1c..8455494d2c926 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -42,7 +42,8 @@ containing information about the [abstract modules](#abstract-module), The headers and each byte table are all aligned to 8 bytes. The fields in the headers use C/C++ type notation, including the fixed-size integer types defined -in the `` header, and will have the same size and alignment. +in the `` header, and will have the same size and alignment. For +consistency, all these types use little endian layout. ### File header From 72f62ac0bf15874843bf84890a8005c8d846b1be Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Tue, 11 Mar 2025 08:32:49 -0700 Subject: [PATCH 20/30] Add note about whitespaces and minor editorial changes Signed-off-by: Larsen, Steffen --- sycl/doc/design/PropertySets.md | 47 ++++++++++++++++++--------------- 1 file changed, 26 insertions(+), 21 deletions(-) diff --git a/sycl/doc/design/PropertySets.md b/sycl/doc/design/PropertySets.md index 45cfd0a79b80b..e90cda49d2b04 100644 --- a/sycl/doc/design/PropertySets.md +++ b/sycl/doc/design/PropertySets.md @@ -5,6 +5,7 @@ implementation produces sets of properties. The intention of this design document is to describe the structure of the property sets and define the representation and meaning of pre-defined property set names. + ## Property sets structure A property set consists of a reserved name, enclosed in square brackets, @@ -24,11 +25,17 @@ corresponding value: | "1" | The value is a 32 bit integer. | | "2" | The value is a base64 encoded byte array. | -### Property sets +__Note:__ Whitespaces are __not__ ignored and are treated like any other +characters. As such, `some_key=1|1` is not the same as `some_key = 1 | 1` and +`[some property set]` is not the same as `[ some property set ]`. + + +## Property sets This section describes the known property sets. -#### [SYCL/specialization constants] + +### [SYCL/specialization constants] __Key:__ Specialization constant name. @@ -52,7 +59,7 @@ unsigned Size; ``` -#### [SYCL/specialization constants default values] +### [SYCL/specialization constants default values] __Key:__ Specialization constant name. @@ -62,7 +69,7 @@ __Value:__ Byte representation of the default value for the specialization constant. -#### [SYCL/devicelib req mask] +### [SYCL/devicelib req mask] __Key:__ At most one entry with "DeviceLibReqMask". @@ -75,7 +82,7 @@ __Notes:__ 1. If this property set is missing, no device libraries are used by the binary. -#### [SYCL/kernel param opt] +### [SYCL/kernel param opt] __Key:__ Kernel name. @@ -92,7 +99,7 @@ been eliminated. arguments. -#### [SYCL/program metadata] +### [SYCL/program metadata] Program metadata properties: @@ -105,7 +112,7 @@ Program metadata properties: | `variable` + "@global_id_mapping" | Byte array. ("2") | Specifies the mapping between the global variable with unique identifier `variable` and its name in the binary. | -#### [SYCL/misc properties] +### [SYCL/misc properties] Miscellaneous properties: @@ -119,7 +126,7 @@ Miscellaneous properties: | "specConstsReplacedWithDefault" | 32 bit integer. ("1") | 1 if the specialization constants have been replaced by their default values and 0 or missing otherwise. | -#### [SYCL/assert used] +### [SYCL/assert used] __Key:__ Kernel name. @@ -128,7 +135,7 @@ __Value type:__ 32 bit integer. ("1") __Value:__ 1 if the kernel uses assertions and 0 or missing otherwise. -#### [SYCL/exported symbols] +### [SYCL/exported symbols] __Key:__ Symbol name. @@ -137,7 +144,7 @@ __Value type:__ 32 bit integer. ("1") __Value:__ 1 if the symbol is exported by the binary and 0 or missing otherwise. -#### [SYCL/imported symbols] +### [SYCL/imported symbols] __Key:__ Symbol name. @@ -146,7 +153,7 @@ __Value type:__ 32 bit integer. ("1") __Value:__ 1 if the symbol is imported by the binary and 0 or missing otherwise. -#### [SYCL/device globals] +### [SYCL/device globals] __Key:__ Device global variable name. @@ -172,7 +179,7 @@ __Notes:__ global variables. -#### [SYCL/device requirements] +### [SYCL/device requirements] Set of device requirements for the entire module: @@ -190,7 +197,7 @@ Set of device requirements for the entire module: See also [OptionalDeviceFeatures.md](OptionalDeviceFeatures.md). -#### [SYCL/host pipes] +### [SYCL/host pipes] __Key:__ Host pipe variable name. @@ -210,7 +217,7 @@ __Notes:__ variables. -#### [SYCL/virtual functions] +### [SYCL/virtual functions] Set of information about virtual function usage in the module. @@ -220,7 +227,7 @@ Set of information about virtual function usage in the module. | "uses-virtual-functions-set" | Byte array. ("2") | A string containing a comma-separated list of sets of virtual functions used by the module. If this is missing, the module does not use any virtual function sets. | -#### [SYCL/implicit local arg] +### [SYCL/implicit local arg] __Key:__ Kernel name. @@ -236,7 +243,7 @@ have an implicit local memory argument. local memory argument. -#### [SYCL/registered kernels] +### [SYCL/registered kernels] __Key:__ "Registered" kernel name. @@ -250,7 +257,7 @@ __Notes:__ kernel names. -#### [SYCLBIN/global metadata] +### [SYCLBIN/global metadata] Set of global information about a SYCLBIN file. @@ -259,7 +266,7 @@ Set of global information about a SYCLBIN file. | "state" | 32 bit integer. ("1") | Integer representation of one of the possible states of the file, corresponding to the `sycl::bundle_state` enum. It must be one of the following:
  1. `sycl::bundle_state::input`
  2. `sycl::bundle_state::object`
  3. `sycl::bundle_state::executable`
| -#### [SYCLBIN/ir module metadata] +### [SYCLBIN/ir module metadata] Set of information about an IR module in a SYCLBIN file. @@ -268,7 +275,7 @@ Set of information about an IR module in a SYCLBIN file. | "type" | 32 bit integer. ("1") | Integer representation of one of the pre-defined IR types. It must be one of the following:
  1. SPIR-V
  2. PTX
  3. AMDGCN
| -#### [SYCLBIN/native device code image metadata] +### [SYCLBIN/native device code image metadata] Set of information about an native device code image in a SYCLBIN file. @@ -276,5 +283,3 @@ Set of information about an native device code image in a SYCLBIN file. | ------ | ----------------- | ----------------------------------------------------- | | "arch" | Byte array. ("2") | A string representing the architecture of the binary. | - - From 71892bc0b81f14f8feedf124461676e23da1041f Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Wed, 12 Mar 2025 02:55:12 -0700 Subject: [PATCH 21/30] Add motivation Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 8455494d2c926..ccf3e9f6d553e 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -1,5 +1,16 @@ # SYCLBIN - A format for separately compiled SYCL device code +Some applications may want to dynamically load device binaries at runtime, e.g. +for modularity and to avoid having to recompile the entire application. However, +doing so through the +[sycl_ext_oneapi_kernel_compiler](https://github.com/intel/llvm/blob/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler.asciidoc) +extension can be unnecessarily expensive, if the systems utilizing this +modularity are able to compile the binaries separate from the application's +execution. However, since the compiler may produce multiple binaries and related +metadata, e.g. through module splitting or multiple device targets, a new +SYCLBIN format is needed to define the interface between the compiler-produced +binaries and the runtime's handling of it. + This design document details the SYCLBIN binary format used for storing SYCL device binaries to be loaded dynamically by the SYCL runtime. It also details how the toolchain produces, links and packages these binaries, as well as how From 3d76c2a27b0eee29d866c93b480705326842400c Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Wed, 12 Mar 2025 03:51:22 -0700 Subject: [PATCH 22/30] Add target and specify arch Signed-off-by: Larsen, Steffen --- sycl/doc/design/PropertySets.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/sycl/doc/design/PropertySets.md b/sycl/doc/design/PropertySets.md index e90cda49d2b04..94226bf4f9c55 100644 --- a/sycl/doc/design/PropertySets.md +++ b/sycl/doc/design/PropertySets.md @@ -270,16 +270,17 @@ Set of global information about a SYCLBIN file. Set of information about an IR module in a SYCLBIN file. -| Key | Value type | Value | -| ------ | --------------------- | ----- | -| "type" | 32 bit integer. ("1") | Integer representation of one of the pre-defined IR types. It must be one of the following:
  1. SPIR-V
  2. PTX
  3. AMDGCN
| +| Key | Value type | Value | +| -------- | --------------------- | ----- | +| "type" | 32 bit integer. ("1") | Integer representation of one of the pre-defined IR types. It must be one of the following:
  1. SPIR-V
  2. PTX
  3. AMDGCN
| +| "target" | Byte array. ("2") | A string representing the architecture of the binary, corresponding to the value of `-fsycl-targets` option used when compiling this binary. This may be missing if no part of `-fsycl-targets` was used during the compilation of this binary or if `-fsycl-targets` was not used at all. | ### [SYCLBIN/native device code image metadata] Set of information about an native device code image in a SYCLBIN file. -| Key | Value type | Value | -| ------ | ----------------- | ----------------------------------------------------- | -| "arch" | Byte array. ("2") | A string representing the architecture of the binary. | +| Key | Value type | Value | +| ------ | ----------------- | ----- | +| "arch" | Byte array. ("2") | A string representing the architecture of the binary, corresponding to the value of `-fsycl-targets` option used when compiling this binary. | From 6e9a6b032c735feb777f32a39b130ca0f73b3a73 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Fri, 14 Mar 2025 08:25:28 -0700 Subject: [PATCH 23/30] Remove undocumented option Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index ccf3e9f6d553e..3b8b2834d65dd 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -224,10 +224,6 @@ option currently requires `--offload-new-driver` to be set. -`--offload-ir` -*TODO* - - `--offload-rdc` This is an alias of `-fgpu-rdc`. From ef7f2a2f1ce54b9f90d103a84d21735a22df73c0 Mon Sep 17 00:00:00 2001 From: Steffen Larsen Date: Wed, 26 Mar 2025 17:07:45 +0100 Subject: [PATCH 24/30] Update sycl/doc/design/SYCLBINDesign.md --- sycl/doc/design/SYCLBINDesign.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 3b8b2834d65dd..1f7039623a976 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -216,7 +216,7 @@ If this option is set, the output of the invocation is a SYCLBIN file with the `-fsycl` pipeline, instead passing the output of the clang-offload-packager invocation to clang-linker-wrapper together with the new `--syclbin` flag. -Setting this option will override `-fsycl` and `-fsycl-device-only`. +Setting this option will imply `-fsycl` and override `-fsycl-device-only`. The behavior is dependent on using the clang-linker-wrapper. As the current default offload compilation behavior is using the old offload model (driver based), this From 0bc59d50906f8bea11c921049d50155a6eb0e432 Mon Sep 17 00:00:00 2001 From: Steffen Larsen Date: Thu, 27 Mar 2025 15:39:15 +0100 Subject: [PATCH 25/30] Apply suggestions from code review Co-authored-by: Alexey Sachkov --- sycl/doc/design/SYCLBINDesign.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 1f7039623a976..a598504add9d5 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -6,10 +6,14 @@ doing so through the [sycl_ext_oneapi_kernel_compiler](https://github.com/intel/llvm/blob/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler.asciidoc) extension can be unnecessarily expensive, if the systems utilizing this modularity are able to compile the binaries separate from the application's -execution. However, since the compiler may produce multiple binaries and related -metadata, e.g. through module splitting or multiple device targets, a new -SYCLBIN format is needed to define the interface between the compiler-produced -binaries and the runtime's handling of it. +execution. + +To facilitate that a new SYCLBIN format is needed to define the interface between +the compiler-produced binaries and the runtime's handling of it. This necessity +comes from the overall design of our SYCL toolchain where runtime relies on +compiler-provided information/metadata to implement various features (like +support for specialization constants or shared libraries), i.e. device code alone +is not enough. This design document details the SYCLBIN binary format used for storing SYCL device binaries to be loaded dynamically by the SYCL runtime. It also details @@ -74,7 +78,7 @@ file. | `uint64_t` | Byte size of the metadata byte table. | | `uint64_t` | Byte size of the binary byte table. | | `uint64_t` | Byte offset of the global metadata from the start of the metadata byte table. | -| `uint64_t` | Byte size of the global metadata from the start of the metadata byte table. | +| `uint64_t` | Byte size of the global metadata. | #### Global metadata From 13fad438cb796019536f1ba7eb85204eda2a230e Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Fri, 28 Mar 2025 03:55:07 -0700 Subject: [PATCH 26/30] Address comments Signed-off-by: Larsen, Steffen --- sycl/doc/design/PropertySets.md | 10 ++++++++++ sycl/doc/design/SYCLBINDesign.md | 7 +++++-- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/sycl/doc/design/PropertySets.md b/sycl/doc/design/PropertySets.md index 94226bf4f9c55..25b95d6c91337 100644 --- a/sycl/doc/design/PropertySets.md +++ b/sycl/doc/design/PropertySets.md @@ -58,6 +58,8 @@ unsigned Offset; unsigned Size; ``` +See also [SYCL2020-SpecializationConstants.md](./SYCL2020-SpecializationConstants.md). + ### [SYCL/specialization constants default values] @@ -68,6 +70,8 @@ __Value type:__ Byte array. ("2") __Value:__ Byte representation of the default value for the specialization constant. +See also [SYCL2020-SpecializationConstants.md](./SYCL2020-SpecializationConstants.md). + ### [SYCL/devicelib req mask] @@ -143,6 +147,8 @@ __Value type:__ 32 bit integer. ("1") __Value:__ 1 if the symbol is exported by the binary and 0 or missing otherwise. +See also [SharedLibraries.md](SharedLibraries.md). + ### [SYCL/imported symbols] @@ -152,6 +158,8 @@ __Value type:__ 32 bit integer. ("1") __Value:__ 1 if the symbol is imported by the binary and 0 or missing otherwise. +See also [SharedLibraries.md](SharedLibraries.md). + ### [SYCL/device globals] @@ -178,6 +186,8 @@ __Notes:__ 1. If this property set is missing, the binary does not contain any device global variables. +See also [DeviceGlobal.md](./DeviceGlobal.md). + ### [SYCL/device requirements] diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index a598504add9d5..5b2be36f8ba94 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -154,7 +154,10 @@ name "SYCLBIN/ir module metadata", as described in the An native device code image contains the binary data for the corresponding module AOT compiled for a specific device, identified by the architecture -string. +string. The runtime library will attempt to map these to the architecture +enumerators in the +[sycl_ext_oneapi_device_architecture](../extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc) +extension. ##### Native device code image header @@ -220,7 +223,7 @@ If this option is set, the output of the invocation is a SYCLBIN file with the `-fsycl` pipeline, instead passing the output of the clang-offload-packager invocation to clang-linker-wrapper together with the new `--syclbin` flag. -Setting this option will imply `-fsycl` and override `-fsycl-device-only`. +Setting this option implies `-fsycl` and `-fsycl-device-only`. The behavior is dependent on using the clang-linker-wrapper. As the current default offload compilation behavior is using the old offload model (driver based), this From 1ca7ec0dde1c792d5ff3e73e4739641ce735f63e Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Mon, 31 Mar 2025 23:16:45 -0700 Subject: [PATCH 27/30] Expand post-processing Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index 5b2be36f8ba94..a2d392a99e2d5 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -242,10 +242,10 @@ SYCLBIN files are linked together is yet to be specified. ## clang-linker-wrapper changes -The clang-linker-wrapper is responsible for doing post-processing and linking of -device binaries, as described in [OffloadDesign.md](OffloadDesign.md). -However, to support SYCLBIN files, the clang-linker-wrapper must be able to -unpack an offload binary (as described in +The clang-linker-wrapper is responsible for doing module-splitting, metadata +extraction and linking of device binaries, as described in +[OffloadDesign.md](OffloadDesign.md). However, to support SYCLBIN files, the +clang-linker-wrapper must be able to unpack an offload binary (as described in [ClangOffloadPackager.rst](https://github.com/intel/llvm/blob/sycl/clang/docs/ClangOffloadPackager.rst)) directly, instead of extracting it from a host binary. This should be done when a new flag, `--syclbin`, is passed. In this case, the clang-linker-wrapper is From 874c3bf94833950bf7e3537f63ad50238bdba670 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Mon, 31 Mar 2025 23:20:15 -0700 Subject: [PATCH 28/30] Specify -fsyclbin being ignored if used with -fsycl-device-only Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index a2d392a99e2d5..c06fe79377603 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -219,15 +219,17 @@ The clang driver needs to accept the following new flags: `-fsyclbin` If this option is set, the output of the invocation is a SYCLBIN file with the -.syclbin file extension. This skips the host-compilation invocation of the typical -`-fsycl` pipeline, instead passing the output of the clang-offload-packager -invocation to clang-linker-wrapper together with the new `--syclbin` flag. +.syclbin file extension. This skips the host-compilation invocation of the +typical `-fsycl` pipeline, instead passing the output of the +clang-offload-packager invocation to clang-linker-wrapper together with the new +`--syclbin` flag. -Setting this option implies `-fsycl` and `-fsycl-device-only`. +Setting this option will override `-fsycl`. Passing`-fsycl-device-only` with +`-fsyclbin` will cause the latter to be ignored. -The behavior is dependent on using the clang-linker-wrapper. As the current default -offload compilation behavior is using the old offload model (driver based), this -option currently requires `--offload-new-driver` to be set. +The behavior is dependent on using the clang-linker-wrapper. As the current +default offload compilation behavior is using the old offload model (driver +based), this option currently requires `--offload-new-driver` to be set. From bd71eb5473a5cdd70cf134412c6d09b3a93f9231 Mon Sep 17 00:00:00 2001 From: "Larsen, Steffen" Date: Mon, 31 Mar 2025 23:22:08 -0700 Subject: [PATCH 29/30] ignored -> unused Signed-off-by: Larsen, Steffen --- sycl/doc/design/SYCLBINDesign.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index c06fe79377603..bffbc498a5029 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -225,7 +225,7 @@ clang-offload-packager invocation to clang-linker-wrapper together with the new `--syclbin` flag. Setting this option will override `-fsycl`. Passing`-fsycl-device-only` with -`-fsyclbin` will cause the latter to be ignored. +`-fsyclbin` will cause the latter will be considered unused. The behavior is dependent on using the clang-linker-wrapper. As the current default offload compilation behavior is using the old offload model (driver From e61efb2a1d1b9be4eb52b28c605e8ae685fa39d9 Mon Sep 17 00:00:00 2001 From: Steffen Larsen Date: Wed, 2 Apr 2025 07:56:21 +0200 Subject: [PATCH 30/30] Update sycl/doc/design/SYCLBINDesign.md Co-authored-by: Michael Toguchi --- sycl/doc/design/SYCLBINDesign.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sycl/doc/design/SYCLBINDesign.md b/sycl/doc/design/SYCLBINDesign.md index bffbc498a5029..bd32a59319de4 100644 --- a/sycl/doc/design/SYCLBINDesign.md +++ b/sycl/doc/design/SYCLBINDesign.md @@ -225,7 +225,7 @@ clang-offload-packager invocation to clang-linker-wrapper together with the new `--syclbin` flag. Setting this option will override `-fsycl`. Passing`-fsycl-device-only` with -`-fsyclbin` will cause the latter will be considered unused. +`-fsyclbin` will cause `-fsycl-device-only` to be considered unused. The behavior is dependent on using the clang-linker-wrapper. As the current default offload compilation behavior is using the old offload model (driver