intel · uditagarwal97 · Apr 16, 2025 · Apr 12, 2025 · Apr 15, 2025 · Apr 15, 2025
@@ -134,6 +134,9 @@ def do_configure(args, passthrough_args):
     if args.use_lld:
         llvm_enable_lld = "ON"
 
+    if args.use_zstd:
+        llvm_enable_zstd = "FORCE_ON"
+
     # CI Default conditionally appends to options, keep it at the bottom of
     # args handling
     if args.ci_defaults:
@@ -417,6 +420,9 @@ def main():
         "--native-cpu-libclc-targets",
         help="Target triples for libclc, used by the Native CPU backend",
     )
+    parser.add_argument(
+        "--use-zstd", action="store_true", help="Force zstd linkage while building."
+    )
     args, passthrough_args = parser.parse_known_intermixed_args()
 
     print("args:{}".format(args))

@@ -1113,9 +1113,13 @@ class BinaryWrapper {
         if (OffloadCompressDevImgs && !llvm::compression::zstd::isAvailable()) {
           return createStringError(
               inconvertibleErrorCode(),
-              "'--offload-compress' option is specified but zstd "
-              "is not available. The device image will not be "
-              "compressed.");
+              "'--offload-compress' option is specified but the compiler is "
+              "built without zstd support.\n"
+              "If you are using a custom DPC++ build, please refer to "
+              "https://github.com/intel/llvm/blob/sycl/sycl/doc/"
+              "GetStartedGuide.md#build-dpc-toolchain-with-device-image-"
+              "compression-support"
+              " for more information on how to build with zstd support.");
         }
 
         // Don't compress if the user explicitly specifies the binary image

@@ -15,6 +15,7 @@ and a wide range of compute accelerators such as GPU and FPGA.
     * [Build DPC++ toolchain with support for ARM processors](#build-dpc-toolchain-with-support-for-arm-processors)
     * [Build DPC++ toolchain with support for runtime kernel fusion and JIT compilation](#build-dpc-toolchain-with-support-for-runtime-kernel-fusion-and-jit-compilation)
     * [Build DPC++ toolchain with a custom Unified Runtime](#build-dpc-toolchain-with-a-custom-unified-runtime)
+    * [Build DPC++ toolchain with device image compression support](#build-dpc-toolchain-with-device-image-compression-support)
     * [Build Doxygen documentation](#build-doxygen-documentation)
     * [Deployment](#deployment)
   * [Use DPC++ toolchain](#use-dpc-toolchain)
@@ -47,6 +48,7 @@ and a wide range of compute accelerators such as GPU and FPGA.
 | [Ninja](https://github.com/ninja-build/ninja/wiki/Pre-built-Ninja-packages) |                                                                                                                                      |
 | `hwloc`                                                                     | >= 2.3 (Linux only, `libhwloc-dev` or `hwloc-devel`)                                                                                 |
 | C++ compiler                                                                | [See LLVM](https://github.com/intel/llvm/blob/sycl/llvm/docs/GettingStarted.rst#host-c-toolchain-both-compiler-and-standard-library) |
+|`zstd` (optional) | >= 1.4.8 (see [ZSTD](#build-dpc-toolchain-with-device-image-compression-support)) |
 
 Alternatively, you can create a Docker image that has everything you need for
 building pre-installed using the [Ubuntu 24.04 build Dockerfile](https://github.com/intel/llvm/blob/sycl/devops/containers/ubuntu2404_build.Dockerfile).
@@ -94,7 +96,8 @@ The easiest way to get started is to use the buildbot
 [compile](../../buildbot/compile.py) scripts.
 
 In case you want to configure CMake manually the up-to-date reference for
-variables is in these files.
+variables is in these files. Note that the CMake variables set by default by the [configure.py](../../buildbot/configure.py) script are the ones commonly used by
+the DPC++ developers and might not necsessarily suffice for your project-specific needs.
 
 **Linux**:
 
@@ -127,6 +130,7 @@ flags can be found by launching the script with `--help`):
 * `-t` -> Build type (Debug or Release)
 * `-o` -> Path to build directory
 * `--cmake-gen` -> Set build system type (e.g. `--cmake-gen "Unix Makefiles"`)
+* `--use-zstd` -> Force link zstd while building LLVM (see [ZSTD](#build-dpc-toolchain-with-device-image-compression-support))
 
 You can use the following flags with `compile.py` (full list of available flags
 can be found by launching the script with `--help`):
@@ -320,6 +324,34 @@ DPC++ toolchain, but add the `--disable-jit` flag.
 Both kernel fusion and JIT compilation of AMD and Nvidia kernels are currently
 not yet supported on the Windows platform.
 
+### Build DPC++ toolchain with device image compression support
+
+Device image compression enables the compression of device code (SYCL Kernels) during compilation and decompressing them on-demand at application runtime.
+This reduces the size of fat binaries for both Just-in-Time (JIT) and Ahead-of-Time (AOT) compilation. Refer to the [blog post](https://www.intel.com/content/www/us/en/developer/articles/technical/sycl-compilation-device-image-compression.html) for more details on this feature.
+
+To enable device image compression, you need to build the DPC++ toolchain with the
+zstd compression library. By default, zstd is optional for DPC++ builds i.e. CMake will search for zstd installation but if not found, it will not fail the build
+and this feature will simply be disabled.
+
+To override this behavior and force the build to use zstd, you can use the `--use-zstd` flag in the `configure.py` script.
+
+#### How to obtain zstd?
+
+Minimum zstd version that we have tested with is *1.4.8*.
+
+**Linux**:
+
+You can install zstd using the package manager of your distribution. For example, on Ubuntu, you can run:
+```sh
+sudo apt-get install libzstd-dev
+```
+Note that the libzstd-dev package provided on Ubuntu 24.04 has a bug and the zstd static library is not built with `-fPIC` flag. Linking to this library will result in a build failure. For example: [Issue#15935](https://github.com/intel/llvm/issues/15935). As an alternative, zstd can be built from source either manually or by using the [build_zstd_1_5_6_ub24.sh](https://github.com/intel/llvm/blob/sycl/devops/scripts/build_zstd_1_5_6_ub24.sh) script.
+
+**Windows**
+
+For Windows, prebuilt zstd binaries can be obtained from the [facebook/zstd](https://github.com/facebook/zstd/releases/tag/v1.5.6) release page. After obtaining the zstd binaries, you can add
+to the `PATH` environment variable the path to the zstd installation directory.
+
 ### Build Doxygen documentation
 
 Building Doxygen documentation is similar to building the product itself. First,

@@ -203,13 +203,15 @@ and not recommended to use in production environment.
 **`--offload-compress`**
 
     Enables device image compression for SYCL offloading. Device images
-    are compressed using `zstd` compression algorithm and only if their size
+    are compressed using zstd compression algorithm and only if their size
     exceeds 512 bytes.
+    To use this option, DPC++ must be built with zstd support. Otherwise,
+    the compiler will throw an error during compilation.
     Default value is false.
 
 **`--offload-compression-level=<int>`**
 
-    `zstd` compression level used to compress device images when `--offload-
+    zstd compression level used to compress device images when `--offload-
     compress` is enabled.
     The default value is 10.