diff --git a/sycl/doc/GetStartedGuide.md b/sycl/doc/GetStartedGuide.md index e5eb7b154f6d1..946da3a9fe367 100644 --- a/sycl/doc/GetStartedGuide.md +++ b/sycl/doc/GetStartedGuide.md @@ -12,6 +12,7 @@ and a wide range of compute accelerators such as GPU and FPGA. * [Build DPC++ toolchain with support for NVIDIA CUDA](#build-dpc-toolchain-with-support-for-nvidia-cuda) * [Build DPC++ toolchain with support for HIP AMD](#build-dpc-toolchain-with-support-for-hip-amd) * [Build DPC++ toolchain with support for HIP NVIDIA](#build-dpc-toolchain-with-support-for-hip-nvidia) + * [Build DPC++ toolchain with support for Native CPU](#build-dpc-toolchain-with-support-for-native-cpu) * [Build DPC++ toolchain with support for ARM processors](#build-dpc-toolchain-with-support-for-arm-processors) * [Build DPC++ toolchain with additional features enabled that require runtime/JIT compilation](#build-dpc-toolchain-with-additional-features-enabled-that-require-runtimejit-compilation) * [Build DPC++ toolchain with a custom Unified Runtime](#build-dpc-toolchain-with-a-custom-unified-runtime) @@ -124,6 +125,7 @@ flags can be found by launching the script with `--help`): * `--hip-platform` -> select the platform used by the hip backend, `AMD` or `NVIDIA` (see [HIP AMD](#build-dpc-toolchain-with-support-for-hip-amd) or see [HIP NVIDIA](#build-dpc-toolchain-with-support-for-hip-nvidia)) +* `--native_cpu` -> use the Native CPU backend (see [Native CPU](#build-dpc-toolchain-with-support-for-native-cpu)) * `--enable-all-llvm-targets` -> build compiler (but not a runtime) with all supported targets * `--shared-libs` -> Build shared libraries @@ -298,6 +300,13 @@ as well as the CUDA Runtime API to be installed, see [NVIDIA CUDA Installation Guide for Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html). +### Build DPC++ toolchain with support for Native CPU + +Native CPU is a cpu device which by default has no other dependency than DPC++. This device works with all cpu targets supported by the DPC++ runtime. +Supported targets include x86, Aarch64 and riscv_64. + +To enable Native CPU in a DPC++ build just add `--native_cpu` to the set of flags passed to `configure.py`. + ### Build DPC++ toolchain with support for ARM processors There is no continuous integration for this, and there are no guarantees for supported platforms or configurations. @@ -727,6 +736,13 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \ simple-sycl-app.cpp -o simple-sycl-app-cuda.exe ``` +When building for Native CPU use the SYCL target native_cpu: + +```bash +clang++ -fsycl -fsycl-targets=native_cpu simple-sycl-app.cpp -o simple-sycl-app.exe +``` +More Native CPU build options can be found in [SYCLNativeCPU.md](design/SYCLNativeCPU.md). + **Linux & Windows (64-bit)**: ```bash diff --git a/sycl/doc/design/SYCLNativeCPU.md b/sycl/doc/design/SYCLNativeCPU.md index b7fbb47d1064c..a47bad686b0f6 100644 --- a/sycl/doc/design/SYCLNativeCPU.md +++ b/sycl/doc/design/SYCLNativeCPU.md @@ -91,6 +91,8 @@ Whole Function Vectorization is enabled by default, and can be controlled throug * `-mllvm -sycl-native-cpu-no-vecz`: disable Whole Function Vectorization. * `-mllvm -sycl-native-cpu-vecz-width`: sets the vector width to the specified value, defaults to 8. +The `-march=` option can be used to select specific target cpus which may improve performance of the vectorized code. + For more details on how the Whole Function Vectorizer is integrated for SYCL Native CPU, refer to the [Technical details](#technical-details) section. # Code coverage