@@ -90,8 +90,8 @@ git clone --config core.autocrlf=false https://github.com/intel/llvm -b sycl
9090## Build DPC++ toolchain
9191
9292The easiest way to get started is to use the buildbot
93- [ configure] ( ../.. /buildbot/configure.py) and
94- [ compile] ( ../.. /buildbot/compile.py) scripts.
93+ [ configure] ( https://github.com/intel/llvm/blob/sycl /buildbot/configure.py) and
94+ [ compile] ( https://github.com/intel/llvm/blob/sycl /buildbot/compile.py) scripts.
9595
9696In case you want to configure CMake manually the up-to-date reference for
9797variables is in these files.
@@ -233,21 +233,21 @@ LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$DPCPP_HOME/llvm/build/lib ./a.out
233233
234234### Build DPC++ toolchain with support for HIP AMD
235235
236- There is beta support for oneAPI DPC++ for HIP on AMD devices. It is not feature
237- complete and it still contains known and unknown bugs. Currently it has only
238- been tried on Linux, with ROCm 4.2.0, 4.3.0, 4.5.2, 5.3.0, and 5.4.3, using the
239- AMD Radeon Pro W6800 (gtx1030), MI50 (gfx906), MI100 (gfx908) and MI250x
240- (gfx90a) devices. The backend is tested by a relevant device/toolkit prior to a
241- oneAPI plugin release. Go to the plugin release
242- [ pages] ( https://developer.codeplay.com/products/oneapi/amd ) for further details.
243-
244236To enable support for HIP devices, follow the instructions for the Linux DPC++
245237toolchain, but add the ` --hip ` flag to ` configure.py ` .
246238
247239Enabling this flag requires an installation of ROCm on the system, for
248240instruction on how to install this refer to
249241[ AMD ROCm Installation Guide for Linux] ( https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html ) .
250242
243+ ROCm versions above 5.7 are recommended as earlier versions don't have graph
244+ support. DPC++ aims to support new ROCm versions as they come out, so there may
245+ be a delay but generally the latest ROCm version should work. The ROCm support
246+ is mostly tested on AMD Radeon Pro W6800 (gfx1030), and MI250x (gfx90a), however
247+ other architectures supported by LLVM may work just fine. The full list of ROCm
248+ versions tested prior to oneAPI releases are listed on the plugin release
249+ [ pages] ( https://developer.codeplay.com/products/oneapi/amd ) .
250+
251251The DPC++ build assumes that ROCm is installed in ` /opt/rocm ` , if it is
252252installed somewhere else, the directory must be provided through the CMake
253253variable ` UR_HIP_ROCM_DIR ` which can be passed through to cmake using the
@@ -276,7 +276,10 @@ by default when configuring for HIP. For more details on building LLD refer to
276276
277277### Build DPC++ toolchain with support for HIP NVIDIA
278278
279- There is experimental support for oneAPI DPC++ for HIP on Nvidia devices.
279+ HIP applications can be built to target Nvidia GPUs, so in theory it is possible
280+ to build the DPC++ HIP support for Nvidia, however this is not supported, so it
281+ may not work.
282+
280283There is no continuous integration for this and there are no guarantees for
281284supported platforms or configurations.
282285
@@ -288,13 +291,12 @@ To enable support for HIP NVIDIA devices, follow the instructions for the Linux
288291DPC++ toolchain, but add the ` --hip ` and ` --hip-platform NVIDIA ` flags to
289292` configure.py ` .
290293
291- Enabling this flag requires HIP to be installed, more specifically
292- [ HIP NVCC] ( https://rocmdocs.amd.com/en/latest/Installation_Guide/HIP-Installation.html#nvidia-platform ) ,
293- as well as the CUDA Runtime API to be installed, see
294- [ NVIDIA CUDA Installation Guide for Linux] ( https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html ) .
295-
296- Currently, this has only been tried on Linux, with ROCm 4.2.0 or 4.3.0, with
297- CUDA 11, and using a GeForce 1060 device.
294+ Enabling this flag requires HIP to be installed, specifically for Nvidia, see
295+ the Nvidia tab on the HIP installation docs
296+ [ here] ( https://rocm.docs.amd.com/projects/HIP/en/latest/install/install.html ) ,
297+ as well as the CUDA Runtime API to be installed, see [ NVIDIA CUDA Installation
298+ Guide for
299+ Linux] ( https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html ) .
298300
299301### Build DPC++ toolchain with support for ARM processors
300302
@@ -705,14 +707,6 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
705707The results are correct!
706708` ` `
707709
708- ** NOTE** : Currently, when the application has been built with the CUDA target,
709- the CUDA backend must be selected at runtime using the ` ONEAPI_DEVICE_SELECTOR`
710- environment variable.
711-
712- ` ` ` bash
713- ONEAPI_DEVICE_SELECTOR=cuda:* ./simple-sycl-app-cuda.exe
714- ` ` `
715-
716710** NOTE** : oneAPI DPC++/SYCL developers can specify SYCL device for execution
717711using device selectors (e.g. ` sycl::cpu_selector_v` , ` sycl::gpu_selector_v` ,
718712[Intel FPGA selector(s)](extensions/supported/sycl_ext_intel_fpga_device_selector.asciidoc))
@@ -746,6 +740,14 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
746740 -Xsycl-target-backend --cuda-gpu-arch=sm_80
747741` ` `
748742
743+ Additionally AMD and Nvidia targets also support aliases for the target to
744+ simplify passing the specific architectures, for example
745+ ` -fsycl-targets=nvidia_gpu_sm_80` is equivalent to
746+ ` -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend
747+ --cuda-gpu-arch=sm_80` , the full list of available aliases is documented in the
748+ [Users Manual](UsersManual.md#generic-options), for the ` -fsycl-targets`
749+ option.
750+
749751To build simple-sycl-app ahead of time for GPU, CPU or Accelerator devices,
750752specify the target architecture. The examples provided use a supported
751753alias for the target, representing a full triple. Additional details can
@@ -914,11 +916,14 @@ int CUDASelector(const sycl::device &Device) {
914916
915917### HIP back-end limitations
916918
917- * Requires a ROCm compatible operating system, for full details of supported
918- Operating System for ROCm, please refer to the
919- [ROCm Supported Operating Systems](https://github.com/RadeonOpenCompute/ROCm#supported-operating-systems).
920- * Support is still in a beta state, but the backend is being actively developed.
921- * Global offsets are currently not supported.
919+ * Requires a ROCm compatible system and GPU, see for
920+ [Linux](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-skus)
921+ and for
922+ [Windows](https://rocm.docs.amd.com/projects/install-on-windows/en/latest/reference/system-requirements.html#supported-skus).
923+ * Windows for HIP is not supported by DPC++ at the moment so it may not work.
924+ * `printf` within kernels is not supported.
925+ * C++ standard library functions using complex types are not supported,
926+ `sycl::complex` should be used instead.
922927
923928## Find More
924929
0 commit comments