@@ -92,8 +92,8 @@ git clone --config core.autocrlf=false https://github.com/intel/llvm -b sycl
9292## Build DPC++ toolchain
9393
9494The easiest way to get started is to use the buildbot
95- [ configure] ( ../.. /buildbot/configure.py) and
96- [ compile] ( ../.. /buildbot/compile.py) scripts.
95+ [ configure] ( https://github.com/intel/llvm/blob/sycl /buildbot/configure.py) and
96+ [ compile] ( https://github.com/intel/llvm/blob/sycl /buildbot/compile.py) scripts.
9797
9898In case you want to configure CMake manually the up-to-date reference for
9999variables is in these files. Note that the CMake variables set by default by the [ configure.py] ( ../../buildbot/configure.py ) script are the ones commonly used by
@@ -237,21 +237,21 @@ LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$DPCPP_HOME/llvm/build/lib ./a.out
237237
238238### Build DPC++ toolchain with support for HIP AMD
239239
240- There is beta support for oneAPI DPC++ for HIP on AMD devices. It is not feature
241- complete and it still contains known and unknown bugs. Currently it has only
242- been tried on Linux, with ROCm 4.2.0, 4.3.0, 4.5.2, 5.3.0, and 5.4.3, using the
243- AMD Radeon Pro W6800 (gtx1030), MI50 (gfx906), MI100 (gfx908) and MI250x
244- (gfx90a) devices. The backend is tested by a relevant device/toolkit prior to a
245- oneAPI plugin release. Go to the plugin release
246- [ pages] ( https://developer.codeplay.com/products/oneapi/amd ) for further details.
247-
248240To enable support for HIP devices, follow the instructions for the Linux DPC++
249241toolchain, but add the ` --hip ` flag to ` configure.py ` .
250242
251243Enabling this flag requires an installation of ROCm on the system, for
252244instruction on how to install this refer to
253245[ AMD ROCm Installation Guide for Linux] ( https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html ) .
254246
247+ ROCm versions above 5.7 are recommended as earlier versions don't have graph
248+ support. DPC++ aims to support new ROCm versions as they come out, so there may
249+ be a delay but generally the latest ROCm version should work. The ROCm support
250+ is mostly tested on AMD Radeon Pro W6800 (gfx1030), and MI250x (gfx90a), however
251+ other architectures supported by LLVM may work just fine. The full list of ROCm
252+ versions tested prior to oneAPI releases are listed on the plugin release
253+ [ pages] ( https://developer.codeplay.com/products/oneapi/amd ) .
254+
255255The DPC++ build assumes that ROCm is installed in ` /opt/rocm ` , if it is
256256installed somewhere else, the directory must be provided through the CMake
257257variable ` UR_HIP_ROCM_DIR ` which can be passed through to cmake using the
@@ -280,7 +280,10 @@ by default when configuring for HIP. For more details on building LLD refer to
280280
281281### Build DPC++ toolchain with support for HIP NVIDIA
282282
283- There is experimental support for oneAPI DPC++ for HIP on Nvidia devices.
283+ HIP applications can be built to target Nvidia GPUs, so in theory it is possible
284+ to build the DPC++ HIP support for Nvidia, however this is not supported, so it
285+ may not work.
286+
284287There is no continuous integration for this and there are no guarantees for
285288supported platforms or configurations.
286289
@@ -292,13 +295,12 @@ To enable support for HIP NVIDIA devices, follow the instructions for the Linux
292295DPC++ toolchain, but add the ` --hip ` and ` --hip-platform NVIDIA ` flags to
293296` configure.py ` .
294297
295- Enabling this flag requires HIP to be installed, more specifically
296- [ HIP NVCC] ( https://rocmdocs.amd.com/en/latest/Installation_Guide/HIP-Installation.html#nvidia-platform ) ,
297- as well as the CUDA Runtime API to be installed, see
298- [ NVIDIA CUDA Installation Guide for Linux] ( https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html ) .
299-
300- Currently, this has only been tried on Linux, with ROCm 4.2.0 or 4.3.0, with
301- CUDA 11, and using a GeForce 1060 device.
298+ Enabling this flag requires HIP to be installed, specifically for Nvidia, see
299+ the Nvidia tab on the HIP installation docs
300+ [ here] ( https://rocm.docs.amd.com/projects/HIP/en/latest/install/install.html ) ,
301+ as well as the CUDA Runtime API to be installed, see [ NVIDIA CUDA Installation
302+ Guide for
303+ Linux] ( https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html ) .
302304
303305### Build DPC++ toolchain with support for ARM processors
304306
@@ -736,14 +738,6 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
736738The results are correct!
737739` ` `
738740
739- ** NOTE** : Currently, when the application has been built with the CUDA target,
740- the CUDA backend must be selected at runtime using the ` ONEAPI_DEVICE_SELECTOR`
741- environment variable.
742-
743- ` ` ` bash
744- ONEAPI_DEVICE_SELECTOR=cuda:* ./simple-sycl-app-cuda.exe
745- ` ` `
746-
747741** NOTE** : oneAPI DPC++/SYCL developers can specify SYCL device for execution
748742using device selectors (e.g. ` sycl::cpu_selector_v` , ` sycl::gpu_selector_v` ,
749743[Intel FPGA selector(s)](extensions/supported/sycl_ext_intel_fpga_device_selector.asciidoc))
@@ -777,6 +771,14 @@ clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
777771 -Xsycl-target-backend --cuda-gpu-arch=sm_80
778772` ` `
779773
774+ Additionally AMD and Nvidia targets also support aliases for the target to
775+ simplify passing the specific architectures, for example
776+ ` -fsycl-targets=nvidia_gpu_sm_80` is equivalent to
777+ ` -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend
778+ --cuda-gpu-arch=sm_80` , the full list of available aliases is documented in the
779+ [Users Manual](UsersManual.md#generic-options), for the ` -fsycl-targets`
780+ option.
781+
780782To build simple-sycl-app ahead of time for GPU, CPU or Accelerator devices,
781783specify the target architecture. The examples provided use a supported
782784alias for the target, representing a full triple. Additional details can
@@ -945,11 +947,14 @@ int CUDASelector(const sycl::device &Device) {
945947
946948### HIP back-end limitations
947949
948- * Requires a ROCm compatible operating system, for full details of supported
949- Operating System for ROCm, please refer to the
950- [ROCm Supported Operating Systems](https://github.com/RadeonOpenCompute/ROCm#supported-operating-systems).
951- * Support is still in a beta state, but the backend is being actively developed.
952- * Global offsets are currently not supported.
950+ * Requires a ROCm compatible system and GPU, see for
951+ [Linux](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-skus)
952+ and for
953+ [Windows](https://rocm.docs.amd.com/projects/install-on-windows/en/latest/reference/system-requirements.html#supported-skus).
954+ * Windows for HIP is not supported by DPC++ at the moment so it may not work.
955+ * `printf` within kernels is not supported.
956+ * C++ standard library functions using complex types are not supported,
957+ `sycl::complex` should be used instead.
953958
954959## Find More
955960
0 commit comments