@@ -103,13 +103,60 @@ time and link against it directly.
103103
104104Q: How to build an OpenMP AMDGPU offload capable compiler?
105105^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
106+ A subset of the `ROCm <https://github.com/ROCm >`_ toolchain is
107+ required to build the LLVM toolchain and to execute the openmp application.
108+ Either install ROCm somewhere that cmake's find_package can locate it, or
109+ build the required subcomponents ROCt and ROCr from source.
106110
107- The OpenMP AMDGPU offloading support depends on the ROCm math libraries and the
108- HSA ROCr / ROCt runtimes. These are normally provided by a standard ROCm
109- installation, but can be built and used independently if desired. Building the
110- libraries does not depend on these libraries by default by dynamically loading
111- the HSA runtime at program execution. As in the CUDA case, this can be change by
112- omitting ``amdgpu `` from the ``LIBOMPTARGET_DLOPEN_PLUGINS `` list.
111+ The two components used are ROCT-Thunk-Interface, roct, and ROCR-Runtime, rocr.
112+ Roct is the userspace part of the linux driver. It calls into the driver which
113+ ships with the linux kernel. It is an implementation detail of Rocr from
114+ OpenMP's perspective. Rocr is an implementation of `HSA
115+ <http://www.hsafoundation.com> `_.
116+
117+ .. code-block :: text
118+
119+ SOURCE_DIR=same-as-llvm-source # e.g. the checkout of llvm-project, next to openmp
120+ BUILD_DIR=somewhere
121+ INSTALL_PREFIX=same-as-llvm-install
122+
123+ cd $SOURCE_DIR
124+ git clone [email protected] :ROCm/ROCT-Thunk-Interface.git -b roc-4.2.x \ 125+ --single-branch
126+ git clone [email protected] :ROCm/ROCR-Runtime.git -b rocm-4.2.x \ 127+ --single-branch
128+
129+ cd $BUILD_DIR && mkdir roct && cd roct
130+ cmake $SOURCE_DIR/ROCT-Thunk-Interface/ -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX \
131+ -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF
132+ make && make install
133+
134+ cd $BUILD_DIR && mkdir rocr && cd rocr
135+ cmake $SOURCE_DIR/ROCR-Runtime/src -DIMAGE_SUPPORT=OFF \
136+ -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release \
137+ -DBUILD_SHARED_LIBS=ON
138+ make && make install
139+
140+ ``IMAGE_SUPPORT `` requires building rocr with clang and is not used by openmp.
141+
142+ Provided cmake's find_package can find the ROCR-Runtime package, LLVM will
143+ build a tool ``bin/amdgpu-arch `` which will print a string like ``gfx906 `` when
144+ run if it recognises a GPU on the local system. LLVM will also build a shared
145+ library, libomptarget.rtl.amdgpu.so, which is linked against rocr.
146+
147+ With those libraries installed, then LLVM build and installed, try:
148+
149+ .. code-block :: shell
150+
151+ clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa example.c -o example && ./example
152+
153+ If your build machine is not the target machine or automatic detection of the
154+ available GPUs failed, you should also set:
155+
156+ - ``LIBOMPTARGET_DEVICE_ARCHITECTURES='gfx<xyz>;...' `` where ``<xyz> `` is the
157+ shader core instruction set architecture. For instance, set
158+ ``LIBOMPTARGET_DEVICE_ARCHITECTURES='gfx906;gfx90a' `` to target AMD GCN5
159+ and CDNA2 devices.
113160
114161Q: What are the known limitations of OpenMP AMDGPU offload?
115162^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0 commit comments