@@ -144,13 +144,33 @@ installation layout of compatible version. The following plugins from CodePlay a
144144Building ``dpnp `` also requires `building Data Parallel Control Library for custom SYCL targets.
145145<https://intelpython.github.io/dpctl/latest/beginners_guides/installation.html#building-for-custom-sycl-targets> `_
146146
147- ``dpnp `` can be built for CUDA devices as follows:
147+ ``dpnp `` can be built for CUDA devices using the ``--target-cuda `` argument.
148+
149+ To target a specific architecture (e.g., ``sm_80 ``):
150+
151+ .. code-block :: bash
152+
153+ python scripts/build_locally.py --target-cuda=sm_80
154+
155+ To use the default architecture (``sm_50 ``), run:
148156
149157.. code-block :: bash
150158
151- python scripts/build_locally.py --target=cuda
159+ python scripts/build_locally.py --target-cuda
160+
161+ Note that kernels are built for ``sm_50 `` by default, allowing them to work on a wider
162+ range of architectures, but limiting the usage of more recent CUDA features.
163+
164+ For reference, compute architecture strings like ``sm_80 `` correspond to specific
165+ CUDA Compute Capabilities (e.g., Compute Capability 8.0 corresponds to ``sm_80 ``).
166+ A complete mapping between NVIDIA GPU models and their respective
167+ Compute Capabilities can be found in the official
168+ `CUDA GPU Compute Capability <https://developer.nvidia.com/cuda-gpus >`_ documentation.
169+
170+ A full list of available SYCL alias targets is available in the
171+ `DPC++ Compiler User Manual <https://intel.github.io/llvm/UsersManual.html >`_.
152172
153- And for AMD devices:
173+ To build for AMD devices, use :
154174
155175.. code-block :: bash
156176
@@ -179,7 +199,7 @@ architecture all at once:
179199
180200.. code-block :: bash
181201
182- python scripts/build_locally.py --target= cuda --target-hip=gfx90a
202+ python scripts/build_locally.py --target- cuda --target-hip=gfx90a
183203
184204
185205 Testing
0 commit comments