Skip to content

Commit fde66f6

Browse files
author
github-actions[doc-deploy-bot]
committed
Docs for pull request 2098
1 parent c997447 commit fde66f6

File tree

5 files changed

+78
-12
lines changed

5 files changed

+78
-12
lines changed

pulls/2098/_modules/dpctl/tensor/_set_functions.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1527,7 +1527,7 @@ <h1>Source code for dpctl.tensor._set_functions</h1><div class="highlight"><pre>
15271527
<span class="n">dep_evs</span> <span class="o">=</span> <span class="n">_manager</span><span class="o">.</span><span class="n">submitted_events</span>
15281528

15291529
<span class="k">if</span> <span class="n">x_dt</span> <span class="o">!=</span> <span class="n">dt</span><span class="p">:</span>
1530-
<span class="n">x_buf</span> <span class="o">=</span> <span class="n">_empty_like_orderK</span><span class="p">(</span><span class="n">x_arr</span><span class="p">,</span> <span class="n">dt</span><span class="p">,</span> <span class="n">res_usm_type</span><span class="p">,</span> <span class="n">sycl_dev</span><span class="p">)</span>
1530+
<span class="n">x_buf</span> <span class="o">=</span> <span class="n">_empty_like_orderK</span><span class="p">(</span><span class="n">x_arr</span><span class="p">,</span> <span class="n">dt</span><span class="p">,</span> <span class="n">res_usm_type</span><span class="p">,</span> <span class="n">exec_q</span><span class="p">)</span>
15311531
<span class="n">ht_ev</span><span class="p">,</span> <span class="n">ev</span> <span class="o">=</span> <span class="n">_copy_usm_ndarray_into_usm_ndarray</span><span class="p">(</span>
15321532
<span class="n">src</span><span class="o">=</span><span class="n">x_arr</span><span class="p">,</span> <span class="n">dst</span><span class="o">=</span><span class="n">x_buf</span><span class="p">,</span> <span class="n">sycl_queue</span><span class="o">=</span><span class="n">exec_q</span><span class="p">,</span> <span class="n">depends</span><span class="o">=</span><span class="n">dep_evs</span>
15331533
<span class="p">)</span>

pulls/2098/_sources/beginners_guides/installation.rst.txt

Lines changed: 37 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -159,13 +159,41 @@ The following plugins from CodePlay are supported:
159159
.. _codeplay_nv_plugin: https://developer.codeplay.com/products/oneapi/nvidia/
160160
.. _codeplay_amd_plugin: https://developer.codeplay.com/products/oneapi/amd/
161161

162-
``dpctl`` can be built for CUDA devices as follows:
162+
Builds for CUDA and AMD devices internally use SYCL alias targets that are passed to the compiler.
163+
A full list of available SYCL alias targets is available in the
164+
`DPC++ Compiler User Manual <https://intel.github.io/llvm/UsersManual.html>`_.
165+
166+
CUDA build
167+
~~~~~~~~~~
168+
169+
``dpctl`` can be built for CUDA devices using the ``DPCTL_TARGET_CUDA`` CMake option,
170+
which accepts a specific compute architecture string:
171+
172+
.. code-block:: bash
173+
174+
python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_CUDA=sm_80"
175+
176+
To use the default architecture (``sm_50``),
177+
set ``DPCTL_TARGET_CUDA`` to a value such as ``ON``, ``TRUE``, ``YES``, ``Y``, or ``1``:
163178

164179
.. code-block:: bash
165180
166181
python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_CUDA=ON"
167182
168-
And for AMD devices
183+
Note that kernels are built for the default architecture (``sm_50``), allowing them to work on a
184+
wider range of architectures, but limiting the usage of more recent CUDA features.
185+
186+
For reference, compute architecture strings like ``sm_80`` correspond to specific
187+
CUDA Compute Capabilities (e.g., Compute Capability 8.0 corresponds to ``sm_80``).
188+
A complete mapping between NVIDIA GPU models and their respective
189+
Compute Capabilities can be found in the official
190+
`CUDA GPU Compute Capability <https://developer.nvidia.com/cuda-gpus>`_ documentation.
191+
192+
AMD build
193+
~~~~~~~~~
194+
195+
``dpctl`` can be built for AMD devices using the ``DPCTL_TARGET_HIP`` CMake option,
196+
which requires specifying a compute architecture string:
169197

170198
.. code-block:: bash
171199
@@ -174,8 +202,13 @@ And for AMD devices
174202
Note that the `oneAPI for AMD GPUs` plugin requires the architecture be specified and only
175203
one architecture can be specified at a time.
176204

177-
It is, however, possible to build for Intel devices, CUDA devices, and an AMD device
178-
architecture all at once:
205+
Multi-target build
206+
~~~~~~~~~~~~~~~~~~
207+
208+
The default ``dpctl`` build from the source enables support of Intel devices only.
209+
Extending the build with a custom SYCL target additionally enables support of CUDA or AMD
210+
device in ``dpctl``. Besides, the support can be also extended to enable both CUDA and AMD
211+
devices at the same time:
179212

180213
.. code-block:: bash
181214

pulls/2098/beginners_guides/installation.html

Lines changed: 39 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -867,7 +867,7 @@ <h2>Installation using pip<a class="headerlink" href="#installation-using-pip" t
867867
<section id="installation-via-intel-r-distribution-for-python">
868868
<h2>Installation via Intel(R) Distribution for Python<a class="headerlink" href="#installation-via-intel-r-distribution-for-python" title="Permalink to this heading"></a></h2>
869869
<p><a class="reference external" href="https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html">Intel(R) Distribution for Python*</a> is distributed as a conda-based installer
870-
and includes <a class="reference internal" href="../api_reference/dpctl/index.html#module-dpctl" title="dpctl"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpctl</span></code></a> along with its dependencies and sister projects <a class="reference external" href="https://intelpython.github.io/dpnp/overview.html#module-dpnp" title="(in Data Parallel Extension for NumPy v0.19.0dev0+15.g0d012506707)"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpnp</span></code></a>
870+
and includes <a class="reference internal" href="../api_reference/dpctl/index.html#module-dpctl" title="dpctl"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpctl</span></code></a> along with its dependencies and sister projects <a class="reference external" href="https://intelpython.github.io/dpnp/overview.html#module-dpnp" title="(in Data Parallel Extension for NumPy v0.19.0dev0+18.gcedd0d171f9)"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpnp</span></code></a>
871871
and <a class="reference external" href="https://intelpython.github.io/numba-dpex/latest/index.html#module-numba_dpex" title="(in numba-dpex)"><code class="xref py py-mod docutils literal notranslate"><span class="pre">numba_dpex</span></code></a>.</p>
872872
<p>Once the installed environment is activated, <code class="docutils literal notranslate"><span class="pre">dpctl</span></code> should be ready to use.</p>
873873
</section>
@@ -932,24 +932,52 @@ <h3>Building for custom SYCL targets<a class="headerlink" href="#building-for-cu
932932
<li><p><a class="reference external" href="https://developer.codeplay.com/products/oneapi/amd/">oneAPI for AMD GPUs</a></p></li>
933933
</ul>
934934
</div></blockquote>
935-
<p><code class="docutils literal notranslate"><span class="pre">dpctl</span></code> can be built for CUDA devices as follows:</p>
935+
<p>Builds for CUDA and AMD devices internally use SYCL alias targets that are passed to the compiler.
936+
A full list of available SYCL alias targets is available in the
937+
<a class="reference external" href="https://intel.github.io/llvm/UsersManual.html">DPC++ Compiler User Manual</a>.</p>
938+
<section id="cuda-build">
939+
<h4>CUDA build<a class="headerlink" href="#cuda-build" title="Permalink to this heading"></a></h4>
940+
<p><code class="docutils literal notranslate"><span class="pre">dpctl</span></code> can be built for CUDA devices using the <code class="docutils literal notranslate"><span class="pre">DPCTL_TARGET_CUDA</span></code> CMake option,
941+
which accepts a specific compute architecture string:</p>
942+
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--cmake-opts<span class="o">=</span><span class="s2">&quot;-DDPCTL_TARGET_CUDA=sm_80&quot;</span>
943+
</pre></div>
944+
</div>
945+
<p>To use the default architecture (<code class="docutils literal notranslate"><span class="pre">sm_50</span></code>),
946+
set <code class="docutils literal notranslate"><span class="pre">DPCTL_TARGET_CUDA</span></code> to a value such as <code class="docutils literal notranslate"><span class="pre">ON</span></code>, <code class="docutils literal notranslate"><span class="pre">TRUE</span></code>, <code class="docutils literal notranslate"><span class="pre">YES</span></code>, <code class="docutils literal notranslate"><span class="pre">Y</span></code>, or <code class="docutils literal notranslate"><span class="pre">1</span></code>:</p>
936947
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--cmake-opts<span class="o">=</span><span class="s2">&quot;-DDPCTL_TARGET_CUDA=ON&quot;</span>
937948
</pre></div>
938949
</div>
939-
<p>And for AMD devices</p>
950+
<p>Note that kernels are built for the default architecture (<code class="docutils literal notranslate"><span class="pre">sm_50</span></code>), allowing them to work on a
951+
wider range of architectures, but limiting the usage of more recent CUDA features.</p>
952+
<p>For reference, compute architecture strings like <code class="docutils literal notranslate"><span class="pre">sm_80</span></code> correspond to specific
953+
CUDA Compute Capabilities (e.g., Compute Capability 8.0 corresponds to <code class="docutils literal notranslate"><span class="pre">sm_80</span></code>).
954+
A complete mapping between NVIDIA GPU models and their respective
955+
Compute Capabilities can be found in the official
956+
<a class="reference external" href="https://developer.nvidia.com/cuda-gpus">CUDA GPU Compute Capability</a> documentation.</p>
957+
</section>
958+
<section id="amd-build">
959+
<h4>AMD build<a class="headerlink" href="#amd-build" title="Permalink to this heading"></a></h4>
960+
<p><code class="docutils literal notranslate"><span class="pre">dpctl</span></code> can be built for AMD devices using the <code class="docutils literal notranslate"><span class="pre">DPCTL_TARGET_HIP</span></code> CMake option,
961+
which requires specifying a compute architecture string:</p>
940962
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--cmake-opts<span class="o">=</span><span class="s2">&quot;-DDPCTL_TARGET_HIP=gfx1030&quot;</span>
941963
</pre></div>
942964
</div>
943965
<p>Note that the <cite>oneAPI for AMD GPUs</cite> plugin requires the architecture be specified and only
944966
one architecture can be specified at a time.</p>
945-
<p>It is, however, possible to build for Intel devices, CUDA devices, and an AMD device
946-
architecture all at once:</p>
967+
</section>
968+
<section id="multi-target-build">
969+
<h4>Multi-target build<a class="headerlink" href="#multi-target-build" title="Permalink to this heading"></a></h4>
970+
<p>The default <code class="docutils literal notranslate"><span class="pre">dpctl</span></code> build from the source enables support of Intel devices only.
971+
Extending the build with a custom SYCL target additionally enables support of CUDA or AMD
972+
device in <code class="docutils literal notranslate"><span class="pre">dpctl</span></code>. Besides, the support can be also extended to enable both CUDA and AMD
973+
devices at the same time:</p>
947974
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--cmake-opts<span class="o">=</span><span class="s2">&quot;-DDPCTL_TARGET_CUDA=ON \</span>
948975
<span class="s2">-DDPCTL_TARGET_HIP=gfx1030&quot;</span>
949976
</pre></div>
950977
</div>
951978
</section>
952979
</section>
980+
</section>
953981
<section id="running-examples-and-tests">
954982
<h2>Running Examples and Tests<a class="headerlink" href="#running-examples-and-tests" title="Permalink to this heading"></a></h2>
955983
<section id="running-the-examples">
@@ -1041,7 +1069,12 @@ <h3>Running the Python Tests<a class="headerlink" href="#running-the-python-test
10411069
<li><a class="reference internal" href="#system-requirements">System requirements</a></li>
10421070
<li><a class="reference internal" href="#building-from-source">Building from source</a><ul>
10431071
<li><a class="reference internal" href="#building-locally-for-use-with-oneapi-dpc-installation">Building locally for use with oneAPI DPC++ installation</a></li>
1044-
<li><a class="reference internal" href="#building-for-custom-sycl-targets">Building for custom SYCL targets</a></li>
1072+
<li><a class="reference internal" href="#building-for-custom-sycl-targets">Building for custom SYCL targets</a><ul>
1073+
<li><a class="reference internal" href="#cuda-build">CUDA build</a></li>
1074+
<li><a class="reference internal" href="#amd-build">AMD build</a></li>
1075+
<li><a class="reference internal" href="#multi-target-build">Multi-target build</a></li>
1076+
</ul>
1077+
</li>
10451078
</ul>
10461079
</li>
10471080
<li><a class="reference internal" href="#running-examples-and-tests">Running Examples and Tests</a><ul>

pulls/2098/objects.inv

0 Bytes
Binary file not shown.

pulls/2098/searchindex.js

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)