Skip to content

Commit 6d9d7fb

Browse files
author
github-actions[doc-deploy-bot]
committed
Docs for pull request 2098
1 parent 97475e9 commit 6d9d7fb

File tree

10 files changed

+323
-91
lines changed

10 files changed

+323
-91
lines changed

pulls/2098/_modules/dpctl.html

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -923,10 +923,6 @@ <h1>Source code for dpctl</h1><div class="highlight"><pre>
923923
<span class="s2">&quot;utils&quot;</span><span class="p">,</span>
924924
<span class="p">]</span>
925925

926-
<span class="k">if</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">os</span><span class="p">,</span> <span class="s2">&quot;add_dll_directory&quot;</span><span class="p">):</span>
927-
<span class="c1"># Include folder containing DPCTLSyclInterface.dll to search path</span>
928-
<span class="n">os</span><span class="o">.</span><span class="n">add_dll_directory</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="vm">__file__</span><span class="p">))</span>
929-
930926

931927
<div class="viewcode-block" id="get_include"><a class="viewcode-back" href="../api_reference/dpctl/generated/dpctl.get_include.html#dpctl.get_include">[docs]</a><span class="k">def</span><span class="w"> </span><span class="nf">get_include</span><span class="p">():</span>
932928
<span class="w"> </span><span class="sa">r</span><span class="sd">&quot;&quot;&quot;</span>

pulls/2098/_modules/dpctl/tensor/_copy_utils.html

Lines changed: 102 additions & 63 deletions
Large diffs are not rendered by default.

pulls/2098/_sources/beginners_guides/installation.rst.txt

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -166,14 +166,27 @@ A full list of available SYCL alias targets is available in the
166166
CUDA build
167167
~~~~~~~~~~
168168

169-
``dpctl`` can be built for CUDA devices using the ``DPCTL_TARGET_CUDA`` CMake option,
170-
which accepts a specific compute architecture string:
169+
``dpctl`` can be built for CUDA devices using the ``--target-cuda`` argument.
170+
171+
To target a specific architecture (e.g., ``sm_80``):
172+
173+
.. code-block:: bash
174+
175+
python scripts/build_locally.py --verbose --target-cuda=sm_80
176+
177+
To use the default architecture (``sm_50``), omit the value:
178+
179+
.. code-block:: bash
180+
181+
python scripts/build_locally.py --verbose --target-cuda
182+
183+
Alternatively, you can use the ``DPCTL_TARGET_CUDA`` CMake option:
171184

172185
.. code-block:: bash
173186
174187
python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_CUDA=sm_80"
175188
176-
To use the default architecture (``sm_50``),
189+
To use the default architecture (``sm_50``) with CMake options,
177190
set ``DPCTL_TARGET_CUDA`` to a value such as ``ON``, ``TRUE``, ``YES``, ``Y``, or ``1``:
178191

179192
.. code-block:: bash
@@ -192,12 +205,11 @@ Compute Capabilities can be found in the official
192205
AMD build
193206
~~~~~~~~~
194207

195-
``dpctl`` can be built for AMD devices using the ``DPCTL_TARGET_HIP`` CMake option,
196-
which requires specifying a compute architecture string:
208+
``dpctl`` can be built for AMD devices using the ``--target-hip`` argument.
197209

198210
.. code-block:: bash
199211
200-
python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_HIP=<arch>"
212+
python scripts/build_locally.py --verbose --target-hip=<arch>
201213
202214
Note that the `oneAPI for AMD GPUs` plugin requires the architecture be specified and only
203215
one architecture can be specified at a time.
@@ -208,11 +220,17 @@ To determine the architecture code (``<arch>``) for your AMD GPU, run:
208220
rocminfo | grep 'Name: *gfx.*'
209221
210222
This will print names like ``gfx90a``, ``gfx1030``, etc.
211-
You can then use one of them as the argument to ``-DDPCTL_TARGET_HIP``.
223+
You can then use one of them as the argument to ``--target-hip``.
212224

213225
For example:
214226

215227
.. code-block:: bash
228+
python scripts/build_locally.py --verbose --target-hip=gfx1030
229+
230+
Alternatively, you can use the ``DPCTL_TARGET_HIP`` CMake option:
231+
232+
.. code-block:: bash
233+
216234
python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_HIP=gfx1030"
217235
218236
Multi-target build
@@ -225,8 +243,7 @@ devices at the same time:
225243

226244
.. code-block:: bash
227245
228-
python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_CUDA=ON \
229-
-DDPCTL_TARGET_HIP=gfx1030"
246+
python scripts/build_locally.py --verbose --target-cuda --target-hip=gfx1030
230247
231248
Running Examples and Tests
232249
==========================

pulls/2098/_sources/user_guides/environment_variables.rst.txt

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,12 @@ Environment variables
66

77
Behavior of :py:mod:`dpctl` is affected by :dpcpp_envar:`environment variables <>` that
88
affect DPC++ compiler runtime.
9+
Other relevant environment variables that may not be documented here can be found in:
10+
11+
- `Level Zero <https://intel.github.io/llvm/EnvironmentVariables.html>`_
12+
13+
- `OneAPI <https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/PROG.html#environment-variables>`_
14+
915

1016
Variable ``ONEAPI_DEVICE_SELECTOR``
1117
-----------------------------------
@@ -50,3 +56,70 @@ The value of the variable is a bit-mask, with the following supported values:
5056
- Enables tracing of PI calls
5157
* - ``-1``
5258
- Enables all levels of tracing
59+
60+
.. _env_var_ze_flat_device_hierarchy:
61+
62+
Variable ``ZE_FLAT_DEVICE_HIERARCHY``
63+
--------------------------
64+
Allows users to define the device hierarchy model exposed by Level Zero driver implementation.
65+
Keep in mind :py:mod:`dpctl.get_composite_devices` will only work while this is set to ``COMBINED``.
66+
67+
.. list-table::
68+
:header-rows: 1
69+
70+
* - Value
71+
- Description
72+
* - ``COMBINED``
73+
- Level Zero devices with multiple tiles will be exposed as a set of root devices, each corresponding to an individual tile. These root devices are component devices, which can be queried for their corresponding composite device, and the composite device can in turn be queried for components. Dedicated composite device APIs will return non-trivial results.
74+
* - ``COMPOSITE``
75+
- Level Zero devices with multiple tiles will be exposed as a singular root device, with tiles accessible as sub-devices.
76+
* - ``FLAT``
77+
- Level Zero devices with multiple tiles will be exposed as a set of root devices, each corresponding to an individual tile. Enabled by default.
78+
79+
Read more about device hierarchy in `Level Zero Specification <https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/PROG.html#device-hierarchy>`_ and `Intel GPU article <https://www.intel.com/content/www/us/en/developer/articles/technical/flattening-gpu-tile-hierarchy.html>`_.
80+
81+
Variable ``ZE_AFFINITY_MASK``
82+
-------------------------------
83+
Allows users to mask specific devices from being used by SYCL applications.
84+
If we have ``ZE_FLAT_DEVICE_HIERARCHY`` set to ``COMPOSITE``, we can have an AFFINITY of “1” for our application to only see device #1 - making system devices 0, and 2+, invisible.
85+
86+
If we have ``ZE_FLAT_DEVICE_HIERARCHY`` set to ``FLAT``, we can have a ``ZE_AFFINITY_MASK`` of “1” for our application to only see the second tile in the system as logical device #0.
87+
If the system has four dual-tile GPUs installed, this would be the second tile in the first GPU. In ``FLAT`` mode, the numbers use a system-wide-sub-device-number from a flat numbering perspective.
88+
Therefore, we could use the second tile in each of four dual-tile GPUs with ``ZE_AFFINITY_MASK=1,3,5,7``.
89+
90+
| If we have ``ZE_FLAT_DEVICE_HIERARCHY`` set to ``COMBINED``, the way tiles and composite devices are exposed depends on the physical devices present and the value of ``ZE_AFFINITY_MASK``:
91+
| **If all exposed tiles (as determined by ``ZE_AFFINITY_MASK``) belong to the same physical device:**
92+
| - That composite device is available to the application, and each tile is accessible as a component device of that composite device.
93+
94+
| **If the exposed tiles belong to different physical devices:**
95+
| - A composite device is available for each physical device, and the tiles are accessible as component devices of their respective composite device.
96+
97+
Additional examples to illustrate this are in the detailed documentation for ``ZE_AFFINITY_MASK``, read more about it in `Level Zero Specification <https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/PROG.html#affinity-mask>`_.
98+
99+
Variable ``ZE_ENABLE_PCI_ID_DEVICE_ORDER``
100+
-------------------------------
101+
Forces driver to report devices from lowest to highest PCI bus ID.
102+
103+
.. list-table::
104+
:header-rows: 1
105+
106+
* - Value
107+
- Description
108+
* - ``0``
109+
- Disabled. Default value.
110+
* - ``1``
111+
- Enabled.
112+
113+
Variable ``ZE_SHARED_FORCE_DEVICE_ALLOC``
114+
-------------------------------
115+
Forces all shared allocations into device memory
116+
117+
.. list-table::
118+
:header-rows: 1
119+
120+
* - Value
121+
- Description
122+
* - ``0``
123+
- Disabled. Default value.
124+
* - ``1``
125+
- Enabled.

pulls/2098/api_reference/dpctl/generated/dpctl.get_composite_devices.html

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -807,7 +807,9 @@ <h1>dpctl.get_composite_devices<a class="headerlink" href="#dpctl-get-composite-
807807
instances.</p>
808808
<p>Only available when <cite>ZE_FLAT_DEVICE_HIERARCHY=COMBINED</cite> is set in
809809
the environment, and only for specific Level Zero devices
810-
(i.e., those which expose multiple tiles as root devices).</p>
810+
(i.e., those which expose multiple tiles as root devices).
811+
To read more about <cite>ZE_FLAT_DEVICE_HIERARCHY=COMBINED</cite>,
812+
see <a class="reference internal" href="../../../user_guides/environment_variables.html#env-var-ze-flat-device-hierarchy"><span class="std std-ref">Variable ZE_FLAT_DEVICE_HIERARCHY</span></a>.</p>
811813
<p>For more information, see:
812814
<a class="reference external" href="https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_composite_device.asciidoc">https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_composite_device.asciidoc</a></p>
813815
<dl class="field-list simple">

pulls/2098/api_reference/dpctl/generated/generated/dpctl.tensor.usm_ndarray.__dlpack_device__.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -808,8 +808,8 @@ <h1>dpctl.tensor.usm_ndarray.__dlpack_device__<a class="headerlink" href="#dpctl
808808
<p>The tuple describes the non-partitioned device where the array has been
809809
allocated, or the non-partitioned parent device of the allocation
810810
device.</p>
811-
<p>See <code class="docutils literal notranslate"><span class="pre">DLDeviceType</span></code> for a list of devices supported by the DLPack
812-
protocol.</p>
811+
<p>See <a class="reference internal" href="../../tensor.constants.html#dpctl.tensor.DLDeviceType" title="dpctl.tensor.DLDeviceType"><code class="xref py py-class docutils literal notranslate"><span class="pre">dpctl.tensor.DLDeviceType</span></code></a> for a list of devices supported
812+
by the DLPack protocol.</p>
813813
<dl class="field-list simple">
814814
<dt class="field-odd">Raises<span class="colon">:</span></dt>
815815
<dd class="field-odd"><p><strong>DLPackCreationError</strong> – when the <code class="docutils literal notranslate"><span class="pre">device_id</span></code> could not be determined.</p>

pulls/2098/beginners_guides/installation.html

Lines changed: 20 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -868,7 +868,7 @@ <h2>Installation using pip<a class="headerlink" href="#installation-using-pip" t
868868
<section id="installation-via-intel-r-distribution-for-python">
869869
<h2>Installation via Intel(R) Distribution for Python<a class="headerlink" href="#installation-via-intel-r-distribution-for-python" title="Permalink to this heading"></a></h2>
870870
<p><a class="reference external" href="https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html">Intel(R) Distribution for Python*</a> is distributed as a conda-based installer
871-
and includes <a class="reference internal" href="../api_reference/dpctl/index.html#module-dpctl" title="dpctl"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpctl</span></code></a> along with its dependencies and sister projects <a class="reference external" href="https://intelpython.github.io/dpnp/overview.html#module-dpnp" title="(in Data Parallel Extension for NumPy v0.19.0dev1+15.g876e9403a7e)"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpnp</span></code></a>
871+
and includes <a class="reference internal" href="../api_reference/dpctl/index.html#module-dpctl" title="dpctl"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpctl</span></code></a> along with its dependencies and sister projects <a class="reference external" href="https://intelpython.github.io/dpnp/overview.html#module-dpnp" title="(in Data Parallel Extension for NumPy v0.19.0dev3+26.gf244f40aede)"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpnp</span></code></a>
872872
and <a class="reference external" href="https://intelpython.github.io/numba-dpex/latest/index.html#module-numba_dpex" title="(in numba-dpex)"><code class="xref py py-mod docutils literal notranslate"><span class="pre">numba_dpex</span></code></a>.</p>
873873
<p>Once the installed environment is activated, <code class="docutils literal notranslate"><span class="pre">dpctl</span></code> should be ready to use.</p>
874874
</section>
@@ -938,12 +938,20 @@ <h3>Building for custom SYCL targets<a class="headerlink" href="#building-for-cu
938938
<a class="reference external" href="https://intel.github.io/llvm/UsersManual.html">DPC++ Compiler User Manual</a>.</p>
939939
<section id="cuda-build">
940940
<h4>CUDA build<a class="headerlink" href="#cuda-build" title="Permalink to this heading"></a></h4>
941-
<p><code class="docutils literal notranslate"><span class="pre">dpctl</span></code> can be built for CUDA devices using the <code class="docutils literal notranslate"><span class="pre">DPCTL_TARGET_CUDA</span></code> CMake option,
942-
which accepts a specific compute architecture string:</p>
941+
<p><code class="docutils literal notranslate"><span class="pre">dpctl</span></code> can be built for CUDA devices using the <code class="docutils literal notranslate"><span class="pre">--target-cuda</span></code> argument.</p>
942+
<p>To target a specific architecture (e.g., <code class="docutils literal notranslate"><span class="pre">sm_80</span></code>):</p>
943+
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--target-cuda<span class="o">=</span>sm_80
944+
</pre></div>
945+
</div>
946+
<p>To use the default architecture (<code class="docutils literal notranslate"><span class="pre">sm_50</span></code>), omit the value:</p>
947+
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--target-cuda
948+
</pre></div>
949+
</div>
950+
<p>Alternatively, you can use the <code class="docutils literal notranslate"><span class="pre">DPCTL_TARGET_CUDA</span></code> CMake option:</p>
943951
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--cmake-opts<span class="o">=</span><span class="s2">&quot;-DDPCTL_TARGET_CUDA=sm_80&quot;</span>
944952
</pre></div>
945953
</div>
946-
<p>To use the default architecture (<code class="docutils literal notranslate"><span class="pre">sm_50</span></code>),
954+
<p>To use the default architecture (<code class="docutils literal notranslate"><span class="pre">sm_50</span></code>) with CMake options,
947955
set <code class="docutils literal notranslate"><span class="pre">DPCTL_TARGET_CUDA</span></code> to a value such as <code class="docutils literal notranslate"><span class="pre">ON</span></code>, <code class="docutils literal notranslate"><span class="pre">TRUE</span></code>, <code class="docutils literal notranslate"><span class="pre">YES</span></code>, <code class="docutils literal notranslate"><span class="pre">Y</span></code>, or <code class="docutils literal notranslate"><span class="pre">1</span></code>:</p>
948956
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--cmake-opts<span class="o">=</span><span class="s2">&quot;-DDPCTL_TARGET_CUDA=ON&quot;</span>
949957
</pre></div>
@@ -958,26 +966,28 @@ <h4>CUDA build<a class="headerlink" href="#cuda-build" title="Permalink to this
958966
</section>
959967
<section id="amd-build">
960968
<h4>AMD build<a class="headerlink" href="#amd-build" title="Permalink to this heading"></a></h4>
961-
<p><code class="docutils literal notranslate"><span class="pre">dpctl</span></code> can be built for AMD devices using the <code class="docutils literal notranslate"><span class="pre">DPCTL_TARGET_HIP</span></code> CMake option,
962-
which requires specifying a compute architecture string:</p>
963-
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--cmake-opts<span class="o">=</span><span class="s2">&quot;-DDPCTL_TARGET_HIP=&lt;arch&gt;&quot;</span>
969+
<p><code class="docutils literal notranslate"><span class="pre">dpctl</span></code> can be built for AMD devices using the <code class="docutils literal notranslate"><span class="pre">--target-hip</span></code> argument.</p>
970+
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--target-hip<span class="o">=</span>&lt;arch&gt;
964971
</pre></div>
965972
</div>
966973
<p>Note that the <cite>oneAPI for AMD GPUs</cite> plugin requires the architecture be specified and only
967974
one architecture can be specified at a time.</p>
968975
<p>To determine the architecture code (<code class="docutils literal notranslate"><span class="pre">&lt;arch&gt;</span></code>) for your AMD GPU, run:</p>
969976
<p>This will print names like <code class="docutils literal notranslate"><span class="pre">gfx90a</span></code>, <code class="docutils literal notranslate"><span class="pre">gfx1030</span></code>, etc.
970-
You can then use one of them as the argument to <code class="docutils literal notranslate"><span class="pre">-DDPCTL_TARGET_HIP</span></code>.</p>
977+
You can then use one of them as the argument to <code class="docutils literal notranslate"><span class="pre">--target-hip</span></code>.</p>
971978
<p>For example:</p>
979+
<p>Alternatively, you can use the <code class="docutils literal notranslate"><span class="pre">DPCTL_TARGET_HIP</span></code> CMake option:</p>
980+
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--cmake-opts<span class="o">=</span><span class="s2">&quot;-DDPCTL_TARGET_HIP=gfx1030&quot;</span>
981+
</pre></div>
982+
</div>
972983
</section>
973984
<section id="multi-target-build">
974985
<h4>Multi-target build<a class="headerlink" href="#multi-target-build" title="Permalink to this heading"></a></h4>
975986
<p>The default <code class="docutils literal notranslate"><span class="pre">dpctl</span></code> build from the source enables support of Intel devices only.
976987
Extending the build with a custom SYCL target additionally enables support of CUDA or AMD
977988
device in <code class="docutils literal notranslate"><span class="pre">dpctl</span></code>. Besides, the support can be also extended to enable both CUDA and AMD
978989
devices at the same time:</p>
979-
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--cmake-opts<span class="o">=</span><span class="s2">&quot;-DDPCTL_TARGET_CUDA=ON \</span>
980-
<span class="s2">-DDPCTL_TARGET_HIP=gfx1030&quot;</span>
990+
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>scripts/build_locally.py<span class="w"> </span>--verbose<span class="w"> </span>--target-cuda<span class="w"> </span>--target-hip<span class="o">=</span>gfx1030
981991
</pre></div>
982992
</div>
983993
</section>

pulls/2098/objects.inv

74 Bytes
Binary file not shown.

pulls/2098/searchindex.js

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)