You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<h2>Installation via Intel(R) Distribution for Python<aclass="headerlink" href="#installation-via-intel-r-distribution-for-python" title="Permalink to this heading">¶</a></h2>
869
869
<p><aclass="reference external" href="https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html">Intel(R) Distribution for Python*</a> is distributed as a conda-based installer
870
-
and includes <aclass="reference internal" href="../api_reference/dpctl/index.html#module-dpctl" title="dpctl"><codeclass="xref py py-mod docutils literal notranslate"><spanclass="pre">dpctl</span></code></a> along with its dependencies and sister projects <aclass="reference external" href="https://intelpython.github.io/dpnp/overview.html#module-dpnp" title="(in Data Parallel Extension for NumPy v0.19.0dev0+15.g0d012506707)"><codeclass="xref py py-mod docutils literal notranslate"><spanclass="pre">dpnp</span></code></a>
870
+
and includes <aclass="reference internal" href="../api_reference/dpctl/index.html#module-dpctl" title="dpctl"><codeclass="xref py py-mod docutils literal notranslate"><spanclass="pre">dpctl</span></code></a> along with its dependencies and sister projects <aclass="reference external" href="https://intelpython.github.io/dpnp/overview.html#module-dpnp" title="(in Data Parallel Extension for NumPy v0.19.0dev0+18.gcedd0d171f9)"><codeclass="xref py py-mod docutils literal notranslate"><spanclass="pre">dpnp</span></code></a>
<p>Once the installed environment is activated, <codeclass="docutils literal notranslate"><spanclass="pre">dpctl</span></code> should be ready to use.</p>
<li><p><aclass="reference external" href="https://developer.codeplay.com/products/oneapi/amd/">oneAPI for AMD GPUs</a></p></li>
933
933
</ul>
934
934
</div></blockquote>
935
-
<p><codeclass="docutils literal notranslate"><spanclass="pre">dpctl</span></code> can be built for CUDA devices as follows:</p>
935
+
<p>Builds for CUDA and AMD devices internally use SYCL alias targets that are passed to the compiler.
936
+
A full list of available SYCL alias targets is available in the
937
+
<aclass="reference external" href="https://intel.github.io/llvm/UsersManual.html">DPC++ Compiler User Manual</a>.</p>
938
+
<sectionid="cuda-build">
939
+
<h4>CUDA build<aclass="headerlink" href="#cuda-build" title="Permalink to this heading">¶</a></h4>
940
+
<p><codeclass="docutils literal notranslate"><spanclass="pre">dpctl</span></code> can be built for CUDA devices using the <codeclass="docutils literal notranslate"><spanclass="pre">DPCTL_TARGET_CUDA</span></code> CMake option,
941
+
which accepts a specific compute architecture string:</p>
<p>To use the default architecture (<codeclass="docutils literal notranslate"><spanclass="pre">sm_50</span></code>),
946
+
set <codeclass="docutils literal notranslate"><spanclass="pre">DPCTL_TARGET_CUDA</span></code> to a value such as <codeclass="docutils literal notranslate"><spanclass="pre">ON</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">TRUE</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">YES</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">Y</span></code>, or <codeclass="docutils literal notranslate"><spanclass="pre">1</span></code>:</p>
<p>Note that kernels are built for the default architecture (<codeclass="docutils literal notranslate"><spanclass="pre">sm_50</span></code>), allowing them to work on a
951
+
wider range of architectures, but limiting the usage of more recent CUDA features.</p>
952
+
<p>For reference, compute architecture strings like <codeclass="docutils literal notranslate"><spanclass="pre">sm_80</span></code> correspond to specific
953
+
CUDA Compute Capabilities (e.g., Compute Capability 8.0 corresponds to <codeclass="docutils literal notranslate"><spanclass="pre">sm_80</span></code>).
954
+
A complete mapping between NVIDIA GPU models and their respective
<h4>AMD build<aclass="headerlink" href="#amd-build" title="Permalink to this heading">¶</a></h4>
960
+
<p><codeclass="docutils literal notranslate"><spanclass="pre">dpctl</span></code> can be built for AMD devices using the <codeclass="docutils literal notranslate"><spanclass="pre">DPCTL_TARGET_HIP</span></code> CMake option,
961
+
which requires specifying a compute architecture string:</p>
<p>Note that the <cite>oneAPI for AMD GPUs</cite> plugin requires the architecture be specified and only
944
966
one architecture can be specified at a time.</p>
945
-
<p>It is, however, possible to build for Intel devices, CUDA devices, and an AMD device
946
-
architecture all at once:</p>
967
+
</section>
968
+
<sectionid="multi-target-build">
969
+
<h4>Multi-target build<aclass="headerlink" href="#multi-target-build" title="Permalink to this heading">¶</a></h4>
970
+
<p>The default <codeclass="docutils literal notranslate"><spanclass="pre">dpctl</span></code> build from the source enables support of Intel devices only.
971
+
Extending the build with a custom SYCL target additionally enables support of CUDA or AMD
972
+
device in <codeclass="docutils literal notranslate"><spanclass="pre">dpctl</span></code>. Besides, the support can be also extended to enable both CUDA and AMD
<li><aclass="reference internal" href="#building-from-source">Building from source</a><ul>
1043
1071
<li><aclass="reference internal" href="#building-locally-for-use-with-oneapi-dpc-installation">Building locally for use with oneAPI DPC++ installation</a></li>
1044
-
<li><aclass="reference internal" href="#building-for-custom-sycl-targets">Building for custom SYCL targets</a></li>
1072
+
<li><aclass="reference internal" href="#building-for-custom-sycl-targets">Building for custom SYCL targets</a><ul>
0 commit comments