Skip to content

Commit c501180

Browse files
author
github-actions[doc-deploy-bot]
committed
Docs for pull request 1872
1 parent b55b0de commit c501180

File tree

6 files changed

+85
-25
lines changed

6 files changed

+85
-25
lines changed

pulls/1872/_modules/dpctl/_sycl_timer.html

Lines changed: 40 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -896,8 +896,7 @@ <h1>Source code for dpctl._sycl_timer</h1><div class="highlight"><pre>
896896

897897
<div class="viewcode-block" id="SyclTimer"><a class="viewcode-back" href="../../api_reference/dpctl/generated/dpctl.SyclTimer.html#dpctl.SyclTimer">[docs]</a><span class="k">class</span> <span class="nc">SyclTimer</span><span class="p">:</span>
898898
<span class="w"> </span><span class="sd">&quot;&quot;&quot;</span>
899-
<span class="sd"> Context to measure device time and host wall-time of execution</span>
900-
<span class="sd"> of commands submitted to :class:`dpctl.SyclQueue`.</span>
899+
<span class="sd"> Context to time execution of tasks submitted to :class:`dpctl.SyclQueue`.</span>
901900

902901
<span class="sd"> :Example:</span>
903902
<span class="sd"> .. code-block:: python</span>
@@ -911,13 +910,18 @@ <h1>Source code for dpctl._sycl_timer</h1><div class="highlight"><pre>
911910
<span class="sd"> milliseconds_sc = 1e3</span>
912911
<span class="sd"> timer = dpctl.SyclTimer(time_scale = milliseconds_sc)</span>
913912

913+
<span class="sd"> untimed_code_block_1</span>
914914
<span class="sd"> # use the timer</span>
915915
<span class="sd"> with timer(queue=q):</span>
916-
<span class="sd"> code_block1</span>
916+
<span class="sd"> timed_code_block1</span>
917+
918+
<span class="sd"> untimed_code_block_2</span>
917919

918920
<span class="sd"> # use the timer</span>
919921
<span class="sd"> with timer(queue=q):</span>
920-
<span class="sd"> code_block2</span>
922+
<span class="sd"> timed_code_block2</span>
923+
924+
<span class="sd"> untimed_code_block_3</span>
921925

922926
<span class="sd"> # retrieve elapsed times in milliseconds</span>
923927
<span class="sd"> wall_dt, device_dt = timer.dt</span>
@@ -928,16 +932,41 @@ <h1>Source code for dpctl._sycl_timer</h1><div class="highlight"><pre>
928932
<span class="sd"> associated with these submissions to perform the timing. Thus</span>
929933
<span class="sd"> :class:`dpctl.SyclTimer` requires the queue with ``&quot;enable_profiling&quot;``</span>
930934
<span class="sd"> property. In order to be able to collect the profiling information,</span>
931-
<span class="sd"> the ``dt`` property ensures that both submitted barriers complete their</span>
932-
<span class="sd"> execution and thus effectively synchronizes the queue.</span>
935+
<span class="sd"> the ``dt`` property ensures that both tasks submitted by the timer</span>
936+
<span class="sd"> complete their execution and thus effectively synchronizes the queue.</span>
937+
938+
<span class="sd"> Execution of the above example results in the following task graph,</span>
939+
<span class="sd"> where each group of tasks is ordered after the one preceding it,</span>
940+
<span class="sd"> ``[tasks_of_untimed_block1]``, ``[timer_fence_start_task]``,</span>
941+
<span class="sd"> ``[tasks_of_timed_block1]``, ``[timer_fence_finish_task]``,</span>
942+
<span class="sd"> ``[tasks_of_untimed_block2]``, ``[timer_fence_start_task]``,</span>
943+
<span class="sd"> ``[tasks_of_timed_block2]``, ``[timer_fence_finish_task]``,</span>
944+
<span class="sd"> ``[tasks_of_untimed_block3]``.</span>
933945

934-
<span class="sd"> `device_timer` keyword argument controls the type of tasks submitted.</span>
935-
<span class="sd"> With `device_timer=&quot;queue_barrier&quot;`, queue barrier tasks are used. With</span>
936-
<span class="sd"> `device_timer=&quot;order_manager&quot;`, a single empty body task is inserted</span>
937-
<span class="sd"> instead relying on order manager (used by `dpctl.tensor` operations) to</span>
946+
<span class="sd"> ``device_timer`` keyword argument controls the type of tasks submitted.</span>
947+
<span class="sd"> With ``&quot;queue_barrier&quot;`` value, queue barrier tasks are used. With</span>
948+
<span class="sd"> ``&quot;order_manager&quot;`` value, a single empty body task is inserted</span>
949+
<span class="sd"> and order manager (used by all `dpctl.tensor` operations) is used to</span>
938950
<span class="sd"> order these tasks so that they fence operations performed within</span>
939951
<span class="sd"> timer&#39;s context.</span>
940952

953+
<span class="sd"> Timing offloading operations that do not use the order manager with</span>
954+
<span class="sd"> the timer that uses ``&quot;order_manager&quot;`` as ``device_timer`` value</span>
955+
<span class="sd"> will be misleading becaused the tasks submitted by the timer will not</span>
956+
<span class="sd"> be ordered with respect to tasks we intend to time.</span>
957+
958+
<span class="sd"> Note, that host timer effectively measures the time of task</span>
959+
<span class="sd"> submissions. To measure host timer wall-time that includes execution</span>
960+
<span class="sd"> of submitted tasks, make sure to include synchronization point in</span>
961+
<span class="sd"> the timed block.</span>
962+
963+
<span class="sd"> :Example:</span>
964+
<span class="sd"> .. code-block:: python</span>
965+
966+
<span class="sd"> with timer(q):</span>
967+
<span class="sd"> timed_block</span>
968+
<span class="sd"> q.wait()</span>
969+
941970
<span class="sd"> Args:</span>
942971
<span class="sd"> host_timer (callable, optional):</span>
943972
<span class="sd"> A callable such that host_timer() returns current</span>
@@ -946,7 +975,7 @@ <h1>Source code for dpctl._sycl_timer</h1><div class="highlight"><pre>
946975
<span class="sd"> device_timer (Literal[&quot;queue_barrier&quot;, &quot;order_manager&quot;], optional):</span>
947976
<span class="sd"> Device timing method. Default: &quot;queue_barrier&quot;.</span>
948977
<span class="sd"> time_scale (Union[int, float], optional):</span>
949-
<span class="sd"> Ratio of the unit of time of interest and one second.</span>
978+
<span class="sd"> Ratio of one second and the unit of time-scale of interest.</span>
950979
<span class="sd"> Default: ``1``.</span>
951980
<span class="sd"> &quot;&quot;&quot;</span>
952981

pulls/1872/api_reference/dpctl/generated/dpctl.SyclTimer.html

Lines changed: 42 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -821,8 +821,7 @@ <h1>dpctl.SyclTimer<a class="headerlink" href="#dpctl-sycltimer" title="Permalin
821821
<dl class="py class">
822822
<dt class="sig sig-object py" id="dpctl.SyclTimer">
823823
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">dpctl.</span></span><span class="sig-name descname"><span class="pre">SyclTimer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">host_timer=&lt;built-in</span> <span class="pre">function</span> <span class="pre">perf_counter&gt;</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">device_timer=None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">time_scale=1</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="../../../_modules/dpctl/_sycl_timer.html#SyclTimer"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#dpctl.SyclTimer" title="Permalink to this definition"></a></dt>
824-
<dd><p>Context to measure device time and host wall-time of execution
825-
of commands submitted to <a class="reference internal" href="dpctl.SyclQueue.html#dpctl.SyclQueue" title="dpctl.SyclQueue"><code class="xref py py-class docutils literal notranslate"><span class="pre">dpctl.SyclQueue</span></code></a>.</p>
824+
<dd><p>Context to time execution of tasks submitted to <a class="reference internal" href="dpctl.SyclQueue.html#dpctl.SyclQueue" title="dpctl.SyclQueue"><code class="xref py py-class docutils literal notranslate"><span class="pre">dpctl.SyclQueue</span></code></a>.</p>
826825
<dl class="field-list">
827826
<dt class="field-odd">Example<span class="colon">:</span></dt>
828827
<dd class="field-odd"><div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">dpctl</span>
@@ -834,13 +833,18 @@ <h1>dpctl.SyclTimer<a class="headerlink" href="#dpctl-sycltimer" title="Permalin
834833
<span class="n">milliseconds_sc</span> <span class="o">=</span> <span class="mf">1e3</span>
835834
<span class="n">timer</span> <span class="o">=</span> <span class="n">dpctl</span><span class="o">.</span><span class="n">SyclTimer</span><span class="p">(</span><span class="n">time_scale</span> <span class="o">=</span> <span class="n">milliseconds_sc</span><span class="p">)</span>
836835

836+
<span class="n">untimed_code_block_1</span>
837837
<span class="c1"># use the timer</span>
838838
<span class="k">with</span> <span class="n">timer</span><span class="p">(</span><span class="n">queue</span><span class="o">=</span><span class="n">q</span><span class="p">):</span>
839-
<span class="n">code_block1</span>
839+
<span class="n">timed_code_block1</span>
840+
841+
<span class="n">untimed_code_block_2</span>
840842

841843
<span class="c1"># use the timer</span>
842844
<span class="k">with</span> <span class="n">timer</span><span class="p">(</span><span class="n">queue</span><span class="o">=</span><span class="n">q</span><span class="p">):</span>
843-
<span class="n">code_block2</span>
845+
<span class="n">timed_code_block2</span>
846+
847+
<span class="n">untimed_code_block_3</span>
844848

845849
<span class="c1"># retrieve elapsed times in milliseconds</span>
846850
<span class="n">wall_dt</span><span class="p">,</span> <span class="n">device_dt</span> <span class="o">=</span> <span class="n">timer</span><span class="o">.</span><span class="n">dt</span>
@@ -855,14 +859,41 @@ <h1>dpctl.SyclTimer<a class="headerlink" href="#dpctl-sycltimer" title="Permalin
855859
associated with these submissions to perform the timing. Thus
856860
<a class="reference internal" href="#dpctl.SyclTimer" title="dpctl.SyclTimer"><code class="xref py py-class docutils literal notranslate"><span class="pre">dpctl.SyclTimer</span></code></a> requires the queue with <code class="docutils literal notranslate"><span class="pre">&quot;enable_profiling&quot;</span></code>
857861
property. In order to be able to collect the profiling information,
858-
the <code class="docutils literal notranslate"><span class="pre">dt</span></code> property ensures that both submitted barriers complete their
859-
execution and thus effectively synchronizes the queue.</p>
860-
<p><cite>device_timer</cite> keyword argument controls the type of tasks submitted.
861-
With <cite>device_timer=”queue_barrier”</cite>, queue barrier tasks are used. With
862-
<cite>device_timer=”order_manager”</cite>, a single empty body task is inserted
863-
instead relying on order manager (used by <cite>dpctl.tensor</cite> operations) to
862+
the <code class="docutils literal notranslate"><span class="pre">dt</span></code> property ensures that both tasks submitted by the timer
863+
complete their execution and thus effectively synchronizes the queue.</p>
864+
<p>Execution of the above example results in the following task graph,
865+
where each group of tasks is ordered after the one preceding it,
866+
<code class="docutils literal notranslate"><span class="pre">[tasks_of_untimed_block1]</span></code>, <code class="docutils literal notranslate"><span class="pre">[timer_fence_start_task]</span></code>,
867+
<code class="docutils literal notranslate"><span class="pre">[tasks_of_timed_block1]</span></code>, <code class="docutils literal notranslate"><span class="pre">[timer_fence_finish_task]</span></code>,
868+
<code class="docutils literal notranslate"><span class="pre">[tasks_of_untimed_block2]</span></code>, <code class="docutils literal notranslate"><span class="pre">[timer_fence_start_task]</span></code>,
869+
<code class="docutils literal notranslate"><span class="pre">[tasks_of_timed_block2]</span></code>, <code class="docutils literal notranslate"><span class="pre">[timer_fence_finish_task]</span></code>,
870+
<code class="docutils literal notranslate"><span class="pre">[tasks_of_untimed_block3]</span></code>.</p>
871+
<p><code class="docutils literal notranslate"><span class="pre">device_timer</span></code> keyword argument controls the type of tasks submitted.
872+
With <code class="docutils literal notranslate"><span class="pre">&quot;queue_barrier&quot;</span></code> value, queue barrier tasks are used. With
873+
<code class="docutils literal notranslate"><span class="pre">&quot;order_manager&quot;</span></code> value, a single empty body task is inserted
874+
and order manager (used by all <cite>dpctl.tensor</cite> operations) is used to
864875
order these tasks so that they fence operations performed within
865876
timer’s context.</p>
877+
<p>Timing offloading operations that do not use the order manager with
878+
the timer that uses <code class="docutils literal notranslate"><span class="pre">&quot;order_manager&quot;</span></code> as <code class="docutils literal notranslate"><span class="pre">device_timer</span></code> value
879+
will be misleading becaused the tasks submitted by the timer will not
880+
be ordered with respect to tasks we intend to time.</p>
881+
<p>Note, that host timer effectively measures the time of task
882+
submissions. To measure host timer wall-time that includes execution
883+
of submitted tasks, make sure to include synchronization point in
884+
the timed block.</p>
885+
<dl class="field-list">
886+
<dt class="field-odd">Example<span class="colon">:</span></dt>
887+
<dd class="field-odd"><div class="highlight-python notranslate"><div class="highlight"><pre><span></span>
888+
</pre></div>
889+
</div>
890+
<dl class="simple">
891+
<dt>with timer(q):</dt><dd><p>timed_block
892+
q.wait()</p>
893+
</dd>
894+
</dl>
895+
</dd>
896+
</dl>
866897
</div>
867898
<dl class="field-list simple">
868899
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
@@ -871,7 +902,7 @@ <h1>dpctl.SyclTimer<a class="headerlink" href="#dpctl-sycltimer" title="Permalin
871902
host time in seconds.
872903
Default: <a class="reference external" href="https://docs.python.org/3/library/timeit.html#timeit.default_timer" title="(in Python v3.13)"><code class="xref py py-func docutils literal notranslate"><span class="pre">timeit.default_timer()</span></code></a>.</p></li>
873904
<li><p><strong>device_timer</strong> (<em>Literal</em><em>[</em><em>&quot;queue_barrier&quot;</em><em>, </em><em>&quot;order_manager&quot;</em><em>]</em><em>, </em><em>optional</em>) – Device timing method. Default: “queue_barrier”.</p></li>
874-
<li><p><strong>time_scale</strong> (<em>Union</em><em>[</em><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.13)"><em>int</em></a><em>, </em><a class="reference external" href="https://docs.python.org/3/library/functions.html#float" title="(in Python v3.13)"><em>float</em></a><em>]</em><em>, </em><em>optional</em>) – Ratio of the unit of time of interest and one second.
905+
<li><p><strong>time_scale</strong> (<em>Union</em><em>[</em><a class="reference external" href="https://docs.python.org/3/library/functions.html#int" title="(in Python v3.13)"><em>int</em></a><em>, </em><a class="reference external" href="https://docs.python.org/3/library/functions.html#float" title="(in Python v3.13)"><em>float</em></a><em>]</em><em>, </em><em>optional</em>) – Ratio of one second and the unit of time-scale of interest.
875906
Default: <code class="docutils literal notranslate"><span class="pre">1</span></code>.</p></li>
876907
</ul>
877908
</dd>

pulls/1872/api_reference/dpctl/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -861,7 +861,7 @@
861861
<td><p>Python class representing <code class="docutils literal notranslate"><span class="pre">sycl::platform</span></code> class.</p></td>
862862
</tr>
863863
<tr class="row-even"><td><p><a class="reference internal" href="generated/dpctl.SyclTimer.html#dpctl.SyclTimer" title="dpctl.SyclTimer"><code class="xref py py-obj docutils literal notranslate"><span class="pre">SyclTimer</span></code></a></p></td>
864-
<td><p>Context to measure device time and host wall-time of execution of commands submitted to <a class="reference internal" href="generated/dpctl.SyclQueue.html#dpctl.SyclQueue" title="dpctl.SyclQueue"><code class="xref py py-class docutils literal notranslate"><span class="pre">dpctl.SyclQueue</span></code></a>.</p></td>
864+
<td><p>Context to time execution of tasks submitted to <a class="reference internal" href="generated/dpctl.SyclQueue.html#dpctl.SyclQueue" title="dpctl.SyclQueue"><code class="xref py py-class docutils literal notranslate"><span class="pre">dpctl.SyclQueue</span></code></a>.</p></td>
865865
</tr>
866866
</tbody>
867867
</table>

pulls/1872/beginners_guides/installation.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -881,7 +881,7 @@ <h2>Installation using pip<a class="headerlink" href="#installation-using-pip" t
881881
<section id="installation-via-intel-r-distribution-for-python">
882882
<h2>Installation via Intel(R) Distribution for Python<a class="headerlink" href="#installation-via-intel-r-distribution-for-python" title="Permalink to this heading"></a></h2>
883883
<p><a class="reference external" href="https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html">Intel(R) Distribution for Python*</a> is distributed as a conda-based installer
884-
and includes <a class="reference internal" href="../api_reference/dpctl/index.html#module-dpctl" title="dpctl"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpctl</span></code></a> along with its dependencies and sister projects <a class="reference external" href="https://intelpython.github.io/dpnp/overview.html#module-dpnp" title="(in Data Parallel Extension for NumPy v0.17.0dev1+21.g078d9a33a18)"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpnp</span></code></a>
884+
and includes <a class="reference internal" href="../api_reference/dpctl/index.html#module-dpctl" title="dpctl"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpctl</span></code></a> along with its dependencies and sister projects <a class="reference external" href="https://intelpython.github.io/dpnp/overview.html#module-dpnp" title="(in Data Parallel Extension for NumPy v0.17.0dev2)"><code class="xref py py-mod docutils literal notranslate"><span class="pre">dpnp</span></code></a>
885885
and <a class="reference external" href="https://intelpython.github.io/numba-dpex/latest/index.html#module-numba_dpex" title="(in numba-dpex)"><code class="xref py py-mod docutils literal notranslate"><span class="pre">numba_dpex</span></code></a>.</p>
886886
<p>Once the installed environment is activated, <code class="docutils literal notranslate"><span class="pre">dpctl</span></code> should be ready to use.</p>
887887
</section>

pulls/1872/objects.inv

0 Bytes
Binary file not shown.

pulls/1872/searchindex.js

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)