|
449 | 449 |
|
450 | 450 | <dl class="py function">
|
451 | 451 | <dt class="sig sig-object py" id="tilelang.language.reduce.reduce_absmax">
|
452 |
| -<span class="sig-prename descclassname"><span class="pre">tilelang.language.reduce.</span></span><span class="sig-name descname"><span class="pre">reduce_absmax</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">buffer</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Buffer</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">out</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Buffer</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">dim</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">int</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#tilelang.language.reduce.reduce_absmax" title="Permalink to this definition">#</a></dt> |
| 452 | +<span class="sig-prename descclassname"><span class="pre">tilelang.language.reduce.</span></span><span class="sig-name descname"><span class="pre">reduce_absmax</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">buffer</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Buffer</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">out</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Buffer</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">dim</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">int</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">clear</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">bool</span></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#tilelang.language.reduce.reduce_absmax" title="Permalink to this definition">#</a></dt> |
453 | 453 | <dd><p>Perform reduce absolute max on input buffer, store the result to output buffer.</p>
|
454 | 454 | <dl class="field-list simple">
|
455 | 455 | <dt class="field-odd">Parameters<span class="colon">:</span></dt>
|
|
535 | 535 |
|
536 | 536 | <dl class="py function">
|
537 | 537 | <dt class="sig sig-object py" id="tilelang.language.reduce.reduce_sum">
|
538 |
| -<span class="sig-prename descclassname"><span class="pre">tilelang.language.reduce.</span></span><span class="sig-name descname"><span class="pre">reduce_sum</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">buffer</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Buffer</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">out</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Buffer</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">dim</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">int</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#tilelang.language.reduce.reduce_sum" title="Permalink to this definition">#</a></dt> |
| 538 | +<span class="sig-prename descclassname"><span class="pre">tilelang.language.reduce.</span></span><span class="sig-name descname"><span class="pre">reduce_sum</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">buffer</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Buffer</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">out</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Buffer</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">dim</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">int</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">clear</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">bool</span></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#tilelang.language.reduce.reduce_sum" title="Permalink to this definition">#</a></dt> |
539 | 539 | <dd><p>Perform reduce sum on input buffer, store the result to output buffer.</p>
|
540 | 540 | <dl class="field-list simple">
|
541 | 541 | <dt class="field-odd">Parameters<span class="colon">:</span></dt>
|
542 | 542 | <dd class="field-odd"><ul class="simple">
|
543 | 543 | <li><p><strong>buffer</strong> (<em>tir.Buffer</em>) – The input buffer</p></li>
|
544 | 544 | <li><p><strong>out</strong> (<em>tir.Buffer</em>) – The output buffer</p></li>
|
545 | 545 | <li><p><strong>dim</strong> (<em>int</em>) – The dimension to perform reduce on</p></li>
|
| 546 | +<li><p><strong>clear</strong> (<em>bool</em><em>, </em><em>optional</em>) – If True, output buffer will be cleared before reduction. |
| 547 | +If False, results will be accumulated on existing values. |
| 548 | +Defaults to True.</p></li> |
546 | 549 | </ul>
|
547 | 550 | </dd>
|
548 |
| -<dt class="field-even">Returns<span class="colon">:</span></dt> |
549 |
| -<dd class="field-even"><p>Handle to the reduction operation</p> |
| 551 | +</dl> |
| 552 | +<dl> |
| 553 | +<dt>Note: When clear=True, reduce_sum will not compute directly on the output buffer. This is because</dt><dd><blockquote> |
| 554 | +<div><p>during warp reduction, the same value would be accumulated multiple times (number of threads |
| 555 | +in the warp). Therefore, the implementation with clear=True follows these steps:</p> |
| 556 | +</div></blockquote> |
| 557 | +<ol class="arabic simple"> |
| 558 | +<li><p>create a temp buffer with same shape and dtype as out</p></li> |
| 559 | +<li><p>copy out to temp buffer</p></li> |
| 560 | +<li><p>call reduce_sum with temp buffer and out</p></li> |
| 561 | +<li><p>Add temp buffer to out</p></li> |
| 562 | +</ol> |
550 | 563 | </dd>
|
551 |
| -<dt class="field-odd">Return type<span class="colon">:</span></dt> |
552 |
| -<dd class="field-odd"><p>tir.Call</p> |
| 564 | +</dl> |
| 565 | +<dl class="field-list simple"> |
| 566 | +<dt class="field-odd">Returns<span class="colon">:</span></dt> |
| 567 | +<dd class="field-odd"><p>Handle to the reduction operation</p> |
| 568 | +</dd> |
| 569 | +<dt class="field-even">Return type<span class="colon">:</span></dt> |
| 570 | +<dd class="field-even"><p>tir.Call</p> |
553 | 571 | </dd>
|
554 | 572 | </dl>
|
555 | 573 | </dd></dl>
|
|
0 commit comments