Skip to content

Commit 2556ae1

Browse files
Update docs
1 parent d976ea0 commit 2556ae1

File tree

9 files changed

+232
-151
lines changed

9 files changed

+232
-151
lines changed

_sources/autoapi/tilelang/engine/phase/index.rst.txt

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,5 +36,26 @@ Module Contents
3636
3737
.. py:function:: LowerAndLegalize(mod, target)
3838
39+
Bind target information and progressively legalize and lower frontend Tile IR into a form suitable for downstream optimization and codegen.
40+
41+
This pass pipeline:
42+
- Binds the provided target to the module.
43+
- Legalizes frontend Tile IR into TVM-compatible constructs.
44+
- Simplifies expressions.
45+
- Configures reducer layouts and performs layout inference for fragments and shared memory.
46+
- Lowers high-level tile operations and L2 persistent maps.
47+
- Legalizes vectorized loops and inserts safety checks for memory accesses.
48+
- Re-simplifies to remove redundancies introduced by safety checks.
49+
- Attempts loop vectorization for dynamic-shaped loops.
50+
51+
:param mod: The input IR module containing frontend Tile IR.
52+
:type mod: IRModule
53+
:param target: Target device information to bind into the module.
54+
:type target: Target
55+
56+
:returns: The transformed module, ready for target-specific optimization passes.
57+
:rtype: IRModule
58+
59+
3960
.. py:function:: OptimizeForTarget(mod, target)
4061

_sources/autoapi/tilelang/language/customize/index.rst.txt

Lines changed: 54 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -36,18 +36,23 @@ Module Contents
3636

3737
.. py:function:: region(buffer, access_type, *args)
3838
39-
Create a memory region descriptor for tile operations.
39+
Create a tile memory-region descriptor for a BufferLoad.
4040

41-
:param buffer: The buffer to create a region for
41+
Maps access_type ('r', 'w', 'rw') to the numeric codes expected by the `tl.region` intrinsic
42+
(1, 2, 3 respectively) and returns a tir.Call representing the region with the provided extents.
43+
44+
:param buffer: The BufferLoad that identifies the underlying buffer and indices.
4245
:type buffer: tir.BufferLoad
43-
:param access_type: Type of access - 'r' for read, 'w' for write, 'rw' for read-write
46+
:param access_type: One of 'r', 'w', or 'rw' indicating read, write, or read-write access.
4447
:type access_type: str
45-
:param \*args: Extent expressions defining the region size
48+
:param \*args: Extent expressions for each region dimension.
4649
:type \*args: tir.PrimExpr
4750

48-
:returns: A region descriptor for tile operations
51+
:returns: A call to the `tl.region` intrinsic describing the memory region.
4952
:rtype: tir.Call
5053

54+
:raises KeyError: If access_type is not one of 'r', 'w', or 'rw'.
55+
5156

5257
.. py:function:: buffer_to_tile_region(buffer, access_type)
5358
@@ -79,53 +84,61 @@ Module Contents
7984

8085
.. py:function:: buffer_region_to_tile_region(buffer_region, access_type, extents)
8186
82-
Convert a buffer region to a tile region descriptor.
87+
Create a tl region descriptor for the given BufferRegion.
8388

84-
:param buffer_region: The buffer region to convert
89+
:param buffer_region: Source buffer region whose `region` items provide mins and extents.
8590
:type buffer_region: tir.BufferRegion
86-
:param access_type: Type of access - 'r' for read, 'w' for write, 'rw' for read-write
91+
:param access_type: Access mode: "r", "w", or "rw".
8792
:type access_type: str
93+
:param extents: Requested extents; must have length <= the number of extents in buffer_region.region.
94+
:type extents: List[PrimExpr]
8895

89-
:returns: A region descriptor for the specified buffer region
96+
:returns: A tile-region descriptor (tl.region) covering the buffer_region.
9097
:rtype: tir.Call
9198

99+
:raises AssertionError: If the number of extents in buffer_region.region is smaller than len(extents).
100+
92101

93102
.. py:function:: atomic_max(dst, value, memory_order = None)
94103
95-
Perform an atomic maximum operation.
104+
Perform an atomic maximum on the value stored at dst with an optional memory-order.
96105

97-
:param dst: Destination buffer where the atomic maximum will be performed
106+
If memory_order is None the runtime extern "AtomicMax" is called without an explicit memory-order id; otherwise the provided memory_order string is mapped to a numeric id using the module's memory-order map and passed to the extern.
107+
108+
:param dst: Destination buffer/address to apply the atomic max.
98109
:type dst: Buffer
99-
:param value: Value to be atomically added
110+
:param value: Value to compare/store atomically.
100111
:type value: PrimExpr
112+
:param memory_order: Optional memory-order name (e.g. "relaxed", "acquire", "seq_cst").
113+
If provided, it is translated to the corresponding numeric memory-order id before the call.
114+
:type memory_order: str | None
101115

102-
:returns: Handle to the atomic maximum operation
116+
:returns: A handle/expression representing the issued atomic maximum operation.
103117
:rtype: PrimExpr
104118

105119

106120
.. py:function:: atomic_min(dst, value, memory_order = None)
107121
108-
Perform an atomic minimum operation.
122+
Atomically update the value at dst to the minimum of its current value and value.
109123

110-
:param dst: Destination buffer where the atomic minimum will be performed
111-
:type dst: Buffer
112-
:param value: Value to be atomically added
113-
:type value: PrimExpr
124+
If memory_order is provided, it selects the memory-order semantic used by the underlying extern call;
125+
allowed names are "relaxed", "consume", "acquire", "release", "acq_rel", and "seq_cst" (mapped internally
126+
to integer IDs). If memory_order is None, the extern is invoked without an explicit memory-order argument.
114127

115-
:returns: Handle to the atomic minimum operation
128+
:param memory_order: Optional memory-order name controlling the atomic operation's ordering.
129+
:type memory_order: str | None
130+
131+
:returns: A handle expression representing the atomic-min operation.
116132
:rtype: PrimExpr
117133

118134

119135
.. py:function:: atomic_add(dst, value, memory_order = None)
120136
121-
Perform an atomic addition operation.
137+
Atomically add `value` into `dst`, returning a handle to the operation.
122138

123-
:param dst: Destination buffer where the atomic addition will be performed
124-
:type dst: Buffer
125-
:param value: Value to be atomically added
126-
:type value: PrimExpr
139+
Supports scalar/addressed extern atomic add when neither argument exposes extents, or tile-region-based atomic add for Buffer/BufferRegion/BufferLoad inputs. If both arguments are plain Buffers their shapes must be structurally equal. If at least one side exposes extents, extents are aligned (missing dimensions are treated as size 1); an assertion is raised if extents cannot be deduced. The optional `memory_order` (one of "relaxed","consume","acquire","release","acq_rel","seq_cst") is used only for the direct extern `AtomicAdd` path when no extents are available — otherwise the tile-region path ignores `memory_order`.
127140

128-
:returns: Handle to the atomic addition operation
141+
:returns: A handle representing the atomic addition operation.
129142
:rtype: PrimExpr
130143

131144

@@ -196,44 +209,37 @@ Module Contents
196209

197210
.. py:function:: view(src, shape = None, dtype = None)
198211
199-
Views the input buffer with optionally modified shape and dtype.
212+
Return a Tensor view of the input buffer with an optional new shape and dtype.
200213

201-
:param src: Input buffer to be viewed
202-
:type src: Buffer
203-
:param shape: New shape for the buffer. Defaults to None.
204-
:type shape: Union[List[PrimExpr], None], optional
205-
:param dtype: New dtype for the buffer. Defaults to None.
206-
:type dtype: Union[str, None], optional
207-
208-
:returns: A new buffer view with the specified shape and dtype
209-
:rtype: Buffer
214+
If `shape` is None the source buffer's shape is used; if `dtype` is None the source buffer's dtype is used. The returned buffer shares the same underlying data as `src` (no copy).
210215

211216

212217
.. py:function:: atomic_load(src, memory_order = 'seq_cst')
213218
214-
Loads a value from the input buffer with specified memory_order.
219+
Load a value from the given buffer using the specified atomic memory ordering.
215220

216-
:param src: Input buffer to load from
217-
:type src: Buffer
218-
:param memory_order: Atomicity level for the load operation. Defaults to "seq_cst".
219-
:type memory_order: str, optional
220-
221-
:returns: The loaded value from the buffer
222-
:rtype: PrimExpr
221+
Performs an atomic load from `src` and returns a PrimExpr representing the loaded value.
222+
memory_order selects the ordering and must be one of: "relaxed", "consume", "acquire",
223+
"release", "acq_rel", or "seq_cst" (default).
224+
Raises KeyError if an unknown memory_order is provided.
223225

224226

225227
.. py:function:: atomic_store(dst, src, memory_order = 'seq_cst')
226228
227-
Stores a value to the input buffer with specified memory_order.
229+
Perform an atomic store of `src` into `dst` with the given memory ordering.
228230

229-
:param dst: Input buffer to store to
231+
:param dst: Destination buffer to store into.
230232
:type dst: Buffer
231-
:param src: Value to store
233+
:param src: Value to store.
232234
:type src: PrimExpr
233-
:param memory_order: Atomicity level for the load operation. Defaults to "seq_cst".
235+
:param memory_order: Memory ordering name; one of "relaxed", "consume",
236+
"acquire", "release", "acq_rel", or "seq_cst". Defaults to "seq_cst".
237+
The name is mapped to an internal numeric ID used by the underlying runtime.
234238
:type memory_order: str, optional
235239

236-
:returns: The handle of the store operation
240+
:returns: A handle representing the issued atomic store operation.
237241
:rtype: PrimExpr
238242

243+
:raises KeyError: If `memory_order` is not one of the supported names.
244+
239245

_sources/autoapi/tilelang/language/reduce/index.rst.txt

Lines changed: 11 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -142,29 +142,25 @@ Module Contents
142142
143143
.. py:function:: cumsum(src, dst = None, dim = 0, reverse = False)
144144
145-
Perform cumulative sum on input buffer, store the result to output buffer.
146-
147-
:param src: The input buffer
148-
:type src: tir.Buffer
149-
:param dst: The output buffer. Defaults to None.
150-
:type dst: tir.Buffer, optional
151-
:param dim: The dimension to perform cumulative sum on. Defaults to 0.
152-
:type dim: int, optional
153-
:param reverse: Whether to perform reverse cumulative sum. Defaults to False.
154-
:type reverse: bool, optional
155-
156-
:returns: Handle to the cumulative sum operation
145+
Compute the cumulative sum of `src` along `dim`, writing results to `dst`.
146+
147+
Negative `dim` indices are normalized (Python-style). If `dst` is None, the operation is performed in-place into `src`. Raises ValueError when `dim` is out of bounds for `src.shape`. When `src.scope() == "local.fragment"`, this delegates to `cumsum_fragment`; otherwise it emits the `tl.cumsum` intrinsic.
148+
149+
:returns: A handle to the emitted cumulative-sum operation.
157150
:rtype: tir.Call
158151

159152

160153
.. py:function:: finalize_reducer(reducer)
161154
162-
Finalize the reducer buffer.
155+
Finalize a reducer buffer by emitting the `tl.finalize_reducer` intrinsic.
156+
157+
This returns a TVM `tir.Call` handle that finalizes the given reducer using its writable pointer.
158+
The call does not modify Python objects directly; it produces the low-level intrinsic call used by the IR.
163159

164-
:param reducer: The reducer buffer
160+
:param reducer: Reducer buffer whose writable pointer will be finalized.
165161
:type reducer: tir.Buffer
166162

167-
:returns: Handle to the finalize reducer operation
163+
:returns: Handle to the finalize reducer intrinsic call.
168164
:rtype: tir.Call
169165

170166

_sources/autoapi/tilelang/transform/index.rst.txt

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -372,13 +372,21 @@ Package Contents
372372

373373
.. py:function:: LowerDeviceKernelLaunch()
374374
375-
LowerDeviceKernelLaunch
375+
Create and return a transform pass that lowers device kernel launch constructs to target-specific IR.
376376

377+
This pass transforms high-level device kernel launch and related intrinsics into lower-level
378+
IR suitable for backend code generation and device-side lowering.
379+
380+
:returns: The transform pass that performs device kernel launch lowering.
381+
:rtype: tvm.transform.Pass
377382

378383

379384
.. py:function:: LayoutReducer()
380385
381-
LayoutReducer
386+
Return a TVM transform pass that performs layout reduction/normalization.
387+
388+
This wrapper delegates to the underlying FFI implementation and returns a pass object suitable for use in a PassContext or pass pipeline. The pass is intended to simplify or reduce tensor/layout-related representations during relay/tile transformations.
382389

390+
:returns: The transform pass object produced by the FFI backend.
383391

384392

autoapi/tilelang/engine/phase/index.html

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -484,7 +484,7 @@ <h2>Functions<a class="headerlink" href="#functions" title="Link to this heading
484484
<td><p></p></td>
485485
</tr>
486486
<tr class="row-odd"><td><p><a class="reference internal" href="#tilelang.engine.phase.LowerAndLegalize" title="tilelang.engine.phase.LowerAndLegalize"><code class="xref py py-obj docutils literal notranslate"><span class="pre">LowerAndLegalize</span></code></a>(mod, target)</p></td>
487-
<td><p></p></td>
487+
<td><p>Bind target information and progressively legalize and lower frontend Tile IR into a form suitable for downstream optimization and codegen.</p></td>
488488
</tr>
489489
<tr class="row-even"><td><p><a class="reference internal" href="#tilelang.engine.phase.OptimizeForTarget" title="tilelang.engine.phase.OptimizeForTarget"><code class="xref py py-obj docutils literal notranslate"><span class="pre">OptimizeForTarget</span></code></a>(mod, target)</p></td>
490490
<td><p></p></td>
@@ -585,15 +585,28 @@ <h2>Module Contents<a class="headerlink" href="#module-contents" title="Link to
585585
<dl class="py function">
586586
<dt class="sig sig-object py" id="tilelang.engine.phase.LowerAndLegalize">
587587
<span class="sig-prename descclassname"><span class="pre">tilelang.engine.phase.</span></span><span class="sig-name descname"><span class="pre">LowerAndLegalize</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">mod</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">target</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#tilelang.engine.phase.LowerAndLegalize" title="Link to this definition"></a></dt>
588-
<dd><dl class="field-list simple">
588+
<dd><p>Bind target information and progressively legalize and lower frontend Tile IR into a form suitable for downstream optimization and codegen.</p>
589+
<p>This pass pipeline:
590+
- Binds the provided target to the module.
591+
- Legalizes frontend Tile IR into TVM-compatible constructs.
592+
- Simplifies expressions.
593+
- Configures reducer layouts and performs layout inference for fragments and shared memory.
594+
- Lowers high-level tile operations and L2 persistent maps.
595+
- Legalizes vectorized loops and inserts safety checks for memory accesses.
596+
- Re-simplifies to remove redundancies introduced by safety checks.
597+
- Attempts loop vectorization for dynamic-shaped loops.</p>
598+
<dl class="field-list simple">
589599
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
590600
<dd class="field-odd"><ul class="simple">
591-
<li><p><strong>mod</strong> (<em>tvm.IRModule</em>)</p></li>
592-
<li><p><strong>target</strong> (<em>tvm.target.Target</em>)</p></li>
601+
<li><p><strong>mod</strong> (<em>IRModule</em>) – The input IR module containing frontend Tile IR.</p></li>
602+
<li><p><strong>target</strong> (<em>Target</em>) – Target device information to bind into the module.</p></li>
593603
</ul>
594604
</dd>
595-
<dt class="field-even">Return type<span class="colon">:</span></dt>
596-
<dd class="field-even"><p>tvm.IRModule</p>
605+
<dt class="field-even">Returns<span class="colon">:</span></dt>
606+
<dd class="field-even"><p>The transformed module, ready for target-specific optimization passes.</p>
607+
</dd>
608+
<dt class="field-odd">Return type<span class="colon">:</span></dt>
609+
<dd class="field-odd"><p>IRModule</p>
597610
</dd>
598611
</dl>
599612
</dd></dl>

0 commit comments

Comments
 (0)