Skip to content

Commit 6877619

Browse files
Update docs
1 parent bb55529 commit 6877619

File tree

184 files changed

+3311
-505
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

184 files changed

+3311
-505
lines changed

_sources/autoapi/tilelang/jit/adapter/libgen/index.rst.txt

Lines changed: 0 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ Attributes
1010
.. autoapisummary::
1111

1212
tilelang.jit.adapter.libgen.logger
13-
tilelang.jit.adapter.libgen.is_nvrtc_available
1413

1514

1615
Classes
@@ -19,18 +18,13 @@ Classes
1918
.. autoapisummary::
2019

2120
tilelang.jit.adapter.libgen.LibraryGenerator
22-
tilelang.jit.adapter.libgen.PyLibraryGenerator
2321

2422

2523
Module Contents
2624
---------------
2725

2826
.. py:data:: logger
2927
30-
.. py:data:: is_nvrtc_available
31-
:value: False
32-
33-
3428
.. py:class:: LibraryGenerator(target, verbose = False)
3529
3630
.. py:attribute:: srcpath
@@ -101,41 +95,3 @@ Module Contents
10195
.. py:method:: set_src_path(srcpath)
10296
10397
104-
.. py:class:: PyLibraryGenerator(target, verbose = False)
105-
106-
Bases: :py:obj:`LibraryGenerator`
107-
108-
109-
.. py:attribute:: host_func
110-
:type: str | None
111-
:value: None
112-
113-
114-
115-
.. py:attribute:: culib
116-
:value: None
117-
118-
119-
120-
.. py:attribute:: pymodule
121-
:value: None
122-
123-
124-
125-
.. py:method:: import_from_file(module_name, file_path)
126-
:staticmethod:
127-
128-
129-
130-
.. py:method:: update_host_func(host_func)
131-
132-
133-
.. py:method:: load_lib(lib_path = None)
134-
135-
136-
.. py:method:: compile_lib(timeout = None)
137-
138-
139-
.. py:method:: __del__()
140-
141-

_sources/autoapi/tilelang/jit/adapter/nvrtc/index.rst.txt

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ Submodules
1818
:maxdepth: 1
1919

2020
/autoapi/tilelang/jit/adapter/nvrtc/adapter/index
21+
/autoapi/tilelang/jit/adapter/nvrtc/libgen/index
22+
/autoapi/tilelang/jit/adapter/nvrtc/wrapper/index
2123

2224

2325
Attributes
@@ -68,3 +70,8 @@ Package Contents
6870
Dummy NVRTCKernelAdapter that raises ImportError on instantiation.
6971

7072

73+
.. py:method:: from_database(*args, **kwargs)
74+
:classmethod:
75+
76+
77+
Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
tilelang.jit.adapter.nvrtc.libgen
2+
=================================
3+
4+
.. py:module:: tilelang.jit.adapter.nvrtc.libgen
5+
6+
.. autoapi-nested-parse::
7+
8+
NVRTC Library Generator for TileLang.
9+
10+
Compiles CUDA kernels at runtime using NVRTC and manages resulting binaries.
11+
12+
Why NVRTC instead of nvcc:
13+
- No offline compilation step, enables true JIT workflows
14+
- Works without CUDA toolkit installed (only requires driver)
15+
- Allows kernel specialization based on runtime parameters
16+
17+
Key responsibilities:
18+
- Compile CUDA source to cubin using NVRTC API
19+
- Generate accompanying Python launcher code
20+
- Load compiled cubin and extract kernel handles
21+
- Manage library lifecycle (load/unload)
22+
23+
24+
25+
Attributes
26+
----------
27+
28+
.. autoapisummary::
29+
30+
tilelang.jit.adapter.nvrtc.libgen.logger
31+
32+
33+
Classes
34+
-------
35+
36+
.. autoapisummary::
37+
38+
tilelang.jit.adapter.nvrtc.libgen.NVRTCLibraryGenerator
39+
40+
41+
Module Contents
42+
---------------
43+
44+
.. py:data:: logger
45+
46+
.. py:class:: NVRTCLibraryGenerator(target, verbose = False)
47+
48+
Bases: :py:obj:`tilelang.jit.adapter.libgen.LibraryGenerator`
49+
50+
51+
Runtime compiler and loader for NVRTC-compiled CUDA kernels.
52+
53+
Lifecycle:
54+
1. compile_lib(): CUDA source → cubin + Python launcher
55+
2. load_lib(): cubin → loaded library + kernel handles
56+
3. pymodule.call(): Execute kernels via Python launcher
57+
4. __del__: Cleanup (unload library)
58+
59+
Why three files (cu, cubin, py):
60+
- .cu: Source for debugging, kept in temp directory
61+
- .cubin: Compiled binary, loaded by CUDA driver
62+
- .py: Launch code, imported as Python module
63+
64+
.. attribute:: host_func
65+
66+
Generated Python launch code (from wrapper)
67+
68+
.. attribute:: culib
69+
70+
CUDA library handle (CUlibrary)
71+
72+
.. attribute:: pymodule
73+
74+
Imported Python module containing call() function
75+
76+
77+
.. py:attribute:: host_func
78+
:type: str | None
79+
:value: None
80+
81+
82+
83+
.. py:attribute:: culib
84+
:type: cuda.bindings.driver.CUlibrary | None
85+
:value: None
86+
87+
88+
89+
.. py:attribute:: pymodule
90+
:type: types.ModuleType | None
91+
:value: None
92+
93+
94+
95+
.. py:attribute:: pypath
96+
:type: str | None
97+
:value: None
98+
99+
100+
101+
.. py:method:: import_from_file(module_name, file_path)
102+
:staticmethod:
103+
104+
105+
Dynamically import Python module from file path.
106+
107+
Standard importlib pattern for loading modules outside sys.path.
108+
Used to import generated .py launcher code from temp directory.
109+
110+
:param module_name: Name to assign to imported module
111+
:param file_path: Absolute path to .py file
112+
113+
:returns: Imported module object
114+
115+
116+
117+
.. py:method:: update_host_func(host_func)
118+
119+
Store generated Python launch code for later file write.
120+
121+
Called by adapter after wrapper generates the launch code.
122+
This is the bridge between code generation and file output.
123+
124+
:param host_func: Python source code containing call() function
125+
126+
127+
128+
.. py:method:: load_lib(lib_path = None)
129+
130+
Load compiled cubin and Python launcher into memory.
131+
132+
Why two loads:
133+
1. Import Python module for launch logic
134+
2. Load cubin via CUDA Driver API for kernel handles
135+
136+
Context synchronization: CUDA context must be current before loading.
137+
If not, use torch.cuda.synchronize() to establish context.
138+
139+
:param lib_path: Path to .cubin file (optional, uses self.libpath if None)
140+
141+
Side effects:
142+
- Sets self.pymodule to imported Python module
143+
- Sets self.culib to CUDA library handle
144+
145+
146+
147+
.. py:method:: compile_lib(timeout = None)
148+
149+
Compile CUDA source to cubin using NVRTC and write output files.
150+
151+
Output artifacts (all in temp directory):
152+
- .cu: Source code (for debugging)
153+
- .cubin: Compiled binary (for execution)
154+
- .py: Python launcher (for calling kernels)
155+
156+
Include paths setup:
157+
- TileLang templates: kernel primitives and utilities
158+
- CUTLASS: optimized GEMM/tensor ops
159+
- CUDA headers: driver/runtime APIs
160+
161+
Why architecture detection:
162+
ARM64 servers (SBSA) have different header paths than x86_64.
163+
164+
:param timeout: Compilation timeout in seconds (currently unsupported by NVRTC compiler)
165+
166+
Side effects:
167+
- Writes .cu, .cubin, .py files to temp directory
168+
- Sets self.srcpath, self.libpath, self.pypath
169+
170+
171+
172+
.. py:method:: __del__()
173+
174+
Cleanup: unload CUDA library when object is destroyed.
175+
176+
Critical for resource management - CUDA libraries consume GPU memory.
177+
Failure to unload is logged but not raised (destructor can't fail).
178+
179+
Why explicit unload:
180+
Python GC doesn't know about GPU resources, must release manually.
181+
182+
183+

0 commit comments

Comments
 (0)