Commit ba94c21
Enabling support for enhanced customization of Nvidia
# Enabling enhanced `ptxas` customization
This MR enables broader support for `ptxas` customization via the
following functionality:
* Ability to pass specific `ptxas` options. Available options are
documented
[here](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#ptxas-options)
* Ability to pass these options for specific kernel calls
Benefits:
* Enables parameters to be passed to `ptxas`.
* Enables targeted customization of the compilation behavior for each
specific kernel call.
Usage:
Pass a string with `ptxas` options as the function parameter
`ptx_options` in any given kernel call.
Example:
For tutorial `03-matrix-multiplication.py` one can enable `opt-level 3`
for `leaky_relu` and `opt-level 0` for `matmul_kernel` like so:
```python
...
if ACTIVATION == "leaky_relu":
accumulator = leaky_relu(accumulator,ptx_options="--opt-level=0")
...
matmul_kernel[grid](
a, b, c, #
M, N, K, #
a.stride(0), a.stride(1), #
b.stride(0), b.stride(1), #
c.stride(0), c.stride(1), #
ACTIVATION=activation, #
ptx_options="--opt-level=0"
)
```
Testing done:
This was tested by modifying the following python tutorials:
* `02-fused-softmax`
* `03-matrix-multiplication`
I checked the behavior of cached compiles, I can confirm the cache works
as expected for different options on a given kernel.
---------
Co-authored-by: Pedro Torruella <[email protected]>ptxas options (triton-lang#6993)1 parent 5c9e545 commit ba94c21
1 file changed
+12
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
| 109 | + | |
109 | 110 | | |
110 | 111 | | |
111 | 112 | | |
| |||
407 | 408 | | |
408 | 409 | | |
409 | 410 | | |
410 | | - | |
411 | | - | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
412 | 422 | | |
413 | 423 | | |
414 | 424 | | |
| |||
0 commit comments