Commit f07b7f7
authored
[SWDEV-539076] Initial naive foreach autotune support (#2377)
Adds initial autotuning for foreach support required for
https://ontrack-internal.amd.com/browse/SWDEV-539076
4x improvement for some kernels
Before:
triton_for_fused_18.kd 🔍 | 4.986 ms | 4.986 ms | 2.493 ms | 2 |
triton_for_fused_6.kd 🔍 | 0.098 ms | 0.098 ms | 0.049 ms | 2 |
triton_for_fused_7.kd 🔍 | 0.036 ms | 0.036 ms | 0.018 ms | 2 |
After:
triton_for_fused_18.kd 🔍 | 1.273 ms | 1.273 ms | 0.636 ms | 2 |
triton_for_fused_6.kd 🔍 | 0.044 ms | 0.044 ms | 0.022 ms | 2 |
triton_for_fused_7.kd 🔍 | 0.024 ms | 0.024 ms | 0.012 ms | 2 | 1 parent 30508ff commit f07b7f7
File tree
2 files changed
+13
-4
lines changed- torch/_inductor
- codegen
- runtime
2 files changed
+13
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
614 | 614 | | |
615 | 615 | | |
616 | 616 | | |
617 | | - | |
| 617 | + | |
618 | 618 | | |
619 | 619 | | |
620 | 620 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2213 | 2213 | | |
2214 | 2214 | | |
2215 | 2215 | | |
2216 | | - | |
| 2216 | + | |
2217 | 2217 | | |
2218 | 2218 | | |
2219 | 2219 | | |
| 2220 | + | |
| 2221 | + | |
| 2222 | + | |
| 2223 | + | |
| 2224 | + | |
| 2225 | + | |
| 2226 | + | |
| 2227 | + | |
| 2228 | + | |
| 2229 | + | |
2220 | 2230 | | |
2221 | 2231 | | |
2222 | | - | |
| 2232 | + | |
2223 | 2233 | | |
2224 | 2234 | | |
2225 | 2235 | | |
2226 | 2236 | | |
2227 | 2237 | | |
2228 | 2238 | | |
2229 | | - | |
2230 | 2239 | | |
2231 | 2240 | | |
2232 | 2241 | | |
| |||
0 commit comments