Commit 9801a7a
authored
Reapply "[KERNELS] Improve block sizes for batched matmul_ogs with small m/n/k (triton-lang#7897)" (triton-lang#8084)
This reverts commit 0a2e3a3.
(Verified that this is still faster on GB200 on top of recent fixes.)1 parent 1c03c46 commit 9801a7a
File tree
4 files changed
+80
-18
lines changed- python/triton_kernels
- tests
- triton_kernels
- matmul_ogs_details
- opt_flags_details
4 files changed
+80
-18
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
| |||
470 | 471 | | |
471 | 472 | | |
472 | 473 | | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
473 | 526 | | |
474 | 527 | | |
475 | 528 | | |
476 | 529 | | |
477 | 530 | | |
478 | 531 | | |
479 | 532 | | |
480 | | - | |
| 533 | + | |
481 | 534 | | |
482 | 535 | | |
483 | 536 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
368 | 368 | | |
369 | 369 | | |
370 | 370 | | |
371 | | - | |
| 371 | + | |
372 | 372 | | |
373 | 373 | | |
374 | 374 | | |
| |||
Lines changed: 14 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| |||
133 | 134 | | |
134 | 135 | | |
135 | 136 | | |
| 137 | + | |
136 | 138 | | |
137 | 139 | | |
138 | 140 | | |
| |||
146 | 148 | | |
147 | 149 | | |
148 | 150 | | |
149 | | - | |
| 151 | + | |
150 | 152 | | |
151 | 153 | | |
152 | 154 | | |
| |||
164 | 166 | | |
165 | 167 | | |
166 | 168 | | |
167 | | - | |
| 169 | + | |
168 | 170 | | |
169 | | - | |
| 171 | + | |
170 | 172 | | |
171 | | - | |
| 173 | + | |
172 | 174 | | |
173 | 175 | | |
174 | 176 | | |
| |||
178 | 180 | | |
179 | 181 | | |
180 | 182 | | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
181 | 187 | | |
182 | 188 | | |
183 | 189 | | |
| |||
189 | 195 | | |
190 | 196 | | |
191 | 197 | | |
192 | | - | |
| 198 | + | |
193 | 199 | | |
194 | 200 | | |
195 | 201 | | |
| |||
224 | 230 | | |
225 | 231 | | |
226 | 232 | | |
227 | | - | |
| 233 | + | |
228 | 234 | | |
229 | 235 | | |
230 | 236 | | |
| |||
275 | 281 | | |
276 | 282 | | |
277 | 283 | | |
| 284 | + | |
278 | 285 | | |
279 | 286 | | |
280 | 287 | | |
| |||
291 | 298 | | |
292 | 299 | | |
293 | 300 | | |
294 | | - | |
| 301 | + | |
295 | 302 | | |
296 | 303 | | |
297 | 304 | | |
| |||
Lines changed: 11 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
10 | | - | |
| 9 | + | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
38 | | - | |
| 39 | + | |
| 40 | + | |
39 | 41 | | |
40 | 42 | | |
41 | 43 | | |
| |||
54 | 56 | | |
55 | 57 | | |
56 | 58 | | |
57 | | - | |
| 59 | + | |
58 | 60 | | |
59 | 61 | | |
60 | 62 | | |
61 | | - | |
| 63 | + | |
62 | 64 | | |
63 | 65 | | |
64 | 66 | | |
| |||
0 commit comments