Commit f277afd
authored
perf: Enable 128x256 tile shapes for FP4 MOE CUTLASS backend (NVIDIA#5986)
Signed-off-by: Daniel Stokes <[email protected]>1 parent c4ee535 commit f277afd
File tree
4 files changed
+13
-6
lines changed- cpp/tensorrt_llm/kernels/cutlass_kernels
- moe_gemm
- launchers
- python
4 files changed
+13
-6
lines changedLines changed: 4 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
383 | 383 | | |
384 | 384 | | |
385 | 385 | | |
386 | | - | |
387 | | - | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
388 | 390 | | |
389 | 391 | | |
390 | 392 | | |
| |||
Lines changed: 7 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
342 | 342 | | |
343 | 343 | | |
344 | 344 | | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
345 | 350 | | |
346 | 351 | | |
347 | | - | |
| 352 | + | |
348 | 353 | | |
349 | | - | |
| 354 | + | |
350 | 355 | | |
351 | 356 | | |
352 | 357 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
159 | 159 | | |
160 | 160 | | |
161 | 161 | | |
162 | | - | |
| 162 | + | |
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
359 | 359 | | |
360 | 360 | | |
361 | 361 | | |
362 | | - | |
| 362 | + | |
363 | 363 | | |
364 | 364 | | |
365 | 365 | | |
| |||
0 commit comments