Commit e1fb6f6
authored
[kernels] restore old behavior that output for tokens routed to zero experts should be zero-initialized (#7150)
#7140 introduced a subtle
change in the semantics of `matmul_ogs`. We actually care that the
output of rows that have scatter_indx==-1 be zero-initialized because
some expert parallelism code may reduce them
also found some missing mask in the AMD implementation, which most
likely explains the test failure.1 parent 88a2851 commit e1fb6f6
File tree
2 files changed
+2
-2
lines changed- python/triton_kernels/triton_kernels/matmul_ogs_details
2 files changed
+2
-2
lines changedLines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
291 | 291 | | |
292 | 292 | | |
293 | 293 | | |
294 | | - | |
| 294 | + | |
295 | 295 | | |
296 | 296 | | |
297 | 297 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
387 | 387 | | |
388 | 388 | | |
389 | 389 | | |
390 | | - | |
| 390 | + | |
391 | 391 | | |
392 | 392 | | |
393 | 393 | | |
| |||
0 commit comments