Skip to content

Commit 07dbc1a

Browse files
ikawrakowIwan Kawrakow
andauthored
Metal: much faster MoE prompt processing (#307)
* MoE improvements on Metal This version beats mainline, there are things I don't understand: * Mianline has effectively gone to GEMV for MUL_MAT_ID. We can do the same, but we are 30% slower. Why? * Using actual GEMM, we beat mainline with ubtach size of 128. But then performance degrades. Why? * Some cleanup * Much better --------- Co-authored-by: Iwan Kawrakow <[email protected]>
1 parent 6d405d1 commit 07dbc1a

File tree

2 files changed

+2118
-2042
lines changed

2 files changed

+2118
-2042
lines changed

0 commit comments

Comments
 (0)