Commit 27533e7
committed
metal : improve FA + improve MoE (llama/12612)
* ggml : FA with different K, V head sizes (CPU)
ggml-ci
* metal : add FA with HS=192
* metal : extend FA to support different K and V head sizes
ggml-ci
* metal : add FA vector kernels for heads K 192 and V 128
ggml-ci
* ggml : restrict op on other backends to equal head sizes
ggml-ci
* metal : optimize FA-vec kernel
ggml-ci
* metal : FA remove mq registers
* metal : improve MoE mul_mat_id condition
ggml-ci
* metal : fix comments + remove unnecessary addition
ggml-ci
* metal : avoid too much shared memory usage with mul_mat_id
ggml-ci1 parent 1b81415 commit 27533e7
File tree
8 files changed
+883
-678
lines changed- ggml
- include
- src
- ggml-cpu
- ggml-cuda
- ggml-metal
- ggml-vulkan
8 files changed
+883
-678
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1791 | 1791 | | |
1792 | 1792 | | |
1793 | 1793 | | |
1794 | | - | |
1795 | | - | |
1796 | | - | |
1797 | | - | |
1798 | | - | |
| 1794 | + | |
| 1795 | + | |
| 1796 | + | |
| 1797 | + | |
| 1798 | + | |
1799 | 1799 | | |
1800 | 1800 | | |
1801 | 1801 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12238 | 12238 | | |
12239 | 12239 | | |
12240 | 12240 | | |
12241 | | - | |
12242 | | - | |
| 12241 | + | |
| 12242 | + | |
| 12243 | + | |
12243 | 12244 | | |
12244 | | - | |
| 12245 | + | |
12245 | 12246 | | |
12246 | 12247 | | |
12247 | 12248 | | |
12248 | 12249 | | |
12249 | 12250 | | |
12250 | 12251 | | |
12251 | 12252 | | |
12252 | | - | |
12253 | | - | |
12254 | | - | |
| 12253 | + | |
| 12254 | + | |
| 12255 | + | |
12255 | 12256 | | |
12256 | 12257 | | |
12257 | | - | |
12258 | 12258 | | |
12259 | 12259 | | |
12260 | 12260 | | |
| |||
12320 | 12320 | | |
12321 | 12321 | | |
12322 | 12322 | | |
12323 | | - | |
12324 | | - | |
12325 | | - | |
12326 | | - | |
| 12323 | + | |
| 12324 | + | |
| 12325 | + | |
| 12326 | + | |
12327 | 12327 | | |
12328 | 12328 | | |
12329 | | - | |
| 12329 | + | |
12330 | 12330 | | |
12331 | | - | |
| 12331 | + | |
12332 | 12332 | | |
12333 | 12333 | | |
12334 | 12334 | | |
| |||
12342 | 12342 | | |
12343 | 12343 | | |
12344 | 12344 | | |
12345 | | - | |
| 12345 | + | |
12346 | 12346 | | |
12347 | 12347 | | |
12348 | 12348 | | |
| |||
12356 | 12356 | | |
12357 | 12357 | | |
12358 | 12358 | | |
12359 | | - | |
| 12359 | + | |
12360 | 12360 | | |
12361 | 12361 | | |
12362 | 12362 | | |
| |||
12380 | 12380 | | |
12381 | 12381 | | |
12382 | 12382 | | |
12383 | | - | |
| 12383 | + | |
12384 | 12384 | | |
12385 | 12385 | | |
12386 | 12386 | | |
12387 | 12387 | | |
12388 | 12388 | | |
12389 | 12389 | | |
12390 | | - | |
| 12390 | + | |
12391 | 12391 | | |
12392 | 12392 | | |
12393 | 12393 | | |
12394 | 12394 | | |
12395 | 12395 | | |
12396 | 12396 | | |
12397 | 12397 | | |
12398 | | - | |
| 12398 | + | |
12399 | 12399 | | |
12400 | 12400 | | |
12401 | 12401 | | |
12402 | 12402 | | |
12403 | 12403 | | |
12404 | | - | |
| 12404 | + | |
12405 | 12405 | | |
12406 | 12406 | | |
12407 | | - | |
| 12407 | + | |
12408 | 12408 | | |
12409 | 12409 | | |
12410 | 12410 | | |
12411 | 12411 | | |
12412 | 12412 | | |
12413 | 12413 | | |
12414 | | - | |
| 12414 | + | |
12415 | 12415 | | |
12416 | 12416 | | |
12417 | 12417 | | |
12418 | 12418 | | |
12419 | 12419 | | |
12420 | 12420 | | |
12421 | | - | |
| 12421 | + | |
12422 | 12422 | | |
12423 | 12423 | | |
12424 | 12424 | | |
| |||
15277 | 15277 | | |
15278 | 15278 | | |
15279 | 15279 | | |
15280 | | - | |
15281 | 15280 | | |
15282 | 15281 | | |
15283 | 15282 | | |
| |||
15386 | 15385 | | |
15387 | 15386 | | |
15388 | 15387 | | |
15389 | | - | |
| 15388 | + | |
| 15389 | + | |
15390 | 15390 | | |
15391 | | - | |
| 15391 | + | |
15392 | 15392 | | |
15393 | 15393 | | |
15394 | 15394 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3232 | 3232 | | |
3233 | 3233 | | |
3234 | 3234 | | |
| 3235 | + | |
| 3236 | + | |
| 3237 | + | |
| 3238 | + | |
| 3239 | + | |
| 3240 | + | |
| 3241 | + | |
3235 | 3242 | | |
3236 | 3243 | | |
3237 | 3244 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
219 | 219 | | |
220 | 220 | | |
221 | 221 | | |
222 | | - | |
223 | | - | |
224 | | - | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
225 | 228 | | |
226 | 229 | | |
227 | 230 | | |
| |||
0 commit comments