Skip to content

Optimized DeepSeek V2/V3 implementation (MLA + flash attention)#12227

Closed
jukofyork wants to merge 5 commits intoggml-org:masterfrom
jukofyork:mla-with-flash-attention
Closed

Optimized DeepSeek V2/V3 implementation (MLA + flash attention)#12227
jukofyork wants to merge 5 commits intoggml-org:masterfrom
jukofyork:mla-with-flash-attention

Commits