Skip to content

CUDA: 4D FlashAttention support#14628

Merged
ggerganov merged 2 commits intoggml-org:gg/llama-high-throughputfrom
JohannesGaessler:cuda-fa-4d-3
Jul 11, 2025
Merged

CUDA: 4D FlashAttention support#14628
ggerganov merged 2 commits intoggml-org:gg/llama-high-throughputfrom
JohannesGaessler:cuda-fa-4d-3

Commits

Commits on Jul 11, 2025