CUDA: use mma FA kernel for gqa > 4 on RTX 4000#15035
Merged
JohannesGaessler merged 1 commit intoggml-org:masterfrom Aug 2, 2025 
Merged
CUDA: use mma FA kernel for gqa > 4 on RTX 4000#15035JohannesGaessler merged 1 commit intoggml-org:masterfrom 
JohannesGaessler merged 1 commit intoggml-org:masterfrom