CUDA: faster tile FA (Pascal/AMD), headsize 256#15769
Merged
JohannesGaessler merged 1 commit intoggml-org:masterfrom Sep 6, 2025
Merged
CUDA: faster tile FA (Pascal/AMD), headsize 256#15769JohannesGaessler merged 1 commit intoggml-org:masterfrom
JohannesGaessler merged 1 commit intoggml-org:masterfrom