Skip to content

CUDA: use async data loading for FlashAttention#11894

Merged
JohannesGaessler merged 3 commits intoggml-org:masterfrom
JohannesGaessler:cuda-fa-mma-17
Feb 17, 2025
Merged

CUDA: use async data loading for FlashAttention#11894
JohannesGaessler merged 3 commits intoggml-org:masterfrom
JohannesGaessler:cuda-fa-mma-17

Commits

Commits on Feb 15, 2025

Commits on Feb 17, 2025