Skip to content

CUDA: broadcasting for FlashAttention mask#14500

Merged
JohannesGaessler merged 1 commit intoggml-org:gg/ggml-batch-soft-max-opsfrom
JohannesGaessler:cuda-fa-mask-broadcast
Jul 2, 2025
Merged

CUDA: broadcasting for FlashAttention mask#14500
JohannesGaessler merged 1 commit intoggml-org:gg/ggml-batch-soft-max-opsfrom
JohannesGaessler:cuda-fa-mask-broadcast

Commits

Commits on Jul 2, 2025