Skip to content

Commit fdfc7de

Browse files
committed
metal : optimize multi-sequence FA vec kernel
ggml-ci
1 parent f078c79 commit fdfc7de

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

ggml/src/ggml-metal/ggml-metal.metal

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3887,6 +3887,11 @@ kernel void kernel_flash_attn_ext_vec(
38873887
sm[tiisg] = pm[ic + tiisg];
38883888
}
38893889

3890+
// skip -INF blocks
3891+
if (simd_max(sm[tiisg]) == -INFINITY) {
3892+
continue;
3893+
}
3894+
38903895
// Q*K^T
38913896
{
38923897
// each simdgroup processes 1 query and NE (NW/NL) head elements

0 commit comments

Comments
 (0)