Skip to content

Commit 989f9e6

Browse files
committed
fixed inccorect padding for flash attn with swa
1 parent 186227f commit 989f9e6

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

src/llama-kv-cache-unified-iswa.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ llama_kv_cache_unified_iswa::llama_kv_cache_unified_iswa(
3131

3232
//kcpp: pad the swa kv cache as well, similar to extra_context_handle_fragmentation
3333
size_swa += 32;
34+
size_swa = GGML_PAD(size_swa, n_pad);
3435

3536
// when using full-size SWA cache, we set the SWA cache size to be equal to the base cache size
3637
if (swa_full) {

0 commit comments

Comments
 (0)