Skip to content

Commit f8bcfe0

Browse files
committed
fix: Fix indexing into k_l for recurrent cache with filter
Branch: HybridCache Signed-off-by: Gabe Goodhart <[email protected]>
1 parent a02e242 commit f8bcfe0

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

src/llama-kv-cache.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1906,8 +1906,8 @@ llama_kv_cache_recurrent::llama_kv_cache_recurrent(
19061906
ggml_tensor * v = ggml_new_tensor_1d(ctx, type_v, n_embd_v_gqa*kv_size);
19071907
ggml_format_name(k, "cache_k_l%d", i);
19081908
ggml_format_name(v, "cache_v_l%d", i);
1909-
k_l.push_back(k);
1910-
v_l.push_back(v);
1909+
k_l[i] = k;
1910+
v_l[i] = v;
19111911
}
19121912

19131913
// allocate tensors and initialize the buffers to avoid NaNs in the padding

0 commit comments

Comments
 (0)