-
Notifications
You must be signed in to change notification settings - Fork 13.7k
fix: llama_memory_seq_rm(mem, -1, ...) #15200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
because GGML_ASSERT(seq_id >= 0 && (size_t) seq_id < seq_to_stream.size()); in llama_kv_cache_unified::seq_rm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes incorrect sequence ID parameters in llama_memory_seq_rm function calls, changing from -1 to 0 to comply with the function's assertion requirements that sequence IDs must be non-negative.
- Changes sequence ID parameter from
-1to0in twollama_memory_seq_rmcalls - Addresses assertion failures in
llama_kv_cache_unified::seq_rmthat requiresseq_id >= 0
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tools/main/main.cpp | Updates sequence ID parameter in memory sequence removal call |
| examples/lookahead/lookahead.cpp | Updates sequence ID parameter in memory sequence removal call |
|
|
||
| // remove any "future" tokens that we might have inherited from the previous session | ||
| llama_memory_seq_rm(mem, -1, n_matching_session_tokens, -1); | ||
| llama_memory_seq_rm(mem, 0, n_matching_session_tokens, -1); |
Copilot
AI
Aug 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix changes the sequence ID from -1 to 0, but this may alter the intended behavior. Please verify that sequence ID 0 is the correct target sequence for removal, as this could affect which cached tokens are cleared.
| llama_memory_seq_rm(mem, 0, n_matching_session_tokens, -1); | |
| llama_memory_seq_rm(mem, -1, n_matching_session_tokens, -1); |
| // KV cache management | ||
| // if no verification token matched, we simply remove all cells from this batch -> no fragmentation | ||
| llama_memory_seq_rm(mem, -1, n_past, -1); | ||
| llama_memory_seq_rm(mem, 0, n_past, -1); |
Copilot
AI
Aug 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix changes the sequence ID from -1 to 0, but this may alter the intended behavior. Please verify that sequence ID 0 is the correct target sequence for removal, as this could affect which cached tokens are cleared in the lookahead algorithm.
| llama_memory_seq_rm(mem, 0, n_past, -1); | |
| llama_memory_seq_rm(mem, -1, n_past, -1); |
* llama : deprecate llama_kv_self_ API ggml-ci * llama : allow llama_memory_(nullptr) ggml-ci * memory : add flag for optional data clear in llama_memory_clear ggml-ci
|
Yes |
|
Superseded by #15226 |
I believe it must be llama_memory_seq_rm(mem, 0, ...)
because of
See:
745aa53#r163727848