Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions src/llama-memory-recurrent.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,8 @@ bool llama_memory_recurrent::seq_rm(llama_seq_id seq_id, llama_pos p0, llama_pos
p1 = std::numeric_limits<llama_pos>::max();
}

// models like Mamba or RWKV can't have a state partially erased
// models like Mamba or RWKV can't have a state partially erased at the end
// of the sequence because their state isn't preserved for previous tokens
if (seq_id >= (int64_t) size) {
// could be fatal
return false;
Expand All @@ -160,8 +161,8 @@ bool llama_memory_recurrent::seq_rm(llama_seq_id seq_id, llama_pos p0, llama_pos
int32_t & tail_id = cells[seq_id].tail;
if (tail_id >= 0) {
const auto & cell = cells[tail_id];
// partial intersection is invalid
if ((0 < p0 && p0 < cell.pos) || (0 < p1 && p1 <= cell.pos)) {
// partial intersection is invalid if it includes the final pos
if ((0 < p0 && p1 > cell.pos)) {
//printf("[DEBUG] inside `llama_memory_recurrent::seq_rm`: partial intersection is invalid, so returning false\n");
return false;
}
Expand Down
5 changes: 4 additions & 1 deletion tools/main/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -354,7 +354,10 @@ int main(int argc, char ** argv) {
}

// remove any "future" tokens that we might have inherited from the previous session
llama_memory_seq_rm(mem, -1, n_matching_session_tokens, -1);
if (!llama_memory_seq_rm(mem, -1, n_matching_session_tokens, -1)) {
LOG_INF("%s: unable to resuse common prefix\n", __func__);
llama_memory_seq_rm(mem, -1, -1, -1);
Comment on lines 357 to 360
Copy link
Collaborator

@compilade compilade Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this will correctly trigger reevaluation of the context afterwards.

n_matching_session_tokens seems to be used mostly for cosmetic purposes except when it's equal to embd_inp.size(), so maybe it could be set to 0.

But I think there might be something else necessary here. (For example, should session_tokens be cleared?)

(This is mostly relevant when reloading a saved session, and not necessarily during context shifting, though)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yeah, I'm still a little murky on the logic going on here. I'll look more carefully and see what I can find.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry folks,
I am new to merge management of commits.
b8f9c56923ab34bf24d4fe5476efe9b02736918d
Don't know who and when presses merge button for the commit after all checks are passed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're holding on this until I resolve the question in this thread. I'll try to get to that today!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see what you're getting at here with the clause below that recalculates the cached logits for the final token. I think the right thing to do here is to set n_matching_session_tokens to 0 since we're essentially saying there are no matching tokens anymore.

}
}

LOG_DBG("recalculate the cached logits (check): embd_inp.size() %zu, n_matching_session_tokens %zu, embd_inp.size() %zu, session_tokens.size() %zu\n",
Expand Down
Loading