Misc. bug: For both LFM2-2.6B and LFM2-8B-A1 (arch lfm2, lfm2moe), it always prompt cache gets cleared every time.

### Name and Version

version: 6721 (56b47958)
built with clang version 19.1.5 for x86_64-pc-windows-msvc


### Operating systems

Windows

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell
llama-server -m LFM2-2.6B-Q8_0.gguf --jinja -ngl 999 -fa on -c 32768

Even when starting with --swa-full, the behavior doesn’t change.
```

### Problem description & steps to reproduce

For both LFM2-2.6B and LFM2-8B-A1 (arch lfm2, lfm2moe), it always shows “forcing full prompt re-processing,” and the prompt cache gets cleared every time.

Modify the last part of the prompt and send the request.

### First Bad Commit

_No response_

### Relevant log output

```shell
srv  params_from_: Chat format: Content-only
slot get_availabl: id  0 | task 1912 | selected slot by lcs similarity, lcs_len = 7657, similarity = 0.922 (> 0.100 thold)
slot launch_slot_: id  0 | task 2068 | processing task
slot update_slots: id  0 | task 2068 | new prompt, n_ctx_slot = 32768, n_keep = 0, n_prompt_tokens = 8155
slot update_slots: id  0 | task 2068 | n_past = 7657, cache_tokens.size() = 8306, seq_id = 0, pos_min = 8305, n_swa = 1
slot update_slots: id  0 | task 2068 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)
slot update_slots: id  0 | task 2068 | erased invalidated context checkpoint (pos_min = 8092, pos_max = 8092, n_swa = 1, size = 0.344 MiB)
slot update_slots: id  0 | task 2068 | n_past = 0, memory_seq_rm [0, end)
slot update_slots: id  0 | task 2068 | prompt processing progress, n_past = 2048, n_tokens = 2048, progress = 0.251134
slot update_slots: id  0 | task 2068 | n_past = 2048, memory_seq_rm [2048, end)
slot update_slots: id  0 | task 2068 | prompt processing progress, n_past = 4096, n_tokens = 2048, progress = 0.502269
slot update_slots: id  0 | task 2068 | n_past = 4096, memory_seq_rm [4096, end)
slot update_slots: id  0 | task 2068 | prompt processing progress, n_past = 6144, n_tokens = 2048, progress = 0.753403
slot update_slots: id  0 | task 2068 | n_past = 6144, memory_seq_rm [6144, end)
slot update_slots: id  0 | task 2068 | prompt processing progress, n_past = 8091, n_tokens = 1947, progress = 0.992152
slot update_slots: id  0 | task 2068 | n_past = 8091, memory_seq_rm [8091, end)
slot update_slots: id  0 | task 2068 | prompt processing progress, n_past = 8155, n_tokens = 64, progress = 1.000000
slot update_slots: id  0 | task 2068 | prompt done, n_past = 8155, n_tokens = 64
slot update_slots: id  0 | task 2068 | saved context checkpoint 1 of 3 (pos_min = 8090, pos_max = 8090, size = 0.344 MiB)
srv  cancel_tasks: cancel task, id_task = 2068
srv  log_server_r: request: POST /v1/chat/completions 192.168.1.199 200
slot      release: id  0 | task 2068 | stop processing: n_past = 8299, truncated = 0
srv  update_slots: all slots are idle
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: For both LFM2-2.6B and LFM2-8B-A1 (arch lfm2, lfm2moe), it always prompt cache gets cleared every time. #16491

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: For both LFM2-2.6B and LFM2-8B-A1 (arch lfm2, lfm2moe), it always prompt cache gets cleared every time. #16491

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions