Commit c642bc0
authored
kv-cache : separate recurrent vs non-recurrent impl (#12799)
* kv-cache : serparate recurrent vs non-recurrent impl (wip)
ggml-ci
* kv-cache : init -> contructor + add llama_memory_params
ggml-ci
* kv-cache : fix callback reference
ggml-ci
* context : llama_kv_cache -> llama_memory_i
ggml-ci
* context : move memory creation logic to model
ggml-ci
* llama : remove reference of memory during encode
ggml-ci
* kv-cache : hide padding details in the implementation
ggml-ci
* kv-cache : add ubatch_next()
ggml-ci
* context : simplify sbatch logic
ggml-ci
* kv-cache : hide defrag logic in the implementation
ggml-ci
* context : hide kv cache details in implementation
ggml-ci
* build : fix
ggml-ci
* cont : another fix
ggml-ci
* kv-cache : simplify interface (wip)
ggml-ci
* kv-cache : use separate KV cell structs for unified/recurrent
ggml-ci
* kv-cache : clean-up
ggml-ci
* model : better llama_model::create_model() signature
ggml-ci
* kv-cache : fix recurrent seq_rm()
ggml-ci
* kv-cache : replace `struct callbacks` with `llama_model &`
ggml-ci
* kv-cache : replace `struct graph_params` with `llama_context &`
ggml-ci
* kv-cache : fix offload check
ggml-ci
* context : avoid passing unique_ptr
ggml-ci
* kv-cache : avoid using the backends from the llama_context
ref #13113
ggml-ci
* kv-cache : more consistent debug logs [no ci]
* kv-cache : do not pass the full llama_context for kv graphs
ggml-ci
* kv-cache : remove comment
* kv-cache : ggml_rope_ext_inplace -> ggml_rope_ext
ggml-ci
* kv-cache : fix recurrent multi-user case
ggml-ci
* memory : remove comments [no ci]1 parent cb06a3c commit c642bc0
File tree
11 files changed
+1964
-1052
lines changed- src
11 files changed
+1964
-1052
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
189 | 189 | | |
190 | 190 | | |
191 | 191 | | |
192 | | - | |
| 192 | + | |
193 | 193 | | |
194 | 194 | | |
195 | 195 | | |
| |||
203 | 203 | | |
204 | 204 | | |
205 | 205 | | |
| 206 | + | |
206 | 207 | | |
207 | 208 | | |
208 | 209 | | |
| |||
212 | 213 | | |
213 | 214 | | |
214 | 215 | | |
| 216 | + | |
215 | 217 | | |
216 | 218 | | |
217 | 219 | | |
| |||
239 | 241 | | |
240 | 242 | | |
241 | 243 | | |
| 244 | + | |
242 | 245 | | |
243 | 246 | | |
244 | 247 | | |
| |||
262 | 265 | | |
263 | 266 | | |
264 | 267 | | |
| 268 | + | |
265 | 269 | | |
266 | 270 | | |
267 | 271 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
| 74 | + | |
74 | 75 | | |
75 | 76 | | |
76 | 77 | | |
| |||
0 commit comments