Releases: s-Nick/llama.cpp
Releases · s-Nick/llama.cpp
b5587
releases : use dl backend for linux release, remove arm64 linux relea…
b5518
convert : fix tensor naming conflict for llama 4 vision (#13836) * convert : fix tensor naming conflict for llama 4 vision * add comment
b5494
server: fix regression on streamed non-chat completion w/ stops (#13785) * more forgiving message diffs: partial stop words aren't erased, full stops are * Add (slow) server test for completion + stream + stop
b5466
server : support audio input (#13714) * server : support audio input * add audio support on webui
b5435
llama : remove llama_kv_cache_view API + remove deprecated (#13653) ggml-ci
b5401
minja: sync (qwen3) (#13573) * minja: sync https://github.com/google/minja/commit/f06140fa52fd140fe38e531ec373d8dc9c86aa06 - https://github.com/google/minja/pull/67 (@grf53) - https://github.com/google/minja/pull/66 (@taha-yassine) - https://github.com/google/minja/pull/63 (@grf53) - https://github.com/google/minja/pull/58 --------- Co-authored-by: ochafik <[email protected]>
b5353
CUDA: fix misaligned synchronization in FA (#13469)
b5318
ci : limit write permission to only the release step + fixes (#13392) * ci : limit write permission to only the release step * fix win cuda file name * fix license file copy on multi-config generators
b5283
clip : fix confused naming ffn_up and ffn_down (#13290) * clip : fix confused naming ffn_up and ffn_down * rm ffn_i/o/g naming * rename n_embd, n_ff * small fix * no check n_ff
b5209
llama : (mrope) allow using normal 1D position for text token (#13138) * llama : (mrope) use normal position for text token * rm n_pos_per_embd from llm_graph_input_attn_temp