Releases: EAddario/llama.cpp
Releases · EAddario/llama.cpp
b5497
server: fix streaming crashes (#13786) * add preludes to content on partial regex match * allow all parsers to parse non-tool-call content. * tweak order of <|python_tag|> vs <function= parsing for functionary v3.1 format. still not ideal but hopefully less prone to crash
b5478
`server`: streaming of tool calls and thoughts when `--jinja` is on (…
b5476
releases : enable openmp in windows cpu backend build (#13756)
b5373
scripts : fix compare-llama-bench.py show parameter (#13514)
b5343
docs : Fix typo in InternVL3 model name (#13440)
b5269
llama : move end-user examples to tools directory (#13249) * llama : move end-user examples to tools directory --------- Co-authored-by: Xuan Son Nguyen <[email protected]>
b5215
model : Nomic Embed Text V2 with Mixture-of-Experts (MoE) architectur…
b5200
llama-bench : Add `--override-tensors` arg (#12922) * Add --override-tensors option to llama-bench * Correct llama-bench --override-tensors to --override-tensor * llama-bench: Update --override-tensors parsing to match --tensor-split, appear in test matrix. * Make new llama-bench util functions static to fix Ubuntu CI * llama-bench: Correct -ot corner cases (No -ot calls, leading and trailing empty -ot spans, etc.)
b5191
llama : fix K-shift with quantized K and BLAS backend (#13113)
b5156
clip : refactor, add `image_manipulation` and `llava_uhd` classes (#1…