Releases: EAddario/llama.cpp
Releases · EAddario/llama.cpp
b5890
quantize : fix minor logic flaw in --tensor-type (#14572)
b5873
model : support LiquidAI LFM2 hybrid family (#14620) **Important** LFM2 was [merged ](https://github.com/huggingface/transformers/pull/39340)into transformers, but has not yet been released. To convert into gguf, install transformers from source ```shell pip install "transformers @ git+https://github.com/huggingface/transformers.git@main" ```
b5837
llama : remove ggml_cont where possible (#14568)
b5833
vulkan: Handle updated FA dim2/3 definition (#14518) * vulkan: Handle updated FA dim2/3 definition Pack mask boolean and n_head_log2 into a single dword to keep the push constant block under the 128B limit. * handle null mask for gqa * allow gqa with dim3>1
b5707
sycl: Cleanup codepaths in Get Rows in sycl backend (#14215) Addresses unused reorder path
b5672
quantize : change int to unsigned int for KV overrides (#14197)
b5669
kv-cache : fix use-after-move of defrag info (#14189) ggml-ci
b5663
compare-llama-bench: add option to plot (#14169) * compare llama-bench: add option to plot * Address review comments: convert case + add type hints * Add matplotlib to requirements * fix tests * Improve comment and fix assert condition for test * Add back default test_name, add --plot_log_scale * use log_scale regardless of x_values
b5649
vocab : prevent heap overflow when vocab is too small (#14145) ggml-ci
b5530
llama : add RobertaForSequenceClassification reranker support (#13875)