Skip to content

Releases: EAddario/llama.cpp

b6792

18 Oct 09:56
8138785

Choose a tag to compare

opencl: transposed gemm/gemv moe kernel with mxfp4,f32 (#16602)

* opencl: transposed gemm/gemv moe kernel with mxfp4,f32

* add restore kernel for moe transpose

* fix trailing whitespaces

* resolve compilation warnings

b6779

16 Oct 11:41
7a50cf3

Choose a tag to compare

CANN: format code using .clang-format (#15863)

This commit applies .clang-format rules to all source files under the
ggml-cann directory to ensure consistent coding style and readability.
The .clang-format option `SortIncludes: false` has been set to disable
automatic reordering of include directives.
No functional changes are introduced.

Co-authored-by: hipudding <[email protected]>

b6731

11 Oct 09:22
477a66b

Choose a tag to compare

convert : correctly handle LLaMA tokenizer for Jamba (#16470)

* fix: convert_hf_to_gguf - change Jamba non-sentencepiece mode (tokenizer.json) vocab construction

* fix: convert_hf_to_gguf - jamba non-sentencepiece tokenizer to use _set_vocab_llama_hf func

* fix: convert_hf_to_gguf - removed get_vocab_base_pre from jamba

b6727

10 Oct 11:28
cdb6da4

Choose a tag to compare

server : log requests to /v1/completions (#16495)

b6686

03 Oct 22:05
128d522

Choose a tag to compare

chat : support Magistral thinking (#16413)

* feat: added a dedicated Magistral chat format that preserves [THINK] spans, parses reasoning before tool calls

* feat: new flow in the chat template test suite for Magistral

b6683

03 Oct 15:00
946f71e

Choose a tag to compare

llama : fix shapes for bert/mpt q/k norm (#16409)

b6679

03 Oct 11:05
0e1f838

Choose a tag to compare

vulkan: Fix FA coopmat1 invalid array indexing (#16365)

When computing sinks, the cm1 shader was looping r from 0 to Br rather than
to rows_per_thread. I must have copied this from the scalar path (where it is
correct), and somehow it wasn't causing failures on current drivers.

b6660

01 Oct 18:28
4201dea

Choose a tag to compare

common: introduce http.h for httplib-based client (#16373)

* common: introduce http.h for httplib-based client

This change moves cpp-httplib based URL parsing and client setup into
a new header `common/http.h`, and integrates it in `arg.cpp` and `run.cpp`.

It is an iteration towards removing libcurl, while intentionally
minimizing changes to existing code to guarantee the same behavior when
`LLAMA_CURL` is used.

Signed-off-by: Adrien Gallouët <[email protected]>

* tools : add missing WIN32_LEAN_AND_MEAN

Signed-off-by: Adrien Gallouët <[email protected]>

---------

Signed-off-by: Adrien Gallouët <[email protected]>
Signed-off-by: Adrien Gallouët <[email protected]>

b6658

01 Oct 17:05
2a9b633

Choose a tag to compare

Improve code block color theming (#16325)

* feat: Improve code block theming

* chore: update webui build output

* chore: Update webui static build

b6527

20 Sep 21:43

Choose a tag to compare

sync : ggml