Skip to content

Releases: EAddario/llama.cpp

b6005

27 Jul 14:28
bf78f54

Choose a tag to compare

vulkan: add ops docs (#14900)

b5996

26 Jul 13:16
11dd5a4

Choose a tag to compare

CANN: Implement GLU ops (#14884)

Implement REGLU, GEGLU, SWIGLU ops according to #14158

b5995

26 Jul 06:52
9b8f3c6

Choose a tag to compare

musa: fix build warnings (unused variable) (#14869)

Signed-off-by: Xiaodong Ye <[email protected]>

b5994

25 Jul 22:12
c7f3169

Choose a tag to compare

ggml-cpu : disable GGML_NNPA by default due to instability (#14880)

* docs: update s390x document for sentencepiece

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit e086c5e3a7ab3463d8e0906efcfa39352db0a48d)

* docs: update huggingface links + reword

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 8410b085ea8c46e22be38266147a1e94757ef108)

* ggml-cpu: disable ggml-nnpa compile flag by default

fixes #14877

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 412f4c7c88894b8f55846b4719c76892a23cfe09)

* docs: update s390x build docs to reflect nnpa disable

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit c1eeae1d0c2edc74ab9fbeff2707b0d357cf0b4d)

---------

Signed-off-by: Aaron Teo <[email protected]>

b5971

23 Jul 14:07
221c0e0

Choose a tag to compare

ci : correct label refactor->refactoring (#14832)

b5964

22 Jul 22:27
acd6cb1

Choose a tag to compare

ggml : model card yaml tab->2xspace (#14819)

b5942

19 Jul 17:16
9008328

Choose a tag to compare

imatrix : use GGUF to store importance matrices (#9400)

* imatrix : allow processing multiple chunks per batch

* perplexity : simplify filling the batch

* imatrix : fix segfault when using a single chunk per batch

* imatrix : use GGUF to store imatrix data

* imatrix : fix conversion problems

* imatrix : use FMA and sort tensor names

* py : add requirements for legacy imatrix convert script

* perplexity : revert changes

* py : include imatrix converter requirements in toplevel requirements

* imatrix : avoid using designated initializers in C++

* imatrix : remove unused n_entries

* imatrix : allow loading mis-ordered tensors

Sums and counts tensors no longer need to be consecutive.

* imatrix : more sanity checks when loading multiple imatrix files

* imatrix : use ggml_format_name instead of std::string concatenation

Co-authored-by: Xuan Son Nguyen <[email protected]>

* quantize : use unused imatrix chunk_size with LLAMA_TRACE

* common : use GGUF for imatrix output by default

* imatrix : two-way conversion between old format and GGUF

* convert : remove imatrix to gguf python script

* imatrix : use the function name in more error messages

* imatrix : don't use FMA explicitly

This should make comparisons between the formats easier
because this matches the behavior of the previous version.

* imatrix : avoid returning from void function save_imatrix

* imatrix : support 3d tensors with MUL_MAT

* quantize : fix dataset name loading from gguf imatrix

* common : move string_remove_suffix from quantize and imatrix

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* imatrix : add warning when legacy format is written

* imatrix : warn when writing partial data, to help guess dataset coverage

Also make the legacy format store partial data
by using neutral values for missing data.
This matches what is done at read-time for the new format,
and so should get the same quality in case the old format is still used.

* imatrix : avoid loading model to convert or combine imatrix

* imatrix : avoid using imatrix.dat in README

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>

b5939

19 Jul 12:23
f0d4d17

Choose a tag to compare

Documentation: Update build.md's Vulkan section (#14736)

* Documentation: Rewrote and updated the "Without docker" portion of the Vulkan backend build documentation.

* Documentation: Reorganize build.md's Vulkan section.

b5921

17 Jul 11:15
086cf81

Choose a tag to compare

llama : fix parallel processing for lfm2 (#14705)

b5906

16 Jul 08:07
4b91d6f

Choose a tag to compare

convert : only check for tokenizer folder if we need it (#14704)