Skip to content

Releases: gabriellarson/llama.cpp

b6475

15 Sep 07:56
b8e09f0

Choose a tag to compare

model : add grok-2 support (#15539)

* add grok-2 support

* type fix

* type fix

* type fix

* "fix" vocab for invalid sequences

* fix expert tensor mapping and spaces in vocab

* add chat template

* fix norm tensor mapping

* rename layer_out_norm to ffn_post_norm

* ensure ffn_post_norm is mapped

* fix experts merging

* remove erroneous FFN_GATE entry

* concatenate split tensors and add more metadata

* process all expert layers and try cat instead of hstack

* add support for community BPE vocab

* fix expert feed forward length and ffn_down concat

* commit this too

* add ffn_up/gate/down, unsure if sequence is right

* add ffn_gate/down/up to tensor names

* correct residual moe (still not working)

* mess--

* fix embedding scale being applied twice

* add built in chat template

* change beta fast for grok if default value

* remove spm vocab in favor of community bpe vocab

* change attention temp length metadata type to integer

* update attention temp length metadata

* remove comment

* replace M_SQRT2 with std::sqrt(2)

* add yarn metadata, move defaults to hparams

b6090

05 Aug 10:01
ee3a9fc

Choose a tag to compare

context : fix index overflow on huge outputs (#15080)

* context : fix overflow when re-ordering huge outputs

* context : fix logits size overflow for huge batches

b6082

04 Aug 13:22
5aa1105

Choose a tag to compare

vulkan: fix build when using glslang that does not support coopmat2 (…

b6077

03 Aug 16:50
83bc2f2

Choose a tag to compare

model : add text-only support for Kimi-VL (and find special tokens in…

b6075

03 Aug 05:17
5c0eb5e

Choose a tag to compare

opencl: fix adreno compiler detection logic (#15029)

b6001

27 Jul 09:27
f1a4e72

Choose a tag to compare

vulkan: skip empty set_rows to avoid invalid API usage (#14860)

b5998

27 Jul 06:37
446595b

Choose a tag to compare

Docs: add instructions for adding backends (#14889)

b5994

26 Jul 01:53
c7f3169

Choose a tag to compare

ggml-cpu : disable GGML_NNPA by default due to instability (#14880)

* docs: update s390x document for sentencepiece

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit e086c5e3a7ab3463d8e0906efcfa39352db0a48d)

* docs: update huggingface links + reword

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 8410b085ea8c46e22be38266147a1e94757ef108)

* ggml-cpu: disable ggml-nnpa compile flag by default

fixes #14877

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit 412f4c7c88894b8f55846b4719c76892a23cfe09)

* docs: update s390x build docs to reflect nnpa disable

Signed-off-by: Aaron Teo <[email protected]>
(cherry picked from commit c1eeae1d0c2edc74ab9fbeff2707b0d357cf0b4d)

---------

Signed-off-by: Aaron Teo <[email protected]>

b5902

15 Jul 23:04
4a4f426

Choose a tag to compare

model : add Kimi-K2 support (#14654)

* Kimi-K2 conversion

* add Kimi_K2  pre type

* Kimi-K2

* Kimi-K2 unicode

* Kimi-K2

* LLAMA_MAX_EXPERTS 384

* fix vocab iteration

* regex space fix

* add kimi-k2 to pre_computed_hashes

* Updated with kimi-k2 get_vocab_base_pre hash

* fix whitespaces

* fix flake errors

* remove more unicode.cpp whitespaces

* change set_vocab() flow

* add moonshotai-Kimi-K2.jinja to /models/templates/

* update moonshotai-Kimi-K2.jinja

* add kimi-k2 chat template

* add kimi-k2

* update NotImplementedError

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* except Exception

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* LLM_CHAT_TEMPLATE_KIMI_K2 if(add_ass){}

---------

Co-authored-by: Sigbjørn Skjæret <[email protected]>

b5884

12 Jul 18:51
c31e606

Choose a tag to compare

tests : cover lfm2 cases in test_ssm_conv (#14651)