Skip to content

Releases: EAddario/llama.cpp

b6519

19 Sep 08:09
4b8560a

Choose a tag to compare

chat : fix build on arm64 (#16101)

b6475

15 Sep 07:39
b8e09f0

Choose a tag to compare

model : add grok-2 support (#15539)

* add grok-2 support

* type fix

* type fix

* type fix

* "fix" vocab for invalid sequences

* fix expert tensor mapping and spaces in vocab

* add chat template

* fix norm tensor mapping

* rename layer_out_norm to ffn_post_norm

* ensure ffn_post_norm is mapped

* fix experts merging

* remove erroneous FFN_GATE entry

* concatenate split tensors and add more metadata

* process all expert layers and try cat instead of hstack

* add support for community BPE vocab

* fix expert feed forward length and ffn_down concat

* commit this too

* add ffn_up/gate/down, unsure if sequence is right

* add ffn_gate/down/up to tensor names

* correct residual moe (still not working)

* mess--

* fix embedding scale being applied twice

* add built in chat template

* change beta fast for grok if default value

* remove spm vocab in favor of community bpe vocab

* change attention temp length metadata type to integer

* update attention temp length metadata

* remove comment

* replace M_SQRT2 with std::sqrt(2)

* add yarn metadata, move defaults to hparams

b6445

10 Sep 21:13
00681df

Choose a tag to compare

CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3% E2E performance (#…

b6399

06 Sep 12:53
61bdfd5

Choose a tag to compare

server : implement prompt processing progress report in stream mode (…

b6323

30 Aug 10:13
696fccf

Choose a tag to compare

vulkan: Skip syncing for prealloc_y when it is reused (#15544)

b6294

26 Aug 21:31
bcbddcd

Choose a tag to compare

tests : fix test-opt with GGML_BACKEND_DL (#15599)

b6275

25 Aug 18:36
4d917cd

Choose a tag to compare

vulkan: fix min subgroup 16 condition for mmid subgroup optimization …

b6264

24 Aug 20:32
043fb27

Choose a tag to compare

vulkan: apply MUL_MAT_ID subgroup optimization to non-coopmat devices…

b6239

21 Aug 19:07
cd36b5e

Choose a tag to compare

llama : remove deprecated llama_kv_self API (#15472)

ggml-ci

b6209

19 Aug 23:37
fb22dd0

Choose a tag to compare

opencl: mark `argsort` unsupported if cols exceed workgroup limit (#1…