Releases · EAddario/llama.cpp

19 Sep 08:09

4b8560a

b6519

chat : fix build on arm64 (#16101)

Assets 15

15 Sep 07:39

github-actions

b6475

b8e09f0

b6475

model : add grok-2 support (#15539)

* add grok-2 support

* type fix

* type fix

* type fix

* "fix" vocab for invalid sequences

* fix expert tensor mapping and spaces in vocab

* add chat template

* fix norm tensor mapping

* rename layer_out_norm to ffn_post_norm

* ensure ffn_post_norm is mapped

* fix experts merging

* remove erroneous FFN_GATE entry

* concatenate split tensors and add more metadata

* process all expert layers and try cat instead of hstack

* add support for community BPE vocab

* fix expert feed forward length and ffn_down concat

* commit this too

* add ffn_up/gate/down, unsure if sequence is right

* add ffn_gate/down/up to tensor names

* correct residual moe (still not working)

* mess--

* fix embedding scale being applied twice

* add built in chat template

* change beta fast for grok if default value

* remove spm vocab in favor of community bpe vocab

* change attention temp length metadata type to integer

* update attention temp length metadata

* remove comment

* replace M_SQRT2 with std::sqrt(2)

* add yarn metadata, move defaults to hparams

Assets 15

10 Sep 21:13

github-actions

b6445

00681df

b6445

CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3% E2E performance (#…

Assets 15

06 Sep 12:53

github-actions

b6399

61bdfd5

b6399

server : implement prompt processing progress report in stream mode (…

Assets 15

30 Aug 10:13

github-actions

b6323

696fccf

b6323

vulkan: Skip syncing for prealloc_y when it is reused (#15544)

Assets 15

26 Aug 21:31

github-actions

b6294

bcbddcd

b6294

tests : fix test-opt with GGML_BACKEND_DL (#15599)

Assets 15

25 Aug 18:36

github-actions

b6275

4d917cd

b6275

vulkan: fix min subgroup 16 condition for mmid subgroup optimization …

Assets 15

24 Aug 20:32

github-actions

b6264

043fb27

b6264

vulkan: apply MUL_MAT_ID subgroup optimization to non-coopmat devices…

Assets 15

21 Aug 19:07

github-actions

b6239

cd36b5e

b6239

llama : remove deprecated llama_kv_self API (#15472)

ggml-ci

Assets 15

19 Aug 23:37

github-actions

b6209

fb22dd0

b6209

opencl: mark `argsort` unsupported if cols exceed workgroup limit (#1…

Assets 15

Releases: EAddario/llama.cpp

b6519

Uh oh!

b6475

Uh oh!

b6445

Uh oh!

b6399

Uh oh!

b6323

Uh oh!

b6294

Uh oh!

b6275

Uh oh!

b6264

Uh oh!

b6239

Uh oh!

b6209

Uh oh!