Skip to content

Releases: ReinForce-II/llama.cpp

b2972

23 May 04:03
cd93a28

Choose a tag to compare

CUDA: fix FA out-of-bounds reads (#7479)

b2961

22 May 02:40
201cc11

Choose a tag to compare

llama : add phi3 128K model support (#7225)

* add phi3 128k support in convert-hf-to-gguf

* add phi3 128k support in cuda

* address build warnings on llama.cpp

* adjust index value in cuda long rope freq factors

* add long rope support in ggml cpu backend

* make freq factors only depend on ctx size

* remove unused rope scaling type 'su' frin gguf converter

* fix flint warnings on convert-hf-to-gguf.py

* set to the short freq factor when context size is small than trained context size

* add one line of comments

* metal : support rope freq_factors

* ggml : update ggml_rope_ext API to support freq. factors

* backends : add dev messages to support rope freq. factors

* minor : style

* tests : update to use new rope API

* backends : fix pragma semicolons

* minor : cleanup

* llama : move rope factors from KV header to tensors

* llama : remove tmp assert

* cuda : fix compile warning

* convert : read/write n_head_kv

* llama : fix uninitialized tensors

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b2953

21 May 09:07
917dc8c

Choose a tag to compare

Tokenizer SPM fixes for phi-3 and llama-spm (#7375)

* Update brute force test: special tokens
* Fix added tokens
  - Try to read 'added_tokens.json'.
  - Try to read 'tokenizer_config.json'.
  - Try to read 'tokenizer.json'.
* Fix special tokens rtrim

Co-authored-by: Georgi Gerganov <[email protected]>
* server : fix test regexes

b2950

20 May 15:54
db10f01

Choose a tag to compare

rpc : track allocated buffers (#7411)

* rpc : track allocated buffers

ref: #7407

* rpc : pack rpc_tensor tightly

b2941

20 May 03:37
33c8d50

Choose a tag to compare

Add provisions for windows support for BF16 code including CMake prov…

b2876

14 May 11:42
5416002

Choose a tag to compare

llama : disable pipeline parallelism with nkvo (#7265)

b2837

10 May 09:18
d11afd6

Choose a tag to compare

llava : fix moondream support (#7163)

* Revert "Revert "llava : add support for moondream vision language model (#6899)""

This reverts commit 9da243b36ac0b9d609adfaaa4c8f1cc8c592f737.

* Fix num_positions and embeddings initialization