Skip to content

Releases: mingfeima/llama.cpp

b3542

08 Aug 04:13
15fa07a

Choose a tag to compare

make : use C compiler to build metal embed object (#8899)

* make : use C compiler to build metal embed object

* use rm + rmdir to avoid -r flag in rm

b3441

23 Jul 01:56
081fe43

Choose a tag to compare

llama : fix codeshell support (#8599)

* llama : fix codeshell support

* llama : move codeshell after smollm below to respect the enum order

b3401

16 Jul 06:10
7acfd4e

Choose a tag to compare

convert_hf : faster lazy safetensors (#8482)

* convert_hf : faster lazy safetensors

This makes '--dry-run' much, much faster.

* convert_hf : fix memory leak in lazy MoE conversion

The '_lazy' queue was sometimes self-referential,
which caused reference cycles of objects old enough
to avoid garbage collection until potential memory exhaustion.

b2972

23 May 02:58
cd93a28

Choose a tag to compare

CUDA: fix FA out-of-bounds reads (#7479)

b2688

17 Apr 06:40
facb8b5

Choose a tag to compare

convert : fix autoawq gemma (#6704)

* fix autoawq quantized gemma model convert error

using autoawq to quantize gemma model will include a lm_head.weight tensor in model-00001-of-00002.safetensors. it result in this situation that convert-hf-to-gguf.py can't map lm_head.weight. skip loading this tensor could prevent this error.

* change code to full string match and print necessary message

change code to full string match and print a short message to inform users that lm_head.weight has been skipped.

---------

Co-authored-by: Zheng.Deng <[email protected]>

b2586

03 Apr 04:50
5260486

Choose a tag to compare

[SYCL] Disable iqx on windows as WA (#6435)

* disable iqx on windows as WA

* array instead of global_memory

b2542

27 Mar 03:24
a4f569e

Choose a tag to compare

[SYCL] fix no file in win rel (#6314)