Releases · ggml-org/llama.cpp

31 Mar 13:25

a8a1f33

b5010

Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135)

* Vulkan: Add DP4A MMQ and Q8_1 quantization shader

* Add q4_0 x q8_1 matrix matrix multiplication support

* Vulkan: Add int8 coopmat MMQ support

* Vulkan: Add q4_1, q5_0 and q5_1 quants, improve integer dot code

* Add GL_EXT_integer_dot_product check

* Remove ggml changes, fix mmq pipeline picker

* Remove ggml changes, restore Intel coopmat behaviour

* Fix glsl compile attempt when integer vec dot is not supported

* Remove redundant code, use non-saturating integer dot, enable all matmul sizes for mmq

* Remove redundant comment

* Fix integer dot check

* Fix compile issue with unsupported int dot glslc

* Update Windows build Vulkan SDK version

Assets 25

31 Mar 13:01

github-actions

b5009

1790e73

b5009

cmake : fix whitespace (#0)

Assets 25

31 Mar 11:36

github-actions

b5006

1a85949

b5006

llava : proper description fix (#12668)

Assets 25

31 Mar 11:08

github-actions

b5005

6c02a03

b5005

SYCL: Remove misleading ggml_sycl_op_flatten function (#12387)

* SYCL: Remove misleading ggml_sycl_op_flatten function

* remove trailing whitespace

* Fix L2 norm from rebase

* remove try catch block from element_wise.cpp

* remove comment from common.hp

* ggml-sycl.cpp: Add try catch sycl::exception block in compute_forward

* norm.cpp: remove try catch exception block

Assets 25

31 Mar 10:51

github-actions

b5004

f52d59d

b5004

llava : fix clip loading GGUFs with missing description (#12660)

Assets 25

31 Mar 10:18

github-actions

b5003

52de2e5

b5003

tts : remove printfs (#12640)

* tts.cpp : llama tokens console output is done using LOG_INF instead of printf(). Therefore the options '--log-disable' and '--log-file' have now uniform impact on all output.

Assets 26

30 Mar 21:11

github-actions

b5002

2c3f8b8

b5002

llama : support BailingMoE (Ling) (#12634)

Assets 25

30 Mar 20:40

github-actions

b5001

4663bd3

b5001

metal : use constexpr in FA kernels + fix typedef (#12659)

* metal : use constexpr in FA kernels

ggml-ci

* cont

ggml-ci

* cont : fix typedef

ggml-ci

Assets 25

30 Mar 19:20

github-actions

b4999

7242dd9

b4999

llama-chat : Add Yandex instruct model template support (#12621)

* add yandex template

* update yandex chat template

* fix tests

* adjust chat template

* fix style

* fix tool macro in template

* add clarify comment

---------

Co-authored-by: Sergei Vorobev <[email protected]>
Co-authored-by: Xuan-Son Nguyen <[email protected]>

Assets 26

30 Mar 10:20

github-actions

b4998

492d7f1

b4998

musa: fix all warnings, re-enable `-DLLAMA_FATAL_WARNINGS=ON` in ci a…

Assets 26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b5010

Uh oh!

b5009

Uh oh!

b5006

Uh oh!

b5005

Uh oh!

b5004

Uh oh!

b5003

Uh oh!

b5002

Uh oh!

b5001

Uh oh!

b4999

Uh oh!

b4998

Uh oh!