Skip to content

Releases: ngxson/llama.cpp

b4881

13 Mar 11:22
e0dbec0
Compare
Choose a tag to compare
llama : refactor llama_context, llama_kv_cache, llm_build_context (#1…

b4880

13 Mar 10:55
2048b59
Compare
Choose a tag to compare
server : fix crash when using verbose output with input tokens that a…

b4879

12 Mar 19:51
f08f4b3
Compare
Choose a tag to compare
Update build.yml for Windows Vulkan builder to use Vulkan 1.4.304 SDK…

b4877

12 Mar 10:51
363f8c5
Compare
Choose a tag to compare
sycl : variable sg_size support for mmvq kernels (#12336)

b4876

12 Mar 09:56
34c961b
Compare
Choose a tag to compare
CUDA/HIP: Fix fattn-vec-* when device warp size is not 32 (#12315)

When fattn-wmma was ported over to warp64 various bits that also touch fattn-vec where converted to
selectable warp size, however the fattn-vec kernels dont work with 64 wide warps for now, so we need
to avoid launching them with parameters for warp64

b4875

12 Mar 09:19
7841fc7
Compare
Choose a tag to compare
llama : Add Gemma 3 support (+ experimental vision capability) (#12343)

* llama : Add Gemma 3 text-only support

* fix python coding style

* fix compile on ubuntu

* python: fix style

* fix ubuntu compile

* fix build on ubuntu (again)

* fix ubuntu build, finally

* clip : Experimental support for Gemma 3 vision (#12344)

* clip : Experimental support for Gemma 3 vision

* fix build

* PRId64

b4874

12 Mar 06:47
bf69cfe
Compare
Choose a tag to compare
vulkan: fix bug in coopmat1 mul_mat_id (#12316)

* tests: run mul_mat_id with a larger N

* vulkan: fix bug in coopmat1 mul_mat_id

b4873

11 Mar 19:59
10f2e81
Compare
Choose a tag to compare
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows …

b4872

11 Mar 14:06
ba76543
Compare
Choose a tag to compare
ggml-backend : fix backend search path (#12330)

* Fix backend search path

* replace .native() with '/'

* reverted .native()

b4871

11 Mar 12:30
6ab2e47
Compare
Choose a tag to compare
metal : Cache the Metal library at the device context level (#12265)