Skip to content

Releases: agray3/llama.cpp

b6428

09 Sep 08:38
a972fae

Choose a tag to compare

CUDA: Add mul_mat_id support for the mmf kernel (#15767)

* CUDA: Add mul_mat_id support the mmf

Add support for mul_mat_id for bs < 16

* Review: use warp_size, fix should_use_mmf condition

* Launch one block per expert, stride along n_expert_used

* templatize mul_mat_id

* Pad shmem to 16 bytes, add helper function mul_mat_f_switch_ids

* Reduce compile times by dividing mmf into f16, bf16 and f32 variants

* Divide mmf by ncols_dst

* Add missing files

* Fix MUSA/HIP builds

b6206

19 Aug 17:15
d2fcd91

Choose a tag to compare

server : disable context shift by default (#15416)

* server : disable context shift by default

ggml-ci

* server : make scopr of test parameters local

b6144

13 Aug 09:43
00f35d5

Choose a tag to compare

ggml : repack block_iq4_nlx8 (#14904)

ggml-ci

b6082

04 Aug 10:49
5aa1105

Choose a tag to compare

vulkan: fix build when using glslang that does not support coopmat2 (…

b6019

29 Jul 07:19
8ad7b3e

Choose a tag to compare

opencl : add ops docs (#14910)

b5967

23 Jul 07:51
6c88b3b

Choose a tag to compare

ggml: fix loongarch quantize_row_q8_1 error (#14827)

b5958

22 Jul 10:37
8e6f8bc

Choose a tag to compare

opencl: remove unreachable `return` (#14806)

b5707

19 Jun 11:40
600e3e9

Choose a tag to compare

sycl: Cleanup codepaths in Get Rows in sycl backend (#14215)

Addresses unused reorder path

b5622

10 Jun 12:55
97340b4

Choose a tag to compare

Vulkan: Don't default to CPU device (like llvmpipe), even if no other…

b5166

22 Apr 11:12
2434535

Choose a tag to compare

llava : update documentations (#13055)

* llava : update documentations

* fix typo