Releases · ggml-org/llama.cpp

13 Aug 11:50

d8914fc

b6148

 common : add --override-tensor-draft, --cpu-moe-draft and --n-cpu-mo…

Assets 15

13 Aug 09:29

github-actions

b6144

00f35d5

b6144

ggml : repack block_iq4_nlx8 (#14904)

ggml-ci

Assets 15

13 Aug 09:19

github-actions

b6143

6028bf7

b6143

CUDA: Optimize `reduce_rows_f32` kernel, leading up to 25x perf impro…

Assets 15

13 Aug 06:20

github-actions

b6141

e71d48e

b6141

ggml-rpc: chunk send()/recv() to avoid EINVAL for very large tensors …

Assets 15

12 Aug 20:38

github-actions

b6140

b049315

b6140

HIP: disable sync warp shuffel operators from clr amd_warp_sync_funct…

Assets 15

12 Aug 12:16

github-actions

b6139

f4586ee

b6139

sycl: Fix and disable more configurations of mul_mat (#15151)

* sycl: Fix and disable more configurations of mul_mat

* Disable more configurations

Assets 15

12 Aug 10:17

github-actions

b6138

60a7658

b6138

opencl: allow mixed f16/f32 `add` (#15140)

Assets 15

12 Aug 10:02

github-actions

b6137

efe3a90

b6137

CUDA cmake: add `-lineinfo` for easier debug (#15260)

Assets 15

12 Aug 08:26

github-actions

b6136

bbd57b7

b6136

CANN: GGML_OP_CPY optimization (#15070)

Signed-off-by: noemotiovon <[email protected]>

Assets 15

12 Aug 03:01

github-actions

b6135

25ff6f7

b6135

musa: fix failures in test-backend-ops for mul_mat_id op (#15236)

* musa: fix failures in test-backend-ops for mul_mat_id op

Signed-off-by: Xiaodong Ye <[email protected]>

* Address review comments

Signed-off-by: Xiaodong Ye <[email protected]>

---------

Signed-off-by: Xiaodong Ye <[email protected]>

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b6148

Uh oh!

b6144

Uh oh!

b6143

Uh oh!

b6141

Uh oh!

b6140

Uh oh!

b6139

Uh oh!

b6138

Uh oh!

b6137

Uh oh!

b6136

Uh oh!

b6135

Uh oh!