Releases · ggml-org/llama.cpp

03 Aug 20:28

97366dc

b6078

vocab : JetBrains Mellum pre-tokenizer (#15045)

Assets 15

03 Aug 12:42

github-actions

b6076

6c7a441

b6076

vulkan: Use coopmat2 for conv2d (#14982)

Assets 15

02 Aug 18:11

github-actions

b6075

5c0eb5e

b6075

opencl: fix adreno compiler detection logic (#15029)

Assets 15

02 Aug 15:29

github-actions

b6074

03d4698

b6074

CUDA: use mma FA kernel for gqa > 4 on RTX 4000 (#15035)

Assets 15

02 Aug 15:27

github-actions

b6073

3303c19

b6073

cuda: make im2col a little faster (#15025)

Assets 15

02 Aug 15:18

github-actions

b6071

a4569c4

b6071

llama : enable LLAMA_SET_ROWS=1 by default (#14959)

ggml-ci

Assets 15

02 Aug 15:04

github-actions

b6070

15e92fd

b6070

cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1 (#15038)

* cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1

ggml-ci

* cont : fix cont types

ggml-ci

* cont : adopt variable names and comment from the other branch

Assets 15

02 Aug 11:00

github-actions

b6067

f738989

b6067

chat : fix multiple tool_calls on hermes-2-pro (#14962)

Assets 15

02 Aug 10:59

github-actions

b6066

4cb208c

b6066

vulkan: coopmat2 mul_mat optimizations (#14934)

- Increase tile size for k-quants, to match non-k-quants
- Choose more carefully between large and medium tiles, considering how it
  interacts with split_k
- Allow larger/non-power of two split_k, and make the splits a multiple of 256
- Use split_k==3 to when >1/2 and <=2/3 of the SMs would hae been used

Assets 15

02 Aug 10:15

github-actions

b6065

3025b62

b6065

llama-bench: rename DB table name from test to llama_bench (#15003)

Signed-off-by: Xiaodong Ye <[email protected]>

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b6078

Uh oh!

b6076

Uh oh!

b6075

Uh oh!

b6074

Uh oh!

b6073

Uh oh!

b6071

Uh oh!

b6070

Uh oh!

b6067

Uh oh!

b6066

Uh oh!

b6065

Uh oh!