Releases · ggml-org/llama.cpp

12 Aug 10:17

60a7658

b6138

opencl: allow mixed f16/f32 `add` (#15140)

Assets 15

12 Aug 10:02

github-actions

b6137

efe3a90

b6137

CUDA cmake: add `-lineinfo` for easier debug (#15260)

Assets 15

12 Aug 08:26

github-actions

b6136

bbd57b7

b6136

CANN: GGML_OP_CPY optimization (#15070)

Signed-off-by: noemotiovon <[email protected]>

Assets 15

12 Aug 03:01

github-actions

b6135

25ff6f7

b6135

musa: fix failures in test-backend-ops for mul_mat_id op (#15236)

* musa: fix failures in test-backend-ops for mul_mat_id op

Signed-off-by: Xiaodong Ye <[email protected]>

* Address review comments

Signed-off-by: Xiaodong Ye <[email protected]>

---------

Signed-off-by: Xiaodong Ye <[email protected]>

Assets 15

11 Aug 15:25

github-actions

b6134

be48528

b6134

CANN: Add broadcast for softmax and FA (#15208)

* refactor softmax

* fix fa

* fix mask shape

* format

* add comments

* Remove whitespace

Assets 15

11 Aug 15:30

github-actions

b6133

cf9e564

b6133

mtmd : Fix MinicpmV model converter and clip to avoid using hardcode.…

Assets 15

11 Aug 14:42

github-actions

b6132

fba5c0d

b6132

chat : hotfix gpt-oss jinja raising an exception (#15243)

* chat : hotfix gpt-oss jinja raising an exception

* fix

Assets 15

11 Aug 13:06

github-actions

b6131

53d0a12

b6131

server : allow specifying reasoning_format in HTTP request (#15238)

Assets 15

11 Aug 11:28

github-actions

b6129

228f724

b6129

kv-cache : fix seq_rm with seq_id == -1 (#15226)

* kv-cache : fix seq_rm with seq_id == -1

ggml-ci

* cont : iterate over streams

ggml-ci

Assets 15

11 Aug 10:16

github-actions

b6128

cd3069d

b6128

kv-cache : log (debug) all streams in find_slot (#15176)

This commit updates `llama_kv_cache_unified::find_slot` to log
information for all streams when debug is enabled.

The motivation for this change is that currently if a non-unified
kv-cache is used, then only one stream will be logged because the
code was currently uses `seq_to_stream[1]`.

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b6138

Uh oh!

b6137

Uh oh!

b6136

Uh oh!

b6135

Uh oh!

b6134

Uh oh!

b6133

Uh oh!

b6132

Uh oh!

b6131

Uh oh!

b6129

Uh oh!

b6128

Uh oh!