Releases · ggml-org/llama.cpp

30 Mar 10:20

492d7f1

b4998

musa: fix all warnings, re-enable `-DLLAMA_FATAL_WARNINGS=ON` in ci a…

Assets 26

30 Mar 06:17

github-actions

b4997

d3f1f0a

b4997

sync : ggml

ggml-ci

Assets 26

29 Mar 23:48

github-actions

b4992

af6ae1e

b4992

llama : fix non-causal mask for gemma 3 (#12615)

Assets 26

29 Mar 14:13

github-actions

b4991

0bb2919

b4991

llama : change cpu_buft_list order: ACCEL -> GPU host -> CPU extra ->…

Assets 25

29 Mar 10:46

github-actions

b4990

a69f846

b4990

cmake : fix ccache conflict (#12522)

If users already set CMAKE_C_COMPILER_LAUNCHER globally, setting it in
cmake again will lead to conflict and compile fail.

Signed-off-by: Jay <[email protected]>

Assets 26

28 Mar 21:56

github-actions

b4988

3714c3e

b4988

llama : fix incorrect Qwen2Moe ffn_moe_out graph callback (#12631)

Assets 25

28 Mar 19:06

github-actions

b4987

b4ae508

b4987

metal : improve FA + improve MoE (#12612)

* ggml : FA with different K, V head sizes (CPU)

ggml-ci

* metal : add FA with HS=192

* metal : extend FA to support different K and V head sizes

ggml-ci

* metal : add FA vector kernels for heads K 192 and V 128

ggml-ci

* ggml : restrict op on other backends to equal head sizes

ggml-ci

* metal : optimize FA-vec kernel

ggml-ci

* metal : FA remove mq registers

* metal : improve MoE mul_mat_id condition

ggml-ci

* metal : fix comments + remove unnecessary addition

ggml-ci

* metal : avoid too much shared memory usage with mul_mat_id

ggml-ci

Assets 26

28 Mar 18:42

github-actions

b4986

b86f600

b4986

vulkan: fix coopmat shader generation when cross-compiling (#12272)

* vulkan: fix coopmat shader generation when cross-compiling

Previously the status of coopmat{,2} support isn't passed to the
vulkan-shaders-gen project building on the host, which leads to build
failure because of the cross-compiling code expecting coopmat{,2}
shaders that didn't get generated.

Fix this by passing the coopmat{,2} support status to vulkan-shaders
subproject.

Signed-off-by: Icenowy Zheng <[email protected]>

* Only call coop-mat shaders once

* Fix whitespace

---------

Signed-off-by: Icenowy Zheng <[email protected]>
Co-authored-by: bandoti <[email protected]>

Assets 25

28 Mar 18:02

github-actions

b4985

dd373dd

b4985

llama: fix error on bad grammar (#12628)

Assets 25

28 Mar 08:59

github-actions

b4984

5d01670

b4984

server : include speculative decoding stats when timings_per_token is…

Assets 26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b4998

Uh oh!

b4997

Uh oh!

b4992

Uh oh!

b4991

Uh oh!

b4990

Uh oh!

b4988

Uh oh!

b4987

Uh oh!

b4986

Uh oh!

b4985

Uh oh!

b4984

Uh oh!