Releases · ngxson/llama.cpp

21 Mar 15:02

af04481

b4936

model : do not repack if a GPU device is present (#12498)

ggml-ci

Assets 26

21 Mar 10:08

github-actions

b4935

960e726

b4935

chore : cleanup llama_model_loader::TENSOR_ usage (#12492)

Assets 25

21 Mar 10:04

github-actions

b4934

ea1518e

b4934

llama-tts : avoid crashes related to bad model file paths (#12482)

Assets 26

21 Mar 07:55

github-actions

b4933

1aa87ee

b4933

[SYCL] Fix build on Windows when ccache enabled (#9954) (#9976)

* [SYCL] Fix build on Windows when ccache enabled (#9954)

* take effect only on windows and force it to icl

---------

Co-authored-by: Romain Biessy <[email protected]>

Assets 26

21 Mar 03:07

github-actions

b4932

9ffcc9e

b4932

sycl: cleanup oneDNN related code (#12097)

Assets 26

20 Mar 12:47

github-actions

b4930

dbb3a47

b4930

llama : make Qwen2MoE QKV bias optional (#12477)

Assets 25

20 Mar 12:18

github-actions

b4929

3d82dbc

b4929

ggml : block interleaving support for Q4_K quantization for x86 AVX2 …

Assets 26

19 Mar 21:03

github-actions

b4927

568013d

b4927

context : clear sets containing encoder output sequence ids before st…

Assets 26

19 Mar 20:53

github-actions

b4926

517b5dd

b4926

CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case (#12183)

- Find out active blocks per SM using cudaOccupancyMaxActiveBlocksPerMultiprocessor API. Use this value to determine the optimal parallel_blocks value.
- Prefer vector flash attention kernels over MMA kernel for BS=1

Fixes Issue: #12182
---------

Co-authored-by: Johannes Gäßler <[email protected]>

Assets 26

19 Mar 19:46

github-actions

b4925

a9b5928

b4925

vulkan: optimize iq1 coopmat2 dequant functions (#12427)

Assets 26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ngxson/llama.cpp

b4936

Uh oh!

b4935

Uh oh!

b4934

Uh oh!

b4933

Uh oh!

b4932

Uh oh!

b4930

Uh oh!

b4929

Uh oh!

b4927

Uh oh!

b4926

Uh oh!

b4925

Uh oh!