Releases · ngxson/llama.cpp

26 Aug 05:03

34bdbbd

b6278

vulkan: Remove splitting for mul_mat_id (#15568)

row_ids only needs to hold the BN rows for the current tile.

Assets 15

25 Aug 22:14

github-actions

b6277

74f52f7

b6277

CUDA: Accelerate MXFP4 table lookup using `__byte_perm` (#15451)

* CUDA: optimize get_int_from_table_16

* CUDA: use v_perm_b32 to replace byte_perm on AMD GPUs

* revise documentation

---------

Co-authored-by: xix <[email protected]>
Co-authored-by: Johannes Gäßler <[email protected]>

Assets 15

25 Aug 21:38

github-actions

b6276

f7207b0

b6276

opencl: fix support ops condition for `rms_norm` (#15560)

Assets 15

25 Aug 16:50

github-actions

b6275

4d917cd

b6275

vulkan: fix min subgroup 16 condition for mmid subgroup optimization …

Assets 15

25 Aug 11:25

github-actions

b6269

6b64f74

b6269

batched-bench : fix unified KV cache handling + pp timing (#15562)

* batched-bench : fix unified KV cache handling + pp timing

* cont : run dummy token only with split KV cache

Assets 15

25 Aug 07:43

github-actions

b6267

b0ba31f

b6267

metal : add FA kernels for HS=40 (#15559)

ggml-ci

Assets 15

25 Aug 02:49

github-actions

b6265

c247d06

b6265

CANN: ROPE cache sin/cos repeat (#15501)

Signed-off-by: noemotiovon <[email protected]>

Assets 15

24 Aug 17:57

github-actions

b6264

043fb27

b6264

vulkan: apply MUL_MAT_ID subgroup optimization to non-coopmat devices…

Assets 15

24 Aug 09:47

github-actions

b6262

c9a24fb

b6262

vulkan: Support FA with any multiple of 8 head sizes (#15537)

The scalar FA shader already handled multiples of 8. The coopmat1 FA
shader assumed 16x16x16 and the shared memory allocations need the HSK
dimensions padded to a multiple of 16. NVIDIA's coopmat2 implementation
requires multiples of 16 for N and K, and needs the matrix dimensions
padded and loads clamped.

Store the FA pipelines in a map, indexed by the pipeline state.

Assets 15

24 Aug 09:06

github-actions

b6261

a9c6ffc

b6261

vulkan: enable Conv2D for Apple after MoltenVK fixed the bug (#15526)

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ngxson/llama.cpp

b6278

Uh oh!

b6277

Uh oh!

b6276

Uh oh!

b6275

Uh oh!

b6269

Uh oh!

b6267

Uh oh!

b6265

Uh oh!

b6264

Uh oh!

b6262

Uh oh!

b6261

Uh oh!