Releases · ngxson/llama.cpp

26 Aug 14:59

8f5afa9

b6287

CUDA: return -1 for nonexistent compiled arch (#15587)

Assets 15

26 Aug 11:39

github-actions

b6286

b3964c1

b6286

metal : optimize FA vec for large sequences and BS <= 8 (#15566)

* metal : optmize FA vec for large heads and sequences

* metal : adjust small-batch mul mv kernels

ggml-ci

* batched-bench : fix total speed computation

ggml-ci

* cont : add comments

ggml-ci

Assets 15

26 Aug 10:03

github-actions

b6284

85cc1ae

b6284

context : print graph stats for memory-less contexts (#15586)

ggml-ci

Assets 15

26 Aug 08:22

github-actions

b6282

c4e9239

b6282

model : support MiniCPM-V 4.5 (#15575)

Assets 15

26 Aug 07:15

github-actions

b6280

0fd90db

b6280

metal : remove contiguous assertion for src0 in IM2COL (#15577)

* remove contiguous assertion for src0 in IM2COL

* add contiguous check in supports_op

Assets 15

26 Aug 06:37

github-actions

b6279

4c37636

b6279

Add a warning for special devices (#15563)

* Add warning

* Print the devices names

* Add newlines

* Apply suggestions from code review

Co-authored-by: Johannes Gäßler <[email protected]>

* Fix vector names

---------

Co-authored-by: Johannes Gäßler <[email protected]>

Assets 15

26 Aug 05:03

github-actions

b6278

34bdbbd

b6278

vulkan: Remove splitting for mul_mat_id (#15568)

row_ids only needs to hold the BN rows for the current tile.

Assets 15

25 Aug 22:14

github-actions

b6277

74f52f7

b6277

CUDA: Accelerate MXFP4 table lookup using `__byte_perm` (#15451)

* CUDA: optimize get_int_from_table_16

* CUDA: use v_perm_b32 to replace byte_perm on AMD GPUs

* revise documentation

---------

Co-authored-by: xix <[email protected]>
Co-authored-by: Johannes Gäßler <[email protected]>

Assets 15

25 Aug 21:38

github-actions

b6276

f7207b0

b6276

opencl: fix support ops condition for `rms_norm` (#15560)

Assets 15

25 Aug 16:50

github-actions

b6275

4d917cd

b6275

vulkan: fix min subgroup 16 condition for mmid subgroup optimization …

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ngxson/llama.cpp

b6287

Uh oh!

b6286

Uh oh!

b6284

Uh oh!

b6282

Uh oh!

b6280

Uh oh!

b6279

Uh oh!

b6278

Uh oh!

b6277

Uh oh!

b6276

Uh oh!

b6275

Uh oh!