Skip to content

Releases: alitariq4589/llama.cpp

b6191

18 Aug 08:31
f44f793

Choose a tag to compare

ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (#15379)

* ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors

* ggml-quants : avoid division by zero in make_q3_quants

b6140

13 Aug 06:45
b049315

Choose a tag to compare

HIP: disable sync warp shuffel operators from clr amd_warp_sync_funct…

b5774

28 Jun 19:26
27208bf

Choose a tag to compare

CUDA: add bf16 and f32 support to cublas_mul_mat_batched (#14361)

* CUDA: add bf16 and f32 support to cublas_mul_mat_batched

* Review: add type traits and make function more generic

* Review: make check more explicit, add back comments, and fix formatting

* Review: fix formatting, remove useless type conversion, fix naming for bools

b5749

24 Jun 09:25
abf2410

Choose a tag to compare

main : honor --verbose-prompt on interactive prompts (#14350)