Releases · alitariq4589/llama.cpp

18 Aug 08:31

f44f793

b6191 Latest

Latest

ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (#15379)

* ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors

* ggml-quants : avoid division by zero in make_q3_quants

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6

373 MB 2025-08-18T08:31:41Z
llama-b6191-bin-macos-arm64.zip

sha256:b205f97f3a01047207b0763012d7e1d3c3cdb22c06e448a6885158d942227414

10.9 MB 2025-08-18T08:31:50Z
llama-b6191-bin-macos-x64.zip

sha256:087c0a1bc5399f09083404242b9123c2259c83483e5bbc3471a0724fbc818ff4

28 MB 2025-08-18T08:31:51Z
llama-b6191-bin-ubuntu-vulkan-x64.zip

sha256:c79aac76519d2766b785d4ee2e98f47b143a3b0f7a14b13bf072b98b2c1f698a

22.1 MB 2025-08-18T08:31:53Z
llama-b6191-bin-ubuntu-x64.zip

sha256:d47583f4e7ccf38e608b0b58bb07704f9b4062f3ee3198624b7a8fe54e5d00d4

12.9 MB 2025-08-18T08:31:54Z
llama-b6191-bin-win-cpu-arm64.zip

sha256:0917d515fca1786d68a3075a94d45abb28d5f6f60dbc4c2dd6f70c1a47ec9ea7

11.1 MB 2025-08-18T08:31:55Z
llama-b6191-bin-win-cpu-x64.zip

sha256:b54722de43f5c4011053bcd4ee3d04148999adf74ffefc387a3ce34e4d3253c5

14 MB 2025-08-18T08:31:56Z
llama-b6191-bin-win-cuda-12.4-x64.zip

sha256:f83b07efc3fba8d005332f70112696dc8dd09f9399473fc02cf3771a57929564

139 MB 2025-08-18T08:31:57Z
llama-b6191-bin-win-hip-radeon-x64.zip

sha256:7adaf91dc239bf4361dc3870075f9da9288f45142425a96540e34f5568b8e3b5

288 MB 2025-08-18T08:32:03Z
llama-b6191-bin-win-opencl-adreno-arm64.zip

sha256:a68eb9fb98f7451deab943d18199c58e555ad534d8ace186d81a81bcf8c3b310

11.5 MB 2025-08-18T08:32:10Z
Source code (zip)

2025-08-18T07:23:56Z
Source code (tar.gz)

2025-08-18T07:23:56Z

13 Aug 06:45

github-actions

b6140

b049315

b6140

HIP: disable sync warp shuffel operators from clr amd_warp_sync_funct…

Assets 15

28 Jun 19:26

github-actions

b5774

27208bf

b5774

CUDA: add bf16 and f32 support to cublas_mul_mat_batched (#14361)

* CUDA: add bf16 and f32 support to cublas_mul_mat_batched

* Review: add type traits and make function more generic

* Review: make check more explicit, add back comments, and fix formatting

* Review: fix formatting, remove useless type conversion, fix naming for bools

Assets 15

24 Jun 09:25

github-actions

b5749

abf2410

b5749

main : honor --verbose-prompt on interactive prompts (#14350)

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Releases: alitariq4589/llama.cpp

b6191

Uh oh!

b6140

Uh oh!

b5774

Uh oh!

b5749

Uh oh!