Releases · Yangxiaoz/llama.cpp

05 Jun 03:57

0d39844

b5590 Latest

Latest

ggml-vulkan: adds support for op CONV_TRANSPOSE_1D (#13813)

* * ggml-vulkan: adds op CONV_TRANSPOSE_1D

* test-backend-ops: adds more spohisticated tests for CONV_TRANSPOSE_1D

* Missing barrier added to shader.
Number of additional tests reduced to 108.

* * Fixes typo in variable name.

* Removes extra whitespaces.

* Adds int64->int32 casts to prevent possible warnings.

* Problem size reduced in tests to pass tests with llvmpipe.

* supports_op condition moved from unintended position

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6

373 MB 2025-06-05T03:57:42Z
llama-b5590-bin-macos-arm64.zip

sha256:5ab0873a50a16dce5fd73612a92a4d639e4db85fce7f71ed527531e19a6a9223

10.4 MB 2025-06-05T03:57:54Z
llama-b5590-bin-macos-x64.zip

sha256:0e6746c713c0c9ec19ee197f11cd793187d046f9d94cbdd702d6f91186a5050d

25.3 MB 2025-06-05T03:57:55Z
llama-b5590-bin-ubuntu-vulkan-x64.zip

sha256:9ab5e781fd6193b5d3528840364a2200aeb280a38c95f7ddfba2e482e6737894

19.7 MB 2025-06-05T03:57:56Z
llama-b5590-bin-ubuntu-x64.zip

sha256:360ec13642a5aeea09dc0a69f72b5dd858b243ac984053408b66b350bd051648

12 MB 2025-06-05T03:57:57Z
llama-b5590-bin-win-cpu-arm64.zip

sha256:fb9f027472c18e412e2f2e8bc2037e4ddbaeef896c9c115079ea83d2770dad2f

10.7 MB 2025-06-05T03:57:58Z
llama-b5590-bin-win-cpu-x64.zip

sha256:836af6194f9410d6e5b91d172fe215686ee1800f4c37329e82af32f93fbc7e30

13.3 MB 2025-06-05T03:57:59Z
llama-b5590-bin-win-cuda-12.4-x64.zip

sha256:4170a577bf29c30f6cfe870de8ffd3fd61128a5acd180feb85572aeb555e153b

126 MB 2025-06-05T03:58:00Z
llama-b5590-bin-win-hip-radeon-x64.zip

sha256:2b19389df375f71418c06ca59a445978b0d1e43500f1439a096c6be7e2b33e3e

297 MB 2025-06-05T03:58:04Z
llama-b5590-bin-win-opencl-adreno-arm64.zip

sha256:9b516b307874c308ea244082ec39eaed7367d03fa61bbffd6d3e17460ef458e6

11 MB 2025-06-05T03:58:12Z
Source code (zip)

2025-06-04T20:02:00Z
Source code (tar.gz)

2025-06-04T20:02:00Z

31 May 13:52

github-actions

b5555

803f8ba

b5555

llama : deprecate explicit kv_self defrag/update calls (#13921)

ggml-ci

Assets 18

29 May 15:18

github-actions

b5536

2b13162

b5536

gguf-py : add support for sub_type (in arrays) in GGUFWriter add_key_…

Assets 18

28 May 14:36

github-actions

b5520

81044d3

b5520

CUDA: add a flag "GGML_CUDA_JETSON_DEVICE" for optimization(#13856)

Assets 18

28 May 13:42

github-actions

b5519

a682474

b5519

CUDA: fix FA tg at long context for CC >= 8.9 (#13852)

Assets 18

14 May 12:47

github-actions

b5379

360a9c9

b5379

server : fix cache_tokens bug with no cache_prompt (#13533)

Assets 20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Releases: Yangxiaoz/llama.cpp

b5590

Uh oh!

b5555

Uh oh!

b5536

Uh oh!

b5520

Uh oh!

b5519

Uh oh!

b5379

Uh oh!