Releases · zhiyuan1i/llama.cpp

28 Jul 16:12

cd1fce6

b6015 Latest

Latest

SYCL: Add set_rows support for quantized types  (#14883)

* SYCL: Add set_rows support for quantized types

This commit adds support for GGML_OP_SET_ROWS operation for various
quantized tensor types (Q8_0, Q5_1, Q5_0, Q4_1, Q4_0, IQ4_NL) and BF16
type in the SYCL backend.

The quantization/dequantization copy kernels were moved from cpy.cpp
to cpy.hpp to make them available for set_rows.cpp.

This addresses part of the TODOs mentioned in the code.

* Use get_global_linear_id() instead

ggml-ci

* Fix formatting

ggml-ci

* Use const for ne11 and size_t variables in set_rows_sycl_q

ggml-ci

* Increase block size for q kernel to 256

ggml-ci

* Cleanup imports

* Add float.h to cpy.hpp

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6

373 MB 2025-07-28T16:12:56Z
llama-b6015-bin-macos-arm64.zip

sha256:8983b335fca284bd6a8d75d8a43310ce46ec6fd0bc87520b27868a248ecacbce

10.7 MB 2025-07-28T16:13:10Z
llama-b6015-bin-macos-x64.zip

sha256:a55938d4fa84299777c670f7d27e8568fa8f48d681703fe9d08c457d058d14e2

27.2 MB 2025-07-28T16:13:12Z
llama-b6015-bin-ubuntu-vulkan-x64.zip

sha256:0faa45fa799b4ce7bdec7b019ef6a29d7f50c422182e23436e72896f7a1e8e52

20.9 MB 2025-07-28T16:13:13Z
llama-b6015-bin-ubuntu-x64.zip

sha256:59e37e5b503b2e402f78e0e0c2304101fdcd9ba5db19288d96b7b7c14dedd20d

12.5 MB 2025-07-28T16:13:15Z
llama-b6015-bin-win-cpu-arm64.zip

sha256:0c47ff5d2716021094714d3642576ab5cea7ba6fae4bb658596debf10d7d980d

10.9 MB 2025-07-28T16:13:17Z
llama-b6015-bin-win-cpu-x64.zip

sha256:e4db34fc9799a7086a19d8fea0d45439d8b047a159fbc5e69c75d4961d9671af

13.7 MB 2025-07-28T16:13:18Z
llama-b6015-bin-win-cuda-12.4-x64.zip

sha256:ab9d7b7d847d2bc4ea0302a5f0f24c8fdce2ea71b4ed81ba1457d6058e165e40

129 MB 2025-07-28T16:13:19Z
llama-b6015-bin-win-hip-radeon-x64.zip

sha256:8880c2d8483e8539b6a35e0a1547881379a69b126757cdc7550b71935ec4ae9f

298 MB 2025-07-28T16:13:26Z
llama-b6015-bin-win-opencl-adreno-arm64.zip

sha256:0bd568060b8f933c21d04f6770c59dd1de3477b8241f7d2b1e5eb4f0baf5aac0

11.2 MB 2025-07-28T16:13:39Z
Source code (zip)

2025-07-28T15:02:15Z
Source code (tar.gz)

2025-07-28T15:02:15Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Releases: zhiyuan1i/llama.cpp

b6015

Uh oh!