Skip to content

[SYCL] fix llama_kv_cache hang when kv_cache is huge: 5GB#21283

Merged
ggerganov merged 1 commit intoggml-org:masterfrom
arthw:fix_buffer_clear
Apr 2, 2026
Merged

[SYCL] fix llama_kv_cache hang when kv_cache is huge: 5GB#21283
ggerganov merged 1 commit intoggml-org:masterfrom
arthw:fix_buffer_clear

Conversation

@arthw
Copy link
Copy Markdown
Contributor

@arthw arthw commented Apr 2, 2026

In llama_kv_cache, when the cache size is huge, like 5GB, the code will hang.
The root cause is the memset() can't support more than 4GB.
Verified on Arc770.

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Apr 2, 2026
@ggerganov ggerganov merged commit 4888137 into ggml-org:master Apr 2, 2026
45 of 46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants