UPSTREAM PR #21283: [SYCL] fix llama_kv_cache hang when kv_cache is huge: 5GB by loci-dev · Pull Request #1326 · auroralabs-loci/llama.cpp

loci-dev · 2026-04-02T02:17:58Z

Note

Source pull request: ggml-org/llama.cpp#21283

In llama_kv_cache, when the cache size is huge, like 5GB, the code will hang.
The root cause is the memset() can't support more than 4GB.
Verified on Arc770.

loci-review · 2026-04-02T03:12:28Z

No meaningful performance changes were detected across 123165 analyzed functions in the following binaries: build.bin.libllama.so, build.bin.llama-bench, build.bin.libmtmd.so, build.bin.llama-cvector-generator, build.bin.llama-tts, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli, build.bin.llama-tokenize, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.libggml-base.so, build.bin.libggml-cpu.so, build.bin.libggml.so.

🔎 Full breakdown: Loci Inspector
💬 Questions? Tag @loci-dev

fix llama_kv_cache hang when kv_cache is huge: 5GB

fe9309c

loci-dev temporarily deployed to PROD__AL_DEMO April 2, 2026 02:18 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #21283: [SYCL] fix llama_kv_cache hang when kv_cache is huge: 5GB#1326

UPSTREAM PR #21283: [SYCL] fix llama_kv_cache hang when kv_cache is huge: 5GB#1326
loci-dev wants to merge 1 commit intomainfrom
loci/pr-21283-fix_buffer_clear

loci-dev commented Apr 2, 2026

Uh oh!

loci-review bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Apr 2, 2026

Uh oh!

loci-review bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants