Releases: ngxson/llama.cpp
Releases · ngxson/llama.cpp
b4893
llama-tts : add '-o' option (#12398) * added -o option to specify an output file name * llama-tts returns ENOENT in case of file write error note : PR #12042 is closed as superseded with this one.
b4892
SYCL: Delete redundant plus sign and space (#12391)
b4891
SYCL : support non-contiguous tensors in binary ops (add, sub, etc) (…
b4890
[CANN]MUL_MAT optimization (#12382)
b4889
Add CLI arg to llama-run to adjust the number of threads used (#12370) We default to 4, sometimes we want to manually adjust this Signed-off-by: Eric Curtin <[email protected]>
b4888
main : add -sysf / --system-prompt-file (#12249) (#12250) * add system_prompt_file * add -sysf / --system-prompt-file * remove system_prompt_file
b4887
Load all MoE experts during warmup (#11571) * llama : introduce llama_set_warmup() API call that controls warmup mode; use all MoE experts during warmup * common : use new API to enable warmup mode during model warmup --------- Co-authored-by: Stanisław Szymczyk <[email protected]>
b4886
server: fix "--grammar-file" parameter (#12285)
b4885
graph : simplify attn input build for unified KV cache (#12381) ggml-ci
b4884
hparams : add SWA rope parameters (#12374) ggml-ci