Releases: aizip/llama.cpp
Releases · aizip/llama.cpp
b3631
ggml : do not crash when quantizing q4_x_x with an imatrix (#9192)
b3618
lora : fix llama conversion script with ROPE_FREQS (#9117)
b3599
server : refactor middleware and /health endpoint (#9056) * server : refactor middleware and /health endpoint * move "fail_on_no_slot" to /slots * Update examples/server/server.cpp Co-authored-by: Georgi Gerganov <[email protected]> * fix server tests * fix CI * update server docs --------- Co-authored-by: Georgi Gerganov <[email protected]>
b3579
ci : fix github workflow vulnerable to script injection (#9008) Signed-off-by: Diogo Teles Sant'Anna <[email protected]>
b3488
ggml: bugfix: fix the inactive elements is agnostic for risc-v vector…
b3466
server : add Speech Recognition & Synthesis to UI (#8679) * server : add Speech Recognition & Synthesis to UI * server : add Speech Recognition & Synthesis to UI (fixes)
b3423
ggml : fix quant dot product with odd number of blocks (#8549) * ggml : fix iq4_nl dot product with odd number of blocks * ggml : fix odd blocks for ARM_NEON (#8556) * ggml : fix iq4_nl dot product with odd number of blocks * ggml : fix q4_1 * ggml : fix q5_0 * ggml : fix q5_1 * ggml : fix iq4_nl metal ggml-ci * ggml : fix q4_0 * ggml : fix q8_0 ggml-ci * ggml : remove special Q4_0 code for first 2 blocks * ggml : fix sumf redefinition --------- Co-authored-by: slaren <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>
b3374
cuda : suppress 'noreturn' warn in no_device_code (#8414) * cuda : suppress 'noreturn' warn in no_device_code This commit adds a while(true) loop to the no_device_code function in common.cuh. This is done to suppress the warning: ```console /ggml/src/ggml-cuda/template-instances/../common.cuh:346:1: warning: function declared 'noreturn' should not return [-Winvalid-noreturn] 346 | } | ^ ``` The motivation for this is to reduce the number of warnings when compilng with GGML_HIPBLAS=ON. Signed-off-by: Daniel Bevenius <[email protected]> * squash! cuda : suppress 'noreturn' warn in no_device_code Update __trap macro instead of using a while loop to suppress the warning. Signed-off-by: Daniel Bevenius <[email protected]> --------- Signed-off-by: Daniel Bevenius <[email protected]>