Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5010
Vulkan: Add DP4A MMQ and Q8_1 quantization shader (#12135) * Vulkan: Add DP4A MMQ and Q8_1 quantization shader * Add q4_0 x q8_1 matrix matrix multiplication support * Vulkan: Add int8 coopmat MMQ support * Vulkan: Add q4_1, q5_0 and q5_1 quants, improve integer dot code * Add GL_EXT_integer_dot_product check * Remove ggml changes, fix mmq pipeline picker * Remove ggml changes, restore Intel coopmat behaviour * Fix glsl compile attempt when integer vec dot is not supported * Remove redundant code, use non-saturating integer dot, enable all matmul sizes for mmq * Remove redundant comment * Fix integer dot check * Fix compile issue with unsupported int dot glslc * Update Windows build Vulkan SDK version
b5009
cmake : fix whitespace (#0)
b5006
llava : proper description fix (#12668)
b5005
SYCL: Remove misleading ggml_sycl_op_flatten function (#12387) * SYCL: Remove misleading ggml_sycl_op_flatten function * remove trailing whitespace * Fix L2 norm from rebase * remove try catch block from element_wise.cpp * remove comment from common.hp * ggml-sycl.cpp: Add try catch sycl::exception block in compute_forward * norm.cpp: remove try catch exception block
b5004
llava : fix clip loading GGUFs with missing description (#12660)
b5003
tts : remove printfs (#12640) * tts.cpp : llama tokens console output is done using LOG_INF instead of printf(). Therefore the options '--log-disable' and '--log-file' have now uniform impact on all output.
b5002
llama : support BailingMoE (Ling) (#12634)
b5001
metal : use constexpr in FA kernels + fix typedef (#12659) * metal : use constexpr in FA kernels ggml-ci * cont ggml-ci * cont : fix typedef ggml-ci
b4999
llama-chat : Add Yandex instruct model template support (#12621) * add yandex template * update yandex chat template * fix tests * adjust chat template * fix style * fix tool macro in template * add clarify comment --------- Co-authored-by: Sergei Vorobev <[email protected]> Co-authored-by: Xuan-Son Nguyen <[email protected]>
b4998
musa: fix all warnings, re-enable `-DLLAMA_FATAL_WARNINGS=ON` in ci a…