Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5028
llama : add option to override model tensor buffers (#11397) * llama : add option to override tensor buffers * ggml : fix possible underflow in ggml_nbytes
b5026
vocab : BailingMoE : change possessive quantifiers to greedy (#12677)
b5025
common : remove json.hpp from common.cpp (#12697) * common : remove json.hpp from common.cpp * fix comment
b5022
opencl : fix memory allocation size (#12649) issue: https://github.com/CodeLinaro/llama.cpp/pull/17#issuecomment-2760611283 This patch fixes the memory allocation size not exceeding the maximum size of the OpenCL device.
b5021
llama : use LLM_KV_GENERAL_FILE_TYPE instead of gguf_find_key (#12672)
b5019
metal : use F32 prec in FA kernels (#12688) * metal : use F32 prec in FA kernels ggml-ci * cont : fix FA vec kernel ggml-ci
b5018
Fix clang warning in gguf_check_reserved_keys (#12686) * Fix clang warning in gguf_check_reserved_keys Signed-off-by: Xiaodong Ye <[email protected]> * Fix typo Signed-off-by: Xiaodong Ye <[email protected]> --------- Signed-off-by: Xiaodong Ye <[email protected]>
b5017
vulkan: fix build when glslc doesn't support coopmat (#12683)
b5016
SYCL: Rename oneMKL to oneMath (#12192) * Rename oneMKL Interface to oneMath * Use oneMath for Intel vendor * Rename occurences to mkl * clang-format * Silence verbose warnings * Set oneMath HIP_TARGETS * Fix silence warnings * Remove step to build oneMath from build instructions * Use fixed oneMath version * Remove INTEL_CPU * Fold CMake oneDNN conditions * Use Intel oneMKL for Intel devices * Improve CMake message * Link against MKL::MKL_SYCL::BLAS only * Move oneMath documentation to Nvidia and AMD sections
b5015
SYCL: switch to SYCL namespace (#12674)