Releases: ochafik/llama.cpp
Releases · ochafik/llama.cpp
b5400
gguf : use ggml log system (#13571) * gguf : use ggml log system * llama : remove unnecessary new lines in exception messages
b5392
server : proper error handling for missing elements in messages array…
b5387
`common`: add partial regex support (#12808) * move string_find_partial_stop & string_ends_with to common * add common_regex (supports partial matches) Co-authored-by: Georgi Gerganov <[email protected]> * Update common/regex-partial.cpp Co-authored-by: Georgi Gerganov <[email protected]> * Update common/regex-partial.cpp Co-authored-by: Georgi Gerganov <[email protected]> * Update common/regex-partial.h Co-authored-by: Georgi Gerganov <[email protected]> * partial regex: add missing iterator end checks * string utils: use string_views * direct throw to avoid ggml.h include * regex-partial: replace missed ggml_asserts --------- Co-authored-by: ochafik <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>
b5382
CUDA: faster Deepseek FA, add Turing support (#13435)
b5152
SYCL: Refactor and enable FP16 in binary broadcast OPs (#12975) * SYCL: refactor move to a separate file * Fix binbcast * Remove duplicates * fix include formatting * fix typo
b5117
sycl: Support sycl_ext_oneapi_limited_graph (#12873) The current usage of the SYCL-Graph extension checks for the `sycl_ext_oneapi_graph` device aspect. However, it is also possible to support `sycl_ext_oneapi_limied_graph` devices that don't support update
b5072
hellaswag: display estimated score confidence interval (#12797)
b5054
sync: minja (#12739) * sync: minja https://github.com/google/minja/pull/57 * fix json include
b5040
ci : add env variable in ggml-ci and document the same in SYCL.md (#1…
b5002
llama : support BailingMoE (Ling) (#12634)