Releases: ServeurpersoCom/llama.cpp
Releases · ServeurpersoCom/llama.cpp
b6660
common: introduce http.h for httplib-based client (#16373) * common: introduce http.h for httplib-based client This change moves cpp-httplib based URL parsing and client setup into a new header `common/http.h`, and integrates it in `arg.cpp` and `run.cpp`. It is an iteration towards removing libcurl, while intentionally minimizing changes to existing code to guarantee the same behavior when `LLAMA_CURL` is used. Signed-off-by: Adrien Gallouët <[email protected]> * tools : add missing WIN32_LEAN_AND_MEAN Signed-off-by: Adrien Gallouët <[email protected]> --------- Signed-off-by: Adrien Gallouët <[email protected]> Signed-off-by: Adrien Gallouët <[email protected]>
b6653
model : support GLM 4.6 (make a few NextN/MTP tensors not required) (…
b6651
common : disable progress bar without a tty (#16352) * common : disable progress bar without a tty Signed-off-by: Adrien Gallouët <[email protected]> * Add missing headers Signed-off-by: Adrien Gallouët <[email protected]> --------- Signed-off-by: Adrien Gallouët <[email protected]>
b6648
ggml webgpu: support for rope,div,sub,glu,scale,cont operators (#16187) * Work on rope * Simplify inplace operation generation and combine mul/add generation * Work on rope variants * implement neox rope * rope complete * Add sub,div,glu operators * implement scale op * Update cpy shader to handle cont/more types * formatting * Update test vars printing for rope,rms_norm * Avoid ROPE hardcoded constants * Add TODO to change ROPE constants to enum Co-authored-by: Georgi Gerganov <[email protected]> * fix TODO comment --------- Co-authored-by: Georgi Gerganov <[email protected]>
b6646
common : remove common_has_curl() (#16351) `test-arg-parser.cpp` has been updated to work consistently, regardless of whether CURL or SSL support is available, and now always points to `ggml.ai`. The previous timeout test has been removed, but it can be added back by providing a dedicated URL under `ggml.ai`. Signed-off-by: Adrien Gallouët <[email protected]>
b6644
ggml : bump version to 0.9.4 (ggml/1363)
b6643
cuda : Enable CUDA Graph usage for Nemotron Nano v2 (NemotronH) (#16328) * Fix Nemotron Nano v2 9B not executing as CUDA Graph on NVIDIA GPUs * fix to ensure test-backend-ops check passes
b6639
codeowners: add codeowners for opencl backend (#16344)
b6636
ci : add AMD runners and workflows (#16249) * ci : add AMD runners and workflows * ci : move AMD jobs to separate workflow * cont : fix paths
b6628
ggml-backend : add root cause in error message if loading backend lib…