Skip to content

Releases: ServeurpersoCom/llama.cpp

b6660

01 Oct 18:34
4201dea
Compare
Choose a tag to compare
common: introduce http.h for httplib-based client (#16373)

* common: introduce http.h for httplib-based client

This change moves cpp-httplib based URL parsing and client setup into
a new header `common/http.h`, and integrates it in `arg.cpp` and `run.cpp`.

It is an iteration towards removing libcurl, while intentionally
minimizing changes to existing code to guarantee the same behavior when
`LLAMA_CURL` is used.

Signed-off-by: Adrien Gallouët <[email protected]>

* tools : add missing WIN32_LEAN_AND_MEAN

Signed-off-by: Adrien Gallouët <[email protected]>

---------

Signed-off-by: Adrien Gallouët <[email protected]>
Signed-off-by: Adrien Gallouët <[email protected]>

b6653

01 Oct 03:56
e74c92e
Compare
Choose a tag to compare
model : support GLM 4.6 (make a few NextN/MTP tensors not required) (…

b6651

30 Sep 18:39
bf6f3b3
Compare
Choose a tag to compare
common : disable progress bar without a tty (#16352)

* common : disable progress bar without a tty

Signed-off-by: Adrien Gallouët <[email protected]>

* Add missing headers

Signed-off-by: Adrien Gallouët <[email protected]>

---------

Signed-off-by: Adrien Gallouët <[email protected]>

b6648

30 Sep 17:49
8d78cd2
Compare
Choose a tag to compare
ggml webgpu: support for rope,div,sub,glu,scale,cont operators (#16187)

* Work on rope

* Simplify inplace operation generation and combine mul/add generation

* Work on rope variants

* implement neox rope

* rope complete

* Add sub,div,glu operators

* implement scale op

* Update cpy shader to handle cont/more types

* formatting

* Update test vars printing for rope,rms_norm

* Avoid ROPE hardcoded constants

* Add TODO to change ROPE constants to enum

Co-authored-by: Georgi Gerganov <[email protected]>

* fix TODO comment

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b6646

30 Sep 16:30
364a7a6
Compare
Choose a tag to compare
common : remove common_has_curl() (#16351)

`test-arg-parser.cpp` has been updated to work consistently,
regardless of whether CURL or SSL support is available, and
now always points to `ggml.ai`.

The previous timeout test has been removed, but it can be
added back by providing a dedicated URL under `ggml.ai`.

Signed-off-by: Adrien Gallouët <[email protected]>

b6644

30 Sep 11:23
Compare
Choose a tag to compare
ggml : bump version to 0.9.4 (ggml/1363)

b6643

30 Sep 08:52
a014310
Compare
Choose a tag to compare
cuda : Enable CUDA Graph usage for Nemotron Nano v2 (NemotronH) (#16328)

* Fix Nemotron Nano v2 9B not executing as CUDA Graph on NVIDIA GPUs

* fix to ensure test-backend-ops check passes

b6639

30 Sep 06:44
de41f2b
Compare
Choose a tag to compare
codeowners: add codeowners for opencl backend (#16344)

b6636

29 Sep 15:16
d72f5f7
Compare
Choose a tag to compare
ci : add AMD runners and workflows (#16249)

* ci : add AMD runners and workflows

* ci : move AMD jobs to separate workflow

* cont : fix paths

b6628

29 Sep 12:04
02463ab
Compare
Choose a tag to compare
ggml-backend : add root cause in error message if loading backend lib…