Releases · ngxson/llama.cpp

28 Feb 08:04

fbeda90

b4786

vulkan: matmul dequantization improvements (#12015)

* faster dequant for old quants

* dont use unpack for iq4_nl

* vec2 unpack for q8

Assets 25

27 Feb 08:28

github-actions

b4784

b95c8af

b4784

cmake: Fix ggml backend dependencies and installation (#11818)

* Fix dependencies between ggml and backends

ggml backends link only to ggml-base and ggml links to all backends.

* Fix installation of ggml backends

Set up GNUInstallDirs before setting the installation directory of ggml backends

Assets 25

26 Feb 15:03

github-actions

b4783

a800ae4

b4783

llava : add struct for FFI bindgen (#12079)

* add struct for FFI bindgen

* Apply suggestions from code review

---------

Co-authored-by: Xuan-Son Nguyen <[email protected]>

Assets 25

25 Feb 16:09

github-actions

b4778

a82c9e7

b4778

vulkan: fix assertion when qy_needs_dequant (#12068)

Looks like a copy/paste bug from qx_needs_dequant.

Assets 25

25 Feb 12:45

github-actions

b4777

401af80

b4777

server: handle echo=false on /v1/completions (#12060)

Assets 25

25 Feb 12:20

github-actions

b4776

c132239

b4776

add OP sigmoid (#12056)

Co-authored-by: Judd <[email protected]>

Assets 25

25 Feb 12:13

github-actions

b4775

393fca6

b4775

ggml-cpu: Fix build with sve (#12059)

* ggml-cpu: Fix build with sve

Signed-off-by: Molly Sophia <[email protected]>

* ggml-cpu: Remove unused variable in sve q3_k vec dot

Signed-off-by: Molly Sophia <[email protected]>

---------

Signed-off-by: Molly Sophia <[email protected]>

Assets 25

25 Feb 11:53

github-actions

b4774

61d4f39

b4774

vulkan: implement more backpropagation operators (#11914)

* vulkan: implement GGML_OP_ROPE_BACK

* vulkan: implement GGML_OP_RMS_NORM_BACK

* vulkan: implement GGML_OP_SILU_BACK

* vulkan: implement GGML_OP_SOFTMAX_BACK

Assets 25

25 Feb 11:21

github-actions

b4773

0b52745

b4773

server: support add_generation_prompt query param (#12062)

Assets 25

25 Feb 10:16

github-actions

b4771

3e9a286

b4771

llama : expose llama_model_n_head_kv in the API (#11997)

It's useful to be able to have this from the library layer as it's a key
parameter of the model (e.g. to figure out how much KV cache memory is
needed).

Assets 25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ngxson/llama.cpp

b4786

Uh oh!

b4784

Uh oh!

b4783

Uh oh!

b4778

Uh oh!

b4777

Uh oh!

b4776

Uh oh!

b4775

Uh oh!

b4774

Uh oh!

b4773

Uh oh!

b4771

Uh oh!