Skip to content

Releases: ngxson/llama.cpp

b4805

03 Mar 13:53
d5c63cd
Compare
Choose a tag to compare
test-backend-ops : add option -p to filter by op params (#12155)

b4804

03 Mar 13:47
9660ffe
Compare
Choose a tag to compare
ggml : fix kleidiai build (#12159)

The libggml API has changed, but this has not been updated.

b4803

03 Mar 13:45
c950a1f
Compare
Choose a tag to compare
Adding UTF-8 support to llama.cpp (#12111)

For emojis, non-alpha characters, etc.

Signed-off-by: Eric Curtin <[email protected]>

b4801

03 Mar 10:49
ece9745
Compare
Choose a tag to compare
SYCL: Move CPY kernels to a separate file and add few missing kernels…

b4800

02 Mar 21:51
cc473ca
Compare
Choose a tag to compare
ggml-backend : keep paths in native string type when possible (#12144)

b4799

02 Mar 14:38
14dec0c
Compare
Choose a tag to compare
main: use jinja chat template system prompt by default (#12118)

* Use jinja chat template system prompt by default

* faster conditional order

* remove nested ternary

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>

b4798

01 Mar 15:06
1782cdf
Compare
Choose a tag to compare
main: update outdated system prompt message (followup to #12131) (#12…

b4797

01 Mar 13:43
45a8e76
Compare
Choose a tag to compare
common : add --system-prompt parameter, replace behavior of -p in con…

b4796

01 Mar 12:40
80c41dd
Compare
Choose a tag to compare
CUDA: compress mode option and default to size (#12029)

cuda 12.8 added the option to specify stronger compression for binaries, so we now default to "size".

b4793

28 Feb 14:32
70680c4
Compare
Choose a tag to compare
ggml : upgrade init_tensor API to return a ggml_status (#11854)

* Upgrade init_tensor API to return a ggml_status

To prepare for an 'abort-free' ggml
(ggml not to abort on OOMs but return a OOM status),
as agreeed with Diego in the ggml repo,
upgrade the init_tensor() and view_init() APIs
to return a ggml_status.

* misc fixes

---------

Co-authored-by: slaren <[email protected]>