Skip to content

Releases: aizip/llama.cpp

b6653

30 Sep 23:22
e74c92e

Choose a tag to compare

model : support GLM 4.6 (make a few NextN/MTP tensors not required) (…

b6513

18 Sep 19:18
38dbdf4

Choose a tag to compare

CUDA: Optimize PAD_REFLECT_1D (#15957)

* CUDA: Optimize PAD_REFLECT_1D
feat: add more test cases for PAD_REFLECT_1D

* use fast_div to improve performance

* Apply suggestion from JohannesGaessler

Co-authored-by: Johannes Gäßler <[email protected]>

* Apply suggestion from JohannesGaessler

Co-authored-by: Johannes Gäßler <[email protected]>

* optimize

* use a concise expression to further speedup the cuda kernel

---------

Co-authored-by: Johannes Gäßler <[email protected]>

b6479

15 Sep 18:10
b907255

Choose a tag to compare

SYCL: Add COUNT_EQUAL operator support (#15991)

* SYCL: Add COUNT_EQUAL operator support (rebased on master)

* SYCL: remove duplicate op_count_equal definition

* tests: remove test_count_equal_typed and use test_count_equal for all cases

* tests: keep only I32 case for COUNT_EQUAL as suggested

* tests: keep only I32 case for COUNT_EQUAL as requested

b6392

05 Sep 18:49
5143fa8

Choose a tag to compare

CUDA: fastdiv, launch bounds for mmvq + q8_1 quant (#15802)

* CUDA: fastdiv, launch bounds for mmvq + q8_1 quant

b4920

19 Mar 00:40
d84635b

Choose a tag to compare

opencl: improve profiling (#12442)

* opencl: more profiling timing

* opencl: generate trace for profiling

* opencl: reduce profiling overhead

* Populate profiling timing info at the end rather than after each
  kernel run

* opencl: fix for chrome tracing

b4205

28 Nov 11:02
c6bc739

Choose a tag to compare

CANN: Update cann.md to display correctly in CLion (#10538)

b3920

15 Oct 12:21
fbc98b7

Choose a tag to compare

sampling : add XTC sampler (#9742)

* Initial XTC commit

Adds XTC sampler, not activated by default, but recommended settings by default.

* Cleanup

* Simplified chances calculation

To be more inline with the original implementation, chance is calculated once at the beginning.

* First fixes by comments

Still need to look into sorting

* Fixed trailing backspaces

* Fixed RNG to be reproduceable 

Thanks to @slaren for directions

* Fixed forgotten header

* Moved `min_keep` 

Moved from conditions to a simple check at the end.

* Fixed broken randomization

Thanks to @slaren for explanation

* Swapped sorting for a custom algorithm

Shifts tokens to remove the penalized ones, then puts the penalized at the back. Should make `min_keep` still viable.

* Algorithm rework

1. Scan token from top till the first non-penalizable
2. Remove the last captured token (the least probable above threshold)
3. Shift all tokens to override the remaining penalizable
4. Penalize and put them at the the bottom.

* Added XTC to `test-sampling`

* Simplified algorithm and more tests

* Updated info in common and args

* Merged back lost commits in common and arg

* Update dump info in common

* Fixed incorrect min_keep check

* Added XTC to README

* Renamed parameters, fixed info and defaults

* probability is at 0 by default, but XTC is included in sampling queue
* threshold higher than 0.5 switches XTC off

* Initial server support

* Added XTC to server UIs

* Fixed labels in old server UI

* Made algorithm safer and more readable

* Removed xtc_threshold_max

* Fixed arg after update

* Quick fixes by comments

* Simplified algorithm since threshold_max is removed

* Renamed random distribution

* Fixed tests and outdated README

* Small fixes

b3871

03 Oct 15:54
e3c355b

Choose a tag to compare

convert : handle tokenizer merges format from transformers 4.45 (#9696)

b3748

13 Sep 01:59
7820364

Choose a tag to compare

server : Add option to return token pieces in /tokenize endpoint (#9108)

* server : added with_pieces functionality to /tokenize endpoint

* server : Add tokenize with pieces tests to server.feature

* Handle case if tokenizer splits along utf8 continuation bytes

* Add example of token splitting

* Remove trailing ws

* Fix trailing ws

* Maybe fix ci

* maybe this fix windows ci?

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>

b3637

27 Aug 19:18
3246fe8

Choose a tag to compare

Fix minicpm example directory (#9111)