Skip to content

Releases: 2015aroras/llama.cpp

b6498

17 Sep 18:13
c959b67

Choose a tag to compare

CUDA: fix FA occupancy, optimize tile kernel (#15982)

b6445

10 Sep 21:23
00681df

Choose a tag to compare

CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3% E2E performance (#…

b4430

06 Jan 17:40
ecebbd2

Choose a tag to compare

llama : remove unused headers (#11109)

ggml-ci

b4173

26 Nov 02:17
0cc6375

Choose a tag to compare

Introduce llama-run (#10291)

It's like simple-chat but it uses smart pointers to avoid manual
memory cleanups. Less memory leaks in the code now. Avoid printing
multiple dots. Split code into smaller functions. Uses no exception
handling.

Signed-off-by: Eric Curtin <[email protected]>

b4165

25 Nov 17:44
a9a678a

Choose a tag to compare

Add download chat feature to server chat (#10481)

* Add download chat feature to server chat

Add a download feature next to the delete chat feature in the server vue chat interface.

* code style

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>

b4091

15 Nov 19:43

Choose a tag to compare

cmake : fix ppc64 check (whisper/0)

ggml-ci