Releases: 2015aroras/llama.cpp
Releases · 2015aroras/llama.cpp
b6498
b6445
CUDA: Add `fastdiv` to `k_bin_bcast*`, giving 1-3% E2E performance (#…
b4430
llama : remove unused headers (#11109) ggml-ci
b4173
Introduce llama-run (#10291) It's like simple-chat but it uses smart pointers to avoid manual memory cleanups. Less memory leaks in the code now. Avoid printing multiple dots. Split code into smaller functions. Uses no exception handling. Signed-off-by: Eric Curtin <[email protected]>
b4165
Add download chat feature to server chat (#10481) * Add download chat feature to server chat Add a download feature next to the delete chat feature in the server vue chat interface. * code style --------- Co-authored-by: Xuan Son Nguyen <[email protected]>
b4091
cmake : fix ppc64 check (whisper/0) ggml-ci