Skip to content

Releases: jhen0409/llama.cpp

b4635

04 Feb 14:04
106045e

Choose a tag to compare

readme : add llm_client Rust crate to readme bindings (#11628)

[This crate](https://github.com/ShelbyJenkins/llm_client) has been in a usable state for quite awhile, so I figured now is fair to add it.

It installs from crates.io, and automatically downloads the llama.cpp repo and builds it for the target platform - with the goal being the easiest user experience possible.

It also integrates model presets and choosing the largest quant given the target's available VRAM. So a user just has to specify one of the presets (I manually add the most popular models), and it will download from hugging face.

So, it's like a Rust Ollama, but it's not really for chatting. It makes heavy use of llama.cpp's grammar system to do structured output for decision making and control flow tasks.

b4066

11 Nov 09:01
b0cefea

Choose a tag to compare

metal : more precise Q*K in FA vec kernel (#10247)

b4062

09 Nov 14:16

Choose a tag to compare

Revert "convert : fix missing ftype for gemma (#5690)"

This reverts commit 54fbcd2ce6c48c9e22eca6fbf9e53fb68c3e72ea.

b4012

02 Nov 04:15

Choose a tag to compare

Revert "convert : fix missing ftype for gemma (#5690)"

This reverts commit 54fbcd2ce6c48c9e22eca6fbf9e53fb68c3e72ea.