new CLI experience

In https://github.com/ggml-org/llama.cpp/pull/17824 llama-cpp  a new CLI experience reusing the llama-server infrastructure has be created, and deprecated the previous implementation. 

I created a Rust implementation here https://github.com/galo/llama-cpp-rs/tree/main/examples/cli. This is interesting because it allows for some types of application the capability of directly reusing the llama-server features - ex: speculative decoding - ,  as well as the same parity with the rest of llama.cpp features.  For this I simply exported new bindings and safe Rust implementation of the server components. The CLI is an example of how to use this infra,

Take a look and let me know if this is interesting, I have not done extensive texting - i.e. did no test vulkan/cuda/etc backend. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new CLI experience #904

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

new CLI experience #904

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions