Skip to content

new CLI experienceΒ #904

@galo

Description

@galo

In ggml-org/llama.cpp#17824 llama-cpp a new CLI experience reusing the llama-server infrastructure has be created, and deprecated the previous implementation.

I created a Rust implementation here https://github.com/galo/llama-cpp-rs/tree/main/examples/cli. This is interesting because it allows for some types of application the capability of directly reusing the llama-server features - ex: speculative decoding - , as well as the same parity with the rest of llama.cpp features. For this I simply exported new bindings and safe Rust implementation of the server components. The CLI is an example of how to use this infra,

Take a look and let me know if this is interesting, I have not done extensive texting - i.e. did no test vulkan/cuda/etc backend.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions