Skip to content

Commit 958c6cb

Browse files
committed
docs : add "Quick start" section for non-technical users
1 parent aa6dff0 commit 958c6cb

File tree

3 files changed

+57
-19
lines changed

3 files changed

+57
-19
lines changed

README.md

Lines changed: 30 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,30 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)
2828

2929
----
3030

31+
## Quick start
32+
33+
Getting started with llama.cpp is straightforward. Here are several ways to install it on your machine:
34+
35+
- **⭐ Recommended**: Install `llama.cpp` using [brew, flox, nix or winget](docs/install.md)
36+
- Run with Docker - see our [Docker documentation](docs/docker.md)
37+
- Download pre-built binaries from the [releases page](https://github.com/ggml-org/llama.cpp/releases)
38+
- Build from source by cloning this repository - check out [our build guide](docs/build.md)
39+
40+
Once installed, you'll need a model to work with. Head to the [Obtaining and quantizing models](#obtaining-and-quantizing-models) section to learn more.
41+
42+
Example command:
43+
44+
```sh
45+
# Use a local model file
46+
llama-cli -m my_model.gguf
47+
48+
# Or download and run a model directly from Hugging Face
49+
llama-cli -hf ggml-org/gemma-3-1b-it-GGUF
50+
51+
# Launch OpenAI-compatible API server
52+
llama-server -hf ggml-org/gemma-3-1b-it-GGUF
53+
```
54+
3155
## Description
3256

3357
The main goal of `llama.cpp` is to enable LLM inference with minimal setup and state-of-the-art performance on a wide
@@ -229,6 +253,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
229253

230254
</details>
231255

256+
232257
## Supported backends
233258

234259
| Backend | Target devices |
@@ -245,24 +270,18 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
245270
| [OpenCL](docs/backend/OPENCL.md) | Adreno GPU |
246271
| [RPC](https://github.com/ggml-org/llama.cpp/tree/master/tools/rpc) | All |
247272

248-
## Building the project
249-
250-
The main product of this project is the `llama` library. Its C-style interface can be found in [include/llama.h](include/llama.h).
251-
The project also includes many example programs and tools using the `llama` library. The examples range from simple, minimal code snippets to sophisticated sub-projects such as an OpenAI-compatible HTTP server. Possible methods for obtaining the binaries:
252-
253-
- Clone this repository and build locally, see [how to build](docs/build.md)
254-
- On MacOS or Linux, install `llama.cpp` via [brew, flox or nix](docs/install.md)
255-
- Use a Docker image, see [documentation for Docker](docs/docker.md)
256-
- Download pre-built binaries from [releases](https://github.com/ggml-org/llama.cpp/releases)
257-
258273
## Obtaining and quantizing models
259274

260275
The [Hugging Face](https://huggingface.co) platform hosts a [number of LLMs](https://huggingface.co/models?library=gguf&sort=trending) compatible with `llama.cpp`:
261276

262277
- [Trending](https://huggingface.co/models?library=gguf&sort=trending)
263278
- [LLaMA](https://huggingface.co/models?sort=trending&search=llama+gguf)
264279

265-
You can either manually download the GGUF file or directly use any `llama.cpp`-compatible models from [Hugging Face](https://huggingface.co/) or other model hosting sites, such as [ModelScope](https://modelscope.cn/), by using this CLI argument: `-hf <user>/<model>[:quant]`.
280+
You can either manually download the GGUF file or directly use any `llama.cpp`-compatible models from [Hugging Face](https://huggingface.co/) or other model hosting sites, such as [ModelScope](https://modelscope.cn/), by using this CLI argument: `-hf <user>/<model>[:quant]`. For example:
281+
282+
```sh
283+
llama-cli -hf ggml-org/gemma-3-1b-it-GGUF
284+
```
266285

267286
By default, the CLI would download from Hugging Face, you can switch to other options with the environment variable `MODEL_ENDPOINT`. For example, you may opt to downloading model checkpoints from ModelScope or other model sharing communities by setting the environment variable, e.g. `MODEL_ENDPOINT=https://www.modelscope.cn/`.
268287

docs/build.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Build llama.cpp locally
22

3+
The main product of this project is the `llama` library. Its C-style interface can be found in [include/llama.h](include/llama.h).
4+
5+
The project also includes many example programs and tools using the `llama` library. The examples range from simple, minimal code snippets to sophisticated sub-projects such as an OpenAI-compatible HTTP server.
6+
37
**To get the Code:**
48

59
```bash

docs/install.md

Lines changed: 23 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,43 @@
11
# Install pre-built version of llama.cpp
22

3-
## Homebrew
3+
| Install via | Windows | Mac | Linux |
4+
|-------------|---------|-----|-------|
5+
| Winget || | |
6+
| Homebrew | |||
7+
| MacPorts | || |
8+
| Nix | |||
9+
| Flox | |||
410

5-
On Mac and Linux, the homebrew package manager can be used via
11+
## Winget (Windows)
12+
13+
```sh
14+
winget install llama.cpp
15+
```
16+
17+
The package is automatically updated with new `llama.cpp` releases. More info: https://github.com/ggml-org/llama.cpp/issues/8188
18+
19+
## Homebrew (Mac and Linux)
620

721
```sh
822
brew install llama.cpp
923
```
24+
1025
The formula is automatically updated with new `llama.cpp` releases. More info: https://github.com/ggml-org/llama.cpp/discussions/7668
1126

12-
## MacPorts
27+
## MacPorts (Mac)
1328

1429
```sh
1530
sudo port install llama.cpp
1631
```
17-
see also: https://ports.macports.org/port/llama.cpp/details/
1832

19-
## Nix
33+
See also: https://ports.macports.org/port/llama.cpp/details/
2034

21-
On Mac and Linux, the Nix package manager can be used via
35+
## Nix (Mac and Linux)
2236

2337
```sh
2438
nix profile install nixpkgs#llama-cpp
2539
```
40+
2641
For flake enabled installs.
2742

2843
Or
@@ -35,9 +50,9 @@ For non-flake enabled installs.
3550

3651
This expression is automatically updated within the [nixpkgs repo](https://github.com/NixOS/nixpkgs/blob/nixos-24.05/pkgs/by-name/ll/llama-cpp/package.nix#L164).
3752

38-
## Flox
53+
## Flox (Mac and Linux)
3954

40-
On Mac and Linux, Flox can be used to install llama.cpp within a Flox environment via
55+
Flox can be used to install llama.cpp within a Flox environment via
4156

4257
```sh
4358
flox install llama-cpp

0 commit comments

Comments
 (0)