Skip to content

Commit f9c2a7c

Browse files
committed
Merge branch 'master' into prune
2 parents 70842dc + ea1431b commit f9c2a7c

File tree

220 files changed

+15936
-9350
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

220 files changed

+15936
-9350
lines changed

.editorconfig

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,6 @@ charset = unset
4949
trim_trailing_whitespace = unset
5050
insert_final_newline = unset
5151

52-
[tools/mtmd/miniaudio.h]
52+
[vendor/miniaudio/miniaudio.h]
5353
trim_trailing_whitespace = unset
5454
insert_final_newline = unset

.github/workflows/build-linux-cross.yml

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,12 @@ jobs:
2626
sudo apt-get install -y --no-install-recommends \
2727
build-essential \
2828
gcc-14-riscv64-linux-gnu \
29-
g++-14-riscv64-linux-gnu \
30-
libcurl4-openssl-dev:riscv64
29+
g++-14-riscv64-linux-gnu
3130
3231
- name: Build
3332
run: |
34-
cmake -B build -DCMAKE_BUILD_TYPE=Release \
33+
cmake -B build -DLLAMA_CURL=OFF \
34+
-DCMAKE_BUILD_TYPE=Release \
3535
-DGGML_OPENMP=OFF \
3636
-DLLAMA_BUILD_EXAMPLES=ON \
3737
-DLLAMA_BUILD_TOOLS=ON \
@@ -72,12 +72,12 @@ jobs:
7272
glslc \
7373
gcc-14-riscv64-linux-gnu \
7474
g++-14-riscv64-linux-gnu \
75-
libvulkan-dev:riscv64 \
76-
libcurl4-openssl-dev:riscv64
75+
libvulkan-dev:riscv64
7776
7877
- name: Build
7978
run: |
80-
cmake -B build -DCMAKE_BUILD_TYPE=Release \
79+
cmake -B build -DLLAMA_CURL=OFF \
80+
-DCMAKE_BUILD_TYPE=Release \
8181
-DGGML_VULKAN=ON \
8282
-DGGML_OPENMP=OFF \
8383
-DLLAMA_BUILD_EXAMPLES=ON \
@@ -118,12 +118,12 @@ jobs:
118118
build-essential \
119119
glslc \
120120
crossbuild-essential-arm64 \
121-
libvulkan-dev:arm64 \
122-
libcurl4-openssl-dev:arm64
121+
libvulkan-dev:arm64
123122
124123
- name: Build
125124
run: |
126-
cmake -B build -DCMAKE_BUILD_TYPE=Release \
125+
cmake -B build -DLLAMA_CURL=OFF \
126+
-DCMAKE_BUILD_TYPE=Release \
127127
-DGGML_VULKAN=ON \
128128
-DGGML_OPENMP=OFF \
129129
-DLLAMA_BUILD_EXAMPLES=ON \
@@ -163,12 +163,12 @@ jobs:
163163
sudo apt-get install -y --no-install-recommends \
164164
build-essential \
165165
gcc-14-powerpc64le-linux-gnu \
166-
g++-14-powerpc64le-linux-gnu \
167-
libcurl4-openssl-dev:ppc64el
166+
g++-14-powerpc64le-linux-gnu
168167
169168
- name: Build
170169
run: |
171-
cmake -B build -DCMAKE_BUILD_TYPE=Release \
170+
cmake -B build -DLLAMA_CURL=OFF \
171+
-DCMAKE_BUILD_TYPE=Release \
172172
-DGGML_OPENMP=OFF \
173173
-DLLAMA_BUILD_EXAMPLES=ON \
174174
-DLLAMA_BUILD_TOOLS=ON \
@@ -209,12 +209,12 @@ jobs:
209209
glslc \
210210
gcc-14-powerpc64le-linux-gnu \
211211
g++-14-powerpc64le-linux-gnu \
212-
libvulkan-dev:ppc64el \
213-
libcurl4-openssl-dev:ppc64el
212+
libvulkan-dev:ppc64el
214213
215214
- name: Build
216215
run: |
217-
cmake -B build -DCMAKE_BUILD_TYPE=Release \
216+
cmake -B build -DLLAMA_CURL=OFF \
217+
-DCMAKE_BUILD_TYPE=Release \
218218
-DGGML_VULKAN=ON \
219219
-DGGML_OPENMP=OFF \
220220
-DLLAMA_BUILD_EXAMPLES=ON \

README.md

Lines changed: 31 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,30 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)
2828

2929
----
3030

31+
## Quick start
32+
33+
Getting started with llama.cpp is straightforward. Here are several ways to install it on your machine:
34+
35+
- Install `llama.cpp` using [brew, nix or winget](docs/install.md)
36+
- Run with Docker - see our [Docker documentation](docs/docker.md)
37+
- Download pre-built binaries from the [releases page](https://github.com/ggml-org/llama.cpp/releases)
38+
- Build from source by cloning this repository - check out [our build guide](docs/build.md)
39+
40+
Once installed, you'll need a model to work with. Head to the [Obtaining and quantizing models](#obtaining-and-quantizing-models) section to learn more.
41+
42+
Example command:
43+
44+
```sh
45+
# Use a local model file
46+
llama-cli -m my_model.gguf
47+
48+
# Or download and run a model directly from Hugging Face
49+
llama-cli -hf ggml-org/gemma-3-1b-it-GGUF
50+
51+
# Launch OpenAI-compatible API server
52+
llama-server -hf ggml-org/gemma-3-1b-it-GGUF
53+
```
54+
3155
## Description
3256

3357
The main goal of `llama.cpp` is to enable LLM inference with minimal setup and state-of-the-art performance on a wide
@@ -130,6 +154,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
130154
<details>
131155
<summary>Bindings</summary>
132156

157+
- Python: [ddh0/easy-llama](https://github.com/ddh0/easy-llama)
133158
- Python: [abetlen/llama-cpp-python](https://github.com/abetlen/llama-cpp-python)
134159
- Go: [go-skynet/go-llama.cpp](https://github.com/go-skynet/go-llama.cpp)
135160
- Node.js: [withcatai/node-llama-cpp](https://github.com/withcatai/node-llama-cpp)
@@ -229,6 +254,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
229254

230255
</details>
231256

257+
232258
## Supported backends
233259

234260
| Backend | Target devices |
@@ -245,24 +271,18 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
245271
| [OpenCL](docs/backend/OPENCL.md) | Adreno GPU |
246272
| [RPC](https://github.com/ggml-org/llama.cpp/tree/master/tools/rpc) | All |
247273

248-
## Building the project
249-
250-
The main product of this project is the `llama` library. Its C-style interface can be found in [include/llama.h](include/llama.h).
251-
The project also includes many example programs and tools using the `llama` library. The examples range from simple, minimal code snippets to sophisticated sub-projects such as an OpenAI-compatible HTTP server. Possible methods for obtaining the binaries:
252-
253-
- Clone this repository and build locally, see [how to build](docs/build.md)
254-
- On MacOS or Linux, install `llama.cpp` via [brew, flox or nix](docs/install.md)
255-
- Use a Docker image, see [documentation for Docker](docs/docker.md)
256-
- Download pre-built binaries from [releases](https://github.com/ggml-org/llama.cpp/releases)
257-
258274
## Obtaining and quantizing models
259275

260276
The [Hugging Face](https://huggingface.co) platform hosts a [number of LLMs](https://huggingface.co/models?library=gguf&sort=trending) compatible with `llama.cpp`:
261277

262278
- [Trending](https://huggingface.co/models?library=gguf&sort=trending)
263279
- [LLaMA](https://huggingface.co/models?sort=trending&search=llama+gguf)
264280

265-
You can either manually download the GGUF file or directly use any `llama.cpp`-compatible models from [Hugging Face](https://huggingface.co/) or other model hosting sites, such as [ModelScope](https://modelscope.cn/), by using this CLI argument: `-hf <user>/<model>[:quant]`.
281+
You can either manually download the GGUF file or directly use any `llama.cpp`-compatible models from [Hugging Face](https://huggingface.co/) or other model hosting sites, such as [ModelScope](https://modelscope.cn/), by using this CLI argument: `-hf <user>/<model>[:quant]`. For example:
282+
283+
```sh
284+
llama-cli -hf ggml-org/gemma-3-1b-it-GGUF
285+
```
266286

267287
By default, the CLI would download from Hugging Face, you can switch to other options with the environment variable `MODEL_ENDPOINT`. For example, you may opt to downloading model checkpoints from ModelScope or other model sharing communities by setting the environment variable, e.g. `MODEL_ENDPOINT=https://www.modelscope.cn/`.
268288

common/CMakeLists.txt

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -58,23 +58,20 @@ add_library(${TARGET} STATIC
5858
arg.cpp
5959
arg.h
6060
base64.hpp
61-
chat.cpp
62-
chat.h
6361
chat-parser.cpp
6462
chat-parser.h
63+
chat.cpp
64+
chat.h
6565
common.cpp
6666
common.h
6767
console.cpp
6868
console.h
69-
json-schema-to-grammar.cpp
70-
json.hpp
71-
json-partial.h
7269
json-partial.cpp
70+
json-partial.h
71+
json-schema-to-grammar.cpp
7372
llguidance.cpp
7473
log.cpp
7574
log.h
76-
minja/chat-template.hpp
77-
minja/minja.hpp
7875
ngram-cache.cpp
7976
ngram-cache.h
8077
regex-partial.cpp
@@ -147,7 +144,7 @@ if (LLAMA_LLGUIDANCE)
147144
set(LLAMA_COMMON_EXTRA_LIBS ${LLAMA_COMMON_EXTRA_LIBS} llguidance ${LLGUIDANCE_PLATFORM_LIBS})
148145
endif ()
149146

150-
target_include_directories(${TARGET} PUBLIC .)
147+
target_include_directories(${TARGET} PUBLIC . ../vendor)
151148
target_compile_features (${TARGET} PUBLIC cxx_std_17)
152149
target_link_libraries (${TARGET} PRIVATE ${LLAMA_COMMON_EXTRA_LIBS} PUBLIC llama Threads::Threads)
153150

0 commit comments

Comments
 (0)