Skip to content

Commit d440f03

Browse files
authored
Update gguf-llamacpp.md
1 parent 0383e4e commit d440f03

File tree

1 file changed

+25
-1
lines changed

1 file changed

+25
-1
lines changed

docs/hub/gguf-llamacpp.md

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,30 @@
11
# GGUF usage with llama.cpp
22

3-
Llama.cpp allows you to download and run inference on a GGUF simply by providing a path to the Hugging Face repo path and the file name. llama.cpp download the model checkpoint and automatically caches it. The location of the cache is defined by `LLAMA_CACHE` environment variable, read more about it [here](https://github.com/ggerganov/llama.cpp/pull/7826):
3+
NEW: You can now deploy any llama.cpp compatible GGUF on Hugging Face Endpoints, read more about it [here](https://huggingface.co/docs/inference-endpoints/en/others/llamacpp_container)
4+
5+
Llama.cpp allows you to download and run inference on a GGUF simply by providing a path to the Hugging Face repo path and the file name. llama.cpp downloads the model checkpoint and automatically caches it. The location of the cache is defined by `LLAMA_CACHE` environment variable, read more about it [here](https://github.com/ggerganov/llama.cpp/pull/7826).
6+
7+
Install llama.cpp through brew (works on Mac and Linux)
8+
9+
```bash
10+
brew install llama.cpp
11+
```
12+
13+
You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
14+
15+
Step 1: Clone llama.cpp from GitHub.
16+
17+
```
18+
git clone https://github.com/ggerganov/llama.cpp
19+
```
20+
21+
Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
22+
23+
```
24+
cd llama.cpp && LLAMA_CURL=1 make
25+
```
26+
27+
Once installed, you can use the `llama-cli` or `llama-server` as follows:
428

529
```bash
630
./llama-cli

0 commit comments

Comments
 (0)