Skip to content

Commit 3dd77f3

Browse files
Vaibhavs10ngxson
andauthored
upd llama.cpp docs (#1580)
* upd llama.cpp docs * Update docs/hub/gguf-llamacpp.md Co-authored-by: Xuan Son Nguyen <[email protected]> * suggestions from code review. --------- Co-authored-by: Xuan Son Nguyen <[email protected]>
1 parent 9c3c831 commit 3dd77f3

File tree

1 file changed

+3
-8
lines changed

1 file changed

+3
-8
lines changed

docs/hub/gguf-llamacpp.md

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -30,20 +30,15 @@ cd llama.cpp && LLAMA_CURL=1 make
3030
Once installed, you can use the `llama-cli` or `llama-server` as follows:
3131

3232
```bash
33-
llama-cli
34-
--hf-repo lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF \
35-
--hf-file Meta-Llama-3-8B-Instruct-Q8_0.gguf \
36-
-p "You are a helpful assistant" -cnv
33+
llama-cli -hf bartowski/Llama-3.2-3B-Instruct-GGUF:Q8_0
3734
```
3835

3936
Note: You can remove `-cnv` to run the CLI in chat completion mode.
4037

4138
Additionally, you can invoke an OpenAI spec chat completions endpoint directly using the llama.cpp server:
4239

4340
```bash
44-
llama-server \
45-
--hf-repo lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF \
46-
--hf-file Meta-Llama-3-8B-Instruct-Q8_0.gguf
41+
llama-server -hf bartowski/Llama-3.2-3B-Instruct-GGUF:Q8_0
4742
```
4843

4944
After running the server you can simply utilise the endpoint as below:
@@ -66,6 +61,6 @@ curl http://localhost:8080/v1/chat/completions \
6661
}'
6762
```
6863

69-
Replace `--hf-repo` with any valid Hugging Face hub repo name and `--hf-file` with the GGUF file name in the hub repo - off you go! 🦙
64+
Replace `-hf` with any valid Hugging Face hub repo name - off you go! 🦙
7065

7166
Note: Remember to `build` llama.cpp with `LLAMA_CURL=1` :)

0 commit comments

Comments
 (0)