Skip to content

Commit b72b9c2

Browse files
authored
readme : add CPU-only configs
1 parent 9407bdd commit b72b9c2

File tree

1 file changed

+18
-0
lines changed

1 file changed

+18
-0
lines changed

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,24 @@ Here are recommended settings, depending on the amount of VRAM that you have:
7878
--ctx-size 0 --cache-reuse 256
7979
```
8080

81+
<details>
82+
<summary>CPU-only configs</summary>
83+
84+
These are `llama-server` settings for CPU-only hardware. Note that the quality will be significantly lower:
85+
86+
```bash
87+
llama-server \
88+
-hf ggml-org/Qwen2.5-Coder-1.5B-Q8_0-GGUF \
89+
--port 8012 -ub 512 -b 512 --ctx-size 0 --cache-reuse 256
90+
```
91+
92+
```bash
93+
llama-server \
94+
-hf ggml-org/Qwen2.5-Coder-0.5B-Q8_0-GGUF \
95+
--port 8012 -ub 1024 -b 1024 --ctx-size 0 --cache-reuse 256
96+
```
97+
</details>
98+
8199
You can use any other FIM-compatible model that your system can handle. By default, the models downloaded with the `-hf` flag are stored in:
82100

83101
- Mac OS: `~/Library/Caches/llama.cpp/`

0 commit comments

Comments
 (0)