File tree Expand file tree Collapse file tree 1 file changed +18
-0
lines changed Expand file tree Collapse file tree 1 file changed +18
-0
lines changed Original file line number Diff line number Diff line change @@ -78,6 +78,24 @@ Here are recommended settings, depending on the amount of VRAM that you have:
7878 --ctx-size 0 --cache-reuse 256
7979 ```
8080
81+ <details >
82+ <summary >CPU-only configs</summary >
83+
84+ These are ` llama-server ` settings for CPU-only hardware. Note that the quality will be significantly lower:
85+
86+ ``` bash
87+ llama-server \
88+ -hf ggml-org/Qwen2.5-Coder-1.5B-Q8_0-GGUF \
89+ --port 8012 -ub 512 -b 512 --ctx-size 0 --cache-reuse 256
90+ ```
91+
92+ ``` bash
93+ llama-server \
94+ -hf ggml-org/Qwen2.5-Coder-0.5B-Q8_0-GGUF \
95+ --port 8012 -ub 1024 -b 1024 --ctx-size 0 --cache-reuse 256
96+ ```
97+ </details >
98+
8199You can use any other FIM-compatible model that your system can handle. By default, the models downloaded with the ` -hf ` flag are stored in:
82100
83101- Mac OS: ` ~/Library/Caches/llama.cpp/ `
You can’t perform that action at this time.
0 commit comments