Skip to content

Commit af409d6

Browse files
committed
readme : update llama-server command with presets
This commit updates the llama-server commands to use the new presets available in the latest version of llama.cpp. Refs: ggml-org/llama.cpp#11945
1 parent 72c2ee2 commit af409d6

File tree

1 file changed

+3
-12
lines changed

1 file changed

+3
-12
lines changed

README.md

Lines changed: 3 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -100,28 +100,19 @@ Here are recommended settings, depending on the amount of VRAM that you have:
100100
- More than 16GB VRAM:
101101

102102
```bash
103-
llama-server \
104-
-hf ggml-org/Qwen2.5-Coder-7B-Q8_0-GGUF \
105-
--port 8012 -ngl 99 -fa -ub 1024 -b 1024 \
106-
--ctx-size 0 --cache-reuse 256
103+
llama-server --fim-qwen-7b-default
107104
```
108105

109106
- Less than 16GB VRAM:
110107

111108
```bash
112-
llama-server \
113-
-hf ggml-org/Qwen2.5-Coder-3B-Q8_0-GGUF \
114-
--port 8012 -ngl 99 -fa -ub 1024 -b 1024 \
115-
--ctx-size 0 --cache-reuse 256
109+
llama-server --fim-qwen-3b-default
116110
```
117111

118112
- Less than 8GB VRAM:
119113

120114
```bash
121-
llama-server \
122-
-hf ggml-org/Qwen2.5-Coder-1.5B-Q8_0-GGUF \
123-
--port 8012 -ngl 99 -fa -ub 1024 -b 1024 \
124-
--ctx-size 0 --cache-reuse 256
115+
llama-server --fim-qwen-1.5b-default
125116
```
126117

127118
Use `:help llama` for more details.

0 commit comments

Comments
 (0)