Skip to content

Commit f10ebf7

Browse files
committed
parallel : update readme [no ci]
1 parent 1eb817f commit f10ebf7

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

examples/parallel/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,11 @@
11
# llama.cpp/example/parallel
22

33
Simplified simulation of serving incoming requests in parallel
4+
5+
## Example
6+
7+
Generate 128 client requests (`-ns 128`), simulating 8 concurrent clients (`-np 8`). The system prompt is shared (`-pps`), meaning that it is computed once at the start. The client requests consist of 10 junk questions (`-j 10`) followed by the actual question.
8+
9+
```bash
10+
llama-parallel -m model.gguf -np 8 -ns 128 --top-k 1 -pps --junk 10 -c 16384
11+
```

0 commit comments

Comments
 (0)