Skip to content

Commit bfa6076

Browse files
authored
Fix vLLM doc (#912)
1 parent 7693a0f commit bfa6076

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

docs/source/use-vllm-as-backend.mdx

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ To use, simply change the `model_args` to reflect the arguments you want to pass
99
1010
```bash
1111
lighteval vllm \
12-
"model_name=HuggingFaceH4/zephyr-7b-beta,dtype=float16" \
13-
"leaderboard|truthfulqa:mc|0|0"
12+
"model_name=HuggingFaceH4/zephyr-7b-beta" \
13+
"extended|ifeval|0|0"
1414
```
1515

1616
`vllm` is able to distribute the model across multiple GPUs using data
@@ -21,16 +21,16 @@ For example if you have 4 GPUs you can split it across using `tensor_parallelism
2121

2222
```bash
2323
export VLLM_WORKER_MULTIPROC_METHOD=spawn && lighteval vllm \
24-
"model_name=HuggingFaceH4/zephyr-7b-beta,dtype=float16,tensor_parallel_size=4" \
25-
"leaderboard|truthfulqa:mc|0|0"
24+
"model_name=HuggingFaceH4/zephyr-7b-beta,tensor_parallel_size=4" \
25+
"extended|ifeval|0|0"
2626
```
2727

2828
Or, if your model fits on a single GPU, you can use `data_parallelism` to speed up the evaluation:
2929

3030
```bash
31-
lighteval vllm \
32-
"model_name=HuggingFaceH4/zephyr-7b-beta,dtype=float16,data_parallel_size=4" \
33-
"leaderboard|truthfulqa:mc|0|0"
31+
export VLLM_WORKER_MULTIPROC_METHOD=spawn && lighteval vllm \
32+
"model_name=HuggingFaceH4/zephyr-7b-beta,data_parallel_size=4" \
33+
"extended|ifeval|0|0"
3434
```
3535

3636
## Use a config file
@@ -41,7 +41,7 @@ An example of a config file is shown below and can be found at `examples/model_c
4141
```bash
4242
lighteval vllm \
4343
"examples/model_configs/vllm_model_config.yaml" \
44-
"leaderboard|truthfulqa:mc|0|0"
44+
"extended|ifeval|0|0"
4545
```
4646

4747
```yaml

0 commit comments

Comments
 (0)