Skip to content

Commit 6ed696d

Browse files
committed
change doc
1 parent b4c2d77 commit 6ed696d

File tree

1 file changed

+29
-14
lines changed

1 file changed

+29
-14
lines changed

docs/source/use-vllm-as-backend.mdx

Lines changed: 29 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -29,20 +29,35 @@ lighteval vllm \
2929
"leaderboard|truthfulqa:mc|0|0"
3030
```
3131

32-
Available arguments for `vllm` can be found in the `VLLMModelConfig`:
33-
34-
- **pretrained** (str): HuggingFace Hub model ID name or the path to a pre-trained model to load.
35-
- **gpu_memory_utilisation** (float): The fraction of GPU memory to use.
36-
- **revision** (str): The revision of the model.
37-
- **dtype** (str, None): The data type to use for the model.
38-
- **tensor_parallel_size** (int): The number of tensor parallel units to use.
39-
- **data_parallel_size** (int): The number of data parallel units to use.
40-
- **max_model_length** (int): The maximum length of the model.
41-
- **swap_space** (int): The CPU swap space size (GiB) per GPU.
42-
- **seed** (int): The seed to use for the model.
43-
- **trust_remote_code** (bool): Whether to trust remote code during model loading.
44-
- **add_special_tokens** (bool): Whether to add special tokens to the input sequences.
45-
- **multichoice_continuations_start_space** (bool): Whether to add a space at the start of each continuation in multichoice generation.
32+
For more advanced configurations, you can use a config file for the model.
33+
An example of a config file is shown below and can be found at `examples/model_configs/vllm_model_config.yaml`.
34+
35+
```bash
36+
lighteval vllm \
37+
"examples/model_configs/vllm_model_config.yaml" \
38+
"leaderboard|truthfulqa:mc|0|0"
39+
```
40+
41+
```yaml
42+
model: # Model specific parameters
43+
base_params:
44+
model_args: "pretrained=HuggingFaceTB/SmolLM-1.7B,revision=main,dtype=bfloat16" # Model args that you would pass in the command line
45+
generation: # Generation specific parameters
46+
temperature: 0.3
47+
early_stopping: 1
48+
repetition_penalty: 1.0
49+
frequency_penalty: 0.0
50+
length_penalty: 0.0
51+
presence_penalty: 0.0
52+
max_new_tokens: 100
53+
min_new_tokens: 1
54+
seed: 42
55+
stop_tokens: null
56+
top_k: 0
57+
min_p: 0.0
58+
top_p: 0.9
59+
truncate_prompt: false
60+
```
4661
4762
> [!WARNING]
4863
> In the case of OOM issues, you might need to reduce the context size of the

0 commit comments

Comments
 (0)