@@ -29,20 +29,35 @@ lighteval vllm \
2929 " leaderboard|truthfulqa:mc|0|0"
3030```
3131
32- Available arguments for ` vllm ` can be found in the ` VLLMModelConfig ` :
33-
34- - ** pretrained** (str): HuggingFace Hub model ID name or the path to a pre-trained model to load.
35- - ** gpu_memory_utilisation** (float): The fraction of GPU memory to use.
36- - ** revision** (str): The revision of the model.
37- - ** dtype** (str, None): The data type to use for the model.
38- - ** tensor_parallel_size** (int): The number of tensor parallel units to use.
39- - ** data_parallel_size** (int): The number of data parallel units to use.
40- - ** max_model_length** (int): The maximum length of the model.
41- - ** swap_space** (int): The CPU swap space size (GiB) per GPU.
42- - ** seed** (int): The seed to use for the model.
43- - ** trust_remote_code** (bool): Whether to trust remote code during model loading.
44- - ** add_special_tokens** (bool): Whether to add special tokens to the input sequences.
45- - ** multichoice_continuations_start_space** (bool): Whether to add a space at the start of each continuation in multichoice generation.
32+ For more advanced configurations, you can use a config file for the model.
33+ An example of a config file is shown below and can be found at ` examples/model_configs/vllm_model_config.yaml ` .
34+
35+ ``` bash
36+ lighteval vllm \
37+ " examples/model_configs/vllm_model_config.yaml" \
38+ " leaderboard|truthfulqa:mc|0|0"
39+ ```
40+
41+ ``` yaml
42+ model : # Model specific parameters
43+ base_params :
44+ model_args : " pretrained=HuggingFaceTB/SmolLM-1.7B,revision=main,dtype=bfloat16" # Model args that you would pass in the command line
45+ generation : # Generation specific parameters
46+ temperature : 0.3
47+ early_stopping : 1
48+ repetition_penalty : 1.0
49+ frequency_penalty : 0.0
50+ length_penalty : 0.0
51+ presence_penalty : 0.0
52+ max_new_tokens : 100
53+ min_new_tokens : 1
54+ seed : 42
55+ stop_tokens : null
56+ top_k : 0
57+ min_p : 0.0
58+ top_p : 0.9
59+ truncate_prompt : false
60+ ` ` `
4661
4762> [!WARNING]
4863> In the case of OOM issues, you might need to reduce the context size of the
0 commit comments