Skip to content

Commit 11f5c5e

Browse files
authored
Update run_benchmark.sh
1 parent 1b6f6ad commit 11f5c5e

File tree

1 file changed

+1
-0
lines changed
  • examples/pytorch/multimodal-modeling/quantization/auto_round/llama4

1 file changed

+1
-0
lines changed

examples/pytorch/multimodal-modeling/quantization/auto_round/llama4/run_benchmark.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ function run_benchmark {
6767
extra_model_args="max_model_len=66000,gpu_memory_utilization=0.7"
6868
else
6969
model="vllm"
70+
extra_model_args="max_model_len=8192,max_num_seqs=1024,max_gen_toks=2048,gpu_memory_utilization=0.7"
7071
fi
7172

7273
if [[ "${kv_cache_dtype}" == "fp8" ]]; then

0 commit comments

Comments
 (0)