Skip to content

Commit 44a5527

Browse files
committed
update readme
Signed-off-by: He, Xin3 <xin3.he@intel.com>
1 parent 6e06a3e commit 44a5527

File tree

1 file changed

+2
-3
lines changed
  • examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/llama3

1 file changed

+2
-3
lines changed

examples/pytorch/nlp/huggingface_models/language-modeling/quantization/auto_round/llama3/README.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ CUDA_VISIBLE_DEVICES=0 python quantize.py \
123123
--low_gpu_mem_usage \
124124
--export_format auto_round \
125125
--export_path llama3.1-8B-MXFP4-MXFP8 \
126-
--tasks mmlu piqa hellaswag gsm8k \
126+
--tasks mmlu_llama piqa hellaswag gsm8k_llama \
127127
--eval_batch_size 32
128128
```
129129

@@ -221,8 +221,7 @@ CUDA_VISIBLE_DEVICES=0,1 bash run_benchmark.sh --model_path=Llama-3.1-70B-MXFP8
221221

222222
The script automatically:
223223
- Detects available GPUs from `CUDA_VISIBLE_DEVICES` and sets `tensor_parallel_size` accordingly
224-
- Handles different `add_bos_token` settings for different tasks (GSM8K requires `False`, others use `True`)
225-
- Runs default tasks: `piqa,hellaswag,mmlu,gsm8k` with batch size 8
224+
- Runs default tasks: `piqa,hellaswag,mmlu_llama,gsm8k_llama` with batch size 8
226225
- Supports custom task selection and batch size adjustment
227226

228227

0 commit comments

Comments
 (0)