@@ -3,12 +3,14 @@ We use **lm-eval** for evaluation. For LLaMA, we enabled `add_bos_token` and
33in [ modeling_llama.py] ( https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py#L52C1-L52C40 )
44to stabilize accuracy during evaluation. All other settings follow the default configurations of AutoRound and lm-eval.
55
6- | Qwen3-8B W2G64 | Avg. | arc_challenge | hellaswag | gsm8k | lambada_openai | mmlu | mmlupro | truthfulqa_mc1 | winogrande |
7- | :-------------------| :--------:| :--------:| :--------:| :--------:| :--------:| :--------:| :--------:| :--------:| :--------:|
8- | AutoRound | 0.4373 | 0.4019 | 0.4437 | 0.4215 | 0.4826 | 0.5474 | 0.2630 | 0.3072 | 0.6314 |
9- | AutoRound+alg_ext | 0.4787 | 0.4275 | 0.4516 | 0.5944 | 0.5181 | 0.5773 | 0.2807 | 0.3305 | 0.6496 |
6+ | Qwen3-8B W2G64 | Avg. | arc_challenge | hellaswag | gsm8k | lambada_openai | mmlu | mmlupro | truthfulqa_mc1 | winogrande |
7+ | :------------------------------| :------:| :-------------:| :---------:| :------:| :--------------:| :------:| :-------:| :--------------:| :----------:|
8+ | AutoRound | 0.4373 | 0.4019 | 0.4437 | 0.4215 | 0.4826 | 0.5474 | 0.2630 | 0.3072 | 0.6314 |
9+ | AutoRound+alg_ext | 0.4787 | 0.4275 | 0.4516 | 0.5944 | 0.5181 | 0.5773 | 0.2807 | 0.3305 | 0.6496 |
10+ | AutoRoundBest+alg_ext lr 2e-3 | 0.4937 | 0.4505 | 0.474 | 0.5906 | 0.5556 | 0.6028 | 0.3127 | 0.3109 | 0.6527 |
1011
11- | Llama3.1-8B-Instruct W2G64 | Avg. | arc_challenge | hellaswag | gsm8k | lambada_openai | mmlu | mmlupro | truthfulqa_mc1 | winogrande |
12- | :---------------------------| :--------:| :--------:| :--------:| :--------:| :--------:| :--------:| :--------:| :--------:| :--------:|
13- | AutoRound | 0.3820 | 0.3635 | 0.4562 | 0.1622 | 0.5069 | 0.4411 | 0.1661 | 0.3207 | 0.6393 |
14- | AutoRound+alg_ext | 0.4166 | 0.3712 | 0.4729 | 0.2039 | 0.5946 | 0.4981 | 0.2163 | 0.3011 | 0.6748 |
12+ | Llama3.1-8B-Instruct W2G64 | Avg. | arc_challenge | hellaswag | gsm8k | lambada_openai | mmlu | mmlupro | truthfulqa_mc1 | winogrande |
13+ | :------------------------------| :------:| :-------------:| :---------:| :------:| :--------------:| :------:| :-------:| :--------------:| :----------:|
14+ | AutoRound | 0.3820 | 0.3635 | 0.4562 | 0.1622 | 0.5069 | 0.4411 | 0.1661 | 0.3207 | 0.6393 |
15+ | AutoRound+alg_ext | 0.4166 | 0.3712 | 0.4729 | 0.2039 | 0.5946 | 0.4981 | 0.2163 | 0.3011 | 0.6748 |
16+ | AutoRoundBest+alg_ext lr 2e-3 | 0.4539 | 0.4138 | 0.4999 | 0.3071 | 0.6233 | 0.5279 | 0.2364 | 0.3231 | 0.6993 |
0 commit comments