Skip to content

Conversation

@ynankani
Copy link

  • Add mixed precision nvModelOpt recipes for Llama-3.2-1B-Instruct
  • Add mixed precision nvModelOpt recipes for DeepSeek-R1-Distill-Qwen-1.5B
  • Add mixed precision nvModelOpt recipes for Qwen-Qwen2.5-1.5B-Instruct

Observed improvement in mmlu and perplexity score for the above model with mixed (Int4+Int8) precision quantization compared to standard int4 quantization.

<style> </style>
MMLU          
Model FP16-MB Mixed AWQ -MO Mixed RTN-MO Pure INT4 AWQ-MO Pure INT4 RTN-MO
DeepSeek R1 Distill Qwen 1.5B 36.60% 34.80% 33.90% 33.10% 32.40%
Llama 3.2 1B Instruct 47.30% 44.40% 44.70% 43.20% 39.90%
Qwen 2.5 1.5B Instruct 60.00% 57.50% 57.50% 56.60% 56.70%
           
Perplexity (isl=1024, stride=512)        
Model FP16-MB Mixed AWQ -MO Mixed RTN-MO Pure INT4 AWQ-MO Pure INT4 RTN-MO
DeepSeek R1 Distill Qwen 1.5B 39.447 41.699 44.332 44.213 46.304
Llama 3.2 1B Instruct 12.631 13.852 14.176 14.549 16.9
Qwen 2.5 1.5B Instruct 9.216 10.084 10.338 10.495 10.933

…1B-Instruct, DeepSeek-R1-Distill-Qwen-1.5B, Qwen-Qwen2.5-1.5B-Instruct
@ynankani
Copy link
Author

Please review, merge.

CC @devang-ml

@devang-ml devang-ml requested a review from shaahji November 8, 2025 00:26
@shaahji
Copy link
Contributor

shaahji commented Nov 10, 2025

Please fix the failing pre-commit checks. Also, update the branch to the latest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants