v3.0.3
中文版
新特性
- 支持多模态大模型SequenceClassification架构用于多模态分类任务,参考这里。
 - 支持多模态大模型reward model训练。
 
新模型
- Shanghai_AI_Laboratory/internlm3-8b-instruct
 - OpenBMB/MiniCPM-o-2_6
 - deepseek-ai/DeepSeek-R1, deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B系列
 - bytedance-research/Valley-Eagle-7B
 - LLM-Research/phi-4
 - Qwen/Qwen2.5-Math-PRM-7B, Qwen/Qwen2.5-Math-PRM-72B
 - MiniMaxAI/MiniMax-Text-01, MiniMaxAI/MiniMax-VL-01
 
English Version
New Features
- Support multi-modal large model SequenceClassification architecture for multi-modal classification tasks, see here.
 - Support training of multi-modal reward model.
 
New Models
- Shanghai_AI_Laboratory/internlm3-8b-instruct
 - OpenBMB/MiniCPM-o-2_6
 - deepseek-ai/DeepSeek-R1, deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B series
 - bytedance-research/Valley-Eagle-7B
 - LLM-Research/phi-4
 - Qwen/Qwen2.5-Math-PRM-7B, Qwen/Qwen2.5-Math-PRM-72B
 - MiniMaxAI/MiniMax-Text-01, MiniMaxAI/MiniMax-VL-01
 
What's Changed
- update qlora shell by @Jintao-Huang in #2880
 - fix docs by @Jintao-Huang in #2882
 - support multi round dpo by @tastelikefeet in #2884
 - Support infer n parameter by @tastelikefeet in #2893
 - Fix qwen vl eval by @Jintao-Huang in #2892
 - fix infer engine by @Jintao-Huang in #2898
 - Add phi4 by @tastelikefeet in #2895
 - fix link & bug by @Jintao-Huang in #2902
 - update video infer examples by @Jintao-Huang in #2840
 - Sampler by @tastelikefeet in #2905
 - Fix a bug when lint code by @tastelikefeet in #2906
 - Fix bugs by @Jintao-Huang in #2907
 - update plugin doc by @tastelikefeet in #2908
 - fix vllm tp stuck by @Jintao-Huang in #2909
 - fix replace_video2image by @Jintao-Huang in #2913
 - Fix read file mode by @tastelikefeet in #2915
 - fix inspect init by @Jintao-Huang in #2916
 - Update rm by @tastelikefeet in #2919
 - Add internlm3 dense by @HIT-cwh in #2920
 - internlm3 lint pass by @Jintao-Huang in #2923
 - Fix web ui log by @tastelikefeet in #2924
 - Support Valley by @lxline in #2921
 - support minicpm-o by @Jintao-Huang in #2918
 - fix vllm tp block by @Jintao-Huang in #2927
 - update docs by @Jintao-Huang in #2929
 - Support first prms by @tastelikefeet in #2926
 - fix Valley by @lxline in #2931
 - Support mllm seq_cls/rm by @Jintao-Huang in #2934
 - fix bugs by @Jintao-Huang in #2938
 - support deepseek-ai/DeepSeek-R1 by @Jintao-Huang in #2940
 - Fix quant template by @Jintao-Huang in #2942
 - Support minimax by @tastelikefeet in #2943
 - Fix mllm seq cls by @Jintao-Huang in #2945
 - support deepseek_r1_distill by @Jintao-Huang in #2946
 - fix demo_hf by @Jintao-Huang in #2951
 - fix infer_stream by @Jintao-Huang in #2952
 - fix citest by @Jintao-Huang in #2953
 - fix bugs by @Jintao-Huang in #2954
 - update requirements by @Jintao-Huang in #2957
 - update web-ui images by @tastelikefeet in #2958
 - update quant_mllm shell by @Jintao-Huang in #2959
 - fix max_length error print by @Jintao-Huang in #2960
 - fix seq_cls patcher by @Jintao-Huang in #2963
 - ppo compat transformers>=4.47.* by @Jintao-Huang in #2964
 
Full Changelog: v3.0.2...v3.0.3