v3.12.1
What's Changed
- [bugfix] fix glm4_7 agent_template by @Jintao-Huang in #7256
- [bugfix] fix DeepSeek-OCR vllm deploy by @hjh0119 in #7258
- [feat] add async reward function support for GRPO training by @hjh0119 in #7252
- [model] support medgemma by @slin000111 in #7261
- [megatron] Support MiniMaxAI/MiniMax-M2.1 by @Jintao-Huang in #7262
- Support muonclip optimizer by @vx120 in #7191
- add task_type by @slin000111 in #7265
- [bugfix] fix mtp save by @Jintao-Huang in #7267
- [feat] support megatron grpo entropy mask & log by @hjh0119 in #7263
- [model] support iquestcoder by @Jintao-Huang in #7271
- [bugfix] fix reward model adapters by @hjh0119 in #7293
- Fix the issue of repeated inference in multi-turn scheduler. by @Simon-ss7 in #7279
- [bugfix] auto-enable async engine for vLLM encode tasks by @hjh0119 in #7301
- [bugfix] fix vllm_engine load_format by @Jintao-Huang in #7302
- fix npu megatron cp by @addsubmuldiv in #7299
- [misc] Remove unnecessary clone operations during weight synchronization by @hjh0119 in #7308
- [model] support youtu-llm by @hjh0119 in #7306
- [megatron] fix gpt_bridge oom by @Jintao-Huang in #7310
- [misc] fix youtu agent template type-checking by @hjh0119 in #7311
- [bugfix] Fix duplicate 'load_format' argument being passed in rollout by @hjh0119 in #7312
New Contributors
- @Simon-ss7 made their first contribution in #7279
Full Changelog: v3.12.0...v3.12.1