v3.12.1

Jintao-Huang released this 08 Jan 02:29

· 176 commits to main since this release

f842150

What's Changed

[bugfix] fix glm4_7 agent_template by @Jintao-Huang in #7256
[bugfix] fix DeepSeek-OCR vllm deploy by @hjh0119 in #7258
[feat] add async reward function support for GRPO training by @hjh0119 in #7252
[model] support medgemma by @slin000111 in #7261
[megatron] Support MiniMaxAI/MiniMax-M2.1 by @Jintao-Huang in #7262
Support muonclip optimizer by @vx120 in #7191
add task_type by @slin000111 in #7265
[bugfix] fix mtp save by @Jintao-Huang in #7267
[feat] support megatron grpo entropy mask & log by @hjh0119 in #7263
[model] support iquestcoder by @Jintao-Huang in #7271
[bugfix] fix reward model adapters by @hjh0119 in #7293
Fix the issue of repeated inference in multi-turn scheduler. by @Simon-ss7 in #7279
[bugfix] auto-enable async engine for vLLM encode tasks by @hjh0119 in #7301
[bugfix] fix vllm_engine load_format by @Jintao-Huang in #7302
fix npu megatron cp by @addsubmuldiv in #7299
[misc] Remove unnecessary clone operations during weight synchronization by @hjh0119 in #7308
[model] support youtu-llm by @hjh0119 in #7306
[megatron] fix gpt_bridge oom by @Jintao-Huang in #7310
[misc] fix youtu agent template type-checking by @hjh0119 in #7311
[bugfix] Fix duplicate 'load_format' argument being passed in rollout by @hjh0119 in #7312

New Contributors

@Simon-ss7 made their first contribution in #7279

Full Changelog: v3.12.0...v3.12.1

Contributors

addsubmuldiv, Simon-ss7, and 4 other contributors

Assets 2