Skip to content

v3.6.0

Choose a tag to compare

@Jintao-Huang Jintao-Huang released this 08 Jul 03:35
· 526 commits to main since this release

中文版

新特性

  1. Megatron-SWIFT:
    a. 支持更多的 MoE 模型结构,包括:DeepseekV3ForCausalLM、Dots1ForCausalLM 和 Ernie4_5_MoeForCausalLM。训练脚本参考:https://github.com/modelscope/ms-swift/tree/main/examples/train/megatron/moe
    b. 支持更多的 Dense 模型结构,包括:MiMoForCausalLM、InternLM3ForCausalLM 和 Ernie4_5_ForCausalLM。训练脚本参考:https://github.com/modelscope/ms-swift/tree/main/examples/train/megatron/dense
    c. 支持 DPO 训练。训练脚本参考:https://github.com/modelscope/ms-swift/tree/main/examples/train/megatron/rlhf/dpo
    d. 支持 FP8 训练。
    e. 支持更多 rope scaling 类型,包括:default、linear、yarn、dynamic、longrope、llama3 等。
    f. --test_convert_precision参数优化,方便测试 mcore 与 huggingface 模型权重转换精度。
  2. GRPO:
    a. GRPO 多轮训练重构,支持使用 AsyncEngine 加速多轮推理,参考文档:https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/DeveloperGuide/%E5%A4%9A%E8%BD%AE%E8%AE%AD%E7%BB%83.html
    b. offload_model 参数额外对参考模型进行卸载。
    c. 优化 sleep_level 和 offload_model 参数下的显存管理。
    d. reward_funcs 增加了 trainer_state 入参,方便获取当前训练步数和总步数。
  3. 训练:
    a. 支持 reranker 训练,训练脚本参考:https://github.com/modelscope/ms-swift/tree/main/examples/train/reranker
    b. CPT/SFT/DPO/GRPO 纯文本大模型训练支持 ring-attention 切分序列长度,降低显存占用。训练脚本参考:https://github.com/modelscope/ms-swift/tree/main/examples/train/long_text/ring_attention
    c. channel loss 在CPT/SFT训练时,兼容 padding_free 与 packing。 感谢招商银行技术团队的贡献。
    d. remove_unused_columns 参数优化。设置为 False,则将额外数据集传递至 Trainer 内,方便自定义损失函数。
    e. split_dataset_ratio参数默认值从0.01修改为0,默认不再进行验证集切分,需要手动设置--split_dataset_ratio或者--val_dataset
    f. 多模态模型 packing/padding_free 损失对齐问题修复。详见此PR:#4838
    g. swanlab 支持训练完成后的飞书通知回调。
  4. RLHF:
    a. 纯文本/多模态模型支持 GKD 训练,部分场景下支持 padding_free 和 packing,训练脚本如下:
    i. 大模型:https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/gkd.sh
    ii. 多模态大模型:https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/gkd.sh
    b. reward model 训练支持 margin 参数支持,参考文档:https://swift.readthedocs.io/zh-cn/latest/Instruction/%E4%BA%BA%E7%B1%BB%E5%AF%B9%E9%BD%90.html#rm
  5. 全链路:
    a. 支持使用 SGLang 推理引擎对 ms-swift 推理/部署/评测/ui模块进行加速,设置--infer_backend sglang即可。推理脚本参考:https://github.com/modelscope/ms-swift/tree/main/examples/infer/sglang
    b. 支持 FP8 量化,量化脚本参考:https://github.com/modelscope/ms-swift/blob/main/examples/export/quantize/fp8.sh
  6. Web-UI:
    a. 支持 SFT/RLHF/GRPO 在不同 Tab 页面训练,支持保存训练命令行。
    b. Web-UI 界面支持数据采样。

新模型

  1. 多模态模型:
    a. ZhipuAI/GLM-4.1V-9B-Thinking系列
    b. Kwai-Keye/Keye-VL-8B-Preview
    c. moonshotai/Kimi-VL-A3B-Thinking-2506
    d. google/gemma-3n-E2B-it系列
  2. 纯文本模型:
    a. PaddlePaddle/ERNIE-4.5-21B-A3B-PT系列
    b. rednote-hilab/dots.llm1.inst系列
    c. Tencent-Hunyuan/Hunyuan-A13B-Instruct
    d. MiniMax/MiniMax-M1-80k系列(推理)
    e. moonshotai/Kimi-Dev-72B
    f. cognitivecomputations/DeepSeek-R1-0528-AWQ

English Version

New Features

  1. Megatron-SWIFT:
    a. Support for more MoE model architectures, including: DeepseekV3ForCausalLM, Dots1ForCausalLM, and Ernie4_5_MoeForCausalLM. Training script reference: https://github.com/modelscope/ms-swift/tree/main/examples/train/megatron/moe
    b. Support for more Dense model architectures, including: MiMoForCausalLM, InternLM3ForCausalLM, and Ernie4_5_ForCausalLM. Training script reference: https://github.com/modelscope/ms-swift/tree/main/examples/train/megatron/dense
    c. DPO training supported. Training script reference: https://github.com/modelscope/ms-swift/tree/main/examples/train/megatron/rlhf/dpo
    d. FP8 training supported.
    e. More rope scaling types supported, including: default, linear, yarn, dynamic, longrope, llama3, etc.
    f. --test_convert_precision parameter optimized for easier testing of weight conversion precision between mcore and huggingface models.
  2. GRPO:
    a. GRPO multi-turn training refactored, supporting accelerated multi-turn inference with AsyncEngine. Documentation: https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/DeveloperGuide/%E5%A4%9A%E8%BD%AE%E8%AE%AD%E7%BB%83.html
    b. The offload_model parameter now also offloads the reference model.
    c. Optimized GPU memory management under sleep_level and offload_model parameters.
    d. Added trainer_state as an input parameter to reward_funcs, making it easier to obtain the current and total training steps.
  3. Training:
    a. Reranker training supported. Training script reference: https://github.com/modelscope/ms-swift/tree/main/examples/train/reranker
    b. CPT/SFT/DPO/GRPO pure-text large model training supports ring-attention sequence length partitioning, reducing memory usage. Training script reference: https://github.com/modelscope/ms-swift/tree/main/examples/train/long_text/ring_attention
    c. Channel loss in CPT/SFT training is compatible with padding_free and packing. Thanks to the technical team at China Merchants Bank for their contribution.
    d. Optimized remove_unused_columns parameter. When set to False, extra dataset columns are passed to the Trainer for custom loss functions.
    e. The default value for split_dataset_ratio changed from 0.01 to 0, so the validation set is not split by default. You now need to manually set --split_dataset_ratio or --val_dataset.
    f. Fixed loss alignment issue between packing/padding_free for multimodal models. For details, see this PR: #4838
    g. Swanlab now supports Feishu (Lark Suite) notification callback after training is completed.
  4. RLHF:
    a. Pure-text and multimodal models support GKD training, with some scenarios supporting padding_free and packing. Training scripts:
    i. Large models: https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/gkd.sh
    ii. Multimodal large models: https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/gkd.sh
    b. Reward model training now supports the margin parameter. Documentation: https://swift.readthedocs.io/zh-cn/latest/Instruction/%E4%BA%BA%E7%B1%BB%E5%AF%B9%E9%BD%90.html#rm
  5. Full Pipeline:
    a. SGLang inference engine can be used to accelerate ms-swift inference/deployment/evaluation/ui modules, by setting --infer_backend sglang. Inference script reference: https://github.com/modelscope/ms-swift/tree/main/examples/infer/sglang
    b. FP8 quantization supported. Quantization script reference: https://github.com/modelscope/ms-swift/blob/main/examples/export/quantize/fp8.sh
  6. Web-UI:
    a. Supports SFT/RLHF/GRPO training on different Tab pages, and saves training command lines.
    b. Web-UI interface supports data sampling.

New Models

  1. Multimodal Models:
    a. ZhipuAI/GLM-4.1V-9B-Thinking series
    b. Kwai-Keye/Keye-VL-8B-Preview
    c. moonshotai/Kimi-VL-A3B-Thinking-2506
    d. google/gemma-3n-E2B-it series
  2. Pure Text Models:
    a. PaddlePaddle/ERNIE-4.5-21B-A3B-PT series
    b. rednote-hilab/dots.llm1.inst series
    c. Tencent-Hunyuan/Hunyuan-A13B-Instruct
    d. MiniMax/MiniMax-M1-80k series (inference)
    e. moonshotai/Kimi-Dev-72B
    f. cognitivecomputations/DeepSeek-R1-0528-AWQ

What's Changed

New Contributors

Full Changelog: v3.5.0...v3.6.0