Release v3.3.1 · modelscope/ms-swift

中文版

新特性

Agent训练部署模块引入agent template，包括hermes, glm4_0414, llama4等10余种agent template，支持agent数据集兼容不同模型的训练切换，文档参考这里。
GRPO训练支持调用外部vLLM server，训练与部署显存分配更灵活，训练脚本参考这里。

新模型

OpenGVLab/InternVL3-1B系列
moonshotai/Kimi-VL-A3B-Instruct系列
ZhipuAI/GLM-4-9B-0414, ZhipuAI/GLM-Z1-9B-0414系列

English Version

New Features

The Agent training and deployment module introduces agent templates, including more than 10 types such as hermes, glm4_0414, and llama4. These templates support switching between different models for agent dataset compatibility during training. For documentation, refer to here.
GRPO training now supports calling an external vLLM server, allowing for more flexible allocation of GPU memory during training and deployment. For the training script, refer to here.

New Models

OpenGVLab/InternVL3-1B series
moonshotai/Kimi-VL-A3B-Instruct series
ZhipuAI/GLM-4-9B-0414, ZhipuAI/GLM-Z1-9B-0414 series

What's Changed

Fix sampling and rft by @tastelikefeet in #3847
Fix incorrect retry count check in LazyLLMDataset.getitem by @IamLihua in #3845
support internvl3 by @hjh0119 in #3842
fix grpo filter overlong by @hjh0119 in #3844
dapo-bug by @Evilxya in #3846
support agent packing by @Jintao-Huang in #3853
Fix internvl2.5/3 deepspeed packing by @Jintao-Huang in #3855
fix multimodal target_modules by @Jintao-Huang in #3856
Fix multimodal target modules by @Jintao-Huang in #3858
Update FAQ by @slin000111 in #3841
fix grpo completion length equal zero by @hjh0119 in #3857
support val_dataset_shuffle by @Jintao-Huang in #3860
Update swift docker by @Jintao-Huang in #3866
fix citest & minimax link by @Jintao-Huang in #3868
fix grpo save checkpoint by @hjh0119 in #3865
support glm4-z1 by @hjh0119 in #3862
add paper link by @tastelikefeet in #3886
refactor mm target_regex (compat peft/vllm) by @Jintao-Huang in #3879
Support kimi-vl by @Jintao-Huang in #3884
Fix glm4 z1 by @Jintao-Huang in #3889
fix bugs by @Jintao-Huang in #3893
fix typealias to be compatible with Python 3.9 by @hjh0119 in #3895
Fix ui by @tastelikefeet in #3903
Fix fp16 bf16 by @Jintao-Huang in #3909
add rm center_rewards_coefficient argument by @hjh0119 in #3917
revert swift_from_pretrained by @Jintao-Huang in #3914
fix grpo doc by @hjh0119 in #3920
update qwen2_5_omni by @Jintao-Huang in #3908
Support qwen3 by @Jintao-Huang in #3945
Decouple vLLM engine and GRPOTrainer. by @hjh0119 in #3911
Refactor Agent Template by @Jintao-Huang in #3918
update docs by @Jintao-Huang in #3961
fix bugs by @Jintao-Huang in #3962
Support hermes loss_scale by @Jintao-Huang in #3963
fix parse tools by @Jintao-Huang in #3975
Update unsloth compatibility by @tastelikefeet in #3970
Fix qwen2.5-omni use_audio_in_video by @Jintao-Huang in #3987
Fix web-ui by @tastelikefeet in #3997
fix get_toolcall & fix ci by @Jintao-Huang in #3999
fix bugs by @Jintao-Huang in #4001
fix seq_cls by @Jintao-Huang in #4002

New Contributors

@IamLihua made their first contribution in #3845
@Evilxya made their first contribution in #3846

Full Changelog: v3.3.0...v3.3.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v3.3.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

中文版

新特性

新模型

English Version

New Features

New Models

What's Changed

New Contributors

Contributors

Uh oh!