Release v3.2.0 · modelscope/ms-swift

中文版

新特性

GRPO支持多vLLM/lmdeploy数据并行采样，支持异步采样，参考这里。多模态GRPO实验记录参考这里。
swift deploy infer_backend为pt时支持动态batch；流式推理接口修改（break change）。
swift infer infer_backend为vllm/lmdeploy支持数据并行。参考这里。
支持moun优化器，参考这里。

新模型

moonshotai/Moonlight-16B-A3B-Instruct
LLM-Research/Phi-4-mini-instruct, LLM-Research/Phi-4-multimodal-instruct
DeepSeek-V3-awq, deepseek-r1-awq
Baichuan-M1-14B-Instruct

新数据集

多模态GRPO：
- lmms-lab/multimodal-open-r1-8k-verified
- okwinds/clevr_cogen_a_train

New Features

GRPO supports multi-vLLM/lmdeploy data parallel sampling and asynchronous sampling. For more information, refer to here. Records of multi-modal GRPO experiments can be found here.
When swift deploy infer_backend is set to pt, it supports dynamic batching; the streaming inference interface has been modified (breaking change).
When swift infer infer_backend is set to vllm/lmdeploy, it supports data parallelism. Refer to here.
Supports the muon optimizer. For more information, refer to here.

New Models

moonshotai/Moonlight-16B-A3B-Instruct
LLM-Research/Phi-4-mini-instruct, LLM-Research/Phi-4-multimodal-instruct
DeepSeek-V3-awq, deepseek-r1-awq
Baichuan-M1-14B-Instruct

New Datasets

Multi-modal GRPO:
- lmms-lab/multimodal-open-r1-8k-verified
- okwinds/clevr_cogen_a_train

What's Changed

fix setup.py by @Jintao-Huang in #3198
support vllm dp by @Jintao-Huang in #3201
update dataset & fix bugs by @Jintao-Huang in #3203
Support multiple vllms by @tastelikefeet in #3202
update distill docs by @tastelikefeet in #3216
compatible with trl0.16 by @hjh0119 in #3209
support r1 awq by @Jintao-Huang in #3206
fix grpo old_per_token_logps by @hjh0119 in #3220
Support the generation of JanusPro models by @DaozeZhang in #3218
Update the JanusPro-generation by @DaozeZhang in #3221
fix load args by @Jintao-Huang in #3226
update docs by @Jintao-Huang in #3230
Speed up GRPO by @tastelikefeet in #3229
fix docs zh by @Jintao-Huang in #3231
fix deepseek_vl2 by @Jintao-Huang in #3233
support moonlight by @Jintao-Huang in #3232
support muon optimizer by @Jintao-Huang in #3234
update docs by @Jintao-Huang in #3243
fix grpo npu vllm by @hjh0119 in #3242
fix grpo single card by @tastelikefeet in #3246
save val_dataset by @Jintao-Huang in #3248
fix grpo compat transformers==4.47.* by @Jintao-Huang in #3252
grpo_countdown & fix format reward by @mi804 in #3269
Support the base64 format of generated images for JanusPro by @DaozeZhang in #3265
Fix typos by @co63oc in #3266
compat lmdeploy 0.7 by @Jintao-Huang in #3256
fix lmdeploy by @Jintao-Huang in #3274
GRPO+LMDeploy 0.7 by @tastelikefeet in #3277
Support max memory by @Jintao-Huang in #3282
add lmdeploy dp shell by @Jintao-Huang in #3284
Support Baichuan-M1-14B-Instruct by @DaozeZhang in #3271
fix grpo top_k by @Jintao-Huang in #3293
fix lmdeploy mllm in grpo by @tastelikefeet in #3296
Update FAQ by @slin000111 in #3289
fix: error when uploading model to huggingface by @xavier-h-10 in #3297
add multimodal clevr exp by @mi804 in #3301
update docs by @Jintao-Huang in #3304
[refactor] patch_vllm by @Jintao-Huang in #3306
GRPO mllm script by @hjh0119 in #3305
[refactor & feat] support pt dynamic batch by @Jintao-Huang in #3278
Support ZeRO++ by @tastelikefeet in #3315
Revert pt engine batch infer by @Jintao-Huang in #3316
optimize model_type by @Jintao-Huang in #3318
Fix bugs & Update docs/datasets by @Jintao-Huang in #3322
fix grpo zero3 by @hjh0119 in #3324
fix grpo zero3 by @hjh0119 in #3326
compat vllm>=0.5.1 lmdeploy>=0.5.0 by @Jintao-Huang in #3332
update external plugins by @Jintao-Huang in #3334
fix generation_config by @Jintao-Huang in #3335
fix check_model error by @Jintao-Huang in #3336
update get_model_tokenizer_with_flash_attn by @Jintao-Huang in #3337
add geoqa grpo experiment by @mi804 in #3344
fix max_memory by @Jintao-Huang in #3347
support phi4-multimodal by @Jintao-Huang in #3350
fix：fix bugs in cosine reward of GRPO by @youyc22 in #3358
Remove entry including invalid ROADMAP link from English & Chinese documentation by @3manifold in #3357
update docs by @Jintao-Huang in #3349
Support the
update docs by @Jintao-Huang in #3365
add grpo openr1 multimodal experiment by @mi804 in #3368
fix swift app format by @Jintao-Huang in #3367

New Contributors

@xavier-h-10 made their first contribution in #3297
@youyc22 made their first contribution in #3358
@3manifold made their first contribution in #3357

Full Changelog: v3.1.1...v3.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v3.2.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

中文版

新特性

新模型

新数据集

New Features

New Models

New Datasets

What's Changed

New Contributors

Contributors

Uh oh!