Release v3.2.1 · modelscope/ms-swift

中文版

新特性

GRPO支持vLLM的tensor parallel模式。例子参考这里。
GRPO支持co-locate和optimizer和model的offload，支持分批次导入权重和合并LoRA，节约显存资源，使72B模型的训练可以在四张A100上运行。例子参考这里。
GRPO支持code ORM。最佳实践参考这里。

新模型

Qwen/QwQ-32B系列
inclusionAI/Ling-lite系列

New Features

GRPO supports the tensor parallel mode of vLLM. Examples can be found here.
GRPO supports co-locating offloading for both the optimizer and the model, allows for batch weight loading and LoRA merging, saving GPU memory resources, which enables training of a 72B model on four A100 GPUs. Examples can be found here.
GRPO supports code ORM. Best practices can be found here.

New Models

Qwen/QwQ-32B series
inclusionAI/Ling-lite series

What's Changed

Support vllm LLMEngine by @Jintao-Huang in #3370
update publish workflows by @Jintao-Huang in #3374
support ling by @Jintao-Huang in #3379
Support mp mode and hybrid mode of GRPO by @tastelikefeet in #3381
fix name by @tastelikefeet in #3382
fix web-ui infer by @Jintao-Huang in #3384
fix bugs by @tastelikefeet in #3385
fix bugs by @Jintao-Huang in #3386
support Qwen/QwQ-32B by @Jintao-Huang in #3388
support qwq-awq by @Jintao-Huang in #3391
support lmdeploy qwen2_5_vl by @Jintao-Huang in #3394
update infer_save by @Jintao-Huang in #3400
update requirements by @Jintao-Huang in #3403
fix ollama export by @Jintao-Huang in #3406
Fix grpo engine by @tastelikefeet in #3412
fix infer_stream by @Jintao-Huang in #3413
FIx some comments, add dlc script by @tastelikefeet in #3419
add comments and docs by @tastelikefeet in #3424
fix issue 1663 by @Jintao-Huang in #3417
Support GRPO model and optimizer offload, and split loading model by @tastelikefeet in #3427
update wechat by @tastelikefeet in #3430
Fix vllm random by @tastelikefeet in #3437
fix seed by @Jintao-Huang in #3438
fix_base_deploy by @Jintao-Huang in #3442
fix GRPO device mismatch by @hjh0119 in #3440
compat vllm==0.5.1 by @Jintao-Huang in #3444
fix grpo multimodal doc by @mi804 in #3449
support grpo code orm by @hjh0119 in #3431
fix GRPO seed by @Jintao-Huang in #3458
fix grpo multi nodes by @hjh0119 in #3462
Fix tensor parallel hang by @tastelikefeet in #3464
fix grpo trainer zero3 always gather parameters by @tcye in #3467
fix grpo temperature inconsistency by @hjh0119 in #3468
fix grad_norm nan by @Jintao-Huang in #3465
fix grad_norm by @Jintao-Huang in #3469
update minimax by @Jintao-Huang in #3471
Support 72b script with 4 gpus by @tastelikefeet in #3472
refactor packing by @Jintao-Huang in #3457
Fix some docs by @tastelikefeet in #3475
fix grpo ddp hang by @hjh0119 in #3476
fix moe quant by @Jintao-Huang in #3478
Delete duplicate parameters in train_72b_4gpu.sh by @Marquis03 in #3479
fix image by @tastelikefeet in #3480
fix infer gptq internvl2 by @Jintao-Huang in #3481
Resume sample by @BC-A in #3460
fix qwen2_vl flash_attn deepspeed by @Jintao-Huang in #3484
Fix seed of tp=1 by @tastelikefeet in #3486
fix use_cache by @Jintao-Huang in #3487
Fix qwen2 5 vl grounding by @Jintao-Huang in #3491
fix ovis2 device_map by @Jintao-Huang in #3496
fix template.decode by @Jintao-Huang in #3497

New Contributors

@tcye made their first contribution in #3467
@Marquis03 made their first contribution in #3479
@BC-A made their first contribution in #3460

Full Changelog: v3.2.0...v3.2.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v3.2.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

中文版

新特性

新模型

New Features

New Models

What's Changed

New Contributors

Contributors

Uh oh!