Skip to content

v3.2.0

Choose a tag to compare

@Jintao-Huang Jintao-Huang released this 04 Mar 15:48
· 1044 commits to main since this release

中文版

新特性

  1. GRPO支持多vLLM/lmdeploy数据并行采样,支持异步采样,参考这里。多模态GRPO实验记录参考这里
  2. swift deploy infer_backend为pt时支持动态batch;流式推理接口修改(break change)。
  3. swift infer infer_backend为vllm/lmdeploy支持数据并行。参考这里
  4. 支持moun优化器,参考这里

新模型

  1. moonshotai/Moonlight-16B-A3B-Instruct
  2. LLM-Research/Phi-4-mini-instruct, LLM-Research/Phi-4-multimodal-instruct
  3. DeepSeek-V3-awq, deepseek-r1-awq
  4. Baichuan-M1-14B-Instruct

新数据集

  1. 多模态GRPO:
    • lmms-lab/multimodal-open-r1-8k-verified
    • okwinds/clevr_cogen_a_train

New Features

  1. GRPO supports multi-vLLM/lmdeploy data parallel sampling and asynchronous sampling. For more information, refer to here. Records of multi-modal GRPO experiments can be found here.
  2. When swift deploy infer_backend is set to pt, it supports dynamic batching; the streaming inference interface has been modified (breaking change).
  3. When swift infer infer_backend is set to vllm/lmdeploy, it supports data parallelism. Refer to here.
  4. Supports the muon optimizer. For more information, refer to here.

New Models

  1. moonshotai/Moonlight-16B-A3B-Instruct
  2. LLM-Research/Phi-4-mini-instruct, LLM-Research/Phi-4-multimodal-instruct
  3. DeepSeek-V3-awq, deepseek-r1-awq
  4. Baichuan-M1-14B-Instruct

New Datasets

  1. Multi-modal GRPO:
    • lmms-lab/multimodal-open-r1-8k-verified
    • okwinds/clevr_cogen_a_train

What's Changed

New Contributors

Full Changelog: v3.1.1...v3.2.0