Release v2.0.0 · modelscope/ms-swift

New Features

Support for peft 0.10.x version, with the default value of the tuner_backend parameter changed to peft. The interface of peft has been dynamically patched to support parameters like lora_dtype.
Support for vllm+lora inference.
Refactored and updated the README file.
Added English versions of the documentation. Currently, all documents have both English and Chinese versions.
Support for training 70B models using FSDP+QLoRA on dual 24GB GPUs. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/sft.sh
Support for training agents and using the ModelScopeAgent framework. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source/LLM/Agent%E5%BE%AE%E8%B0%83%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.md
Support for model evaluation and benchmark. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E8%AF%84%E6%B5%8B%E6%96%87%E6%A1%A3.md
Support for multi-task experiment management. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E5%AE%9E%E9%AA%8C%E6%96%87%E6%A1%A3.md
Support for GaLore training.
Support for training and inference of AQLM and AWQ quantized models.

New Models

MAMBA series models. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/mamba-1.4b/lora/sft.sh
DeepSeek VL series models. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source_en/Multi-Modal/deepseek-vl-best-practice.md
LLAVA series models. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source/Multi-Modal/llava%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.md
TeleChat models. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/telechat_12b/lora/sft.sh
Grok-1 models. Documentation available at: https://github.com/modelscope/swift/blob/main/docs/source_en/LLM/Grok-1-best-practice.md
Qwen 1.5 MoE series models for training and inference.
dbrx models for training and inference. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/dbrx-instruct/lora_mp/sft.sh
Mengzi3 models for training and inference. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/mengzi3_13b_base/lora_ddp_ds/sft.sh
Xverse MoE models for training and inference. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/xverse_moe_a4_2b/lora/sft.sh
c4ai-command-r series models for training and inference.
MiniCPM series models for training and inference. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/minicpm_moe_8x2b/lora_ddp/sft.sh
Mixtral-8x22B-v0.1 models for training and inference. Script available at: https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/mixtral_moe_8x22b_v1/lora_ddp_ds/sft.sh

New Datasets

Support for the Ruozhiba dataset: https://github.com/modelscope/swift/blob/main/docs/source_en/LLM/Supported-models-datasets.md

What's Changed

Fix RsLoRA by @tastelikefeet in #567
Fix yi-vl merge lora by @Jintao-Huang in #568
Add doc for tuner module by @tastelikefeet in #571
update agent documentation by @tastelikefeet in #572
Update agent doc to fix some conflicts by @tastelikefeet in #573
support vllm lora by @Jintao-Huang in #565
Support llava by @Jintao-Huang in #577
fix app-ui max_length is None by @Jintao-Huang in #580
support train_dataset_mix_ds using custom_local_path by @Jintao-Huang in #582
Fix LRScheduler by @tastelikefeet in #586
compat with transformers==4.39 by @Jintao-Huang in #584
Fix weight saving by @tastelikefeet in #589
fix mix_dataset_sample float by @Jintao-Huang in #594
Refactor all docs by @tastelikefeet in #599
fix tiny bugs in docs by @tastelikefeet in #600
fix issue template and add a pr one by @tastelikefeet in #601
Fix/security template by @tastelikefeet in #603
update docs by @Jintao-Huang in #604
support Mistral-7b-v0.2 by @hjh0119 in #605
fix deploy safe_response by @Jintao-Huang in #614
Fix Adalora with devicemap by @tastelikefeet in #619
update ui by @tastelikefeet in #621
support TeleChat-12b by @hjh0119 in #607
fix save dir (additional_files) by @Jintao-Huang in #622
fix Telechat model by @hjh0119 in #623
Add Grok model by @tastelikefeet in #629
add missing files by @tastelikefeet in #631
support qwen1.5-moe model by @hjh0119 in #627
support Telechat-7b model by @hjh0119 in #630
support model Dbrx by @hjh0119 in #643
fix ui by @tastelikefeet in #648
fix typing hint by @Jintao-Huang in #649
support Mengzi-13b-base model by @hjh0119 in #646
support Qwen1.5-32b models by @hjh0119 in #655
fix plot error by @tastelikefeet in #651
Support FSDP + QLoRA by @tastelikefeet in #659
move fsdp config path by @tastelikefeet in #662
change the default value of ddp_backend by @tastelikefeet in #667
fix ui log by @tastelikefeet in #669
support Xverse-MoE model by @hjh0119 in #668
Support longlora for transformers 4.38 by @tastelikefeet in #456
add ruozhiba datasets by @tastelikefeet in #670
compatible with old versions of modelscope by @tastelikefeet in #671
Fix data_collator by @tastelikefeet in #674
[TorchAcc][Experimental] Integrate TorchAcc. by @baoleai in #647
update Agent best practice with Modelscope-Agent by @hjh0119 in #676
support c4ai-command-r model by @hjh0119 in #684
Support Eval by @tastelikefeet in #494
fix anchor by @tastelikefeet in #687
Fix/0412 by @tastelikefeet in #690
support minicpm and mixtral-moe model by @hjh0119 in #692
fix device_map 4 (qwen-vl) by @Jintao-Huang in #695
fix multimodal model image_mode = 'CMYK' (fix issue#677) by @Jintao-Huang in #697
feat(model): support minicpm-v-2(#699 ) by @YuzaChongyi in #699

New Contributors

@hjh0119 made their first contribution in #605
@YuzaChongyi made their first contribution in #699

Full Changelog: v1.7.3...v2.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v2.0.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

New Features

New Models

New Datasets

What's Changed

New Contributors

Contributors

Uh oh!