|
27 | 27 |
|
28 | 28 | ## 📖 Table of Contents |
29 | 29 | - [Introduction](#-introduction) |
| 30 | +- [Groups](#-Groups) |
30 | 31 | - [News](#-news) |
31 | 32 | - [Installation](#%EF%B8%8F-installation) |
32 | 33 | - [Getting Started](#-getting-started) |
33 | | -- [Documentation](#-documentation) |
| 34 | +- [Classroom](#-Classroom) |
34 | 35 | - [License](#-License) |
35 | 36 | - [Citation](#-citation) |
36 | | -- [WeChat Group](#-Wechat-Group) |
37 | 37 |
|
38 | 38 | ## 📝 Introduction |
39 | | -SWIFT supports training, inference, evaluation and deployment of **300+ LLMs and 50+ MLLMs** (multimodal large models). Developers can directly apply our framework to their own research and production environments to realize the complete workflow from model training and evaluation to application. In addition to supporting the lightweight training solutions provided by [PEFT](https://github.com/huggingface/peft), we also provide a complete **Adapters library** to support the latest training techniques such as NEFTune, LoRA+, LLaMA-PRO, etc. This adapter library can be used directly in your own custom workflow without our training scripts. |
| 39 | +SWIFT supports training(PreTraining/Fine-tuning/RLHF), inference, evaluation and deployment of **300+ LLMs and 50+ MLLMs** (multimodal large models). Developers can directly apply our framework to their own research and production environments to realize the complete workflow from model training and evaluation to application. In addition to supporting the lightweight training solutions provided by [PEFT](https://github.com/huggingface/peft), we also provide a complete **Adapters library** to support the latest training techniques such as NEFTune, LoRA+, LLaMA-PRO, etc. This adapter library can be used directly in your own custom workflow without our training scripts. |
40 | 40 |
|
41 | | -To facilitate use by users unfamiliar with deep learning, we provide a Gradio web-ui for controlling training and inference, as well as accompanying deep learning courses and best practices for beginners. |
| 41 | +To facilitate use by users unfamiliar with deep learning, we provide a Gradio web-ui for controlling training and inference, as well as accompanying deep learning courses and best practices for beginners. SWIFT web-ui is available both on [Huggingface space](https://huggingface.co/spaces/tastelikefeet/swift) and [ModelScope studio](https://www.modelscope.cn/studios/iic/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary), please feel free to try! |
42 | 42 |
|
43 | | -Additionally, we are expanding capabilities for other modalities. Currently, we support full-parameter training and LoRA training for AnimateDiff. |
| 43 | +SWIFT has rich documentations for users, please feel free to check our documentation website: |
| 44 | +<p align="center"> |
| 45 | + <a href="https://swift.readthedocs.io/en/latest/">English Documentation</a>   |   <a href="https://swift.readthedocs.io/zh-cn/latest/">中文文档</a>   |
| 46 | +</p> |
| 47 | + |
| 48 | +## ☎ Groups |
44 | 49 |
|
45 | | -SWIFT has rich documentations for users, please check [here](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM/index.md). |
| 50 | +You can contact us and communicate with us by adding our group: |
46 | 51 |
|
47 | | -SWIFT web-ui is available both on [Huggingface space](https://huggingface.co/spaces/tastelikefeet/swift) and [ModelScope studio](https://www.modelscope.cn/studios/iic/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary), please feel free to try! |
| 52 | + |
| 53 | +[Discord Group](https://discord.gg/qQXTzNUp) | 微信群 |
| 54 | +:-------------------------:|:-------------------------: |
| 55 | +<img src="asset/discord_qr.jpg" width="200" height="200"> | <img src="asset/wechat.png" width="200" height="200"> |
48 | 56 |
|
49 | 57 | ## 🎉 News |
| 58 | +- 2024.07.08:Support cogvlm2-video-13b-chat. You can check the best practice [here](docs/source_en/Multi-Modal/cogvlm2-video-best-practice.md). |
50 | 59 | - 2024.07.08: Support internlm-xcomposer2_5-7b-chat. You can check the best practice [here](docs/source_en/Multi-Modal/internlm-xcomposer2-best-practice.md). |
51 | 60 | - 2024.07.06: Support for the llava-next-video series models: llava-next-video-7b-instruct, llava-next-video-7b-32k-instruct, llava-next-video-7b-dpo-instruct, llava-next-video-34b-instruct. You can refer to [llava-video best practice](docs/source_en/Multi-Modal/llava-video-best-practice.md) for more information. |
52 | 61 | - 2024.07.06: Support internvl2 series: internvl2-2b, internvl2-4b, internvl2-8b, internvl2-26b. |
@@ -433,6 +442,38 @@ swift sft \ |
433 | 442 | --deepspeed default-zero3 |
434 | 443 | ``` |
435 | 444 |
|
| 445 | +#### Pretraining |
| 446 | + |
| 447 | +```shell |
| 448 | +# Experimental Environment: 4 * A100 |
| 449 | +# GPU Memory Requirement: 4 * 30GB |
| 450 | +# Runtime: 0.8 hours |
| 451 | +NPROC_PER_NODE=4 \ |
| 452 | +CUDA_VISIBLE_DEVICES=0,1,2,3 \ |
| 453 | +swift pt \ |
| 454 | + --model_type qwen1half-7b-chat \ |
| 455 | + --dataset chinese_c4#10000 \ |
| 456 | + --num_train_epochs 1 \ |
| 457 | + --sft_type full \ |
| 458 | + --deepspeed default-zero3 \ |
| 459 | + --output_dir output \ |
| 460 | +``` |
| 461 | + |
| 462 | + |
| 463 | +#### RLHF |
| 464 | + |
| 465 | +```shell |
| 466 | +# We support rlhf_type dpo/cpo/simpo/orpo/kto |
| 467 | +CUDA_VISIBLE_DEVICES=0 \ |
| 468 | +swift rlhf \ |
| 469 | + --rlhf_type dpo \ |
| 470 | + --model_type qwen1half-7b-chat \ |
| 471 | + --dataset shareai-llama3-dpo-zh-en-emoji \ |
| 472 | + --num_train_epochs 5 \ |
| 473 | + --sft_type lora \ |
| 474 | + --output_dir output \ |
| 475 | +``` |
| 476 | + |
436 | 477 |
|
437 | 478 | ### Inference |
438 | 479 | Original model: |
@@ -558,11 +599,11 @@ The complete list of supported models and datasets can be found at [Supported Mo |
558 | 599 | | XComposer2<br>XComposer2.5 | [Pujiang AI Lab InternLM vision model](https://github.com/InternLM/InternLM-XComposer) | Chinese<br>English | 7B | chat model | |
559 | 600 | | DeepSeek-VL | [DeepSeek series vision models](https://github.com/deepseek-ai) | Chinese<br>English | 1.3B-7B | chat model | |
560 | 601 | | MiniCPM-V<br>MiniCPM-V-2<br>MiniCPM-V-2_5 | [OpenBmB MiniCPM vision model](https://github.com/OpenBMB/MiniCPM) | Chinese<br>English | 3B-9B | chat model | |
561 | | -| CogVLM<br>CogVLM2<br>CogAgent<br>GLM4V | [Zhipu ChatGLM visual QA and Agent model](https://github.com/THUDM/) | Chinese<br>English | 9B-19B | chat model | |
| 602 | +| CogVLM<br>CogAgent<br>CogVLM2<br>CogVLM2-Video<br>GLM4V | [Zhipu ChatGLM visual QA and Agent model](https://github.com/THUDM/) | Chinese<br>English | 9B-19B | chat model | |
562 | 603 | | Llava1.5<br>Llava1.6 | [Llava series models](https://github.com/haotian-liu/LLaVA) | English | 7B-34B | chat model | |
563 | 604 | | Llava-Next<br>Llava-Next-Video | [Llava-Next series models](https://github.com/LLaVA-VL/LLaVA-NeXT) | Chinese<br>English | 7B-110B | chat model | |
564 | 605 | | mPLUG-Owl | [mPLUG-Owl series models](https://github.com/X-PLUG/mPLUG-Owl) | English | 11B | chat model | |
565 | | -| InternVL<br>Mini-Internvl<br>Internvl2 | [InternVL](https://github.com/OpenGVLab/InternVL) | Chinese<br>English | 2B-25.5B<br>including quantized version | chat model | |
| 606 | +| InternVL<br>Mini-Internvl<br>Internvl2 | [InternVL](https://github.com/OpenGVLab/InternVL) | Chinese<br>English | 2B-40B<br>including quantized version | chat model | |
566 | 607 | | Llava-llama3 | [xtuner](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers) | English | 8B | chat model | |
567 | 608 | | Phi3-Vision | Microsoft | English | 4B | chat model | |
568 | 609 | | PaliGemma | Google | English | 3B | chat model | |
@@ -637,54 +678,7 @@ The complete list of supported models and datasets can be found at [Supported Mo |
637 | 678 | Other variables like `CUDA_VISIBLE_DEVICES` are also supported, which are not listed here. |
638 | 679 |
|
639 | 680 |
|
640 | | -## 📃 Documentation |
641 | | - |
642 | | -### Documentation Compiling |
643 | | - |
644 | | -```shell |
645 | | -make docs |
646 | | -# Check docs/build/html/index.html in web-browser |
647 | | -``` |
648 | | - |
649 | | -### User Guide |
650 | | - |
651 | | -| Document Name | |
652 | | -| ------------------------------------------------------------ | |
653 | | -| [Using Web-UI](docs/source_en/GetStarted/Web-ui.md) | |
654 | | -| [Using Tuners](docs/source_en/GetStarted/Tuners.md) | |
655 | | -| [LLM Inference](docs/source_en/LLM/LLM-inference.md) | |
656 | | -| [LLM Fine-tuning](docs/source_en/LLM/LLM-fine-tuning.md) | |
657 | | -| [LLM Evaluation](docs/source_en/LLM/LLM-eval.md) | |
658 | | -| [LLM Quantization](docs/source_en/LLM/LLM-quantization.md) | |
659 | | -| [LLM Deployment](docs/source_en/LLM/VLLM-inference-acceleration-and-deployment.md) | |
660 | | -| [AnimateDiff Training](docs/source_en/AIGC/AnimateDiff-train-infer.md) | |
661 | | -| [Human Preference Alignment Training Documentation](docs/source_en/LLM/Human-Preference-Alignment-Training-Documentation.md) | |
662 | | - |
663 | | -### Reference Documentation |
664 | | -| Document Name | |
665 | | -| ------------------------------------------------------------ | |
666 | | -| [Command Line Arguments](docs/source_en/LLM/Command-line-parameters.md) | |
667 | | -| [Supported Models and Datasets List](docs/source_en/LLM/Supported-models-datasets.md) | |
668 | | -| [Customizing New Models and Datasets](docs/source_en/LLM/Customization.md) | |
669 | | -| [Runtime Speed and Memory Benchmark](docs/source_en/LLM/Benchmark.md) | |
670 | | - |
671 | | - |
672 | | -### Best Practices |
673 | | - |
674 | | -| Best Practices Name | |
675 | | -| ------------------------------------------------------------ | |
676 | | -| [Agent Fine-Tuning Best Practice](docs/source_en/LLM/Agent-fine-tuning-best-practice.md) | |
677 | | -| [Agent Deployment Best Practice](docs/source_en/LLM/Agent-deployment-best-practice.md) | |
678 | | -| [Self-Cognition Fine-Tuning Best Practice](docs/source_en/LLM/Self-cognition-best-practice.md) | |
679 | | -| [Qwen1.5 Best Practice](docs/source_en/LLM/Qwen1.5-best-practice.md) | |
680 | | -| [Multi-Modal Model Training Best Practice](docs/source_en/Multi-Modal/index.md) | |
681 | | -| [NPU Best Practice](docs/source_en/LLM/NPU-best-practice.md) | |
682 | | -| [DPO Human Alignment Training](docs/source_en/LLM/DPO.md) | |
683 | | -| [ORPO Human Alignment Training](docs/source_en/LLM/ORPO.md) | |
684 | | -| [SimPO Human Alignment Training](docs/source_en/LLM/SimPO.md) | |
685 | | - |
686 | | - |
687 | | -### Deep Learning Tutorials |
| 681 | +## 📚 Classroom |
688 | 682 |
|
689 | 683 | | Tutorial Name | |
690 | 684 | |-------------------------------------------------------------- | |
@@ -715,14 +709,6 @@ This framework is licensed under the [Apache License (Version 2.0)](https://gith |
715 | 709 | } |
716 | 710 | ``` |
717 | 711 |
|
718 | | -## ☎ Wechat Group |
719 | | - |
720 | | -You can contact us and communicate with us by adding our WeChat group: |
721 | | - |
722 | | -<p align="left"> |
723 | | -<img src="asset/wechat.png" width="250" style="display: inline-block;"> |
724 | | -</p> |
725 | | - |
726 | 712 | ## Star History |
727 | 713 |
|
728 | 714 | [](https://star-history.com/#modelscope/swift&Date) |
0 commit comments