update readme (#76)

tastelikefeet · web-flow · commit 772e26c2d505 · 2023-09-19T13:39:05.000+08:00
diff --git a/README.md b/README.md
@@ -14,7 +14,7 @@
 
 # Introduction
 
-SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning) is an extensible framwork designed to faciliate lightweight model fine-tuning. It integrates implementations for various efficient fine-tuning methods,  by embracing approaches that is parameter-efficient, memory-efficient, and time-efficient. SWIFT integrates seamlessly into ModelScope ecosystem and offers the capabilities to finetune various modles, with a primary emphasis on LLMs and vision models. Additionally, SWIFT is fully compatible with [Peft](https://github.com/huggingface/peft), enabling users to  leverage the familiar Peft interface to finetune ModelScope models.
+SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning) is an extensible framwork designed to faciliate lightweight model fine-tuning and inference. It integrates implementations for various efficient fine-tuning methods,  by embracing approaches that is parameter-efficient, memory-efficient, and time-efficient. SWIFT integrates seamlessly into ModelScope ecosystem and offers the capabilities to finetune various models, with a primary emphasis on LLMs and vision models. Additionally, SWIFT is fully compatible with [PEFT](https://github.com/huggingface/peft), enabling users to  leverage the familiar Peft interface to finetune ModelScope models.
 
 Currently supported approches (and counting):
 
@@ -23,20 +23,20 @@ Currently supported approches (and counting):
 3. Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
 4. Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503)
 5. ResTuning-Bypass
-7. All tuners offered on [Peft](https://github.com/huggingface/peft)
+7. All tuners offered on [PEFT](https://github.com/huggingface/peft)
 
 Key features:
 
 1. By integrating the ModelScope library, models can be readily obatined via a model-id.
-2. Tuners provided by SWIFT be combined together to allow exploration of multiple tuners on a model for best result.
-3. Support calling `activate_adapter`或`deactivate_adapter` to activate/deactivate a single tuner. User can use one model with multiple tuners in different threads.
+2. Tuners provided by SWIFT can be combined together to allow exploration of multiple tuners on a model for best result.
+3. Support calling `activate_adapter` or `deactivate_adapter` or `set_active_adapters`  to activate/deactivate tuners. User can inference with one model and multiple tuners in different threads independently.
 
-Users can check the [documentation of Swift](./docs/Get Started/1.Introduction.md) to get detail tutorials.
+Users can check the [documentation of SWIFT](./docs/Get Started/1.Introduction.md) to get detail tutorials.
 
 ## LLM SFT Example
 [code link](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm)
 
-1. supported SFT methods: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), full(full parameter fine-tuning)
+1. supported SFT methods: [LoRA](https://arxiv.org/abs/2106.09685), [QLoRA](https://arxiv.org/abs/2305.14314), full(full parameter fine-tuning)
 2. supported models:
    1. qwen series: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B)
    2. qwen-vl series: qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL)
@@ -58,42 +58,39 @@ Users can check the [documentation of Swift](./docs/Get Started/1.Introduction.m
 
 SWIFT is running in Python environment. Please make sure your python version is higher than 3.8.
 
-Please install SWIFT by the `pip` command:
+- Install SWIFT by the `pip` command:
 
 ```shell
 pip install ms-swift -U
 ```
 
-If you want to install SWIFT by source code, please run:
+- Install SWIFT by source code(for running sft/infer examples), please run:
 
 ```shell
 git clone https://github.com/modelscope/swift.git
 cd swift
 pip install -e .
 ```
 
-If you are using source code, please remember install requirements by:
-```shell
-pip install -r requirements/framework.txt
-```
-
 SWIFT requires torch>=1.13.
 
-We also recommend to use SWIFT in our docker image:
+- Use SWIFT in our docker image:
+
 ```shell
-docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.0
+docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.8.0-py38-torch2.0.1-tf2.13.0-1.9.1
 ```
 
 # Getting Started
 
-SWIFT supports multiple tuners, as well as tuners provided by [Peft](https://github.com/huggingface/peft). To use the these tuners, simply call:
+SWIFT supports multiple tuners, as well as tuners provided by [PEFT](https://github.com/huggingface/peft). To use these tuners, simply call:
 
 ```python
-from swift import Swift
+from swift import Swift, LoRAConfig
+config = LoRAConfig(...)
 model = Swift.prepare_model(model, config, extra_state_keys=['...'])
 ```
 
-The code snippet above initialized the tuner randomly. The input model is an instance of `torch.nn.Module`, config is a subclass instance of `SwiftConfig` or `PeftConfig`. extra_state_keys is
+The code snippet above initialized the tuner randomly. The input model is an instance of `torch.nn.Module`, the config is a subclass instance of `SwiftConfig` or `PeftConfig`. extra_state_keys is
 the extra module weights(like the linear head) to be trained and stored in the output dir.
 
 You may combine multiple tuners by:
@@ -103,7 +100,7 @@ from swift import Swift, LoRAConfig, PromptConfig
 model = Swift.prepare_model(model, {'lora': LoRAConfig(...), 'prompt': PromptConfig(...)})
 ```
 
-You can all `save_pretrained` and `push_to_hub` after finetuning:
+Call `save_pretrained` and `push_to_hub` after finetuning:
 
 ```python
 from swift import push_to_hub
@@ -199,20 +196,8 @@ model_wrapped = Swift.prepare_model(model, lora_config)
 model_wrapped = Swift.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')
 ```
 
-or:
-
-```python
-from swift import LoraConfig, get_peft_model, PeftModel
-from peft import TaskType
-lora_config = LoraConfig(target_modules=['query', 'key', 'value'], task_type=TaskType.CAUSAL_LM)
-model_wrapped = get_peft_model(model, lora_config)
-
-# or call from_pretrained to load weights in the modelhub
-model_wrapped = PeftModel.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')
-```
-
 
-The saving strategy between Swift tuners and Peft tuners are slightly different. You can name a tuner of a SWIFT by:
+The saving strategy between Swift tuners and Peft tuners are slightly different. You can name a tuner by:
 
 ```python
 model = Swift.prepare_model(model, {'default': LoRAConfig(...)})
@@ -230,7 +215,7 @@ output
     |-- adapter_model.bin
 ```
 
-The config/weights stored in the output dir is the config of `extra_state_keys` and the weights of it. This is different from Peft, which stores the weights and config of the `default` tuner.
+The config/weights stored in the output dir is the config of `extra_state_keys` and the weights of it. This is different from PEFT, which stores the weights and config of the `default` tuner.
 
 
 # Learn More
diff --git a/README_CN.md b/README_CN.md
@@ -13,28 +13,28 @@
 </p>
 
 # 简介
-SWIFT（Scalable lightWeight Infrastructure for Fine-Tuning）是一个可扩展的框架，旨在促进轻量级模型的微调。它集成了各种高效的微调方法的实现，采用了参数高效、内存高效和时间高效的方法。SWIFT可以无缝地集成到ModelScope生态系统中，并提供微调各种模型的能力，主要侧重于LLMs和视觉模型。此外，SWIFT与[Peft](https://github.com/huggingface/peft)完全兼容，使用户能够利用熟悉的Peft接口对ModelScope模型进行微调。
+SWIFT（Scalable lightWeight Infrastructure for Fine-Tuning）是一个可扩展的轻量级一站式训练、推理深度学习框架。它集成了各种高效的微调方法，如LoRA、QLoRA、阿里云自研的ResTuning-Bypass等，以及开箱即用的训练推理脚本，使开发者可以在单张商业级显卡上微调推理LLM&AIGC模型。此外，SWIFT与[PEFT](https://github.com/huggingface/peft)完全兼容，使开发者可以在ModelScope模型体系中使用PEFT的能力。
 
-目前支持的方法（数量持续增加）：
+目前支持的方法：
 
 1. LoRA：[LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685)
 2. Adapter：[Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751)
-3. Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
+3. Prompt: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
 4. Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503)
 5. ResTuning-Bypass
-6. 所有在[Peft](https://github.com/huggingface/peft)上提供的tuners
+6. 所有在[PEFT](https://github.com/huggingface/peft)上提供的tuners
 
-关键特点：
-1. 通过集成ModelScope库，可以通过model id轻松获取模型。
-2. SWIFT提供的tuners可以组合在一起，以便在模型上探索多个tuners，以获得最佳结果。
-3. 支持调用`activate_adapter`或`deactivate_adapter`来使tuner激活或失活，用户可以在推理时用一个模型在不同线程中使用多种tuners而互不干扰。
+主要能力：
+1. 可以通过model-id使SWIFT或PEFT的方法使用ModelScope Hub中的模型
+2. 在单次训练或推理中可以使用多个tuners
+3. 支持调用`activate_adapter`或`deactivate_adapter`或`set_active_adapters`来使部分tuner激活或失活，用户可以在推理时同时加载多个独立的tuners在不同线程中并行使用。
 
 用户可以查看 [Swift官方文档](./docs/Get Started/1.Introduction.md) 来了解详细信息。
 
 ## 大模型微调的例子
 [code link](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm)
 
-1. 支持的SFT方法: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), 全参数微调
+1. 支持的SFT方法: [LoRA](https://arxiv.org/abs/2106.09685), [QLoRA](https://arxiv.org/abs/2305.14314), 全参数微调
 2. 支持的模型:
    1. qwen 系列: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B)
    2. qwen-vl 系列: qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL)
@@ -56,39 +56,36 @@ SWIFT（Scalable lightWeight Infrastructure for Fine-Tuning）是一个可扩展
 
 SWIFT在Python环境中运行。请确保您的Python版本高于3.8。
 
-请使用pip命令安装SWIFT：
+- 方法1：使用pip命令安装SWIFT：
 
 ```shell
 pip install ms-swift -U
 ```
 
-如果您想通过源代码安装SWIFT，请运行以下命令：
+- 方法2：通过源代码安装SWIFT（方便运行训练推理脚本），请运行以下命令：
 
 ```shell
 git clone https://github.com/modelscope/swift.git
 cd swift
 pip install -e .
 ```
 
-如果您在使用源代码，请记得通过以下方式安装所需的依赖项：
-```shell
-pip install -r requirements/framework.txt
-```
+SWIFT依赖torch>=1.13。
 
-SWIFT requires torch>=1.13.
+- 方法3：在我们的Docker镜像中使用SWIFT
 
-我们还建议在我们的Docker镜像中使用SWIFT
 ```shell
-docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.0
+docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.8.0-py38-torch2.0.1-tf2.13.0-1.9.1
 ```
 
 # 快速开始
-SWIFT支持多个tuners，包括由[Peft](https://github.com/huggingface/peft)提供的调谐器。要使用这些调谐器，只需调用:
+SWIFT支持多个tuners，包括由[PEFT](https://github.com/huggingface/peft)提供的tuners。要使用这些tuners，只需调用:
 ```python
-from swift import Swift
+from swift import Swift, LoRAConfig
+config = LoRAConfig(...)
 model = Swift.prepare_model(model, config, extra_state_keys=['...'])
 ```
-上面的代码片段随机初始化了tuner。输入模型是torch.nn.Module的一个实例，配置是SwiftConfig或PeftConfig的子类实例。extra_state_keys是要训练并存储在输出目录中的额外模块权重（如linear head）。
+上面的代码片段随机初始化了tuner。输入model是torch.nn.Module的一个实例，config是SwiftConfig或PeftConfig的子类实例。extra_state_keys是要训练并存储在输出目录中的额外模块权重（如linear head）。
 
 您可以通过以下方式组合多个tuners：
 ```python
@@ -105,7 +102,7 @@ push_to_hub('my-group/some-repo-id-modelscope', 'some-output-folder', token='som
 ```
 假设`my-group/some-repo-id-modelscope`是Hub中的model-id，而`some-ms-token`是用于上传的令牌。
 
-使用model-id进行后续推断：
+使用model-id进行后续推理：
 
 ```python
 from swift import Swift
@@ -139,7 +136,7 @@ model = Model.from_pretrained('modelscope/Llama-2-7b-ms', device_map='auto')
 model = SwiftModel.from_pretrained(model, 'my-group/swift_llama2', device_map='auto')
 ```
 
-这是一个使用transformers库创建模型，并使用SWIFT进行高效微调的示例。
+这是一个使用transformers库实例化模型，并使用SWIFT进行高效微调的示例。
 
 ```python
 from swift import Swift, LoRAConfig, AdapterConfig, PromptConfig
@@ -180,7 +177,7 @@ model.get_trainable_parameters()
 # 'trainable params: 838,776 || all params: 87,406,432 || trainable%: 0.9596273189597764'
 ```
 
-您可以在SWIFT中使用Peft提供的功能：
+可以在SWIFT中使用PEFT提供的功能：
 
 ```python
 from swift import LoraConfig, Swift
@@ -192,26 +189,14 @@ model_wrapped = Swift.prepare_model(model, lora_config)
 model_wrapped = Swift.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')
 ```
 
-或者：
-
-```python
-from swift import LoraConfig, get_peft_model, PeftModel
-from peft import TaskType
-lora_config = LoraConfig(target_modules=['query', 'key', 'value'], task_type=TaskType.CAUSAL_LM)
-model_wrapped = get_peft_model(model, lora_config)
-
-# 或者使用from_pretrained从modelscope hub中加载权重。
-model_wrapped = PeftModel.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')
-```
-
-Swift tuners和Peft tuners之间的保存策略略有不同。您可以通过以下方式为Swift tuners命名：
+Swift tuners和Peft tuners之间的保存策略略有不同。可以通过以下方式为Swift tuners命名：
 
 ```python
 model = Swift.prepare_model(model, {'default': LoRAConfig(...)})
 model.save_pretrained('./output')
 ```
 
-在输出目录中，您将会得到以下类似的目录结构：
+在output目录中将会得到以下类似的目录结构：
 
 ```text
 output
@@ -222,14 +207,14 @@ output
     |-- adapter_model.bin
 ```
 
-存储在输出目录中的config/weights是extra_state_keys的配置和权重。这与Peft不同，Peft存储了default调谐器的权重和配置。
+存储在output目录中的config/weights是extra_state_keys的配置和权重。这与Peft不同，Peft存储了`default` tuner的config/weights。
 
 
 # Learn More
 
 - [ModelScope库](https://github.com/modelscope/modelscope/)
 
-  ModelScope库是ModelScope项目的模型库，包含大量热门模型。
+  ModelScope库是ModelScope项目的模型库，包含了各模态热门的深度学习模型。
 
 - [将自己的模型贡献给ModelScope](https://modelscope.cn/docs/ModelScope%E6%A8%A1%E5%9E%8B%E6%8E%A5%E5%85%A5%E6%B5%81%E7%A8%8B%E6%A6%82%E8%A7%88)
 
diff --git a/docs/Modules/2.lora.md b/docs/Modules/2.lora.md
@@ -1,6 +1,6 @@
 # LoRA
 
-LoRA是[LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685) 论文提供的轻量级训练组件。LoRA可以添加到Linear、Embedding、Conv2d等算子上生效。
+LoRA是[LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685) 论文提供的轻量级训练组件。LoRA可以添加到Linear、Embedding、Conv2d、Quantized-Linear等算子上生效。
 
 >```python
 >LoRAConfig (
diff --git a/examples/pytorch/llm/src/llm_sft.py b/examples/pytorch/llm/src/llm_sft.py
@@ -15,10 +15,11 @@
                    Seq2SeqTrainingArguments, Swift, get_logger)
 from swift.utils import (add_version_to_work_dir, broadcast_string,
                          check_json_format, compute_nlg_metrics,
-                         data_collate_fn, get_dist_setting, is_ddp_plus_mp,
-                         is_dist, is_master, parse_args, plot_images,
-                         print_example, print_model_info, seed_everything,
-                         show_layers, sort_by_max_length, stat_dataset)
+                         data_collate_fn, find_all_linear_for_lora,
+                         get_dist_setting, is_ddp_plus_mp, is_dist, is_master,
+                         parse_args, plot_images, print_example,
+                         print_model_info, seed_everything, show_layers,
+                         sort_by_max_length, stat_dataset)
 
 logger = get_logger()