modelscope
diff --git a/‎.gitignore‎
Lines changed: 2 additions & 2 deletions b/‎.gitignore‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 52 additions & 46 deletions b/‎README.md‎
Lines changed: 52 additions & 46 deletions
diff --git a/‎README_CN.md‎
Lines changed: 221 additions & 0 deletions b/‎README_CN.md‎
Lines changed: 221 additions & 0 deletions
diff --git a/‎examples/pytorch/llm/README.md‎
Lines changed: 1 addition & 1 deletion b/‎examples/pytorch/llm/README.md‎
Lines changed: 1 addition & 1 deletion
@@ -124,12 +124,12 @@ replace.sh
 result.png
 result.jpg
 result.mp4
+runs/
+*.out
 
 # Pytorch
 *.pth
 *.pt
 
 # ast template
 ast_index_file.py
-
-runs/
@@ -1,10 +1,16 @@
+<h1>SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)</h1>
+
 <p align="center">
     <br>
     <img src="https://modelscope.oss-cn-beijing.aliyuncs.com/modelscope.gif" width="400"/>
     <br>
-    <h1>SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)</h1>
 <p>
 
+<p align="center">
+<a href="https://modelscope.cn/home">Modelscope Hub</a>
+<br>
+        <a href="README_CN.md">中文</a>&nbsp ｜ &nbspEnglish
+</p>
 
 # Introduction
 
@@ -25,11 +31,41 @@ Key features:
 ## LLM SFT Example
 [code link](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm)
 
-1. supported sft method: lora, qlora, full, ...
-2. supported models: [**qwen-7b**](https://github.com/QwenLM/Qwen-7B), baichuan-7b, baichuan-13b, chatglm2-6b, chatglm2-6b-32k, llama2-7b, llama2-13b, llama2-70b, openbuddy-llama2-13b, openbuddy-llama-65b, polylm-13b, ...
+1. supported sft method: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), full(full parameter fine tuning), ...
+2. supported models: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B), qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL), baichuan-7b, baichuan-13b, baichuan-13b-chat, chatglm2-6b, chatglm2-6b-32k, llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat, openbuddy-llama2-13b, openbuddy-llama-65b, polylm-13b
 3. supported feature: quantization, ddp, model parallelism(device map), gradient checkpoint, gradient accumulation steps, push to modelscope hub, custom datasets, ...
-4. supported datasets: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, ...
+4. supported datasets: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, coco-en
+5. supported templates: chatml(qwen), baichuan, chatglm2, llama, openbuddy_llama, default
+
+# Installation
+
+SWIFT is running in Python environment. Please make sure your python version is higher than 3.8.
+
+Please install SWIFT by the `pip` command:
+
+```shell
+pip install ms-swift -U
+```
+
+If you want to install SWIFT by source code, please run:
+
+```shell
+git clone https://github.com/modelscope/swift.git
+cd swift
+pip install -e .
+```
 
+If you are using source code, please remember install requirements by:
+```shell
+pip install -r requirements/framework.txt
+```
+
+SWIFT requires torch>=1.13.
+
+We also recommend to use SWIFT in our docker image:
+```shell
+docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.0
+```
 
 # Getting Started
 
@@ -104,26 +140,26 @@ model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16
 
 # init lora tuner config
 lora_config = LoRAConfig(
-    r=10,	# the rank of the LoRA module
-    target_modules=['query', 'key', 'value'],	# the modules to be replaced with the end of the module name
-    merge_weights=False	# whether to merge weights
+    r=10,  # the rank of the LoRA module
+    target_modules=['query', 'key', 'value'],  # the modules to be replaced with the end of the module name
+    merge_weights=False  # whether to merge weights
 )
 
 # init adapter tuner config
 adapter_config = AdapterConfig(
-    dim=768,	# the dimension of the hidden states
-    hidden_pos=0,	# the position of the hidden state to passed into the adapter
-    target_modules=r'.*attention.output.dense$',	# the modules to be replaced with regular expression
-    adapter_length=10	# the length of the adapter length
+    dim=768,  # the dimension of the hidden states
+    hidden_pos=0,  # the position of the hidden state to passed into the adapter
+    target_modules=r'.*attention.output.dense$',  # the modules to be replaced with regular expression
+    adapter_length=10  # the length of the adapter length
 )
 
 # init prompt tuner config
 prompt_config = PromptConfig(
-    dim=768,	# the dimension of the hidden states
-    target_modules=r'.*layer\.\d+$',	# the modules to be replaced with regular expression
-    embedding_pos=0,	# the position of the embedding tensor
-    prompt_length=10,	# the length of the prompt tokens
-    attach_front=False	# Whether prompt is attached in front of the embedding
+    dim=768,  # the dimension of the hidden states
+    target_modules=r'.*layer\.\d+$',  # the modules to be replaced with regular expression
+    embedding_pos=0,    # the position of the embedding tensor
+    prompt_length=10,   # the length of the prompt tokens
+    attach_front=False  # Whether prompt is attached in front of the embedding
 )
 
 # create model with swift. In practice, you can use any of these tuners or a combination of them.
@@ -179,36 +215,6 @@ output
 
 The config/weights stored in the output dir is the config of `extra_state_keys` and the weights of it. This is different from Peft, which stores the weights and config of the `default` tuner.
 
-# Installation
-
-SWIFT is running in Python environment. Please make sure your python version is higher than 3.8.
-
-Please install SWIFT by the `pip` command:
-
-```shell
-pip install swift -U
-```
-
-If you want to install SWIFT by source code, please run:
-
-```shell
-git clone https://github.com/modelscope/swift.git
-cd swift
-pip install -e .
-```
-
-If you are using source code, please remember install requirements by:
-```shell
-pip install -r requirements/framework.txt
-```
-
-SWIFT requires torch>=1.13.
-
-We also recommend to use SWIFT in our docker image:
-```shell
-docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.0
-```
-
 
 # Learn More
 
 
@@ -0,0 +1,221 @@
+<h1>SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)</h1>
+
+<p align="center">
+    <br>
+    <img src="https://modelscope.oss-cn-beijing.aliyuncs.com/modelscope.gif" width="400"/>
+    <br>
+<p>
+
+<p align="center">
+<a href="https://modelscope.cn/home">魔搭社区</a>
+<br>
+        中文&nbsp ｜ &nbsp<a href="README.md">English</a>
+</p>
+
+# 简介
+SWIFT（Scalable lightWeight Infrastructure for Fine-Tuning）是一个可扩展的框架，旨在促进轻量级模型的微调。它集成了各种高效的微调方法的实现，采用了参数高效、内存高效和时间高效的方法。SWIFT可以无缝地集成到ModelScope生态系统中，并提供微调各种模型的能力，主要侧重于LLMs和视觉模型。此外，SWIFT与[Peft](https://github.com/huggingface/peft)完全兼容，使用户能够利用熟悉的Peft接口对ModelScope模型进行微调。
+
+目前支持的方法（数量持续增加）：
+
+1. LoRA：[LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685)
+2. Adapter：[Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751)
+3. Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
+4. 所有在[Peft](https://github.com/huggingface/peft)上提供的tuners。
+
+关键特点：
+1. 通过集成ModelScope库，可以通过model id轻松获取模型。
+2. SWIFT提供的tuners可以组合在一起，以便在模型上探索多个tuners，以获得最佳结果。
+
+## 大模型微调的例子
+[code link](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm)
+
+1. 支持的sft方法: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), 全参数微调, ...
+2. 支持的模型: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B), qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL), baichuan-7b, baichuan-13b, baichuan-13b-chat, chatglm2-6b, chatglm2-6b-32k, llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat, openbuddy-llama2-13b, openbuddy-llama-65b, polylm-13b
+3. 支持的特性: 模型量化, DDP, 模型并行(device_map), gradient checkpoint, 梯度累加, 支持推送modelscope hub, 支持自定义数据集, ...
+4. 支持的数据集: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, coco-en
+5. 支持的对话模板: chatml(qwen), baichuan, chatglm2, llama, openbuddy_llama, default
+
+# 安装
+
+SWIFT在Python环境中运行。请确保您的Python版本高于3.8。
+
+请使用pip命令安装SWIFT：
+
+```shell
+pip install ms-swift -U
+```
+
+如果您想通过源代码安装SWIFT，请运行以下命令：
+
+```shell
+git clone https://github.com/modelscope/swift.git
+cd swift
+pip install -e .
+```
+
+如果您在使用源代码，请记得通过以下方式安装所需的依赖项：
+```shell
+pip install -r requirements/framework.txt
+```
+
+SWIFT requires torch>=1.13.
+
+我们还建议在我们的Docker镜像中使用SWIFT
+```shell
+docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.0
+```
+
+# 快速开始
+SWIFT支持多个tuners，包括由[Peft](https://github.com/huggingface/peft)提供的调谐器。要使用这些调谐器，只需调用:
+```python
+from swift import Swift
+model = Swift.prepare_model(model, config, extra_state_keys=['...'])
+```
+上面的代码片段随机初始化了tuner。输入模型是torch.nn.Module的一个实例，配置是SwiftConfig或PeftConfig的子类实例。extra_state_keys是要训练并存储在输出目录中的额外模块权重（如linear head）。
+
+您可以通过以下方式组合多个tuners：
+```python
+from swift import Swift, LoRAConfig, PromptConfig
+model = Swift.prepare_model(model, {'lora': LoRAConfig(...), 'prompt': PromptConfig(...)})
+```
+
+在微调之后，您可以调用save_pretrained和push_to_hub方法：
+
+```python
+from swift import push_to_hub
+model.save_pretrained('some-output-folder')
+push_to_hub('my-group/some-repo-id-modelscope', 'some-output-folder', token='some-ms-token')
+```
+假设`my-group/some-repo-id-modelscope`是Hub中的model-id，而`some-ms-token`是用于上传的令牌。
+
+使用model-id进行后续推断：
+
+```python
+from swift import Swift
+model = Swift.from_pretrained(model, 'my-group/some-repo-id-modelscope')
+```
+
+下面是一个可运行的示例：
+
+```python
+import os
+import tempfile
+
+# 请通过`pip install modelscope`安装modelscope
+from modelscope import Model
+
+from swift import LoRAConfig, SwiftModel, Swift, push_to_hub
+
+tmp_dir = tempfile.TemporaryDirectory().name
+if not os.path.exists(tmp_dir):
+    os.makedirs(tmp_dir)
+
+
+model = Model.from_pretrained('modelscope/Llama-2-7b-ms', device_map='auto')
+lora_config = LoRAConfig(target_modules=['q_proj', 'k_proj', 'v_proj'])
+model: SwiftModel = Swift.prepare_model(model, lora_config)
+# 在这里进行一些微调操作
+model.save_pretrained(tmp_dir)
+
+push_to_hub('my-group/swift_llama2', output_dir=tmp_dir)
+model = Model.from_pretrained('modelscope/Llama-2-7b-ms', device_map='auto')
+model = SwiftModel.from_pretrained(model, 'my-group/swift_llama2', device_map='auto')
+```
+
+这是一个使用transformers库创建模型，并使用SWIFT进行高效微调的示例。
+
+```python
+from swift import Swift, LoRAConfig, AdapterConfig, PromptConfig
+from transformers import AutoModelForImageClassification
+
+# 初始vit模型
+model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16-224")
+
+# 初始化LoRA tuner配置
+lora_config = LoRAConfig(
+    r=10,  # LoRA模块的rank
+    target_modules=['query', 'key', 'value'],  # 将要被替换的模块的模块名后缀
+    merge_weights=False  # 是否合并权重
+)
+
+# 初始化adapter tuner配置
+adapter_config = AdapterConfig(
+    dim=768,  # hidden states的维度
+    hidden_pos=0,  # 要传递到adapter的hidden state的位置
+    target_modules=r'.*attention.output.dense$',  # 要使用正则表达式替换的模块
+    adapter_length=10  # adapter长度
+)
+
+# 初始化prompt tuner配置
+prompt_config = PromptConfig(
+    dim=768,  # hidden states的维度
+    target_modules=r'.*layer\.\d+$',  # 要使用正则表达式替换的模块
+    embedding_pos=0,    # embedding张量的位置
+    prompt_length=10,   # 提示符token的长度
+    attach_front=False  # 是否将提示符附加在embedding前面
+)
+
+# 使用swift创建模型。在实践中，您可以使用其中任何一个调谐器或它们的组合。
+model = Swift.prepare_model(model, {"lora_tuner": lora_config, "adapter_tuner": adapter_config, "prompt_tuner": prompt_config})
+
+# 获取模型的可训练参数。
+model.get_trainable_parameters()
+# 'trainable params: 838,776 || all params: 87,406,432 || trainable%: 0.9596273189597764'
+```
+
+您可以在SWIFT中使用Peft提供的功能：
+
+```python
+from swift import LoraConfig, Swift
+from peft import TaskType
+lora_config = LoraConfig(target_modules=['query', 'key', 'value'], task_type=TaskType.CAUSAL_LM)
+model_wrapped = Swift.prepare_model(model, lora_config)
+
+# 或者使用from_pretrained从modelscope hub中加载权重。
+model_wrapped = Swift.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')
+```
+
+或者：
+
+```python
+from swift import LoraConfig, get_peft_model, PeftModel
+from peft import TaskType
+lora_config = LoraConfig(target_modules=['query', 'key', 'value'], task_type=TaskType.CAUSAL_LM)
+model_wrapped = get_peft_model(model, lora_config)
+
+# 或者使用from_pretrained从modelscope hub中加载权重。
+model_wrapped = PeftModel.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')
+```
+
+Swift tuners和Peft tuners之间的保存策略略有不同。您可以通过以下方式为Swift tuners命名：
+
+```python
+model = Swift.prepare_model(model, {'default': LoRAConfig(...)})
+model.save_pretrained('./output')
+```
+
+在输出目录中，您将会得到以下类似的目录结构：
+
+```text
+output
+    |-- default
+        |-- adapter_config.json
+        |-- adapter_model.bin
+    |-- adapter_config.json
+    |-- adapter_model.bin
+```
+
+存储在输出目录中的config/weights是extra_state_keys的配置和权重。这与Peft不同，Peft存储了default调谐器的权重和配置。
+
+
+# Learn More
+
+- [ModelScope库](https://github.com/modelscope/modelscope/)
+
+  ModelScope库是ModelScope项目的模型库，包含大量热门模型。
+
+- [将自己的模型贡献给ModelScope](https://modelscope.cn/docs/ModelScope%E6%A8%A1%E5%9E%8B%E6%8E%A5%E5%85%A5%E6%B5%81%E7%A8%8B%E6%A6%82%E8%A7%88)
+
+# License
+
+本项目使用[Apache License (Version 2.0)](https://github.com/modelscope/modelscope/blob/master/LICENSE)进行许可。
@@ -16,7 +16,7 @@
 
 ## Features
 1. supported sft method: [lora](https://arxiv.org/abs/2106.09685), [qlora](https://arxiv.org/abs/2305.14314), full(full parameter fine tuning), ...
-2. supported models: [**qwen-7b**](https://github.com/QwenLM/Qwen-7B), qwen-7b-chat, qwen-vl, **qwen-vl-chat**, baichuan-7b, baichuan-13b, baichuan-13b-chat, chatglm2-6b, chatglm2-6b-32k, llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat, openbuddy-llama2-13b, openbuddy-llama-65b, polylm-13b
+2. supported models: qwen-7b, [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B), qwen-vl, [qwen-vl-chat](https://github.com/QwenLM/Qwen-VL), baichuan-7b, baichuan-13b, baichuan-13b-chat, chatglm2-6b, chatglm2-6b-32k, llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat, openbuddy-llama2-13b, openbuddy-llama-65b, polylm-13b
 3. supported feature: quantization, ddp, model parallelism(device map), gradient checkpoint, gradient accumulation steps, push to modelscope hub, custom datasets, ...
 4. supported datasets: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, coco-en
 5. supported templates: chatml(qwen), baichuan, chatglm2, llama, openbuddy_llama, default