Skip to content

Commit 32495cd

Browse files
committed
Merge branch 'main' into release/2.0
2 parents 20e4db4 + a9882fb commit 32495cd

File tree

4 files changed

+58
-54
lines changed

4 files changed

+58
-54
lines changed

README.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -523,20 +523,21 @@ The complete list of supported models and datasets can be found at [Supported Mo
523523

524524
### Supported Technologies
525525

526-
| Technology Name |
527-
|--------------------------------------------------------------- |
526+
| Technology Name |
527+
| ------------------------------------------------------------ |
528528
| 🔥LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685) |
529529
| 🔥LoRA+: [LoRA+: Efficient Low Rank Adaptation of Large Models](https://arxiv.org/pdf/2402.12354.pdf) |
530+
| 🔥GaLore:[GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection](https://arxiv.org/abs/2403.03507) |
531+
| 🔥LISA: [LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning](https://arxiv.org/abs/2403.17919) |
532+
| 🔥UnSloth: https://github.com/unslothai/unsloth |
530533
| 🔥LLaMA PRO: [LLAMA PRO: Progressive LLaMA with Block Expansion](https://arxiv.org/pdf/2401.02415.pdf) |
531-
| 🔥SCEdit: [SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing](https://arxiv.org/abs/2312.11392) < [arXiv](https://arxiv.org/abs/2312.11392) \| [Project Page](https://scedit.github.io/) > |
534+
| 🔥SCEdit: [SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing](https://arxiv.org/abs/2312.11392) < [arXiv](https://arxiv.org/abs/2312.11392) \ |
532535
| 🔥NEFTune: [Noisy Embeddings Improve Instruction Finetuning](https://arxiv.org/abs/2310.05914) |
533-
| QA-LoRA:[Quantization-Aware Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2309.14717) |
534536
| LongLoRA: [Efficient Fine-tuning of Long-Context Large Language Models](https://arxiv.org/abs/2309.12307) |
535-
| ROME: [Rank-One Editing of Encoder-Decoder Models](https://arxiv.org/abs/2211.13317) |
536537
| Adapter: [Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751) |
537-
| Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119) |
538+
| Vision Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119) |
538539
| Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503) |
539-
| Res-Tuning: [Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone](https://arxiv.org/abs/2310.19859) < [arXiv](https://arxiv.org/abs/2310.19859) \| [Project Page](https://res-tuning.github.io/) \| [Usage](docs/source/GetStarted/ResTuning.md) > |
540+
| Res-Tuning: [Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone](https://arxiv.org/abs/2310.19859) < [arXiv](https://arxiv.org/abs/2310.19859) \ |
540541
| Tuners provided by [PEFT](https://github.com/huggingface/peft), such as IA3, AdaLoRA, etc. |
541542

542543
### Supported Hardware

README_CN.md

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -522,21 +522,22 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
522522

523523
### 支持的技术
524524

525-
| 技术名称 |
526-
| ------------------------------------------------------------ |
527-
| 🔥LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685) |
528-
| 🔥LoRA+: [LoRA+: Efficient Low Rank Adaptation of Large Models](https://arxiv.org/pdf/2402.12354.pdf) |
529-
| 🔥LLaMA PRO: [LLAMA PRO: Progressive LLaMA with Block Expansion](https://arxiv.org/pdf/2401.02415.pdf) |
530-
| 🔥SCEdit: [SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing](https://arxiv.org/abs/2312.11392) < [arXiv](https://arxiv.org/abs/2312.11392) \| [Project Page](https://scedit.github.io/) > |
531-
| 🔥NEFTune: [Noisy Embeddings Improve Instruction Finetuning](https://arxiv.org/abs/2310.05914) |
532-
| QA-LoRA:[Quantization-Aware Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2309.14717) |
533-
| LongLoRA: [Efficient Fine-tuning of Long-Context Large Language Models](https://arxiv.org/abs/2309.12307) |
534-
| ROME: [Rank-One Editing of Encoder-Decoder Models](https://arxiv.org/abs/2211.13317) |
535-
| Adapter: [Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751) |
536-
| Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119) |
537-
| Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503) |
538-
| Res-Tuning: [Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone](https://arxiv.org/abs/2310.19859) < [arXiv](https://arxiv.org/abs/2310.19859) \| [Project Page](https://res-tuning.github.io/) \| [Usage](docs/source/GetStarted/ResTuning.md) > |
539-
| [PEFT](https://github.com/huggingface/peft)提供的tuners, 如IA3, AdaLoRA等 |
525+
| 技术名称 |
526+
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
527+
| 🔥LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685) |
528+
| 🔥LoRA+: [LoRA+: Efficient Low Rank Adaptation of Large Models](https://arxiv.org/pdf/2402.12354.pdf) |
529+
| 🔥LLaMA PRO: [LLAMA PRO: Progressive LLaMA with Block Expansion](https://arxiv.org/pdf/2401.02415.pdf) |
530+
| 🔥GaLore:[GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection](https://arxiv.org/abs/2403.03507) |
531+
| 🔥LISA: [LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning](https://arxiv.org/abs/2403.17919) |
532+
| 🔥UnSloth: https://github.com/unslothai/unsloth |
533+
| 🔥SCEdit: [SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing](https://arxiv.org/abs/2312.11392) < [arXiv](https://arxiv.org/abs/2312.11392) \ | [Project Page](https://scedit.github.io/) > |
534+
| 🔥NEFTune: [Noisy Embeddings Improve Instruction Finetuning](https://arxiv.org/abs/2310.05914) |
535+
| LongLoRA: [Efficient Fine-tuning of Long-Context Large Language Models](https://arxiv.org/abs/2309.12307) |
536+
| Adapter: [Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751) |
537+
| Vision Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119) |
538+
| Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503) |
539+
| Res-Tuning: [Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone](https://arxiv.org/abs/2310.19859) < [arXiv](https://arxiv.org/abs/2310.19859) \ | [Project Page](https://res-tuning.github.io/) \| [Usage](docs/source/GetStarted/ResTuning.md) > |
540+
| [PEFT](https://github.com/huggingface/peft)提供的tuners, 如IA3, AdaLoRA等 |
540541

541542
### 支持的硬件
542543

docs/source/GetStarted/使用tuners.md

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,27 +5,28 @@ tuner是指附加在模型上的额外结构部分,用于减少训练参数量
55
1. LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685)
66
2. LoRA+: [LoRA+: Efficient Low Rank Adaptation of Large Models](https://arxiv.org/pdf/2402.12354.pdf)
77
3. LLaMA PRO: [LLAMA PRO: Progressive LLaMA with Block Expansion](https://arxiv.org/pdf/2401.02415.pdf)
8-
4. SCEdit: [SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing](https://arxiv.org/abs/2312.11392) < [arXiv](https://arxiv.org/abs/2312.11392) | [Project Page](https://scedit.github.io/) >
9-
5. NEFTune: [Noisy Embeddings Improve Instruction Finetuning](https://arxiv.org/abs/2310.05914)
10-
6. QA-LoRA:[Quantization-Aware Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2309.14717).
11-
7. LongLoRA: [Efficient Fine-tuning of Long-Context Large Language Models](https://arxiv.org/abs/2309.12307)
12-
8. ROME: [Rank-One Editing of Encoder-Decoder Models](https://arxiv.org/abs/2211.13317)
13-
9. Adapter: [Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751)
14-
10. Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
15-
11. Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503)
16-
12. Res-Tuning: [Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone](https://arxiv.org/abs/2310.19859) < [arXiv](https://arxiv.org/abs/2310.19859) | [Project Page](https://res-tuning.github.io/) | [Usage](docs/source/GetStarted/ResTuning.md) >
17-
13. [PEFT](https://github.com/huggingface/peft)提供的tuners, 如IA3, AdaLoRA等
8+
4. GaLore: [GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection](https://arxiv.org/abs/2403.03507)
9+
5. LISA: [LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning](https://arxiv.org/abs/2403.17919)
10+
6. UnSloth: https://github.com/unslothai/unsloth
11+
7. SCEdit: [SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing](https://arxiv.org/abs/2312.11392) < [arXiv](https://arxiv.org/abs/2312.11392) | [Project Page](https://scedit.github.io/) >
12+
8. NEFTune: [Noisy Embeddings Improve Instruction Finetuning](https://arxiv.org/abs/2310.05914)
13+
9. LongLoRA: [Efficient Fine-tuning of Long-Context Large Language Models](https://arxiv.org/abs/2309.12307)
14+
10. Adapter: [Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751)
15+
11. Vision Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
16+
12. Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503)
17+
13. Res-Tuning: [Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone](https://arxiv.org/abs/2310.19859) < [arXiv](https://arxiv.org/abs/2310.19859) | [Project Page](https://res-tuning.github.io/) | [Usage](docs/source/GetStarted/ResTuning.md) >
18+
14. [PEFT](https://github.com/huggingface/peft)提供的tuners, 如IA3, AdaLoRA等
1819

1920
## 在训练中使用
2021

2122
调用`Swift.prepare_model()`来将tuners添加到模型上:
2223

2324
```python
2425
from modelscope import Model
25-
from swift import Swift, LoRAConfig
26+
from swift import Swift, LoraConfig
2627
import torch
2728
model = Model.from_pretrained('ZhipuAI/chatglm3-6b', torch_dtype=torch.bfloat16, device_map='auto')
28-
lora_config = LoRAConfig(
29+
lora_config = LoraConfig(
2930
r=16,
3031
target_modules=['query_key_value'],
3132
lora_alpha=32,
@@ -37,10 +38,10 @@ model = Swift.prepare_model(model, lora_config)
3738

3839
```python
3940
from modelscope import Model
40-
from swift import Swift, LoRAConfig, AdapterConfig
41+
from swift import Swift, LoraConfig, AdapterConfig
4142
import torch
4243
model = Model.from_pretrained('ZhipuAI/chatglm3-6b', torch_dtype=torch.bfloat16, device_map='auto')
43-
lora_config = LoRAConfig(
44+
lora_config = LoraConfig(
4445
r=16,
4546
target_modules=['query_key_value'],
4647
lora_alpha=32,
@@ -105,13 +106,13 @@ model.save_pretrained(save_directory='./output')
105106
from swift import Seq2SeqTrainer, Seq2SeqTrainingArguments
106107
from modelscope import MsDataset, AutoTokenizer
107108
from modelscope import AutoModelForCausalLM
108-
from swift import Swift, LoRAConfig
109+
from swift import Swift, LoraConfig
109110
from swift.llm import get_template, TemplateType
110111
import torch
111112

112113
# 拉起模型
113114
model = AutoModelForCausalLM.from_pretrained('ZhipuAI/chatglm3-6b', torch_dtype=torch.bfloat16, device_map='auto', trust_remote_code=True)
114-
lora_config = LoRAConfig(
115+
lora_config = LoraConfig(
115116
r=16,
116117
target_modules=['query_key_value'],
117118
lora_alpha=32,

docs/source_en/GetStarted/Tuners.md

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,27 +5,28 @@
55
1. LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685)
66
2. LoRA+: [LoRA+: Efficient Low Rank Adaptation of Large Models](https://arxiv.org/pdf/2402.12354.pdf)
77
3. LLaMA PRO: [LLAMA PRO: Progressive LLaMA with Block Expansion](https://arxiv.org/pdf/2401.02415.pdf)
8-
4. SCEdit: [SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing](https://arxiv.org/abs/2312.11392) < [arXiv](https://arxiv.org/abs/2312.11392) | [Project Page](https://scedit.github.io/) >
9-
5. NEFTune: [Noisy Embeddings Improve Instruction Finetuning](https://arxiv.org/abs/2310.05914)
10-
6. QA-LoRA:[Quantization-Aware Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2309.14717).
11-
7. LongLoRA: [Efficient Fine-tuning of Long-Context Large Language Models](https://arxiv.org/abs/2309.12307)
12-
8. ROME: [Rank-One Editing of Encoder-Decoder Models](https://arxiv.org/abs/2211.13317)
13-
9. Adapter: [Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751)
14-
10. Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
15-
11. Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503)
16-
12. Res-Tuning: [Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone](https://arxiv.org/abs/2310.19859) < [arXiv](https://arxiv.org/abs/2310.19859) | [Project Page](https://res-tuning.github.io/) | [Usage](docs/source/GetStarted/ResTuning.md) >
17-
13. Tuners provided by [PEFT](https://github.com/huggingface/peft), such as IA3, AdaLoRA, etc.
8+
4. GaLore: [GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection](https://arxiv.org/abs/2403.03507)
9+
5. LISA: [LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning](https://arxiv.org/abs/2403.17919)
10+
6. UnSloth: https://github.com/unslothai/unsloth
11+
7. SCEdit: [SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing](https://arxiv.org/abs/2312.11392) < [arXiv](https://arxiv.org/abs/2312.11392) | [Project Page](https://scedit.github.io/) >
12+
8. NEFTune: [Noisy Embeddings Improve Instruction Finetuning](https://arxiv.org/abs/2310.05914)
13+
9. LongLoRA: [Efficient Fine-tuning of Long-Context Large Language Models](https://arxiv.org/abs/2309.12307)
14+
10. Adapter: [Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751)
15+
11. Vision Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
16+
12. Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503)
17+
13. Res-Tuning: [Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone](https://arxiv.org/abs/2310.19859) < [arXiv](https://arxiv.org/abs/2310.19859) | [Project Page](https://res-tuning.github.io/) | [Usage](docs/source/GetStarted/ResTuning.md) >
18+
14. Tuners provided by [PEFT](https://github.com/huggingface/peft), such as IA3, AdaLoRA, etc.
1819

1920
## Using in Training
2021

2122
Call `Swift.prepare_model()` to add tuners to the model:
2223

2324
```python
2425
from modelscope import Model
25-
from swift import Swift, LoRAConfig
26+
from swift import Swift, LoraConfig
2627
import torch
2728
model = Model.from_pretrained('ZhipuAI/chatglm3-6b', torch_dtype=torch.bfloat16, device_map='auto')
28-
lora_config = LoRAConfig(
29+
lora_config = LoraConfig(
2930
r=16,
3031
target_modules=['query_key_value'],
3132
lora_alpha=32,
@@ -37,10 +38,10 @@ Multiple tuners can also be used simultaneously:
3738

3839
```python
3940
from modelscope import Model
40-
from swift import Swift, LoRAConfig, AdapterConfig
41+
from swift import Swift, LoraConfig, AdapterConfig
4142
import torch
4243
model = Model.from_pretrained('ZhipuAI/chatglm3-6b', torch_dtype=torch.bfloat16, device_map='auto')
43-
lora_config = LoRAConfig(
44+
lora_config = LoraConfig(
4445
r=16,
4546
target_modules=['query_key_value'],
4647
lora_alpha=32,
@@ -105,13 +106,13 @@ If only a single config is passed in, the default name `default` will be used:
105106
from swift import Seq2SeqTrainer, Seq2SeqTrainingArguments
106107
from modelscope import MsDataset, AutoTokenizer
107108
from modelscope import AutoModelForCausalLM
108-
from swift import Swift, LoRAConfig
109+
from swift import Swift, LoraConfig
109110
from swift.llm import get_template, TemplateType
110111
import torch
111112

112113
# load model
113114
model = AutoModelForCausalLM.from_pretrained('ZhipuAI/chatglm3-6b', torch_dtype=torch.bfloat16, device_map='auto', trust_remote_code=True)
114-
lora_config = LoRAConfig(
115+
lora_config = LoraConfig(
115116
r=16,
116117
target_modules=['query_key_value'],
117118
lora_alpha=32,

0 commit comments

Comments
 (0)