Skip to content

Commit 43be956

Browse files
committed
Merge branch 'main' into release/2.3
2 parents 742d16b + e29cf5a commit 43be956

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+1265
-635
lines changed

README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,11 @@ You can contact us and communicate with us by adding our group:
5555
<img src="asset/discord_qr.jpg" width="200" height="200"> | <img src="asset/wechat.png" width="200" height="200">
5656

5757
## 🎉 News
58+
- 🔥2024.08.22: Support `reft` tuner from [ReFT](https://github.com/stanfordnlp/pyreft) to achieve 15×–65× more parameter-efficient than LoRA, use `--sft_type reft` to begin!
59+
- 2024.08.21: Support for phi3_5-mini-instruct, phi3_5-moe-instruct, and phi3_5-vision-instruct.
60+
- 2024.08.21: Support for idefics3-8b-llama3, llava-onevision-qwen2-0_5b-ov, llava-onevision-qwen2-7b-ov, and llava-onevision-qwen2-72b-ov.
61+
- 🔥2024.08.20: Support fine-tuning of multimodal large models using DeepSpeed-Zero3.
62+
- 2024.08.20: Supported models: longwriter-glm4-9b, longwriter-llama3_1-8b. Supported dataset: longwriter-6k.
5863
- 🔥2024.08.12: 🎉 SWIFT paper has been published to arXiv. Check [this link](https://arxiv.org/abs/2408.05517) to read.
5964
- 🔥2024.08.12: Support packing with flash-attention without the contamination of attention_mask, use `--packing` to begin. Check[PR](https://github.com/huggingface/transformers/pull/31629/files).
6065
- 🔥2024.08.09: Support for inference and fine-tuning of the qwen2-audio model. Best practice can be found [here](https://github.com/modelscope/ms-swift/issues/1653).
@@ -68,6 +73,8 @@ You can contact us and communicate with us by adding our group:
6873
- 🔥2024.07.24: Support DPO/ORPO/SimPO/CPO alignment algorithm for vision MLLM, training scripts can be find in [Document](docs/source_en/Multi-Modal/human-preference-alignment-training-documentation.md). support RLAIF-V dataset.
6974
- 🔥2024.07.24: Support using Megatron for CPT and SFT on the Qwen2 series. You can refer to the [Megatron training documentation](docs/source_en/LLM/Megatron-training.md).
7075
- 🔥2024.07.24: Support for the llama3.1 series models, including 8b, 70b, and 405b. Support for openbuddy-llama3_1-8b-chat.
76+
<details><summary>More</summary>
77+
7178
- 2024.07.20: Support mistral-nemo series models. Use `--model_type mistral-nemo-base-2407` and `--model_type mistral-nemo-instruct-2407` to begin.
7279
- 2024.07.19: Support [Q-Galore](https://arxiv.org/abs/2407.08296), this algorithm can reduce the training memory cost by 60% (qwen-7b-chat, full, 80G -> 35G), use `swift sft --model_type xxx --use_galore true --galore_quantization true` to begin!
7380
- 2024.07.17: Support newly released InternVL2 models: `model_type` are internvl2-1b, internvl2-40b, internvl2-llama3-76b. For best practices, refer to [here](docs/source_en/Multi-Modal/internvl-best-practice.md).
@@ -81,7 +88,6 @@ You can contact us and communicate with us by adding our group:
8188
- 2024.07.04: Support internlm2_5-7b series: internlm2_5-7b, internlm2_5-7b-chat, internlm2_5-7b-chat-1m.
8289
- 2024.07.02: Support for `llava1_6-vicuna-7b-instruct`, `llava1_6-vicuna-13b-instruct` and other llava-hf models. For best practices, refer to [here](docs/source_en/Multi-Modal/llava-best-practice.md).
8390
- 🔥2024.06.29: Support [eval-scope](https://github.com/modelscope/eval-scope)&[open-compass](https://github.com/open-compass/opencompass) for evaluation! Now we have supported over 50 eval datasets like `BoolQ, ocnli, humaneval, math, ceval, mmlu, gsk8k, ARC_e`, please check our [Eval Doc](https://github.com/modelscope/swift/blob/main/docs/source_en/LLM/LLM-eval.md) to begin! Next sprint we will support Multi-modal and Agent evaluation, remember to follow us : )
84-
<details><summary>More</summary>
8591

8692
- 🔥2024.06.28: Support for **Florence** series model! See [document](docs/source_en/Multi-Modal/florence-best-pratice.md)
8793
- 🔥2024.06.28: Support for Gemma2 series models: gemma2-9b, gemma2-9b-instruct, gemma2-27b, gemma2-27b-instruct.
@@ -618,6 +624,7 @@ The complete list of supported models and datasets can be found at [Supported Mo
618624
| DeepSeek-VL | [DeepSeek series vision models](https://github.com/deepseek-ai) | Chinese<br>English | 1.3B-7B | chat model |
619625
| MiniCPM-V<br>MiniCPM-V-2<br>MiniCPM-V-2.5<br>MiniCPM-V-2.6 | [OpenBmB MiniCPM vision model](https://github.com/OpenBMB/MiniCPM) | Chinese<br>English | 3B-9B | chat model |
620626
| CogVLM<br>CogAgent<br>CogVLM2<br>CogVLM2-Video<br>GLM4V | [Zhipu ChatGLM visual QA and Agent model](https://github.com/THUDM/) | Chinese<br>English | 9B-19B | chat model |
627+
| Llava-HF | [Llava-HF series models](https://huggingface.co/llava-hf) | English | 0.5B-110B | chat model |
621628
| Llava1.5<br>Llava1.6 | [Llava series models](https://github.com/haotian-liu/LLaVA) | English | 7B-34B | chat model |
622629
| Llava-Next<br>Llava-Next-Video | [Llava-Next series models](https://github.com/LLaVA-VL/LLaVA-NeXT) | Chinese<br>English | 7B-110B | chat model |
623630
| mPLUG-Owl | [mPLUG-Owl series models](https://github.com/X-PLUG/mPLUG-Owl) | English | 11B | chat model |

README_CN.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,11 @@ SWIFT具有丰富全面的文档,请查看我们的文档网站:
5656

5757

5858
## 🎉 新闻
59+
- 🔥2024.08.22: 支持[ReFT](https://github.com/stanfordnlp/pyreft), 该tuner可以以LoRA的1/15~1/65的参数量达到和LoRA匹配或更好的效果, 使用`--sft_type reft`开始训练!
60+
- 2024.08.21: 支持phi3_5-mini-instruct, phi3_5-moe-instruct, phi3_5-vision-instruct.
61+
- 2024.08.21: 支持idefics3-8b-llama3, llava-onevision-qwen2-0_5b-ov, llava-onevision-qwen2-7b-ov, llava-onevision-qwen2-72b-ov.
62+
- 🔥2024.08.20: 支持使用deepspeed-zero3对多模态大模型进行微调.
63+
- 2024.08.20: 支持模型: longwriter-glm4-9b, longwriter-llama3_1-8b. 支持数据集: longwriter-6k.
5964
- 🔥2024.08.12: 🎉 SWIFT论文已经发布到arXiv上,可以点击[这个链接](https://arxiv.org/abs/2408.05517)阅读.
6065
- 🔥2024.08.12: 支持packing和flash-attention时不污染attention_mask, 使用`--packing`开启。详情见[PR](https://github.com/huggingface/transformers/pull/31629/files).
6166
- 🔥2024.08.09: 支持qwen2-audio模型的推理与微调. 最佳实践可以查看[这里](https://github.com/modelscope/ms-swift/issues/1653).
@@ -69,6 +74,8 @@ SWIFT具有丰富全面的文档,请查看我们的文档网站:
6974
- 🔥2024.07.24: 人类偏好对齐算法支持视觉多模态大模型, 包括DPO/ORPO/SimPO/CPO, 训练参考[文档](docs/source/Multi-Modal/人类偏好对齐训练文档.md). 支持数据集RLAIF-V.
7075
- 🔥2024.07.24: 支持使用megatron对qwen2系列进行CPT和SFT. 可以查看[megatron训练文档](docs/source/LLM/Megatron训练文档.md).
7176
- 🔥2024.07.24: 支持llama3.1系列模型. 包含8b, 70b, 405b. 支持openbuddy-llama3_1-8b-chat.
77+
<details><summary>More</summary>
78+
7279
- 2024.07.20: 支持mistral-nemo系列模型. 使用`--model_type mistral-nemo-base-2407`以及`--model_type mistral-nemo-instruct-2407`开始训练和推理.
7380
- 🔥2024.07.19: 支持[Q-Galore](https://arxiv.org/abs/2407.08296)算法, 该算法可以减少显存使用约60% (qwen-7b-chat, full, 80G -> 35G), 使用命令行:`swift sft --model_type xxx --use_galore true --galore_quantization true`来开始训练!
7481
- 2024.07.17: 支持InternVL2系列新模型: `model_type`分别为internvl2-1b, internvl2-40b, internvl2-llama3-76b. 最佳实践可以查看[这里](docs/source/Multi-Modal/internvl最佳实践.md).
@@ -82,8 +89,6 @@ SWIFT具有丰富全面的文档,请查看我们的文档网站:
8289
- 2024.07.04: 支持internlm2_5-7b系列: internlm2_5-7b, internlm2_5-7b-chat, internlm2_5-7b-chat-1m.
8390
- 2024.07.02: 支持`llava1_6-vicuna-7b-instruct`, `llava1_6-vicuna-13b-instruct`等llava-hf模型. 最佳实践可以查看[这里](docs/source/Multi-Modal/llava最佳实践.md).
8491
- 🔥2024.06.29: 支持[eval-scope](https://github.com/modelscope/eval-scope)&[open-compass](https://github.com/open-compass/opencompass)评测! 我们支持了包含`BoolQ, ocnli, humaneval, math, ceval, mmlu, gsk8k, ARC_e`等50+标准数据集在内的评测流程, 请查看我们的[评测文档](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM评测文档.md)来使用。下个迭代我们会支持多模态评测和Agent评测,记得持续关注我们: )
85-
<details><summary>More</summary>
86-
8792
- 🔥2024.06.28: 支持**Florence**系列模型: 可以查看[Florence最佳实践](docs/source/Multi-Modal/florence最佳实践.md).
8893
- 🔥2024.06.28: 支持**Gemma2**系列模型: gemma2-9b, gemma2-9b-instruct, gemma2-27b, gemma2-27b-instruct.
8994
- 🔥2024.06.18: 支持**DeepSeek-Coder-v2**系列模型! 使用model_type`deepseek-coder-v2-instruct``deepseek-coder-v2-lite-instruct`来开启训练和推理.
@@ -612,6 +617,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
612617
| DeepSeek-VL | [幻方系列视觉模型](https://github.com/deepseek-ai) | 中文<br>英文 | 1.3B-7B | chat模型 |
613618
| MiniCPM-V<br>MiniCPM-V-2<br>MiniCPM-V-2.5<br>MiniCPM-V-2.6 | [OpenBmB MiniCPM视觉模型](https://github.com/OpenBMB/MiniCPM) | 中文<br>英文 | 3B-9B | chat模型 |
614619
| CogVLM<br>CogAgent<br>CogVLM2<br>CogVLM2-Video<br>GLM4V | [智谱ChatGLM视觉问答和Agent模型](https://github.com/THUDM/) | 中文<br>英文 | 9B-19B | chat模型 |
620+
| Llava-HF | [Llava-HF系列模型](https://huggingface.co/llava-hf) | 英文 | 0.5B-110B | chat模型 |
615621
| Llava1.5<br>Llava1.6 | [Llava系列模型](https://github.com/haotian-liu/LLaVA) | 英文 | 7B-34B | chat模型 |
616622
| Llava-Next<br>Llava-Next-Video | [Llava-Next系列模型](https://github.com/LLaVA-VL/LLaVA-NeXT) | 中文<br>英文 | 7B-110B | chat模型 |
617623
| mPLUG-Owl | [mPLUG-Owl系列模型](https://github.com/X-PLUG/mPLUG-Owl) | 英文 | 11B | chat模型 |

docs/source/LLM/LLM微调文档.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -211,8 +211,8 @@ from swift.tuners import Swift
211211
ckpt_dir = 'vx-xxx/checkpoint-100'
212212
model_type = ModelType.qwen_7b_chat
213213
template_type = get_default_template_type(model_type)
214-
215-
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'})
214+
model_id_or_path = None
215+
model, tokenizer = get_model_tokenizer(model_type, model_id_or_path=model_id_or_path, model_kwargs={'device_map': 'auto'})
216216

217217
model = Swift.from_pretrained(model, ckpt_dir, inference_mode=True)
218218
template = get_template(template_type, tokenizer)

docs/source/LLM/LLM推理文档.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,9 @@ print(f'template_type: {template_type}') # template_type: qwen
4242

4343
kwargs = {}
4444
# kwargs['use_flash_attn'] = True # 使用flash_attn
45-
46-
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'}, **kwargs)
45+
model_id_or_path = None
46+
model, tokenizer = get_model_tokenizer(model_type, model_id_or_path=model_id_or_path,
47+
model_kwargs={'device_map': 'auto'}, **kwargs)
4748
# 修改max_new_tokens
4849
model.generation_config.max_new_tokens = 128
4950

@@ -178,8 +179,8 @@ from swift.utils import seed_everything
178179
model_type = ModelType.qwen_7b_chat
179180
template_type = get_default_template_type(model_type)
180181
print(f'template_type: {template_type}') # template_type: qwen
181-
182-
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'})
182+
model_id_or_path = None
183+
model, tokenizer = get_model_tokenizer(model_type, model_id_or_path=model_id_or_path, model_kwargs={'device_map': 'auto'})
183184

184185
template = get_template(template_type, tokenizer)
185186
seed_everything(42)

docs/source/LLM/LmDeploy推理加速与部署.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,8 @@ from swift.llm import (
3737
)
3838

3939
model_type = ModelType.qwen_7b_chat
40-
lmdeploy_engine = get_lmdeploy_engine(model_type)
40+
model_id_or_path = None
41+
lmdeploy_engine = get_lmdeploy_engine(model_type, model_id_or_path=model_id_or_path)
4142
template_type = get_default_template_type(model_type)
4243
template = get_template(template_type, lmdeploy_engine.hf_tokenizer)
4344
# 与`transformers.GenerationConfig`类似的接口
@@ -95,7 +96,8 @@ from swift.llm import (
9596

9697
if __name__ == '__main__':
9798
model_type = ModelType.qwen2_7b_instruct
98-
lmdeploy_engine = get_lmdeploy_engine(model_type, tp=2)
99+
model_id_or_path = None
100+
lmdeploy_engine = get_lmdeploy_engine(model_type, model_id_or_path=model_id_or_path, tp=2)
99101
template_type = get_default_template_type(model_type)
100102
template = get_template(template_type, lmdeploy_engine.hf_tokenizer)
101103
# 与`transformers.GenerationConfig`类似的接口

docs/source/LLM/Qwen1.5全流程最佳实践.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -59,8 +59,8 @@ print(f'template_type: {template_type}') # template_type: qwen
5959

6060
kwargs = {}
6161
# kwargs['use_flash_attn'] = True # 使用flash_attn
62-
63-
model, tokenizer = get_model_tokenizer(model_type, torch.float16,
62+
model_id_or_path = None
63+
model, tokenizer = get_model_tokenizer(model_type, torch.float16, model_id_or_path=model_id_or_path,
6464
model_kwargs={'device_map': 'auto'}, **kwargs)
6565
# 修改max_new_tokens
6666
model.generation_config.max_new_tokens = 128
@@ -108,7 +108,8 @@ from swift.llm import (
108108
import torch
109109

110110
model_type = ModelType.qwen1half_7b_chat_awq
111-
llm_engine = get_vllm_engine(model_type, torch.float16, max_model_len=4096)
111+
model_id_or_path = None
112+
llm_engine = get_vllm_engine(model_type, torch.float16, model_id_or_path=model_id_or_path, max_model_len=4096)
112113
template_type = get_default_template_type(model_type)
113114
template = get_template(template_type, llm_engine.hf_tokenizer)
114115
# 与`transformers.GenerationConfig`类似的接口
@@ -264,8 +265,8 @@ seed_everything(42)
264265
ckpt_dir = 'output/qwen1half-7b-chat/vx-xxx/checkpoint-xxx'
265266
model_type = ModelType.qwen1half_7b_chat
266267
template_type = get_default_template_type(model_type)
267-
268-
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'})
268+
model_id_or_path = None
269+
model, tokenizer = get_model_tokenizer(model_type, model_id_or_path=model_id_or_path, model_kwargs={'device_map': 'auto'})
269270
model.generation_config.max_new_tokens = 128
270271

271272
model = Swift.from_pretrained(model, ckpt_dir, inference_mode=True)

docs/source/LLM/VLLM推理加速与部署.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,8 @@ from swift.llm import (
4242
)
4343

4444
model_type = ModelType.qwen_7b_chat
45-
llm_engine = get_vllm_engine(model_type)
45+
model_id_or_path = None
46+
llm_engine = get_vllm_engine(model_type, model_id_or_path=model_id_or_path)
4647
template_type = get_default_template_type(model_type)
4748
template = get_template(template_type, llm_engine.hf_tokenizer)
4849
# 与`transformers.GenerationConfig`类似的接口
@@ -98,7 +99,8 @@ from swift.llm import (
9899
)
99100
if __name__ == '__main__':
100101
model_type = ModelType.qwen_7b_chat
101-
llm_engine = get_vllm_engine(model_type, tensor_parallel_size=2)
102+
model_id_or_path = None
103+
llm_engine = get_vllm_engine(model_type, model_id_or_path=model_id_or_path, tensor_parallel_size=2)
102104
template_type = get_default_template_type(model_type)
103105
template = get_template(template_type, llm_engine.hf_tokenizer)
104106
# 与`transformers.GenerationConfig`类似的接口

docs/source/LLM/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
## LLM文档
22

3+
[English Documentation](https://swift.readthedocs.io/en/latest/)
4+
35
### 📚教程
46

57
1. [LLM推理文档](LLM推理文档.md)

0 commit comments

Comments
 (0)