modelscope
diff --git a/‎README.md‎
Lines changed: 2 additions & 1 deletion b/‎README.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎README_CN.md‎
Lines changed: 2 additions & 1 deletion b/‎README_CN.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/source/LLM/LLM量化文档.md‎
Lines changed: 6 additions & 6 deletions b/‎docs/source/LLM/LLM量化文档.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎docs/source/LLM/VLLM推理加速与部署.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/LLM/VLLM推理加速与部署.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/LLM/命令行参数.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/LLM/命令行参数.md‎
Lines changed: 1 addition & 1 deletion
@@ -47,6 +47,7 @@ SWIFT has rich documentations for users, please check [here](https://github.com/
 SWIFT web-ui is available both on [Huggingface space](https://huggingface.co/spaces/tastelikefeet/swift) and [ModelScope studio](https://www.modelscope.cn/studios/iic/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary), please feel free to try!
 
 ## 🎉 News
+- 🔥2024.06.07: Support Qwen2 series LLM, including Base and Instruct models of 0.5B, 1.5B, 7B, and 72B, as well as corresponding quantized versions gptq-int4, gptq-int8, and awq-int4.
 - 🔥2024.06.05: Support for **glm4** series LLM and glm4v-9b-chat MLLM. You can refer to [glm4v best practice](docs/source_en/Multi-Modal/glm4v-best-practice.md).
 - 🔥2024.06.01: Supoprts **SimPO** training! See [document](https://github.com/modelscope/swift/blob/main/docs/source_en/LLM/SimPO.md) to start training!
 - 🔥2024.06.01: Support for deploying large multimodal models, please refer to the [Multimodal Deployment Documentation](docs/source_en/Multi-Modal/mutlimodal-deployment.md) for more information.
@@ -486,7 +487,7 @@ The complete list of supported models and datasets can be found at [Supported Mo
 
 | Model Type                                     | Model Introduction                                                     | Language           | Model Size                             | Model Type                                 |
 |------------------------------------------------|------------------------------------------------------------------------|--------------------|----------------------------------------|------------------------------------------- |
-| Qwen<br>Qwen1.5                            | [Tongyi Qwen 1.0 and 1.5 series models](https://github.com/QwenLM)  | Chinese<br>English    | 0.5B-110B<br>including quantized versions | base model<br>chat model<br>MoE model<br>code model                      |
+| Qwen<br>Qwen1.5<br>Qwen2                            | [Tongyi Qwen 1.0 and 1.5 series models](https://github.com/QwenLM)  | Chinese<br>English    | 0.5B-110B<br>including quantized versions | base model<br>chat model<br>MoE model<br>code model                      |
 | ChatGLM2<br>ChatGLM3<br>Codegeex2<br>GLM4           | [Zhipu ChatGLM series models](https://github.com/THUDM)               | Chinese<br>English    | 6B-9B                                     | base model<br>chat model<br>code model<br>long text model  |
 | Baichuan/Baichuan2                             | [Baichuan 1 and Baichuan 2](https://github.com/baichuan-inc)           | Chinese<br>English    | 7B-13B<br>including quantized versions             | base model<br>chat model                       |
 | Yuan2                                          | [Langchao Yuan series models](https://github.com/IEIT-Yuan)             | Chinese<br>English    | 2B-102B                                | instruct model                                 |
 
@@ -48,6 +48,7 @@ SWIFT具有丰富的文档体系，如有使用问题请请查看[这里](https:
 可以在[Huggingface space](https://huggingface.co/spaces/tastelikefeet/swift) 和 [ModelScope创空间](https://www.modelscope.cn/studios/iic/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary) 中体验SWIFT web-ui功能了。
 
 ## 🎉 新闻
+- 🔥2024.06.07: 支持Qwen2系列LLM, 包括0.5B、1.5B、7B、72B的Base和Instruct模型, 以及对应的gptq-int4、gptq-int8、awq-int4量化版本.
 - 🔥2024.06.05: 支持glm4系列大模型和glm4v-9b-chat多模态大模型, 可以查看[glm4v最佳实践](docs/source/Multi-Modal/glm4v最佳实践.md).
 - 🔥2024.06.01: 支持**SimPO**训练，使用`swift simpo`来开始训练，最佳实践可以查看[这里](https://github.com/modelscope/swift/tree/main/docs/source/LLM/SimPO算法最佳实践.md)
 - 🔥2024.06.01: 支持多模态大模型部署, 可以查看[多模态部署文档](docs/source/Multi-Modal/MLLM部署文档.md).
@@ -482,7 +483,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
 
 | 模型类型                                            | 模型介绍                                                     | 语言       | 模型大小                  | 模型类型                                      |
 | --------------------------------------------------- | ------------------------------------------------------------ |----------| ------------------------- |-------------------------------------------|
-| Qwen<br>Qwen1.5                              | [通义千问1.0和1.5系列模型](https://github.com/QwenLM)        | 中文<br>英文 | 0.5B-110B<br>包含量化版本     | base模型<br>chat模型<br>MoE模型<br>代码模型             |                          |
+| Qwen<br>Qwen1.5<br>Qwen2                              | [通义千问1.0和1.5系列模型](https://github.com/QwenLM)        | 中文<br>英文 | 0.5B-110B<br>包含量化版本     | base模型<br>chat模型<br>MoE模型<br>代码模型             |                          |
 | ChatGLM2<br>ChatGLM3<br>Codegeex2<br>GLM4             | [智谱ChatGLM系列模型](https://github.com/THUDM/)             | 中文<br>英文 | 6B-9B                        | base模型<br>chat模型<br>代码模型<br>长文本模型             |
 | Baichuan<br>Baichuan2                                  | [百川1和百川2](https://github.com/baichuan-inc)              | 中文<br>英文 | 7B-13B<br>包含量化版本         | base模型<br>chat模型                          |
 | Yuan2                                               | [浪潮源系列模型](https://github.com/IEIT-Yuan)               | 中文<br>英文 | 2B-102B                   | instruct模型                                |
 
@@ -68,16 +68,16 @@ pip install -r requirements/llm.txt  -U
 # 如果出现量化的时候OOM, 可以适度降低`--quant_n_samples`(默认256)和`--quant_seqlen`(默认2048).
 # gptq-int4量化 (使用A100大约需要20分钟, 显存占用: 7GB)
 
-# awq: 使用`alpaca-zh alpaca-en sharegpt-gpt4-mini`作为量化数据集
+# awq: 使用`alpaca-zh alpaca-en sharegpt-gpt4:default`作为量化数据集
 CUDA_VISIBLE_DEVICES=0 swift export \
     --model_type qwen1half-7b-chat --quant_bits 4 \
-    --dataset alpaca-zh alpaca-en sharegpt-gpt4-mini --quant_method awq
+    --dataset alpaca-zh alpaca-en sharegpt-gpt4:default --quant_method awq
 
-# gptq: 使用`alpaca-zh alpaca-en sharegpt-gpt4-mini`作为量化数据集
+# gptq: 使用`alpaca-zh alpaca-en sharegpt-gpt4:default`作为量化数据集
 # gptq量化请先查看此issue: https://github.com/AutoGPTQ/AutoGPTQ/issues/439
 OMP_NUM_THREADS=14 CUDA_VISIBLE_DEVICES=0 swift export \
     --model_type qwen1half-7b-chat --quant_bits 4 \
-    --dataset alpaca-zh alpaca-en sharegpt-gpt4-mini --quant_method gptq
+    --dataset alpaca-zh alpaca-en sharegpt-gpt4:default --quant_method gptq
 
 # awq: 使用自定义量化数据集
 # gptq同理
@@ -216,11 +216,11 @@ CUDA_VISIBLE_DEVICES=0 swift infer \
 
 **Merge-LoRA & 量化**
 ```shell
-# 使用`alpaca-zh alpaca-en sharegpt-gpt4-mini`作为量化数据集
+# 使用`alpaca-zh alpaca-en sharegpt-gpt4:default`作为量化数据集
 CUDA_VISIBLE_DEVICES=0 swift export \
     --ckpt_dir 'output/qwen1half-4b-chat/vx-xxx/checkpoint-xxx' \
     --merge_lora true --quant_bits 4 \
-    --dataset alpaca-zh alpaca-en sharegpt-gpt4-mini --quant_method awq
+    --dataset alpaca-zh alpaca-en sharegpt-gpt4:default --quant_method awq
 
 # 使用微调时使用的数据集作为量化数据集
 CUDA_VISIBLE_DEVICES=0 swift export \
 
@@ -527,7 +527,7 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 \
 NPROC_PER_NODE=4 \
 swift sft \
     --model_type llama2-7b-chat \
-    --dataset self-cognition#500 sharegpt-gpt4-mini#1000 \
+    --dataset self-cognition#500 sharegpt-gpt4:default#1000 \
     --logging_steps 5 \
     --max_length 4096 \
     --learning_rate 5e-5 \
 
@@ -90,7 +90,7 @@
 - `--save_only_model`: 是否只保存模型参数, 而不存储断点续训所需的中间状态, 默认为`None`, 即如果`sft_type`为'lora'并且不使用deepspeed(`deepspeed`为`None`), 设置为False, 否则设置为True(e.g. 使用了全参数微调或者使用了deepspeed).
 - `--save_total_limit`: 保存的checkpoint的数量, 默认为`2`, 即保存best和last的checkpoint. 如果设置为-1, 则保存所有的checkpoint.
 - `--logging_steps`: 每训练多少步打印训练信息(e.g. loss, learning_rate等), 默认为`5`.
-- `--dataloader_num_workers`: 默认值为`1`.
+- `--dataloader_num_workers`: 默认值为`None`, 如果是windows机器, 则设置为`0`, 否则设置为`1`.
 - `--push_to_hub`: 是否将训练的checkpoint同步推送到ModelScope Hub中, 默认为`False`.
 - `--hub_model_id`: 推送到的ModelScope Hub的model_id, 默认为`None`, 即设置为`f'{model_type}-{sft_type}'`. 你可以将其设置为model_id, 也可以设置为repo_name. 我们会根据hub_token推断出user_name. 推送的远程仓库如果不存在, 则会创建一个新的仓库, 如果存在, 则复用之前的仓库. 该参数只有在`push_to_hub`设置为True时才生效.
 - `--hub_token`: 推送时需要的SDK token. 可以从[https://modelscope.cn/my/myaccesstoken](https://modelscope.cn/my/myaccesstoken)获取, 默认为`None`, 即从环境变量`MODELSCOPE_API_TOKEN`中获取. 该参数只有在`push_to_hub`设置为True时才生效.