Skip to content

Commit 7519e10

Browse files
committed
Merge branch 'main' into release/2.0
2 parents 480b4eb + 6d40dd1 commit 7519e10

File tree

13 files changed

+70
-110
lines changed

13 files changed

+70
-110
lines changed

README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -292,6 +292,7 @@ swift sft \
292292
```
293293

294294
#### Deepspeed Training
295+
Deepspeed supports training of quantized GPTQ and AWQ models.
295296

296297
ZeRO2:
297298
```shell
@@ -432,6 +433,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
432433
```
433434

434435
### Supported Models
436+
The complete list of supported models and datasets can be found at [Supported Models and Datasets List](https://idealab.alibaba-inc.com/docs/source/LLM/Supported-Models-and-Datasets.md).
435437

436438
#### LLMs
437439

@@ -470,7 +472,8 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
470472
| c4ai-command-r | [c4ai](https://cohere.com/command) | Multilingual | 35B-104B | chat model |
471473
| WizardLM2 | [WizardLM2 series models](https://github.com/nlpxucan/WizardLM) | English | 7B-8x22B<br>including quantized versions | chat model<br>MoE model |
472474
| Atom | [Atom](https://github.com/LlamaFamily/Llama-Chinese) | Chinese | 7B| base model<br>chat model|
473-
| Chinese-LLaMA-Alpaca-2 | [Chinese-LLaMA-Alpaca-2](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | Chinese | 1.3B-13B| base model<br>chat model<br>long text model|
475+
| Chinese-LLaMA-Alpaca-2 | [Chinese-LLaMA-Alpaca-2](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | Chinese | 1.3B-13B| base model<br>chat model<br>long text model |
476+
| ModelScope-Agent | [ModelScope Agent series models](https://github.com/modelscope/modelscope-agent) | Chinese | 7B-14B| agent model |
474477

475478
#### MLLMs
476479

README_CN.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -290,6 +290,7 @@ swift sft \
290290
```
291291

292292
#### Deepspeed训练
293+
Deepspeed支持对GPTQ和AWQ量化模型进行训练.
293294

294295
ZeRO2:
295296
```shell
@@ -429,6 +430,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
429430
```
430431

431432
### 支持的模型
433+
完整的支持模型和数据集可以查看[支持的模型和数据集列表](docs/source/LLM/支持的模型和数据集.md).
432434

433435
#### 大语言模型
434436

@@ -467,7 +469,8 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
467469
| c4ai-command-r | [c4ai](https://cohere.com/command) | 多语种 | 35B-104B | chat模型 |
468470
| WizardLM2 | [WizardLM2系列模型](https://github.com/nlpxucan/WizardLM) | 多语种 | 7B-8x22B<br>包含量化版本 | chat模型<br>MoE模型 |
469471
| Atom | [Atom](https://github.com/LlamaFamily/Llama-Chinese) | 中文 | 7B| base模型<br>chat模型|
470-
| Chinese-LLaMA-Alpaca-2 | [Chinese-LLaMA-Alpaca-2](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 中文 | 1.3B-13B| base模型<br>chat模型<br>长文本模型|
472+
| Chinese-LLaMA-Alpaca-2 | [Chinese-LLaMA-Alpaca-2](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 中文 | 1.3B-13B| base模型<br>chat模型<br>长文本模型 |
473+
| ModelScope-Agent | [ModelScope Agent系列](https://github.com/modelscope/modelscope-agent) | 中文 | 7B-14B| agent模型 |
471474

472475

473476
#### 多模态大模型

ROADMAP.md

Lines changed: 0 additions & 56 deletions
This file was deleted.

docs/source/LLM/自定义与拓展.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,11 @@
2626
2727
2. `--custom_val_dataset_path`: 默认值为`[]`, 表示不使用自定义验证数据集. 如果你指定了`custom_train_dataset_path`, 则自定义数据集的验证集将按照命令行参数`dataset_test_ratio`进行切割.
2828

29-
脚本支持的文件格式包含`csv`, `json`, `jsonl`格式. 你需要将传入的文件符合以下数据集格式. csv格式的文件只支持指令微调, 即没有history的情况. json, jsonl格式的文件支持system, history.
29+
脚本支持的文件格式包含`csv`, `json`, `jsonl`格式. 你需要将传入的文件符合以下数据集格式. 以下格式都支持system. `json`, `jsonl`格式的文件支持多轮对话 (`csv`不支持).
3030

3131
**格式1:**
3232

33-
Pre-Training
33+
预训练:
3434

3535
```csv
3636
response
@@ -45,7 +45,7 @@ AAAAA
4545
{"response": "AAAAA"}
4646
```
4747

48-
Single-Round Dialogue
48+
单轮对话:
4949

5050
```csv
5151
query,response
@@ -60,7 +60,7 @@ AAAAA,BBBBB
6060
{"query": "AAAAA", "response": "BBBBB"}
6161
```
6262

63-
Multi-Round Dialogue
63+
多轮对话:
6464

6565
```jsonl
6666
{"query": "55555", "response": "66666"}

docs/source_en/LLM/Customization.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ The corresponding example sh script can be found [here](https://github.com/model
2626
2727
2. `--custom_val_dataset_path`: The default value is `[]`, indicating not to use a custom validation dataset. If you specify `custom_train_dataset_path`, then the validation set of the custom dataset will be split according to the command line argument `dataset_test_ratio`.
2828

29-
The script supports file formats including `csv`, `json`, and `jsonl`. You need to ensure the passed in files conform to the following dataset formats. csv files only support instruction tuning, i.e. the case without history. json and jsonl files support system and history.
29+
The supported file formats for the script include `csv`, `json`, and `jsonl`. You need to ensure that the incoming files conform to the following dataset formats. Both `json` and `jsonl` formats support multi-turn dialogues (`csv` does not support this).
3030

3131
**Format 1:**
3232

examples/pytorch/llm/scripts/openbuddy_mistral_7b_chat/lora_ddp_ds/sft.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ torchrun \
88
--nproc_per_node=$nproc_per_node \
99
--master_port 29500 \
1010
llm_sft.py \
11-
--model_id_or_path OpenBuddy/openbuddy-mistral-7b-v13.1 \
11+
--model_id_or_path OpenBuddy/openbuddy-mistral-7b-v17.1-32k \
1212
--model_revision master \
1313
--sft_type lora \
1414
--tuner_backend peft \

examples/pytorch/llm/scripts/openbuddy_mistral_7b_chat/lora_mp_ddp/sft.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ torchrun \
88
--nproc_per_node=$nproc_per_node \
99
--master_port 29500 \
1010
llm_sft.py \
11-
--model_id_or_path OpenBuddy/openbuddy-mistral-7b-v13.1 \
11+
--model_id_or_path OpenBuddy/openbuddy-mistral-7b-v17.1-32k \
1212
--model_revision master \
1313
--sft_type lora \
1414
--tuner_backend peft \

swift/llm/utils/argument.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -309,7 +309,7 @@ class SftArguments(ArgumentsBase):
309309
})
310310
output_dir: str = 'output'
311311
add_output_dir_suffix: Optional[bool] = None
312-
ddp_backend: Literal['nccl', 'gloo', 'mpi', 'ccl'] = None
312+
ddp_backend: Optional[Literal['nccl', 'gloo', 'mpi', 'ccl']] = None
313313
ddp_find_unused_parameters: Optional[bool] = None
314314
ddp_broadcast_buffers: Optional[bool] = None
315315

@@ -658,6 +658,8 @@ def __post_init__(self) -> None:
658658
else:
659659
torch.cuda.set_device(local_rank)
660660
self.seed += rank # Avoid the same dropout
661+
if self.ddp_backend is None:
662+
self.ddp_backend = 'nccl'
661663
if self.ddp_backend == 'gloo' and self.quantization_bit != 0:
662664
raise ValueError('not supported, please use `nccl`')
663665

swift/llm/utils/dataset.py

Lines changed: 28 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,8 @@ class DatasetName:
6767
open_orca_gpt4 = 'open-orca-gpt4'
6868
sharegpt_gpt4 = 'sharegpt-gpt4'
6969
sharegpt_gpt4_mini = 'sharegpt-gpt4-mini'
70+
deepctrl_sft_zh = 'deepctrl-sft-zh'
71+
deepctrl_sft_en = 'deepctrl-sft-en'
7072
# agent
7173
ms_agent = 'ms-agent'
7274
ms_agent_for_agentfabric_default = 'ms-agent-for-agentfabric-default'
@@ -549,7 +551,7 @@ def _preprocess_aishell1_dataset(dataset: HfDataset) -> HfDataset:
549551

550552

551553
def _repair_agent_conversations(conversations: str,
552-
use_mini: bool) -> Dict[str, str]:
554+
use_mini: bool) -> List[Dict[str, str]]:
553555
if use_mini:
554556
pattern = r'\d\. {"plugin_name": "(.+?)"'
555557
else:
@@ -562,13 +564,14 @@ def _repair_agent_conversations(conversations: str,
562564
find_list = re.findall(pattern, conversations[:idx])
563565
if len(set(find_list)) <= 1:
564566
return
565-
conversations = ast.literal_eval(conversations)
567+
if isinstance(conversations, str):
568+
conversations = ast.literal_eval(conversations)
566569
if len(conversations) == 1:
567570
return
568571
return conversations
569572

570573

571-
def _repair_ms_bench(conversations: str) -> Dict[str, str]:
574+
def _repair_ms_bench(conversations: str) -> List[Dict[str, str]]:
572575
if isinstance(conversations, str):
573576
conversations = ast.literal_eval(conversations)
574577
default_system = 'You are a helpful assistant.'
@@ -684,6 +687,22 @@ def map_row(row):
684687
get_dataset_from_repo,
685688
tags=['chat', 'agent', 'multi-round'])
686689

690+
register_dataset(
691+
DatasetName.deepctrl_sft_zh,
692+
'AI-ModelScope/deepctrl-sft-data', [['default', 'train']],
693+
None,
694+
SmartPreprocessor(),
695+
get_dataset_from_repo,
696+
tags=['chat', 'general', 'sft', 'multi-round'])
697+
698+
register_dataset(
699+
DatasetName.deepctrl_sft_en,
700+
'AI-ModelScope/deepctrl-sft-data', [['en', 'train']],
701+
None,
702+
SmartPreprocessor(),
703+
get_dataset_from_repo,
704+
tags=['chat', 'general', 'sft', 'multi-round'])
705+
687706
advertise_gen_prompt = """Task: Generating advertisements based on keywords.
688707
Keywords: {query}
689708
Advertisements:"""
@@ -1066,7 +1085,8 @@ def _preprocess_sharegpt(dataset: HfDataset) -> HfDataset:
10661085
response = []
10671086
history: List[History] = []
10681087
for d in tqdm(dataset):
1069-
conversation = ast.literal_eval(d['conversation'])
1088+
if isinstance(d['conversation'], str):
1089+
conversation = ast.literal_eval(d['conversation'])
10701090
query.append(conversation[-1]['human'])
10711091
response.append(conversation[-1]['assistant'])
10721092
h = []
@@ -1316,9 +1336,11 @@ def _preprocess_leetcode_python(dataset: HfDataset) -> HfDataset:
13161336
]
13171337

13181338

1319-
def _repair_conversations_agent_instruct(s: str) -> str:
1339+
def _repair_conversations_agent_instruct(s: str) -> List[Dict[str, Any]]:
13201340
s = s.replace('}\n {', '},\n {')
1321-
return ast.literal_eval(s)
1341+
if isinstance(s, str):
1342+
s = ast.literal_eval(s)
1343+
return s
13221344

13231345

13241346
register_dataset(

swift/llm/utils/model.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2377,6 +2377,7 @@ def _git_clone_github(github_url: str,
23772377
command = f'git -C {git_cache_dir} clone {github_url} {local_repo_name}'
23782378
logger.info(f'Run the command: `{command}`')
23792379
os.system(command)
2380+
logger.info(f'local_repo_path: {local_repo_path}')
23802381
return local_repo_path
23812382

23822383

0 commit comments

Comments
 (0)