Skip to content

Commit 84826bd

Browse files
update doc (#934)
1 parent cef448b commit 84826bd

File tree

4 files changed

+35
-2
lines changed

4 files changed

+35
-2
lines changed

README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,8 @@ To facilitate use by users unfamiliar with deep learning, we provide a Gradio we
3838

3939
Additionally, we are expanding capabilities for other modalities. Currently, we support full-parameter training and LoRA training for AnimateDiff.
4040

41+
SWIFT has rich documentations for users, please check [here](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM).
42+
4143
## 🎉 News
4244
- 🔥2024.05.13: Support Yi-1.5 series models,use `--model_type yi-1_5-9b-chat` to begin!
4345
- 2024.05.11: Support for qlora training and quantized inference using [hqq](https://github.com/mobiusml/hqq) and [eetq](https://github.com/NetEase-FuXi/EETQ). For more information, see the [LLM Quantization Documentation](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM/LLM-quantization.md).
@@ -382,6 +384,20 @@ swift sft \
382384
--deepspeed default-zero3 \
383385
```
384386

387+
##### AliYun-DLC multi-node training
388+
In DLC product, WORLD_SIZE is the node number, RANK is the node index, this is different from the definition of torchrun.
389+
390+
```shell
391+
NNODES=$WORLD_SIZE \
392+
NODE_RANK=$RANK \
393+
swift sft \
394+
--model_id_or_path qwen1half-32b-chat \
395+
--sft_type full \
396+
--dataset blossom-math-zh \
397+
--output_dir output \
398+
--deepspeed default-zero3
399+
```
400+
385401

386402
### Inference
387403
Original model:

README_CN.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ SWIFT支持近**200种LLM和MLLM**(多模态大模型)的训练、推理、
3939

4040
此外,我们也在拓展其他模态的能力,目前我们支持了AnimateDiff的全参数训练和LoRA训练。
4141

42+
SWIFT具有丰富的文档体系,如有使用问题请请查看[这里](https://github.com/modelscope/swift/tree/main/docs/source/LLM).
43+
4244
## 🎉 新闻
4345
- 🔥2024.05.13: 支持Yi-1.5系列模型,使用`--model_type yi-1_5-9b-chat`等开始体验
4446
- 2024.05.11: 支持使用[hqq](https://github.com/mobiusml/hqq)[eetq](https://github.com/NetEase-FuXi/EETQ)进行qlora训练和量化推理,可以查看[LLM量化文档](https://github.com/modelscope/swift/tree/main/docs/source/LLM/LLM量化文档.md)
@@ -380,6 +382,21 @@ swift sft \
380382
--deepspeed default-zero3 \
381383
```
382384

385+
##### 阿里云-DLC多机训练
386+
DLC环境变量中,WORLD_SIZE指代node数量,RANK指代node序号,这一点和torchrun定义不同,需要注意。
387+
```shell
388+
NNODES=$WORLD_SIZE \
389+
NODE_RANK=$RANK \
390+
swift sft \
391+
--model_id_or_path qwen1half-32b-chat \
392+
--sft_type full \
393+
--dataset blossom-math-zh \
394+
--output_dir output \
395+
--deepspeed default-zero3
396+
```
397+
398+
399+
383400

384401
### 推理
385402
原始模型:

docs/source/LLM/命令行参数.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@
122122
- `--train_dataset_mix_ds`: 默认为`['ms-bench']`. 用于防止知识遗忘的通用知识数据集. 该参数已废弃, 请使用`--dataset`进行数据集混合.
123123
- `--use_loss_scale`: 默认为`False`. 生效时会将Agent的部分字段(Action/Action Input部分)的loss权重加强以强化CoT, 对普通SFT场景没有任何效果.
124124
- `--custom_register_path`: 默认为`None`. 传入`.py`文件, 用于注册模板、模型和数据集.
125-
- `--custom_dataset_info`: 默认为`None`, 传入外置dataset_info.json的路径、json字符串或者dict. 用于拓展数据集.
125+
- `--custom_dataset_info`: 默认为`None`, 传入外置dataset_info.json的路径、json字符串或者dict. 用于拓展数据集. 格式参考: https://github.com/modelscope/swift/blob/main/swift/llm/data/dataset_info.json
126126

127127

128128
### FSDP参数

docs/source_en/LLM/Command-line-parameters.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@
122122
- `--train_dataset_mix_ds`: Default is `['ms-bench']`. Used for preventing knowledge forgetting, this is the general knowledge dataset. This parameter has been deprecated, please use `--dataset {dataset_name}#{dataset_sample}` to mix datasets.
123123
- `--use_loss_scale`: Default is `False`. When taking effect, strengthens loss weight of some Agent fields (Action/Action Input part) to enhance CoT, has no effect in regular SFT scenarios.
124124
- `--custom_register_path`: Default is `None`. Pass in a `.py` file used to register templates, models, and datasets.
125-
- `--custom_dataset_info`: Default is `None`. Pass in the path to an external `dataset_info.json`, a JSON string, or a dictionary. Used for expanding datasets.
125+
- `--custom_dataset_info`: Default is `None`. Pass in the path to an external `dataset_info.json`, a JSON string, or a dictionary. Used to register custom datasets. The format example: https://github.com/modelscope/swift/blob/main/swift/llm/data/dataset_info.json
126126

127127

128128
### FSDP Parameters

0 commit comments

Comments
 (0)