Skip to content

Commit 81688d5

Browse files
change the default value of ddp_backend (#667)
1 parent 242086b commit 81688d5

File tree

3 files changed

+3
-3
lines changed

3 files changed

+3
-3
lines changed

docs/source/LLM/命令行参数.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
- `--template_type`: 表示使用的对话模板的类型, 默认是`'AUTO'`, 即根据`model_type`查找`MODEL_MAPPING`中的`template`. 可以选择的`template_type`可以查看`TEMPLATE_MAPPING.keys()`.
2222
- `--output_dir`: 表示ckpt存储的目录, 默认是`'output'`. 我们会在该目录后拼接`model_type`和微调版本号. 方便用户对不同模型进行多次对比实验, 而不需要改变`output_dir`命令行参数. 如果不需要拼接这些内容, 你需要额外指定参数`--add_output_dir_suffix false`.
2323
- `--add_output_dir_suffix`: 默认为`True`, 表示会在`output_dir`的目录后拼接上`model_type`和微调版本号的后缀. 如果要避免此行为, 你可以设置为`False`.
24-
- `--ddp_backend`: 表示分布式的后端支持, 默认是`'nccl'`. 你可以选择的值包括: 'nccl', 'gloo', 'mpi', 'ccl'.
24+
- `--ddp_backend`: 表示分布式的后端支持, 默认是`None`. 你可以选择的值包括: 'nccl', 'gloo', 'mpi', 'ccl'.
2525
- `--seed`: 全局的seed, 默认使用`42`. 用于复现训练效果.
2626
- `--resume_from_checkpoint`: 用于断点续训, 默认为`None`. 你可以将其设置为checkpoint的路径, 例如: `'output/qwen-7b-chat/vx-xxx/checkpoint-xxx'`, 来进行断点续训.
2727
- `--dtype`: 基模型载入时的torch_dtype, 默认为`'AUTO'`, 即智能选择dtype: 如果机器不支持bf16, 则使用fp16, 如果`MODEL_MAPPING`中对应模型有指定torch_dtype, 则使用其对应dtype, 否则使用bf16. 你可以选择的值包括: 'bf16', 'fp16', 'fp32'.

docs/source_en/LLM/Command-line-parameters.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
- `--template_type`: Type of dialogue template used, default is `'AUTO'`, i.e. look up `template` in `MODEL_MAPPING` based on `model_type`. Available `template_type` options can be found in `TEMPLATE_MAPPING.keys()`.
2222
- `--output_dir`: Directory to store ckpt, default is `'output'`. We will append `model_type` and fine-tuning version number to this directory, allowing users to do multiple comparative experiments on different models without changing the `output_dir` command line argument. If you don't want to append this content, specify `--add_output_dir_suffix false`.
2323
- `--add_output_dir_suffix`: Default is `True`, indicating that a suffix of `model_type` and fine-tuning version number will be appended to the `output_dir` directory. Set to `False` to avoid this behavior.
24-
- `--ddp_backend`: Backend support for distributed training, default is `'nccl'`. Options include: 'nccl', 'gloo', 'mpi', 'ccl'.
24+
- `--ddp_backend`: Backend support for distributed training, default is `None`. Options include: 'nccl', 'gloo', 'mpi', 'ccl'.
2525
- `--seed`: Global seed, default is `42`. Used to reproduce training results.
2626
- `--resume_from_checkpoint`: For resuming training from checkpoint, default is `None`. You can set this to the path of a checkpoint, e.g. `'output/qwen-7b-chat/vx-xxx/checkpoint-xxx'`, to resume training.
2727
- `--dtype`: torch_dtype when loading base model, default is `'AUTO'`, i.e. intelligently select dtype: if machine does not support bf16, use fp16; if `MODEL_MAPPING` specifies torch_dtype for corresponding model, use its dtype; otherwise use bf16. Options include: 'bf16', 'fp16', 'fp32'.

swift/llm/utils/argument.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ class SftArguments:
6363
})
6464
output_dir: str = 'output'
6565
add_output_dir_suffix: Optional[bool] = None
66-
ddp_backend: Literal['nccl', 'gloo', 'mpi', 'ccl'] = 'nccl'
66+
ddp_backend: Literal['nccl', 'gloo', 'mpi', 'ccl'] = None
6767
ddp_find_unused_parameters: Optional[bool] = None
6868
ddp_broadcast_buffers: Optional[bool] = None
6969

0 commit comments

Comments
 (0)