Skip to content

Commit 2ec66b7

Browse files
authored
Fix readme & update docs (#3018)
1 parent 2802944 commit 2ec66b7

File tree

7 files changed

+47
-8
lines changed

7 files changed

+47
-8
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ You can contact us and communicate with us by adding our group:
7979

8080
## 🎉 News
8181

82-
- 🎁 2024.01.23: SWIFT support the `sample` command, this is a very important feature for complex CoT and RFT. Meanwhile, we support an [Reinforced Fine-tuning script](docs/source_en/Instruction/Reinforced_Fine_tuning.md).
82+
- 🎁 2025.01.23: SWIFT support the `sample` command, this is a very important feature for complex CoT and RFT. Meanwhile, we support an [Reinforced Fine-tuning script](docs/source_en/Instruction/Reinforced_Fine_tuning.md).
8383
- 🎁 2024.12.04: **SWIFT3.0** major version update. Please check the [Release Notes and Changes](https://swift.readthedocs.io/en/latest/Instruction/ReleaseNote3.0.html).
8484
- 🎉 2024.08.12: The SWIFT paper has been published on arXiv, and you can read it [here](https://arxiv.org/abs/2408.05517).
8585
- 🔥 2024.08.05: Support for using [evalscope](https://github.com/modelscope/evalscope/) as a backend for evaluating large models and multimodal models.

README_CN.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@
7474
- **模型量化**:支持AWQ、GPTQ和BNB的量化导出,导出的模型支持使用vLLM/LmDeploy推理加速,并支持继续训练。
7575

7676
## 🎉 新闻
77-
- 🎁 2024.01.23: SWIFT支持了`sample`命令, 这是一个对CoT和RFT非常重要的命令。同时, 我们支持了一个[强化微调脚本](docs/source/Instruction/强化微调.md)
77+
- 🎁 2025.01.23: SWIFT支持了`sample`命令, 这是一个对CoT和RFT非常重要的命令。同时, 我们支持了一个[强化微调脚本](docs/source/Instruction/强化微调.md)
7878
- 🎁 2024.12.04: **SWIFT3.0**大版本更新。请查看[发布说明和更改](https://swift.readthedocs.io/zh-cn/latest/Instruction/ReleaseNote3.0.html)
7979
- 🎉 2024.08.12: SWIFT论文已经发布到arXiv上,可以点击[这里](https://arxiv.org/abs/2408.05517)阅读。
8080
- 🔥 2024.08.05: 支持使用[evalscope](https://github.com/modelscope/evalscope/)作为后端进行大模型和多模态模型的评测。

docs/source/Customization/自定义数据集.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ query-response格式:
8686

8787
### 多模态
8888

89-
对于多模态数据集,和上述任务的格式相同。区别在于增加了`images`, `videos`, `audios`几个key,分别代表多模态资源的url或者path(推荐使用绝对路径),`<image>` `<video>` `<audio>`标签代表了插入图片/视频/音频的位置,ms-swift支持多图片/视频/音频的情况。下面给出的四条示例分别展示了纯文本,以及包含图像、视频和音频数据的数据格式。
89+
对于多模态数据集,和上述任务的格式相同。区别在于增加了`images`, `videos`, `audios`几个key,分别代表多模态资源的url或者path(推荐使用绝对路径),`<image>` `<video>` `<audio>`标签代表了插入图片/视频/音频的位置,ms-swift支持多图片/视频/音频的情况。这些特殊tokens将在预处理的时候进行替换,参考[这里](https://github.com/modelscope/ms-swift/blob/main/swift/llm/template/template/qwen.py#L198)下面给出的四条示例分别展示了纯文本,以及包含图像、视频和音频数据的数据格式。
9090

9191
预训练:
9292
```

docs/source/GetStarted/快速开始.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,14 @@ swift sft \
5656
--model_name swift-robot
5757
```
5858

59-
训练完成后,使用以下命令对训练后的权重进行推理,这里的`--adapters`替换成训练生成的last checkpoint文件夹。由于adapters文件夹中包含了训练的参数文件,因此不需要额外指定`--model`, `--system`
59+
小贴士:
60+
- 如果要使用自定义数据集进行训练,你可以参考[这里](../Customization/自定义数据集.md)组织数据集格式,并指定`--dataset <dataset_path>`
61+
- `--model_author``--model_name`参数只有当数据集中包含`swift/self-cognition`时才生效。
62+
- 如果要使用其他模型进行训练,你只需要修改`--model <model_id/model_path>`即可。
63+
- 默认使用ModelScope进行模型和数据集的下载。如果要使用HuggingFace,指定`--use_hf true`即可。
64+
65+
训练完成后,使用以下命令对训练后的权重进行推理:
66+
- 这里的`--adapters`需要替换成训练生成的last checkpoint文件夹。由于adapters文件夹中包含了训练的参数文件`args.json`,因此不需要额外指定`--model``--system`,swift会自动读取这些参数。如果要关闭此行为,可以设置`--load_args false`
6067

6168
```shell
6269
# 使用交互式命令行进行推理
@@ -79,6 +86,17 @@ swift infer \
7986
--max_new_tokens 2048
8087
```
8188

89+
最后,使用以下命令将模型推送到ModelScope:
90+
```shell
91+
CUDA_VISIBLE_DEVICES=0 \
92+
swift export \
93+
--adapters output/vx-xxx/checkpoint-xxx \
94+
--push_to_hub true \
95+
--hub_model_id '<your-model-id>' \
96+
--hub_token '<your-sdk-token>' \
97+
--use_hf false
98+
```
99+
82100
## 了解更多
83101

84102
- 更多Shell脚本:[https://github.com/modelscope/ms-swift/tree/main/examples](https://github.com/modelscope/ms-swift/tree/main/examples)

docs/source_en/Customization/Custom-dataset.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ The following outlines the standard dataset format for ms-swift, where the "syst
8787

8888
### Multimodal
8989

90-
For multimodal datasets, the format is the same as the aforementioned tasks. The difference lies in the addition of several keys: `images`, `videos`, and `audios`, which respectively represent the URLs or paths (absolute paths are recommended) of multimodal resources. The tags `<image>`, `<video>`, and `<audio>` indicate the positions where images, videos, and audio should be inserted. MS-Swift supports the inclusion of multiple images, videos, and audio. The four examples provided below respectively demonstrate data formats for plain text and those containing image, video, and audio data.
90+
For multimodal datasets, the format is the same as the aforementioned tasks. The difference lies in the addition of several keys: `images`, `videos`, and `audios`, which represent the URLs or paths (preferably absolute paths) of multimodal resources. The tags `<image>`, `<video>`, and `<audio>` indicate where to insert images, videos, or audio. MS-Swift supports multiple images, videos, and audio files. These special tokens will be replaced during preprocessing, as referenced [here](https://github.com/modelscope/ms-swift/blob/main/swift/llm/template/template/qwen.py#L198). The four examples below respectively demonstrate the data format for plain text, as well as formats containing image, video, and audio data.
9191

9292

9393
Pre-training:

docs/source_en/GetStarted/Quick-start.md

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,16 @@ swift sft \
5656
--model_name swift-robot
5757
```
5858

59-
After training is complete, use the following command to perform inference with the trained weights. The `--adapters` option should be replaced with the last checkpoint folder generated from the training. Since the adapters folder contains the parameter files from the training, there is no need to specify `--model` or `--system` separately.
59+
Tips:
60+
61+
- If you want to train with a custom dataset, you can refer to [this guide](https://idealab.alibaba-inc.com/Customization/Custom_Dataset.md) to organize your dataset format and specify `--dataset <dataset_path>`.
62+
- The `--model_author` and `--model_name` parameters are only effective when the dataset includes `swift/self-cognition`.
63+
- To train with a different model, simply modify `--model <model_id/model_path>`.
64+
- By default, ModelScope is used for downloading models and datasets. If you want to use HuggingFace, simply specify `--use_hf true`.
65+
66+
After training is complete, use the following command to infer with the trained weights:
67+
68+
- Here, `--adapters` should be replaced with the last checkpoint folder generated during training. Since the adapters folder contains the training parameter file `args.json`, there is no need to specify `--model`, `--system` separately; Swift will automatically read these parameters. To disable this behavior, you can set `--load_args false`.
6069

6170
```shell
6271
# Using an interactive command line for inference.
@@ -79,6 +88,18 @@ swift infer \
7988
--max_new_tokens 2048
8089
```
8190

91+
Finally, use the following command to push the model to ModelScope:
92+
93+
```shell
94+
CUDA_VISIBLE_DEVICES=0 \
95+
swift export \
96+
--adapters output/vx-xxx/checkpoint-xxx \
97+
--push_to_hub true \
98+
--hub_model_id '<your-model-id>' \
99+
--hub_token '<your-sdk-token>' \
100+
--use_hf false
101+
```
102+
82103
## Learn More
83104
- More Shell scripts: [https://github.com/modelscope/ms-swift/tree/main/examples](https://github.com/modelscope/ms-swift/tree/main/examples)
84105
- Using Python: [https://github.com/modelscope/ms-swift/blob/main/examples/notebook/qwen2_5-self-cognition/self-cognition-sft.ipynb](https://github.com/modelscope/ms-swift/blob/main/examples/notebook/qwen2_5-self-cognition/self-cognition-sft.ipynb)

examples/train/pretrain/train.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,8 @@ swift pt \
1313
--learning_rate 1e-5 \
1414
--gradient_accumulation_steps $(expr 256 / $nproc_per_node) \
1515
--warmup_ratio 0.03 \
16-
--eval_steps 100 \
17-
--save_steps 100 \
16+
--eval_steps 500 \
17+
--save_steps 500 \
1818
--save_total_limit 2 \
1919
--logging_steps 5 \
2020
--deepspeed zero3 \

0 commit comments

Comments
 (0)