File tree Expand file tree Collapse file tree 1 file changed +4
-2
lines changed
examples/language_model/gpt Expand file tree Collapse file tree 1 file changed +4
-2
lines changed Original file line number Diff line number Diff line change @@ -31,8 +31,11 @@ GPT-[2](https://cdn.openai.com/better-language-models/language_models_are_unsupe
31
31
- tqdm
32
32
- visualdl
33
33
- paddlepaddle-gpu >= 2.2rc
34
+ - pybind11
35
+ - lac (可选)
36
+ - zstandard (可选)
34
37
35
- 安装命令 ` pip install regex sentencepiece tqdm visualdl ` 。
38
+ 安装命令 ` pip install regex sentencepiece tqdm visualdl pybind11 lac zstandard ` 。
36
39
注:需要PaddlePaddle版本大于等于2.2rc,或者使用最新develop版本,安装方法请参见Paddle[ 官网] ( https://www.paddlepaddle.org.cn ) 。
37
40
38
41
### 数据准备
@@ -50,7 +53,6 @@ tar -xvf openwebtext2.json.zst.tar -C /path/to/openwebtext
50
53
```
51
54
52
55
然后使用[ data_tools] ( ../data_tools ) 工具下的` create_pretraining_data.py ` 脚本进行数据集制作:
53
-
54
56
```
55
57
python -u create_pretraining_data.py \
56
58
--model_name gpt2-en \
You can’t perform that action at this time.
0 commit comments