File tree Expand file tree Collapse file tree 1 file changed +4
-4
lines changed
Expand file tree Collapse file tree 1 file changed +4
-4
lines changed Original file line number Diff line number Diff line change @@ -93,7 +93,7 @@ An example script to prepare data for GPT training is:
9393python tools/preprocess_data.py \
9494 --input my-corpus.json \
9595 --output-prefix my-gpt2 \
96- --vocab gpt2-vocab.json \
96+ --vocab-file gpt2-vocab.json \
9797 --dataset-impl mmap \
9898 --tokenizer-type GPT2BPETokenizer \
9999 --merge-file gpt2-merges.txt \
@@ -132,7 +132,7 @@ xz -d oscar-1GB.jsonl.xz
132132python tools/preprocess_data.py \
133133 --input oscar-1GB.jsonl \
134134 --output-prefix my-gpt2 \
135- --vocab gpt2-vocab.json \
135+ --vocab-file gpt2-vocab.json \
136136 --dataset-impl mmap \
137137 --tokenizer-type GPT2BPETokenizer \
138138 --merge-file gpt2-merges.txt \
@@ -192,13 +192,13 @@ DATA_ARGS=" \
192192 --data-path $DATA_PATH \
193193 "
194194
195- CMD="pretrain_gpt.py $GPT_ARGS $ OUTPUT_ARGS $DATA_ARGS"
195+ CMD="pretrain_gpt.py GPTARGSGPT_ARGS OUTPUT_ARGS $DATA_ARGS"
196196
197197N_GPUS=1
198198
199199LAUNCHER="deepspeed --num_gpus $N_GPUS"
200200
201- $LAUNCHER $ CMD
201+ LAUNCHERLAUNCHER CMD
202202```
203203
204204Note, we replaced ` python ` with ` deepspeed --num_gpus 1 ` . For multi-gpu training update ` --num_gpus ` to the number of GPUs you have.
You can’t perform that action at this time.
0 commit comments