Skip to content

Commit de4fc9d

Browse files
authored
[GPT-3] Fix bug of CE config (PaddlePaddle#1024)
1 parent 1d52537 commit de4fc9d

File tree

2 files changed

+6
-4
lines changed

2 files changed

+6
-4
lines changed

examples/language_model/gpt-3/dygraph/modeling.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -616,7 +616,7 @@ class GPTPretrainedModel(PretrainedModel):
616616
"gpt2-small-en": { # config for CE
617617
"vocab_size": 50304,
618618
"hidden_size": 1024,
619-
"num_hidden_layers": 2, #4
619+
"num_hidden_layers": 4,
620620
"num_attention_heads": 4,
621621
"intermediate_size": 4096,
622622
"hidden_act": "gelu",

examples/language_model/gpt-3/dygraph/run.sh

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,12 @@ python -m paddle.distributed.launch --log_dir $log_dir --gpus "0,1,2,3,4,5,6,7"
1414
--device gpu\
1515
--eval_freq 1000\
1616
--warmup_rate 0.01\
17+
--scale_loss 32768\
1718
--global_batch_size 16\
1819
--micro_batch_size 2\
1920
--dp_degree 2\
20-
--mp_degree 4\
21-
--pp_degree 1\
21+
--mp_degree 2\
22+
--pp_degree 2\
2223
--use_amp True\
23-
--scale_loss 32768
24+
--use_recompute False
25+

0 commit comments

Comments
 (0)