@@ -10,38 +10,113 @@ You may first run the following command in [datasets/machine_translation](../dat
1010bash wmt2014_ende.sh yttm
1111```
1212
13- Then, you can run the experiment, we use the
14- "transformer_base" configuration.
13+ Then, you can run the experiment.
14+ For "transformer_base" configuration
1515
16+ # TODO
1617``` bash
1718SUBWORD_MODEL=yttm
19+ SRC=en
20+ TGT=de
21+ datapath=../datasets/machine_translation
1822python train_transformer.py \
19- --train_src_corpus ../datasets/machine_translation/wmt2014_ende/train.tok.${SUBWORD_MODEL} .en \
20- --train_tgt_corpus ../datasets/machine_translation/wmt2014_ende/train.tok.${SUBWORD_MODEL} .de \
21- --dev_src_corpus ../datasets/machine_translation/wmt2014_ende/dev.tok.${SUBWORD_MODEL} .en \
22- --dev_tgt_corpus ../datasets/machine_translation/wmt2014_ende/dev.tok.${SUBWORD_MODEL} .de \
23+ --train_src_corpus ${datapath} /wmt2014_ende/train.tok.${SUBWORD_ALGO} .${SRC} \
24+ --train_tgt_corpus ${datapath} /wmt2014_ende/train.tok.${SUBWORD_ALGO} .${TGT} \
25+ --dev_src_corpus ${datapath} /wmt2014_ende/dev.tok.${SUBWORD_ALGO} .${SRC} \
26+ --dev_tgt_corpus ${datapath} /wmt2014_ende/dev.tok.${SUBWORD_ALGO} .${TGT} \
27+ --src_subword_model_path ${datapath} /wmt2014_ende/${SUBWORD_ALGO} .model \
28+ --src_vocab_path ${datapath} /wmt2014_ende/${SUBWORD_ALGO} .vocab \
29+ --tgt_subword_model_path ${datapath} /wmt2014_ende/${SUBWORD_ALGO} .model \
30+ --tgt_vocab_path ${datapath} /wmt2014_ende/${SUBWORD_ALGO} .vocab \
31+ --save_dir transformer_base_wmt2014_en_de_${SUBWORD_ALGO} \
32+ --cfg transformer_base \
33+ --lr 0.002 \
34+ --batch_size 2700 \
35+ --num_averages 5 \
36+ --warmup_steps 4000 \
37+ --warmup_init_lr 0.0 \
38+ --seed 123 \
39+ --gpus 0,1,2,3
40+ ```
41+
42+ Use the average_checkpoint cli to average the last 10 checkpoints
43+
44+ ``` bash
45+ gluon_average_checkpoint --checkpoints transformer_base_wmt2014_en_de_${SUBWORD_ALGO} /epoch* .params \
46+ --begin 21 \
47+ --end 30 \
48+ --save-path transformer_base_wmt2014_en_de_${SUBWORD_ALGO} /avg_21_30.params
49+ ```
50+
51+
52+ Use the following command to inference/evaluate the Transformer model:
53+
54+ ``` bash
55+ SUBWORD_MODEL=yttm
56+ python evaluate_transformer.py \
57+ --param_path transformer_base_wmt2014_en_de_${SUBWORD_MODEL} /average_21_30.params \
58+ --src_lang en \
59+ --tgt_lang de \
60+ --cfg transformer_base_wmt2014_en_de_${SUBWORD_MODEL} /config.yml \
61+ --src_tokenizer ${SUBWORD_MODEL} \
62+ --tgt_tokenizer ${SUBWORD_MODEL} \
2363 --src_subword_model_path ../datasets/machine_translation/wmt2014_ende/${SUBWORD_MODEL} .model \
24- --src_vocab_path ../datasets/machine_translation/wmt2014_ende/${SUBWORD_MODEL} .vocab \
2564 --tgt_subword_model_path ../datasets/machine_translation/wmt2014_ende/${SUBWORD_MODEL} .model \
65+ --src_vocab_path ../datasets/machine_translation/wmt2014_ende/${SUBWORD_MODEL} .vocab \
2666 --tgt_vocab_path ../datasets/machine_translation/wmt2014_ende/${SUBWORD_MODEL} .vocab \
27- --save_dir transformer_wmt2014_ende_${SUBWORD_MODEL} \
28- --cfg transformer_base \
29- --lr 0.002 \
67+ --src_corpus ../datasets/machine_translation/wmt2014_ende/test.raw.en \
68+ --tgt_corpus ../datasets/machine_translation/wmt2014_ende/test.raw.de
69+ ```
70+
71+
72+
73+ For "transformer_wmt_en_de_big" configuration
74+
75+ ``` bash
76+ SUBWORD_MODEL=yttm
77+ SRC=en
78+ TGT=de
79+ datapath=../datasets/machine_translation
80+ python train_transformer.py \
81+ --train_src_corpus ${datapath} /wmt2014_ende/train.tok.${SUBWORD_ALGO} .${SRC} \
82+ --train_tgt_corpus ${datapath} /wmt2014_ende/train.tok.${SUBWORD_ALGO} .${TGT} \
83+ --dev_src_corpus ${datapath} /wmt2014_ende/dev.tok.${SUBWORD_ALGO} .${SRC} \
84+ --dev_tgt_corpus ${datapath} /wmt2014_ende/dev.tok.${SUBWORD_ALGO} .${TGT} \
85+ --src_subword_model_path ${datapath} /wmt2014_ende/${SUBWORD_ALGO} .model \
86+ --src_vocab_path ${datapath} /wmt2014_ende/${SUBWORD_ALGO} .vocab \
87+ --tgt_subword_model_path ${datapath} /wmt2014_ende/${SUBWORD_ALGO} .model \
88+ --tgt_vocab_path ${datapath} /wmt2014_ende/${SUBWORD_ALGO} .vocab \
89+ --save_dir transformer_big_wmt2014_en_de_${SUBWORD_ALGO} \
90+ --cfg transformer_wmt_en_de_big \
91+ --lr 0.001 \
92+ --sampler BoundedBudgetSampler \
93+ --max_num_tokens 3584 \
94+ --max_update 15000 \
3095 --warmup_steps 4000 \
3196 --warmup_init_lr 0.0 \
3297 --seed 123 \
3398 --gpus 0,1,2,3
3499```
35100
101+ Use the average_checkpoint cli to average the last 10 checkpoints
102+
103+ ``` bash
104+ gluon_average_checkpoint --checkpoints transformer_big_wmt2014_en_de_${SUBWORD_ALGO} /update* .params \
105+ --begin 21 \
106+ --end 30 \
107+ --save-path transformer_big_wmt2014_en_de_${SUBWORD_ALGO} /avg_21_30.params
108+ ```
109+
110+
36111Use the following command to inference/evaluate the Transformer model:
37112
38113``` bash
39114SUBWORD_MODEL=yttm
40115python evaluate_transformer.py \
41- --param_path transformer_wmt2014_ende_ ${SUBWORD_MODEL} /average .params \
116+ --param_path transformer_big_wmt2014_en_de_ ${SUBWORD_MODEL} /average_21_30 .params \
42117 --src_lang en \
43118 --tgt_lang de \
44- --cfg transformer_wmt2014_ende_ ${SUBWORD_MODEL} /config.yml \
119+ --cfg transformer_big_wmt2014_en_de_ ${SUBWORD_MODEL} /config.yml \
45120 --src_tokenizer ${SUBWORD_MODEL} \
46121 --tgt_tokenizer ${SUBWORD_MODEL} \
47122 --src_subword_model_path ../datasets/machine_translation/wmt2014_ende/${SUBWORD_MODEL} .model \
@@ -59,6 +134,14 @@ Test BLEU score with 3 seeds (evaluated via sacre BLEU):
59134
60135| Subword Model | #Params | Seed = 123 | Seed = 1234 | Seed = 12345 | Mean±std |
61136| ---------------| ------------| -------------| -------------| --------------| -------------|
62- | yttm | | 26.63 | 26.73 | | - |
137+ | yttm | | - | - | - | - |
138+ | hf_bpe | | - | - | - | - |
139+ | spm | | - | - | - | - |
140+
141+ - transformer_wmt_en_de_big
142+
143+ | Subword Model | #Params | Seed = 123 | Seed = 1234 | Seed = 12345 | Mean±std |
144+ | ---------------| ------------| -------------| -------------| --------------| -------------|
145+ | yttm | | 27.99 | - | - | - |
63146| hf_bpe | | - | - | - | - |
64147| spm | | - | - | - | - |
0 commit comments