Skip to content

Commit 1bf59a1

Browse files
committed
Update configs, use RectifiedAdam for PWGAN discriminator like in native config.
1 parent 3bc5eb0 commit 1bf59a1

File tree

3 files changed

+33
-18
lines changed

3 files changed

+33
-18
lines changed

examples/multiband_pwgan/README.md

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,11 +61,21 @@ CUDA_VISIBLE_DEVICES=0 python examples/multiband_pwgan/decode_mb_melgan.py \
6161
--use-norm 1
6262
```
6363

64-
## Finetune MelGAN STFT with ljspeech pretrained on other languages
65-
Just load pretrained model and training from scratch with other languages. **DO NOT FORGET** re-preprocessing on your dataset if needed. A hop_size should be 256 if you want to use our pretrained.
64+
## Finetune Multi-Band MelGAN + PWGAN Disc with ljspeech pretrained on other languages
65+
Download generator weights
6666

67+
```bash
68+
CUDA_VISIBLE_DEVICES=0 python examples/multiband_pwgan/train_multiband_pwgan.py \
69+
--train-dir ./dump/train/ \
70+
--dev-dir ./dump/valid/ \
71+
--outdir ./examples/multiband_pwgan/exp/train.multiband_melgan.v1/ \
72+
--config ./examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml \
73+
--use-norm 1 \
74+
--generator_mixed_precision 1 \
75+
--pretrained "ptgen.h5"
76+
```
6777
## Learning Curves
68-
Here is a learning curves of melgan based on this config [`multiband_pwgan.v1.yaml`](https://github.com/dathudeptrai/TensorflowTTS/tree/master/examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml)
78+
Here is a learning curves of melgan based on this config [`multiband_melgan.v1.yaml`](https://github.com/dathudeptrai/TensorflowTTS/tree/master/examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml)
6979

7080
<img src="fig/eval.png" height="300" width="850">
7181

@@ -77,6 +87,9 @@ Here is a learning curves of melgan based on this config [`multiband_pwgan.v1.ya
7787
| [multiband_melgan.v1](https://drive.google.com/drive/folders/1Hg82YnPbX6dfF7DxVs4c96RBaiFbh-cT?usp=sharing) | [link](https://github.com/tensorspeech/TensorFlowTTS/tree/master/examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml) | EN | 22.05k | 80-7600 | 1024 / 256 / None | 940K |
7888
| [multiband_melgan.v1](https://drive.google.com/drive/folders/199XCXER51PWf_VzUpOwxfY_8XDfeXuZl?usp=sharing) | [link](https://github.com/dathudeptrai/TensorflowTTS/tree/master/examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml) | KO | 22.05k | 80-7600 | 1024 / 256 / None | 1000K |
7989

90+
## Notes
91+
1. Using RAdam for discriminator
92+
8093
## Reference
8194

8295
1. https://github.com/kan-bayashi/ParallelWaveGAN

examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -79,11 +79,12 @@ generator_optimizer_params:
7979
amsgrad: false
8080

8181
discriminator_optimizer_params:
82-
lr_fn: "PiecewiseConstantDecay"
82+
lr_fn: "ExponentialDecay"
8383
lr_params:
84-
boundaries: [100000, 200000, 300000, 400000, 500000]
85-
values: [0.00025, 0.000125, 0.0000625, 0.00003125, 0.000015625, 0.000001]
86-
amsgrad: false
84+
initial_learning_rate: 0.0005
85+
decay_steps: 200000
86+
decay_rate: 0.5
87+
8788

8889
###########################################################
8990
# INTERVAL SETTING #

examples/multiband_pwgan/conf/multiband_pwgan.v1ft.yaml

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11

22
# This is the hyperparameter configuration file for Multi-Band MelGAN with PWGAN discriminator.
3-
# This one is adjusted for finetuning
3+
# This one is adjusted for finetuning, used to finetune the LJSpeech pretrained on
44

55
###########################################################
66
# FEATURE EXTRACTION SETTING #
@@ -72,25 +72,26 @@ is_shuffle: true # shuffle dataset after each epoch.
7272
generator_optimizer_params:
7373
lr_fn: "PiecewiseConstantDecay"
7474
lr_params:
75-
boundaries: [100000, 200000, 300000, 400000, 500000, 600000, 700000]
76-
values: [0.0005, 0.0005, 0.00025, 0.000125, 0.0000625, 0.00003125, 0.000015625, 0.000001]
75+
boundaries: [1000, 5000, 10000, 20000]
76+
values: [0.00000000001, 0.000000000005, 0.000000000002, 0.0000000000005, 0.0000000000002]
7777
amsgrad: false
7878

79+
7980
discriminator_optimizer_params:
80-
lr_fn: "PiecewiseConstantDecay"
81+
lr_fn: "ExponentialDecay"
8182
lr_params:
82-
boundaries: [100000, 200000, 300000, 400000, 500000]
83-
values: [0.00025, 0.000125, 0.0000625, 0.00003125, 0.000015625, 0.000001]
84-
amsgrad: false
83+
initial_learning_rate: 0.0000000005
84+
decay_steps: 70000
85+
decay_rate: 0.5
8586

8687
###########################################################
8788
# INTERVAL SETTING #
8889
###########################################################
8990
discriminator_train_start_steps: 0 # steps begin training discriminator
90-
train_max_steps: 200000 # Number of training steps.
91-
save_interval_steps: 5000 # Interval steps to save checkpoint.
92-
eval_interval_steps: 1000 # Interval steps to evaluate the network.
93-
log_interval_steps: 200 # Interval steps to record the training log.
91+
train_max_steps: 10000 # Number of training steps.
92+
save_interval_steps: 1500 # Interval steps to save checkpoint.
93+
eval_interval_steps: 500 # Interval steps to evaluate the network.
94+
log_interval_steps: 100 # Interval steps to record the training log.
9495

9596
###########################################################
9697
# OTHER SETTING #

0 commit comments

Comments
 (0)