Skip to content

Commit a8675e8

Browse files
committed
Update README to fix references
1 parent 291c1a9 commit a8675e8

File tree

1 file changed

+13
-24
lines changed

1 file changed

+13
-24
lines changed

examples/multiband_pwgan/README.md

Lines changed: 13 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,22 @@
11
# Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech (With ParallelWaveGAN discriminator)
2-
Based on the script [`train_multiband_pwgan.py`](https://github.com/dathudeptrai/TensorflowTTS/tree/master/examples/multiband_pwgan/train_multiband_pwgan.py).
2+
Based on the script [`train_multiband_pwgan.py`](https://github.com/tensorspeech/TensorflowTTS/tree/master/examples/multiband_pwgan/train_multiband_pwgan.py).
33

4-
## Training Multi-band MelGAN from scratch with LJSpeech dataset.
4+
## Training Multi-band MelGAN with PWGAN generator from scratch with LJSpeech dataset.
55
This example code show you how to train MelGAN from scratch with Tensorflow 2 based on custom training loop and tf.function. The data used for this example is LJSpeech, you can download the dataset at [link](https://keithito.com/LJ-Speech-Dataset/).
66

77
### Step 1: Create Tensorflow based Dataloader (tf.dataset)
8-
Please see detail at [examples/melgan/](https://github.com/dathudeptrai/TensorflowTTS/tree/master/examples/melgan#step-1-create-tensorflow-based-dataloader-tfdataset)
8+
Please see detail at [examples/melgan/](https://github.com/tensorspeech/TensorflowTTS/tree/master/examples/melgan#step-1-create-tensorflow-based-dataloader-tfdataset)
99

1010
### Step 2: Training from scratch
11-
After you re-define your dataloader, pls modify an input arguments, train_dataset and valid_dataset from [`train_multiband_pwgan.py`](https://github.com/dathudeptrai/TensorflowTTS/tree/master/examples/multiband_pwgan/train_multiband_pwgan.py). Here is an example command line to training melgan-stft from scratch:
11+
After you re-define your dataloader, pls modify an input arguments, train_dataset and valid_dataset from [`train_multiband_pwgan.py`](https://github.com/tensorspeech/TensorflowTTS/tree/master/examples/multiband_pwgan/train_multiband_pwgan.py). Here is an example command line to training melgan-stft from scratch:
1212

1313
First, you need training generator with only stft loss:
1414

1515
```bash
1616
CUDA_VISIBLE_DEVICES=0 python examples/multiband_pwgan/train_multiband_pwgan.py \
1717
--train-dir ./dump/train/ \
1818
--dev-dir ./dump/valid/ \
19-
--outdir ./examples/multiband_pwgan/exp/train.multiband_melgan.v1/ \
19+
--outdir ./examples/multiband_pwgan/exp/train.multiband_pwgan.v1/ \
2020
--config ./examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml \
2121
--use-norm 1 \
2222
--generator_mixed_precision 1 \
@@ -29,18 +29,18 @@ Then resume and start training generator + discriminator:
2929
CUDA_VISIBLE_DEVICES=0 python examples/multiband_pwgan/train_multiband_pwgan.py \
3030
--train-dir ./dump/train/ \
3131
--dev-dir ./dump/valid/ \
32-
--outdir ./examples/multiband_pwgan/exp/train.multiband_melgan.v1/ \
32+
--outdir ./examples/multiband_pwgan/exp/train.multiband_pwgan.v1/ \
3333
--config ./examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml \
3434
--use-norm 1 \
35-
--resume ./examples/multiband_pwgan/exp/train.multiband_melgan.v1/checkpoints/ckpt-200000
35+
--resume ./examples/multiband_pwgan/exp/train.multiband_pwgan.v1/checkpoints/ckpt-200000
3636
```
3737

3838
IF you want to use MultiGPU to training you can replace `CUDA_VISIBLE_DEVICES=0` by `CUDA_VISIBLE_DEVICES=0,1,2,3` for example. You also need to tune the `batch_size` for each GPU (in config file) by yourself to maximize the performance. Note that MultiGPU now support for Training but not yet support for Decode.
3939

4040
In case you want to resume the training progress, please following below example command line:
4141

4242
```bash
43-
--resume ./examples/multiband_pwgan/exp/train.multiband_melgan.v1/checkpoints/ckpt-100000
43+
--resume ./examples/multiband_pwgan/exp/train.multiband_pwgan.v1/checkpoints/ckpt-100000
4444
```
4545

4646
**IMPORTANT NOTES**:
@@ -55,37 +55,26 @@ To running inference on folder mel-spectrogram (eg valid folder), run below comm
5555
CUDA_VISIBLE_DEVICES=0 python examples/multiband_pwgan/decode_mb_melgan.py \
5656
--rootdir ./dump/valid/ \
5757
--outdir ./prediction/multiband_melgan.v1/ \
58-
--checkpoint ./examples/multiband_pwgan/exp/train.multiband_melgan.v1/checkpoints/generator-940000.h5 \
58+
--checkpoint ./examples/multiband_pwgan/exp/train.multiband_pwgan.v1/checkpoints/generator-940000.h5 \
5959
--config ./examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml \
6060
--batch-size 32 \
6161
--use-norm 1
6262
```
6363

6464
## Finetune Multi-Band MelGAN + PWGAN Disc with ljspeech pretrained on other languages
65-
Download generator weights of Multi-Band MelGAN model, pass to `--pretrained` argument
65+
Download generator weights of (any) Multi-Band MelGAN model, pass to `--pretrained` argument.
66+
It's recommended to use (and tune if necessary), the dedicated finetuning config `train.multiband_pwgan.v1ft.yaml`
6667

6768
```bash
6869
CUDA_VISIBLE_DEVICES=0 python examples/multiband_pwgan/train_multiband_pwgan.py \
6970
--train-dir ./dump/train/ \
7071
--dev-dir ./dump/valid/ \
71-
--outdir ./examples/multiband_pwgan/exp/train.multiband_melgan.v1ft/ \
72-
--config ./examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml \
72+
--outdir ./examples/multiband_pwgan/exp/train.multiband_pwgan.v1/ \
73+
--config ./examples/multiband_pwgan/conf/train.multiband_pwgan.v1ft.yaml \
7374
--use-norm 1 \
7475
--generator_mixed_precision 1 \
7576
--pretrained "ptgen.h5"
7677
```
77-
## Learning Curves
78-
Here is a learning curves of melgan based on this config [`multiband_melgan.v1.yaml`](https://github.com/dathudeptrai/TensorflowTTS/tree/master/examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml)
79-
80-
<img src="fig/eval.png" height="300" width="850">
81-
82-
<img src="fig/train.png" height="300" width="850">
83-
84-
## Pretrained Models and Audio samples
85-
| Model | Conf | Lang | Fs [Hz] | Mel range [Hz] | FFT / Hop / Win [pt] | # iters |
86-
| :------ | :---: | :---: | :----: | :--------: | :---------------: | :-----: |
87-
| [multiband_melgan.v1](https://drive.google.com/drive/folders/1Hg82YnPbX6dfF7DxVs4c96RBaiFbh-cT?usp=sharing) | [link](https://github.com/tensorspeech/TensorFlowTTS/tree/master/examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml) | EN | 22.05k | 80-7600 | 1024 / 256 / None | 940K |
88-
| [multiband_melgan.v1](https://drive.google.com/drive/folders/199XCXER51PWf_VzUpOwxfY_8XDfeXuZl?usp=sharing) | [link](https://github.com/dathudeptrai/TensorflowTTS/tree/master/examples/multiband_pwgan/conf/multiband_pwgan.v1.yaml) | KO | 22.05k | 80-7600 | 1024 / 256 / None | 1000K |
8978

9079
## Notes
9180
1. Using RAdam for discriminator

0 commit comments

Comments
 (0)