You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+8-7Lines changed: 8 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,7 +34,7 @@ or pip install -r requirements_3090.txt (GPU 3090, CUDA 11.4)
34
34
35
35
a) Download and extract the [LJ Speech dataset](https://keithito.com/LJ-Speech-Dataset/), then create a link to the dataset folder: `ln -s /xxx/LJSpeech-1.1/ data/raw/`
36
36
37
-
b) Download and Unzip the [ground-truth duration](https://drive.google.com/file/d/1SqwIISwaBZDiCW1MHTHx-MKX6_NQJ_f4/view?usp=sharing) extracted by [MFA](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.1/montreal-forced-aligner_linux.tar.gz): `tar -xvf mfa_outputs.tar; mv mfa_outputs data/processed/ljspeech/`
37
+
b) Download and Unzip the [ground-truth duration](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/mfa_outputs.tar) extracted by [MFA](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.1/montreal-forced-aligner_linux.tar.gz): `tar -xvf mfa_outputs.tar; mv mfa_outputs data/processed/ljspeech/`
38
38
39
39
c) Run the following scripts to pack the dataset for training/inference.
- the pre-trained model of [DiffSpeech](https://drive.google.com/file/d/1AHRuNS379v2_lNuz4-Mjlpii7TZsfs3f/view?usp=sharing);
63
-
- the pre-trained model of [HifiGAN](https://drive.google.com/file/d/1Z3DJ9fvvzIci9DAf8jwchQs-Ulgpx6l8/view?usp=sharing) vocoder;
64
-
- the individual pre-trained model of [FastSpeech 2](https://drive.google.com/file/d/1Zp45YjKkkv5vQSA7woHIqEggfyLqQdqs/view?usp=sharing) for the shallow diffusion mechanism in DiffSpeech;
62
+
- the pre-trained model of [DiffSpeech](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/lj_ds_beta6_1213.zip);
63
+
- the pre-trained model of [HifiGAN](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/0414_hifi_lj_1.zip) vocoder;
64
+
- the individual pre-trained model of [FastSpeech 2](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/fs2_lj_1.zip) for the shallow diffusion mechanism in DiffSpeech;
65
65
66
66
Remember to put the pre-trained models in `checkpoints` directory.
67
67
@@ -72,6 +72,7 @@ About the determination of 'k' in shallow diffusion: We recommend the trick intr
72
72
73
73
### 0. Data Acquirement
74
74
- See in [apply_form](https://github.com/MoonInTheRiver/DiffSinger/blob/master/resources/apply_form.md).
- the pre-trained model of [DiffSinger](https://drive.google.com/file/d/1QEXcvhhiUiHEK2ItXZ8EDHwv8bawiaIX/view?usp=sharing);
100
-
- the pre-trained model of [FFT-Singer](https://drive.google.com/file/d/1XRCdkI8B-DkRe8NfUJqgSjM-9c0gXQvJ/view?usp=sharing) for the shallow diffusion mechanism in DiffSinger;
101
-
- the pre-trained model of [HifiGAN-Singing](https://drive.google.com/file/d/1Z9bH3vorM34gBbjBlGGWWGVl4PwYy3YY/view?usp=sharing) which is specially designed for SVS with NSF mechanism.
100
+
- the pre-trained model of [DiffSinger](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/popcs_ds_beta6_offline_pmf0_1230.zip);
101
+
- the pre-trained model of [FFT-Singer](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/popcs_fs2_pmf0_1230.zip) for the shallow diffusion mechanism in DiffSinger;
102
+
- the pre-trained model of [HifiGAN-Singing](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/0109_hifigan_bigpopcs_hop128.zip) which is specially designed for SVS with NSF mechanism.
0 commit comments