update docs

r9y9 · r9y9 · commit 6873cfbe462f · 2021-08-21T16:34:11.000+09:00
diff --git a/docs/changelog.rst b/docs/changelog.rst
@@ -1,9 +1,12 @@
 Change log
 ==========
 
-v0.2.1 <2021-xx-xx>
+v0.2.1 <2021-08-21>
 -------------------
 
+- pretrained: add PWG TTS models for common voice (ja)
+- pretrained: add HiFi-GAN based TTS models using JVS and JSUT corpus
+- Add HiFi-GAN configs for JVS and JSUT extra recipes
 - `#7`_: Add script to generate ground-truth aligned (GTA) features
 - `#5`_: [docker] Push docker image to Docker Hub
 - `#4`_: [docker] fix docker build fail because no 'gcc' command
diff --git a/docs/pretrained.rst b/docs/pretrained.rst
@@ -30,17 +30,25 @@ Extra pretrained models
 Note that the following models are not explained in our book.
 Those were trained using extra recipes found in our GitHub repository.
 
-+----------------------------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
-| Model ID                         | Class                                                      | Details of the model                                                                                |
-+----------------------------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
-| ``tacotron2_pwg_jsut16k``        | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Tacotron 2 with Parallel WaveGAN (PWG). Trained on JSUT corpus. Sampling rate: 16 kHz.              |
-+----------------------------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
-| ``tacotron2_pwg_jsut24k``        | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Tacotron 2 with Parallel WaveGAN (PWG). Trained on JSUT corpus. Sampling rate: 24 kHz.              |
-+----------------------------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
-| ``multspk_tacotron2_pwg_jvs16k`` | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Multi-speaker Tacotron 2 with Parallel WaveGAN (PWG). Trained on JVS corpus. Sampling rate: 16 kHz. |
-+----------------------------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
-| ``multspk_tacotron2_pwg_jvs24k`` | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Multi-speaker Tacotron 2 with Parallel WaveGAN (PWG). Trained on JVS corpus. Sampling rate: 24 kHz. |
-+----------------------------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
++--------------------------------------+--------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
+| Model ID                             | Corpus       | Class                                                      | Details of the model                                                                                |
++--------------------------------------+--------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
+| ``tacotron2_pwg_jsut16k``            | JSUT         | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Tacotron 2 with Parallel WaveGAN (PWG). Trained on JSUT corpus. Sampling rate: 16 kHz.              |
++--------------------------------------+--------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
+| ``tacotron2_pwg_jsut24k``            | JSUT         | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Tacotron 2 with PWG. Trained on JSUT corpus. Sampling rate: 24 kHz.                                 |
++--------------------------------------+--------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
+| ``tacotron2_hifipwg_jsut24k``        | JSUT         | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Tacotron 2 with HiFi-GAN. Trained on JSUT corpus. Sampling rate: 24 kHz.                            |
++--------------------------------------+--------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
+| ``multspk_tacotron2_pwg_jvs16k``     | JVS          | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Multi-speaker Tacotron 2 with PWG. Trained on JVS corpus. Sampling rate: 16 kHz.                    |
++--------------------------------------+--------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
+| ``multspk_tacotron2_pwg_jvs24k``     | JVS          | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Multi-speaker Tacotron 2 with Parallel WaveGAN (PWG). Trained on JVS corpus. Sampling rate: 24 kHz. |
++--------------------------------------+--------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
+| ``multspk_tacotron2_hifipwg_jvs24k`` | JVS          | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Multi-speaker Tacotron 2 with HiFi-GAN. Trained on JVS corpus. Sampling rate: 24 kHz.               |
++--------------------------------------+--------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
+| ``multspk_tacotron2_pwg_cv16k``      | common voice | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Multi-speaker Tacotron 2 with PWG. Trained on common voice (ja) corpus. Sampling rate: 16 kHz.      |
++--------------------------------------+--------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
+| ``multspk_tacotron2_pwg_cv24k``      | common voice | :py:class:`ttslearn.contrib.tacotron2_pwg.Tacotron2PWGTTS` | Multi-speaker Tacotron 2 with PWG. Trained on common voice (ja) corpus. Sampling rate: 24 kHz.      |
++--------------------------------------+--------------+------------------------------------------------------------+-----------------------------------------------------------------------------------------------------+
 
 Helpers
 --------