You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ Contrarily to other TTS models, Parler-TTS is a **fully open-source** release. A
7
7
This repository contains the inference and training code for Parler-TTS. It is designed to accompany the [Data-Speech](https://github.com/huggingface/dataspeech) repository for dataset annotation.
8
8
9
9
> [!IMPORTANT]
10
-
> We're proud to release Parler-TTS v0.1, our first 300M parameter model, trained on 10.5K hours of audio data.
10
+
> We're proud to release [Parler-TTS Mini v0.1](https://huggingface.co/parler-tts/parler_tts_mini_v0.1), our first 600M parameter model, trained on 10.5K hours of audio data.
11
11
> In the coming weeks, we'll be working on scaling up to 50k hours of data, in preparation for the v1 model.
description ="A female speaker with a slightly low-pitched voice delivers her words quite expressively, in a very confined sounding environment with clear audio quality. She speaks very fast."
@@ -63,7 +63,7 @@ The [training folder](/training/) contains all the information to train or fine-
63
63
-[3. A training guide](/training/README.md#3-training)
64
64
65
65
> [!IMPORTANT]
66
-
> **TL;DR:** After having followed the [installation steps](/training/README.md#requirements), you can reproduce the Parler-TTS v0.1 training recipe with the following command line:
66
+
> **TL;DR:** After having followed the [installation steps](/training/README.md#requirements), you can reproduce the Parler-TTS Mini v0.1 training recipe with the following command line:
Copy file name to clipboardExpand all lines: training/README.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Training Parler-TTS
2
2
3
-
**TL;DR:** After having followed the [installation steps](#requirements), you can reproduce the Parler-TTS v0.1 training recipe with the following command line:
3
+
**TL;DR:** After having followed the [installation steps](#requirements), you can reproduce the [Parler-TTS Mini v0.1](https://huggingface.co/parler-tts/parler_tts_mini_v0.1) training recipe with the following command line:
@@ -71,18 +71,18 @@ And then enter an authentication token from https://huggingface.co/settings/toke
71
71
72
72
Depending on your compute resources and your dataset, you need to choose between fine-tuning a pre-trained model and training a new model from scratch.
73
73
74
-
In that sense, we released a 300M checkpoint trained on 10.5K hours of annotated data under the repository id: [`parler-tts/parler_tts_300M_v0.1`](https://huggingface.co/parler-tts/parler_tts_300M_v0.1), that you can fine-tune for your own use-case.
74
+
In that sense, we released a 600M checkpoint trained on 10.5K hours of annotated data under the repository id: [`parler-tts/parler_tts_mini_v0.1`](https://huggingface.co/parler-tts/parler_tts_mini_v0.1), that you can fine-tune for your own use-case.
75
75
76
76
You can also train you own model from scratch. You can find [here](/helpers/model_init_scripts/) examples on how to initialize a model from scratch. For example, you can initialize a dummy model with:
@@ -95,7 +95,7 @@ To train your own Parler-TTS, you need datasets with 3 main features:
95
95
96
96
Note that we made the choice to use description of the main speech characteristics (speaker pitch, speaking rate, level of noise, etc.) but that you are free to use any handmade or generated text description that makes sense.
97
97
98
-
To train Parler-TTS v0.1, we used:
98
+
To train Parler-TTS Mini v0.1, we used:
99
99
* The full [LibriTTS-R dataset](https://huggingface.co/datasets/blabble-io/libritts_r), a 1K hours high-quality speech dataset.
100
100
* A [10K hours subset](https://huggingface.co/datasets/parler-tts/mls_eng_10k) of [Multilingual LibriSpeech](https://huggingface.co/datasets/facebook/multilingual_librispeech).
101
101
@@ -109,11 +109,11 @@ The script [`run_parler_tts_training.py`](/training/run_parler_tts_training.py)
0 commit comments