Speech Dataset Maker

Speech Dataset Maker is a Windows application designed to simplify the creation of high-quality speech datasets for training Text-to-Speech (TTS) models. It is especially convenient for directly preparing datasets in Piper format.

Features

User-Friendly Interface: Intuitive GUI for easy dataset creation.
Direct Export for TTS: Export recordings directly in Piper format without a separate export step.
Multiple Dataset Support: Define different datasets with different sentences and technical settings.
Automatic Silence Trimming: Automatically removes silence at the beginning and end of recordings.
Unicode & RTL/LTR Support: Works with any language, including right-to-left scripts.
Editable Metadata: Edit the text before saving to the metadata file.

Installation

Download the setup file from the Releases page.
Install the application.
Ensure .NET 8.0 Desktop Runtime is installed on your system.

Requirements

Windows operating system
.NET 8.0 Desktop Runtime (57MB)

Usage

To start recording each dataset, you need:

a .json file for configurations
a .tsv file with ID–sentence pairs

You can add or edit these files in the dataset folder of the application. A link to that folder is provided in the interface.

Select the desired microphone from the list of available devices.
Select the dataset in the interface.
Unrecorded sentences will appear in the text box — you can edit them before recording.
When ready, click Record and read the sentence aloud.
Use the Play button to check your recording.
If satisfied, click Save. The next sentence will appear.
- The app automatically trims silence and updates metadata.
- Recordings are directly stored in Piper format, ready for TTS training.
Click Output Folder to view the dataset at any stage.

Acknowledgments

Ideas and inspiration for this project were adapted from Piper Recording Studio.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.vscode		.vscode
datasets		datasets
.gitignore		.gitignore
App.xaml		App.xaml
App.xaml.cs		App.xaml.cs
AssemblyInfo.cs		AssemblyInfo.cs
LICENSE		LICENSE
MainWindow.xaml		MainWindow.xaml
MainWindow.xaml.cs		MainWindow.xaml.cs
README.md		README.md
Screenshot.png		Screenshot.png
SpeechDatasetMaker.csproj		SpeechDatasetMaker.csproj
SpeechDatasetMaker.sln		SpeechDatasetMaker.sln
icon.ico		icon.ico
installer.iss		installer.iss

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Dataset Maker

Features

Installation

Requirements

Usage

Acknowledgments

License

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

License

aso-mehmudi/Speech-Dataset-Maker

Folders and files

Latest commit

History

Repository files navigation

Speech Dataset Maker

Features

Installation

Requirements

Usage

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages