Applio-Website/apps/applio-docs/src/content/docs/getting-started/training.mdx at 318147c841f0f8c7df0b0147824352e95ab92baa · IAHispano/Applio-Website

title

Training a Voice Model

description

A step-by-step guide to training a high-quality voice model in Applio.

sidebar

order
6

import { Aside, Steps, FileTree } from '@astrojs/starlight/components';

Training is the process where Applio learns to replicate a voice from a dataset of audio files. This guide will walk you through each step of the training process, from preparing your dataset to exporting your final model.

If you don't have a powerful local GPU, you can use cloud alternatives to train your model. Check out our [Cloud Guide](/getting-started/cloud-guides/).

Step 1: Prepare Your Dataset

The first and most important step is to prepare a high-quality audio dataset.

Duration: Aim for 10-30 minutes of clean audio.
Format: Your audio files must be in a lossless format, such as .wav or .flac.
Quality: The audio should be free of background noise, reverb, and other artifacts.

For a detailed guide on creating a high-quality dataset, please see our Dataset Creation Guide.

Once your dataset is ready, you need to place it in the applio/assets/datasets directory. Create a new folder inside this directory for your model.

Multi-Speaker Models (Optional)

If you want to train a model with multiple speakers, create a subfolder for each speaker inside your model's dataset folder. The speaker folders must be named numerically, starting from 0.

- applio/assets/datasets/your-model-name/ - 0/ - speaker0-audio1.wav - speaker0-audio2.wav - 1/ - speaker1-audio1.wav - speaker1-audio2.wav

Step 2: Pre-process the Dataset

Now it's time to pre-process your dataset.

1. In the **Train** tab of Applio, enter a name for your model. 2. Select the correct sample rate for your audio files (`32k`, `40k`, or `48k`). 3. Click the **Pre-process Dataset** button.

Step 3: Extract Features

Next, you need to extract the features from your pre-processed dataset.

1. **Choose a Pitch Extraction Algorithm:** We recommend using **RMVPE** for the best results. 2. **Select an Embedder Model:** Make sure to choose the correct embedder for your model. 3. Click the **Extract Features** button.

This process will take some time. You can monitor the progress in the command line window.

Step 4: Train the Model and Index

This is the final and most time-consuming step.

1. **Set the "Save Every Epoch" Value:** This determines how often the model is saved. A value between 10 and 50 is recommended. 2. **Set the "Total Epochs":** This is the total number of times the model will train on the entire dataset. A good starting point is 200-400 epochs, but you should use [TensorBoard](/getting-started/tensorboard) to monitor your model's progress and decide when to stop. 3. **Set the "Batch Size":** This depends on your GPU's VRAM. For an 8GB GPU, a batch size of 6-8 is a good starting point. 4. Click the **Train Model** button. 5. Once the model training is complete, click the **Train Index** button.

Step 5: Export Your Model

Your trained models are saved in the logs folder. You can also export them directly from the Applio interface.

1. Go to the **Export Model** section in the **Train** tab. 2. Click the **Refresh** button. 3. Select the `.pth` file and the corresponding `.index` file for your model. 4. Click the **Export Model** button. You can often get better results by using a [pre-trained model](/getting-started/pretrained) as a starting point for your training.

Resume Training (Optional)

If you want to continue training a model you've already started, follow these steps to resume from where you left off:

1. **Select Your Model** from the dropdown menu. 2. Make sure to **select the same original sample rate** that you used when you started training (e.g., `32k`, `40k`, or `48k`). 3. Scroll down to the **Training** section. 4. Choose the **same batch size** you used previously. 5. Set a **new max epoch value** that is higher than your current one. For example, if your last completed epoch was 200, you can set this to 400 to continue training up to that point. 6. Click the **Start Training** button to resume training from the latest saved checkpoint. Make sure your previously saved model checkpoints are still in the `logs` directory. Applio will automatically load the latest checkpoint and continue from there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Step 1: Prepare Your Dataset

Multi-Speaker Models (Optional)

Step 2: Pre-process the Dataset

Step 3: Extract Features

Step 4: Train the Model and Index

Step 5: Export Your Model

Resume Training (Optional)

FilesExpand file tree

training.mdx

Latest commit

History

training.mdx

File metadata and controls

Step 1: Prepare Your Dataset

Multi-Speaker Models (Optional)

Step 2: Pre-process the Dataset

Step 3: Extract Features

Step 4: Train the Model and Index

Step 5: Export Your Model

Resume Training (Optional)