DistillW2N

PyTorch Implementation of DistillW2N: A Lightweight One-Shot Whisper to Normal Voice Conversion Model Using Distillation of Self-Supervised Features

Quick Started

Setup

Create a Python environment with e.g. conda: conda create --name distillw2n python=3.10.12 --yes
Activate the new environment: conda activate distillw2n
Install torch and torchaudio: pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121
Update the packages: sudo apt-get update && apt-get install -y libsndfile1 ffmpeg
Install requirements with pip install -r requirements.txt
Download models with links given in txt

Inference

For quickvc and wesper please run: python compare_infer.py
For our models please run: python infer.py

Training

Please run: python u2ss2u.py

Datasets

You just need to download the datasets under YOURPATH.

Dataset Download
- For the libritts, ljspeech, and timit datasets, datahelper will automatically download if they are not found at YOURPATH.
- For the wtimit dataset, you will need to request it via email. Follow the appropriate procedures to obtain access and download the dataset to YOURPATH.
Dataset Preparation (Option)
- datapreper offers options for ppw (Pseudo-whisper) and vad (Voice Activity Detection) versions. You can choose to apply these processing steps according to your project's requirements.

Credits

This implementation builds on

SoundStream for the training pipeline.

⚠️ Our Token2Wav "vocoder" was trained using less than 100 hours of data. For higher-quality synthesis, we recommend using acoustic models like Soft-VC/Seed-VC to convert to a Mel spectrogram first, and then using the pre-trained vocoder like BigVGAN2.

ToDo List

Add Seed-VC inference samples for comparison.
Train the SoundStream Decoder using a larger dataset of high-quality audio. (Training in progress with limited resources)

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
configs		configs
dataset		dataset
demo		demo
filelists/LibriTTS		filelists/LibriTTS
models		models
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
loss.py		loss.py
requirements.txt		requirements.txt
spectrogram_comparison.png		spectrogram_comparison.png
spectrogram_evolution.gif		spectrogram_evolution.gif
train_tensorboard.py		train_tensorboard.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DistillW2N

Quick Started

Setup

Inference

Training

Datasets

Credits

ToDo List

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DistillW2N

Quick Started

Setup

Inference

Training

Datasets

Credits

ToDo List

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages