Text-to-Speech Model based on RWKV7 Architecture

Introduction

This repository primarily explores the use of Focal Codec to convert between speech signals and tokens, and employs an RNN model based on RWKV7 to achieve token prediction for speech generation. The main features are as follows:

Multi-Stage Training: During the pre-training phase, the model performs Token Prediction merely on speech tokens, allowing it to extensively learn speech features from unlabeled speech data, and then aligns using text tokens.
Lightweight Character-Level Text Tokenizer: Designed and implemented a lightweight character-level text tokenizer, significantly reducing the vocab size, enabling the model to better understand the relationship between text tokens and speech tokens.

Project Progress

Conducted experiments based on the public TTS dataset (LibriTTS), achieving remarkable results with 103M parameters.
Efficiently achieved voice cloning for speech generation by inserting prompt audio tokens after text instructions.
Train guide, Inference guide and trained weights is coming soon.

Demonstration

Here shows some generated samples with prompt speech provided:

ccce7a8e5b8b5cef80380e77aeb208c1.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
notes		notes
scripts		scripts
src		src
README.md		README.md
inference.py		inference.py
posttrain.py		posttrain.py
posttrain_multiGPU.py		posttrain_multiGPU.py
pretrain.py		pretrain.py
pretrain_multiGPU.py		pretrain_multiGPU.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-to-Speech Model based on RWKV7 Architecture

Introduction

Project Progress

Demonstration

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

AGENDD/TTS-RWKV

Folders and files

Latest commit

History

Repository files navigation

Text-to-Speech Model based on RWKV7 Architecture

Introduction

Project Progress

Demonstration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages