Skip to content

SparkAudio/SparkVox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SparkVox

SparkVox is a training framework focused on speech generation, while also supporting a range of related speech tasks, including speaker attribute recognition, emotion recognition, audio codecs, and speech synthesis.

Supported Tasks

  • Speaker Attribute Recognition
    • Age prediction
    • Gender prediction
  • Codec
    • BiCodec
    • BigCodec
  • Speech Synthesis
    • SparkTTS

Project Structure

  • bins:
    • train_pl: The main training entry point for all tasks.
  • egs:
    • task (e.g. codec, speech_synthesis): Example training scripts for each task.
  • sparkvox
    • models: Model implementations for different tasks.
  • tools: Utilities for data processing, model inference, and feature extraction.
  • utils: Common utilities for tasks such as reading and processing audio files, as well as general training tools.

Examples

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published