Prodigal.AI <> KodeKurrent
AI Voice Cloning Model Development Challenge
<a href="https://www.python.org/downloads/release/python-3120/">
<img src="https://img.shields.io/badge/Python-3.12+-orange" alt="Python">
Our Model is an advanced text-to-speech system that uses the power of large language models (LLM) for highly accurate and natural-sounding voice synthesis. It is designed to be efficient, flexible, and powerful for both research and production use.
- Simplicity and Efficiency: Built entirely on Qwen2.5, Syndicate Smasher Model eliminates the need for additional generation models like flow matching. Instead of relying on separate models to generate acoustic features, it directly reconstructs audio from the code predicted by the LLM, improving efficiency and reducing complexity.
- High-Quality Voice Cloning: Supports zero-shot voice cloning, allowing it to replicate a speaker's voice without specific training data. This is ideal for cross-lingual and code-switching scenarios, ensuring seamless transitions between lvoices.
- Controllable Speech Generation: Allows customization of gender, pitch, and speaking rate, making it easier to create virtual speakers.
- User Authentication & Security:
- Sign-up/Login System for secure access.
Inference Overview of Voice Cloning![]() |
Inference Overview of Controlled Generation![]() |
Clone and Install
- Clone the repo
git clone https://github.com/bharathgaddam1712/SyndicateSmashers.git
cd SYNDICATE_SMASHER- Install Conda: please see https://docs.conda.io/en/latest/miniconda.html
- Create Conda env:
conda create -n venv -y python=3.12
conda venv
pip install -r requirements.txt*Model Download
Download via python:
from huggingface_hub import snapshot_download
snapshot_download("SparkAudio/Spark-TTS-0.5B", local_dir="pretrained_models/Spark-TTS-0.5B")Download via git clone:
mkdir -p pretrained_models
# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co/SparkAudio/Spark-TTS-0.5B pretrained_models/Spark-TTS-0.5BBasic Usage
You can simply run the demo with the following commands:
cd example
bash infer.shWeb UI Usage
You can start the UI interface by running python webui2.py --device 0, which allows you to perform Voice Cloning and Voice Creation. Voice Cloning supports uploading reference audio or directly recording the audio.
| Voice Cloning | Voice Creation |
|---|---|
![]() |
![]() |
|
Utkarsh Raj |
Bharath Gaddam |
|
Sunny Kumar |
Shivam Jogdand |
Demo_Video (Replace with your actual demo video link)
| Name | Role |
|---|---|
| Utkarsh Raj | Deep Learning |
| Bharath Gaddam | AI Engineer |
| Sunny Kumar | Machine Learning |
| Shivam Jogdand | Machine Learning |



