Skip to content

Starlento/Qwen-TTS-Demo

Repository files navigation

title emoji colorFrom colorTo sdk sdk_version app_file pinned license suggested_hardware
Qwen3-TTS Demo
🎙️
blue
purple
gradio
5.33.0
app.py
false
apache-2.0
zero-a10g

Qwen3-TTS

Text-to-Speech model based on Qwen3 architecture with voice cloning and voice design capabilities.

🚀 Quick Start

Option 1: Docker Setup (Recommended)

Using Pre-built Docker Image

A pre-built Docker image is available on DockerHub at starlento/qwen3-tts:latest.

Prerequisites:

  • Docker and Docker Compose installed
  • NVIDIA GPU with CUDA support
  • NVIDIA Container Toolkit installed

Steps:

  1. Start the container:
docker-compose up -d

The application will be available at http://localhost:7860

Available Apps:

  • app_custom_voice.py - Custom voice synthesis (default)
  • app_voice_clone.py - Voice cloning
  • app_voice_design.py - Voice design

To use a different app, modify the command line in docker-compose.yaml:

command: python app_voice_clone.py --server-name 0.0.0.0 --server-port 7860

Option 2: Host Setup with UV

Prerequisites

  • Python 3.10
  • NVIDIA GPU with CUDA 12.9 support
  • UV package manager (will be installed automatically if not present)

Installation Steps

  1. Run the UV setup script:
chmod +x setup_uv_env.sh
./setup_uv_env.sh

The script will:

  • Install UV if not already installed
  • Create a virtual environment with Python 3.10
  • Install PyTorch 2.8.0 with CUDA 12.9 support
  • Install all required dependencies
  1. Activate the environment:
source .venv/bin/activate
  1. Run the application:
# Default app
python app.py

# Or choose a specific app
python app_custom_voice.py
python app_voice_clone.py
python app_voice_design.py

⚠️ Important Notes

Flash Attention

Note: Flash Attention is NOT installed by default in either setup method. If you need flash-attn for optimized attention mechanisms, you'll need to install it manually.

Model Storage

Models are downloaded to:

  • Docker: /models directory (mapped to ~/models on host)
  • Host: Default HuggingFace cache (~/.cache/huggingface)

📦 Docker Image

  • DockerHub: starlento/qwen3-tts:latest
  • Base: Python 3.10-slim
  • Includes: PyTorch 2.8.0 with CUDA 12.9 support
  • Size: ~8GB

About

An easier to use version fork from https://huggingface.co/spaces/Qwen/Qwen3-TTS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages