- Voice Generation: Generate speech from text using Qwen-TTS models.
- Voice Cloning: Clone voices from reference audio.
- Voice Design: Create voices based on text descriptions (prompts).
- Audio Browser: Manage and play generated audio files.
- Modern UI: Fluent Design interface with theme support and animations.
The GUI provides two options for setting up the environment for Qwen-TTS models:
- GPU: For devices with CUDA support (Recommended).
- CPU: For devices without CUDA or if you prefer CPU execution.
Download the latest version from Releases. Put requirements.txt file in the same folder.
Requirements: Python 3.10-3.12
-
Install dependencies:
pip install -r requirements.txt
-
Run the application:
python package.py
- Generation: Select model and speaker, enter text, and generate.
- Design: Describe the desired voice (e.g., "A young female voice, happy tone") and generate speech.
- Clone: Upload a reference audio file and enter text to clone the voice.
- Browser: View and play generated audio files.
- Settings: Change language, font scale, and theme.
This project adopts a layered licensing architecture. Core rules are in the root LICENSE file:
- Core code (app/core/, etc.): Follows GPLv3 open source license;
- UI design scheme: All rights reserved (non-commercial use only with code display);
- Third-party materials: Non-commercial fair use only, copyright belongs to original rights holders.
Full GPLv3 license text: app/core/LICENSE
Developed by Cyrene2008.