An efficient desktop speech transcription application based on Faster-Whisper and PySide6.
- One-click Transcription: Easily convert video and audio content into editable text and subtitles using the Faster Whisper model.
- GPU Acceleration Support: Automatically detects PC environment, defaults to CPU transcription, and can enable GPU acceleration on CUDA-enabled devices.
- Clean Interface, Simple Operation: A concise and intuitive operating interface that supports light and dark theme switching.
- Multiple Output Options: Supports word-level timestamps, and transcription results can be saved in common subtitle formats (SRT, VTT), plain text (TXT), and other popular formats.
- Interface Language: Supports Chinese and English interface languages.
- Batch Processing: Allows adding multiple audio files at once to queue transcription tasks.
- Real-time Monitoring: Supports progress percentage display and transcription text preview, allowing real-time monitoring of current transcription progress.
git clone https://github.com/JorkeyLiu/faster-vox
cd faster-voxIf you use Conda, you can create a virtual environment with a specified Python version:
conda create -n faster-vox python=3.11.12
conda activate faster-voxpip install -r requirements.txtExecute the following command in the project root directory to start the Faster-Vox application:
python main.pyYou can use tools like PyInstaller to package Faster-Vox as a standalone executable for easy distribution and use.
pip install pyinstaller
pyinstaller FasterVox.specThis project is licensed under the MIT License. See the LICENSE file for details.
This project uses the following excellent open-source libraries, and we express our gratitude:
- Faster-Whisper: A faster and more efficient implementation of the Whisper model based on CTranslate2.
- PySide6-Fluent-Widgets: Provides Windows-style UI components.
- FFmpeg: Audio and video processing tool.
