You can find more information about Quick Whisper here: Scorchsoft Quick Whisper - Free Speech-to-Copy-Edited-Text AI App for Desktop
QuickWhisper is a user-friendly, voice-to-text transcription app that leverages OpenAI's Whisper model for accurate audio transcription. With QuickWhisper, users can start recording their voice, automatically transcribe it to text, copy it to the clipboard, and optionally paste it into other applications. Additionally, the transcription text can be processed through OpenAI ChatGPT for a polished, copy-edited output.
While QuickWhisper was originally developed for Windows, the codebase has been updated to support Linux and macOS as well. Windows and Linux binaries are currently available via the GitHub release log
- Simple Recording & Transcription: Quickly record audio and transcribe it to text with a single click or a hotkey (
Ctrl+Alt+Jfor edit,Ctrl+Alt+Shift+Jfor transcription). - Auto Copy & Paste: Automatically copy transcriptions to the clipboard and paste them into other applications if desired.
- Optional OpenAI ChatGPT Editing: Enhance your transcriptions using OpenAI ChatGPT for a polished, copy-edited text output.
- Customizable Settings: Enable or disable auto-copy and auto-paste, choose from available input devices, and toggle OpenAI ChatGPT editing.
-
Clone the repository or download the code to your local machine.
-
Ensure you have Python 3.x installed.
-
Set up a virtual environment and install dependencies:
Windows:
python -m venv venv venv\Scripts\activate pip install -r requirements.txt
Mac:
python -m venv venv source venv/bin/activate pip install -r requirements.txtLinux:
python3 -m venv venv --system-site-packages source venv/bin/activate pip install -r requirements.txt -
The app will prompt you for your OpenAI API key on first run. Alternatively, you can pre-configure it in
config/credentials.json:{ "openai_api_key": "your_openai_api_key_here" }
- Install required system packages (before creating the virtual environment):
sudo apt install portaudio19-dev python3-tk python3-gi gir1.2-gstreamer-1.0 gir1.2-gtk-3.0 gir1.2-ayatanaappindicator3-0.1 gstreamer1.0-plugins-base espeak
-
Install portaudio first:
brew install portaudio
-
You may need to grant accessibility permissions to the app for keyboard shortcuts:
- Go to System Preferences > Security & Privacy > Privacy > Accessibility
- Add QuickWhisper to the list of allowed apps
-
Run the application:
python quick_whisper.py
-
Select an input device, then press one of the recording buttons or use the hotkeys:
Windows/Linux:
Ctrl+Alt+Jfor Record + AI EditCtrl+Alt+Shift+Jfor Record + TranscriptWin+Xto Cancel Recording
Mac:
⌘+Alt+Jfor Record + AI Edit⌘+Alt+Shift+Jfor Record + Transcript⌘+Xto Cancel Recording
-
After recording, the app will transcribe the audio and display the text in the transcription area. The text can be automatically copied to the clipboard or pasted into other applications, depending on the settings.
-
Enable "Auto Copy-edit with OpenAI ChatGPT" for advanced text processing, allowing OpenAI ChatGPT to edit the transcription for improved readability and structure.
QuickWhisper includes a configuration system that allows you to customize recording behavior. Access it via Settings > Config.
Choose where audio recording files are saved:
- Alongside application (recommended): Saves recordings in a
tmpfolder next to the application - In AppData folder: Uses the OS-appropriate application data directory:
- Windows:
%APPDATA%\QuickWhisper\recordings - macOS:
~/Library/Application Support/QuickWhisper/recordings - Linux:
~/.config/QuickWhisper/recordings
- Windows:
- Custom folder: Specify any folder of your choice
Control how recording files are managed:
- Overwrite the same file each time (default): Saves disk space by reusing the same filename
- Save each recording with date/time in filename: Creates unique files like
recording_20240101_143052.wav⚠️ Warning: This option can consume significant disk space over time
All configuration settings are saved to JSON files in the config/ folder:
settings.json: User preferences, model settings, shortcuts, recording optionscredentials.json: API key (to be encrypted in a future release)prompts.json: Custom AI prompts
Settings will persist between application restarts. If you're upgrading from an older version that used .env files, your settings will be automatically migrated to the new JSON format.
QuickWhisper supports multiple languages for the user interface:
- English (default)
- French (Français)
- German (Deutsch)
- Spanish (Español)
- Chinese Simplified (简体中文)
- Arabic (العربية)
- Japanese (日本語)
- Korean (한국어)
- Portuguese (Português)
- Russian (Русский)
- Go to Settings > Configuration
- Select the Language category
- Choose between:
- Auto-detect from system: Uses your operating system's language setting
- Manual selection: Choose a specific language from the dropdown
The interface will update immediately when you save the settings - no restart required.
If Chinese characters appear as boxes (□□), install CJK fonts:
sudo apt install fonts-noto-cjkIf you want to contribute translations or add a new language:
-
Extract translatable strings to update the template:
python3 tools/i18n_tools.py extract
-
Create a new language (e.g., Italian):
python3 tools/i18n_tools.py init it
-
Edit the .po file at
locale/it/LC_MESSAGES/quickwhisper.powith your translations -
Compile translations to .mo files:
python3 tools/compile_mo.py
-
Add the language to
SUPPORTED_LANGUAGESinutils/i18n.py
To create a standalone executable, first ensure you have your virtual environment activated with dependencies installed, then install PyInstaller:
pip install --no-cache-dir pyinstallerUsing spec file (recommended for all platforms):
python -m PyInstaller quick_whisper.specThe spec file automatically detects your platform and includes the appropriate hidden imports.
Platform-specific manual builds:
Windows (no console window):
python -m PyInstaller --onefile --windowed --add-data "assets;assets" --icon="assets/icon.ico" --hidden-import pystray._win32 --hidden-import PIL._tkinter_finder --hidden-import pyttsx3.drivers --hidden-import pyttsx3.drivers.sapi5 --hidden-import pynput.keyboard._win32 quick_whisper.pymacOS:
pyinstaller --onefile --windowed --add-data "assets:assets" --hidden-import pystray._darwin --hidden-import PIL._tkinter_finder --hidden-import pyttsx3.drivers --hidden-import pyttsx3.drivers.nsss --hidden-import pynput.keyboard._darwin quick_whisper.pyLinux:
pyinstaller --onefile --add-data "assets:assets" --hidden-import pystray._xorg --hidden-import PIL._tkinter_finder --hidden-import pyttsx3.drivers --hidden-import pyttsx3.drivers.espeak --hidden-import pynput.keyboard._xorg quick_whisper.pyLinux prerequisites:
- Install espeak for TTS:
sudo apt install espeak - Install tkinter:
sudo apt install python3-tk - For best hotkey support, run under X11 (Wayland has limited global hotkey support)
This project is licensed under the terms specified in the LICENSE.md file.
QuickWhisper logs resource usage to the console every 60 seconds. To use this, run the app from a terminal rather than double-clicking the executable:
python quick_whisper.pyEvery 60 seconds you will see output like:
============================================================
[MEMORY DIAG] uptime=5.0min RSS=142.3MB delta=+2.1MB threads=8
[MEMORY DIAG] gc_objects=45231 gc_counts=(47, 3, 1)
[MEMORY DIAG] audio: sounds=6 streams_opened=2 streams_closed=2 frames_peak=4800 recordings=2/2
[MEMORY DIAG] threads: ['MainThread', 'sound_0', 'sound_1', 'pynput-listener', ...]
============================================================
| Field | Meaning |
|---|---|
RSS |
Total physical memory used by the process (in MB) |
delta |
Change in RSS since the last log entry. Consistently positive = likely leak |
threads |
Number of active threads. Should stay roughly constant |
gc_objects |
Total Python objects tracked by the garbage collector. Steady growth = object leak |
gc_counts |
Pending GC work per generation (gen0, gen1, gen2) |
sounds |
Total sound effects played since startup |
streams_opened / streams_closed |
PyAudio recording streams. These two numbers should always match |
frames_peak |
Largest audio buffer recorded (in frames). High values expected for long recordings |
recordings |
started/stopped count. Should always match |
- RSS delta is consistently positive (e.g.
+2MB,+3MB,+5MBevery minute) even when idle — something is leaking. - Thread count keeps climbing — threads are being created but not finishing. Look at the thread names list to see which ones are accumulating.
streams_opened>streams_closed— a PyAudio stream was not properly closed after recording.recordingsstarted > stopped — a recording was started but never completed or cancelled.gc_objectssteadily increasing — Python objects are being created and never freed (circular references or growing collections).
You may also see these in the console output:
[MEMORY] WARNING: pynput listener thread did not terminate within 5s— The keyboard hook listener did not shut down cleanly during a hotkey re-registration. If this appears repeatedly, it indicates pynput threads are leaking.
If the memory leak reoccurs, copy the full console output and include it in a bug report.
We can deliver your innovative, technically complex project, using the latest web and mobile application development technologies. Scorchsoft develops online portals, applications, web and mobile apps, and AI projects. With over fourteen years experience working with hundreds of small, medium, and large enterprises, in a diverse range of sectors, we'd love to discover how we can apply our expertise to your project.
