Scorchsoft QuickWhisper

You can find more information about Quick Whisper here: Scorchsoft Quick Whisper - Free Speech-to-Copy-Edited-Text AI App for Desktop

QuickWhisper is a user-friendly, voice-to-text transcription app that leverages OpenAI's Whisper model for accurate audio transcription. With QuickWhisper, users can start recording their voice, automatically transcribe it to text, copy it to the clipboard, and optionally paste it into other applications. Additionally, the transcription text can be processed through OpenAI ChatGPT for a polished, copy-edited output.

Cross-Platform Support

While QuickWhisper was originally developed for Windows, the codebase has been updated to support Linux and macOS as well. Windows and Linux binaries are currently available via the GitHub release log

Features

Simple Recording & Transcription: Quickly record audio and transcribe it to text with a single click or a hotkey (Ctrl+Alt+J for edit, Ctrl+Alt+Shift+J for transcription).
Auto Copy & Paste: Automatically copy transcriptions to the clipboard and paste them into other applications if desired.
Optional OpenAI ChatGPT Editing: Enhance your transcriptions using OpenAI ChatGPT for a polished, copy-edited text output.
Customizable Settings: Enable or disable auto-copy and auto-paste, choose from available input devices, and toggle OpenAI ChatGPT editing.

Screenshot

Installation

Clone the repository or download the code to your local machine.
Ensure you have Python 3.x installed.

Set up a virtual environment and install dependencies:

Windows:

python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt

Mac:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Linux:

python3 -m venv venv --system-site-packages
source venv/bin/activate
pip install -r requirements.txt

The app will prompt you for your OpenAI API key on first run. Alternatively, you can pre-configure it in config/credentials.json:
```
{
  "openai_api_key": "your_openai_api_key_here"
}
```

Linux-Specific Setup

Install required system packages (before creating the virtual environment):

sudo apt install portaudio19-dev python3-tk python3-gi gir1.2-gstreamer-1.0 gir1.2-gtk-3.0 gir1.2-ayatanaappindicator3-0.1 gstreamer1.0-plugins-base espeak

Mac-Specific Setup

Install portaudio first:
```
brew install portaudio
```
You may need to grant accessibility permissions to the app for keyboard shortcuts:
- Go to System Preferences > Security & Privacy > Privacy > Accessibility
- Add QuickWhisper to the list of allowed apps

Usage

Run the application:
```
python quick_whisper.py
```
Select an input device, then press one of the recording buttons or use the hotkeys:

Windows/Linux:
- Ctrl+Alt+J for Record + AI Edit
- Ctrl+Alt+Shift+J for Record + Transcript
- Win+X to Cancel Recording
Mac:
- ⌘+Alt+J for Record + AI Edit
- ⌘+Alt+Shift+J for Record + Transcript
- ⌘+X to Cancel Recording
After recording, the app will transcribe the audio and display the text in the transcription area. The text can be automatically copied to the clipboard or pasted into other applications, depending on the settings.
Enable "Auto Copy-edit with OpenAI ChatGPT" for advanced text processing, allowing OpenAI ChatGPT to edit the transcription for improved readability and structure.

Configuration

QuickWhisper includes a configuration system that allows you to customize recording behavior. Access it via Settings > Config.

Recording Location

Choose where audio recording files are saved:

Alongside application (recommended): Saves recordings in a tmp folder next to the application
In AppData folder: Uses the OS-appropriate application data directory:
- Windows: %APPDATA%\QuickWhisper\recordings
- macOS: ~/Library/Application Support/QuickWhisper/recordings
- Linux: ~/.config/QuickWhisper/recordings
Custom folder: Specify any folder of your choice

File Handling

Control how recording files are managed:

Overwrite the same file each time (default): Saves disk space by reusing the same filename
Save each recording with date/time in filename: Creates unique files like recording_20240101_143052.wav
- ⚠️ Warning: This option can consume significant disk space over time

Config Files

All configuration settings are saved to JSON files in the config/ folder:

settings.json: User preferences, model settings, shortcuts, recording options
credentials.json: API key (to be encrypted in a future release)
prompts.json: Custom AI prompts

Settings will persist between application restarts. If you're upgrading from an older version that used .env files, your settings will be automatically migrated to the new JSON format.

Language Support (i18n)

QuickWhisper supports multiple languages for the user interface:

English (default)
French (Français)
German (Deutsch)
Spanish (Español)
Chinese Simplified (简体中文)
Arabic (العربية)
Japanese (日本語)
Korean (한국어)
Portuguese (Português)
Russian (Русский)

Changing the Language

Go to Settings > Configuration
Select the Language category
Choose between:
- Auto-detect from system: Uses your operating system's language setting
- Manual selection: Choose a specific language from the dropdown

The interface will update immediately when you save the settings - no restart required.

Linux Users: Chinese Font Support

If Chinese characters appear as boxes (□□), install CJK fonts:

sudo apt install fonts-noto-cjk

For Developers: Adding New Translations

If you want to contribute translations or add a new language:

Extract translatable strings to update the template:
```
python3 tools/i18n_tools.py extract
```
Create a new language (e.g., Italian):
```
python3 tools/i18n_tools.py init it
```
Edit the .po file at locale/it/LC_MESSAGES/quickwhisper.po with your translations
Compile translations to .mo files:
```
python3 tools/compile_mo.py
```
Add the language to SUPPORTED_LANGUAGES in utils/i18n.py

Building an Executable

To create a standalone executable, first ensure you have your virtual environment activated with dependencies installed, then install PyInstaller:

pip install --no-cache-dir pyinstaller

Using spec file (recommended for all platforms):

python -m PyInstaller quick_whisper.spec

The spec file automatically detects your platform and includes the appropriate hidden imports.

Platform-specific manual builds:

Windows (no console window):

python -m PyInstaller --onefile --windowed --add-data "assets;assets" --icon="assets/icon.ico" --hidden-import pystray._win32 --hidden-import PIL._tkinter_finder --hidden-import pyttsx3.drivers --hidden-import pyttsx3.drivers.sapi5 --hidden-import pynput.keyboard._win32 quick_whisper.py

macOS:

pyinstaller --onefile --windowed --add-data "assets:assets" --hidden-import pystray._darwin --hidden-import PIL._tkinter_finder --hidden-import pyttsx3.drivers --hidden-import pyttsx3.drivers.nsss --hidden-import pynput.keyboard._darwin quick_whisper.py

Linux:

pyinstaller --onefile --add-data "assets:assets" --hidden-import pystray._xorg --hidden-import PIL._tkinter_finder --hidden-import pyttsx3.drivers --hidden-import pyttsx3.drivers.espeak --hidden-import pynput.keyboard._xorg quick_whisper.py

Linux prerequisites:

Install espeak for TTS: sudo apt install espeak
Install tkinter: sudo apt install python3-tk
For best hotkey support, run under X11 (Wayland has limited global hotkey support)

License

This project is licensed under the terms specified in the LICENSE.md file.

Memory Diagnostics

QuickWhisper logs resource usage to the console every 60 seconds. To use this, run the app from a terminal rather than double-clicking the executable:

python quick_whisper.py

Reading the Logs

Every 60 seconds you will see output like:

============================================================
[MEMORY DIAG] uptime=5.0min  RSS=142.3MB  delta=+2.1MB  threads=8
[MEMORY DIAG] gc_objects=45231  gc_counts=(47, 3, 1)
[MEMORY DIAG] audio: sounds=6  streams_opened=2  streams_closed=2  frames_peak=4800  recordings=2/2
[MEMORY DIAG] threads: ['MainThread', 'sound_0', 'sound_1', 'pynput-listener', ...]
============================================================

What Each Field Means

Field	Meaning
`RSS`	Total physical memory used by the process (in MB)
`delta`	Change in RSS since the last log entry. Consistently positive = likely leak
`threads`	Number of active threads. Should stay roughly constant
`gc_objects`	Total Python objects tracked by the garbage collector. Steady growth = object leak
`gc_counts`	Pending GC work per generation `(gen0, gen1, gen2)`
`sounds`	Total sound effects played since startup
`streams_opened` / `streams_closed`	PyAudio recording streams. These two numbers should always match
`frames_peak`	Largest audio buffer recorded (in frames). High values expected for long recordings
`recordings`	`started/stopped` count. Should always match

Signs of a Memory Leak

RSS delta is consistently positive (e.g. +2MB, +3MB, +5MB every minute) even when idle — something is leaking.
Thread count keeps climbing — threads are being created but not finishing. Look at the thread names list to see which ones are accumulating.
streams_opened > streams_closed — a PyAudio stream was not properly closed after recording.
recordings started > stopped — a recording was started but never completed or cancelled.
gc_objects steadily increasing — Python objects are being created and never freed (circular references or growing collections).

Warning Messages

You may also see these in the console output:

[MEMORY] WARNING: pynput listener thread did not terminate within 5s — The keyboard hook listener did not shut down cleanly during a hotkey re-registration. If this appears repeatedly, it indicates pynput threads are leaking.

If the memory leak reoccurs, copy the full console output and include it in a bug report.

About Scorchsoft

We can deliver your innovative, technically complex project, using the latest web and mobile application development technologies. Scorchsoft develops online portals, applications, web and mobile apps, and AI projects. With over fourteen years experience working with hundreds of small, medium, and large enterprises, in a diverse range of sectors, we'd love to discover how we can apply our expertise to your project.

Scorchsoft App Developers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scorchsoft QuickWhisper

Cross-Platform Support

Features

Screenshot

Installation

Linux-Specific Setup

Mac-Specific Setup

Usage

Configuration

Recording Location

File Handling

Config Files

Language Support (i18n)

Changing the Language

Linux Users: Chinese Font Support

For Developers: Adding New Translations

Building an Executable

License

Memory Diagnostics

Reading the Logs

What Each Field Means

Signs of a Memory Leak

Warning Messages

About Scorchsoft

About

Uh oh!

Releases 12

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
assets		assets
locale		locale
tools		tools
utils		utils
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
quick_whisper.py		quick_whisper.py
quick_whisper.spec		quick_whisper.spec
requirements.txt		requirements.txt
test_config.py		test_config.py

Folders and files

Latest commit

History

Repository files navigation

Scorchsoft QuickWhisper

Cross-Platform Support

Features

Screenshot

Installation

Linux-Specific Setup

Mac-Specific Setup

Usage

Configuration

Recording Location

File Handling

Config Files

Language Support (i18n)

Changing the Language

Linux Users: Chinese Font Support

For Developers: Adding New Translations

Building an Executable

License

Memory Diagnostics

Reading the Logs

What Each Field Means

Signs of a Memory Leak

Warning Messages

About Scorchsoft

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages