Instantly transcribe your voice to text with a single keypress using OpenAI's Whisper running locally on your machine. Press F9 to start recording, press F9 again to stop and get the transcription copied to your clipboard!
- One-key operation: Press F9 to start/stop recording
- Instant transcription: Audio is processed locally using whisper.cpp
- Clipboard integration: Transcribed text is automatically copied to clipboard
- Visual feedback: Desktop notifications show recording status
- CPU optimized: Runs efficiently without GPU (perfect for laptops)
- Privacy first: Everything runs locally, no internet required
This setup was tested and runs perfectly on:
- OS: Ubuntu 22.04.5 LTS
- CPU: Intel Ultra 9 185H (22 threads)
- RAM: 64GB
- Desktop: GNOME 42.9
- Ubuntu/Debian-based Linux distribution
- GNOME desktop environment (for keybinding setup)
- Basic development tools (
git,make,gcc) - Audio recording tools (
arecordfromalsa-utils) - Clipboard tool (
xcliporxsel)
# Clone this repository
git clone https://github.com/atkvishnu/whisper-hotkey-transcribe.git
cd whisper-hotkey-transcribe
# Run the installation script
chmod +x install.sh
./install.sh- Install dependencies:
sudo apt update
sudo apt install build-essential git alsa-utils xclip libnotify-bin- Clone and build whisper.cpp:
cd ~/Projects
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp
make -j$(nproc)- Download a Whisper model:
# Download the base model (good balance of speed and accuracy)
bash ./models/download-ggml-model.sh base- Set up the transcription script:
# Copy the script from this repo
cp scripts/whisper-toggle.sh ~/Projects/whisper.cpp/
chmod +x ~/Projects/whisper.cpp/whisper-toggle.sh-
Configure the F9 hotkey:
Method 1: Manual Configuration (Recommended)
- Open Settings → Keyboard → View and Customize Shortcuts → Custom Shortcuts
- Click the + button to add a new shortcut
- Fill in the following:
- Name:
Whisper Transcribe - Command:
/home/atkvishnu/Projects/whisper.cpp/whisper-toggle.sh - Shortcut: Click "Set Shortcut" and press F9
- Name:
Method 2: Command Line (May require logout)
# The install script attempts this automatically gsettings set org.gnome.settings-daemon.plugins.media-keys custom-keybindings "['/org/gnome/settings-daemon/plugins/media-keys/custom-keybindings/custom0/']" gsettings set org.gnome.settings-daemon.plugins.media-keys.custom-keybinding:/org/gnome/settings-daemon/plugins/media-keys/custom-keybindings/custom0/ name 'Whisper Transcribe' gsettings set org.gnome.settings-daemon.plugins.media-keys.custom-keybinding:/org/gnome/settings-daemon/plugins/media-keys/custom-keybindings/custom0/ command '/home/atkvishnu/Projects/whisper.cpp/whisper-toggle.sh' gsettings set org.gnome.settings-daemon.plugins.media-keys.custom-keybinding:/org/gnome/settings-daemon/plugins/media-keys/custom-keybindings/custom0/ binding 'F9'
- Press F9 - You'll see a notification "Recording started... Press F9 again to stop"
- Speak clearly into your microphone
- Press F9 again - Recording stops, audio is transcribed
- Check your clipboard - The transcribed text is automatically copied!
The script can be customized by editing whisper-toggle.sh:
- Model selection: Change
MODEL_PATHto use different Whisper models - Audio quality: Modify
arecordparameters for different sample rates - File retention: Adjust how many recordings to keep (default: 10)
tiny- Fastest, lowest accuracy (39 MB)base- Good balance (74 MB) - Recommendedsmall- Better accuracy (244 MB)medium- High accuracy (769 MB)large- Best accuracy (1550 MB)
- Recordings:
/tmp/whisper-recordings/recording_*.wav - Transcripts:
/tmp/whisper-recordings/transcript_*.txt - PID file:
/tmp/whisper-recording.pid
- Log out and log back in after setting up the keybinding
- Or restart GNOME Shell: Alt+F2, type 'r', press Enter
The script includes the necessary library paths. If you still get errors:
export LD_LIBRARY_PATH=/path/to/whisper.cpp/build/src:/path/to/whisper.cpp/build/ggml/src:$LD_LIBRARY_PATHCheck your microphone:
# List recording devices
arecord -l
# Test recording
arecord -d 5 test.wav
aplay test.wav- Speak clearly and avoid background noise
- Try a larger model for better accuracy
- Ensure audio levels are appropriate
Feel free to open issues or submit pull requests! Some ideas for improvements:
- Support for other desktop environments
- Additional hotkey configurations
- Integration with other applications
- Support for multiple languages
This project is licensed under the MIT License - see the LICENSE file for details.
- whisper.cpp - Georgi Gerganov's excellent C++ port of Whisper
- OpenAI Whisper - The original Whisper model
- Tested and developed on an Intel Ultra 9 185H system with 64GB RAM
Made with ❤️ for the open-source community. If you find this useful, please star the repository!