A fast, secure, and elegant voice typing solution with push-to-talk functionality. Perfect for hands-free text input with real-time transcription.
- Hotkey: Hold
Left Shift + Left Altto record - Instant Transcription: Fast, local processing with whisper.cpp
- Smart Filtering: Automatic removal of false positives and noise
- Real-time Feedback: Beautiful status indicators
- Beautiful Icons: Microphone icons that change color by status
- One-Click Control: Simple start/pause toggle
- Auto-startup: Launches automatically on login
- KDE Compatible: Optimized for KDE Plasma desktop
- Local Processing: All transcription happens locally
- Input Sanitization: Secure text filtering and validation
- Configurable Endpoints: No hardcoded server addresses
- Minimal Permissions: Runs with standard user privileges
- Lightning Fast: Types almost immediately after speaking
- High Quality Audio: 44.1kHz WAV recording for best accuracy
- Resource Efficient: Minimal CPU and memory usage
- Reliable: Comprehensive error handling and recovery
Tested on Debian 12 (Bookworm) - Should work on most modern Linux distributions.
# Install system dependencies
sudo apt install sox xinput python3-venv
# Install ydotool for typing simulation
sudo apt install ydotool
# Start ydotool daemon
sudo systemctl enable --now ydotoold-
Clone the repository
git clone <repository-url> cd voice_typing
-
Set up Python environment
python3 -m venv venv source venv/bin/activate pip install -r requirements.txt -
Configure your setup
cp config.env.example config.env # Edit config.env with your whisper server details -
Install system tray (optional)
cp voice-typing-tray.desktop ~/.config/autostart/ cp voice-typing-tray.desktop ~/.local/share/applications/
You need a running whisper.cpp server. Quick setup:
# Clone and build whisper.cpp
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp
make server
# Download a model
./models/download-ggml-model.sh base.en
# Start the server
./server -m models/ggml-base.en.bin -p 8080./voice_client_ptt./voice_tray_qt.py- Right-click the tray icon to start/pause
- Green: Ready for input
- Red: Recording in progress
- Blue: Processing transcription
- Gray: Service stopped
- Hold
Left Shift + Left Alt - Speak clearly
- Release keys to transcribe
- Text appears instantly where your cursor is
Edit config.env to customize:
# Server Configuration
WHISPER_SERVER=http://localhost:8080
# Audio Settings
AUDIO_SAMPLE_RATE=44100
AUDIO_FORMAT=wav
# Keyboard Settings
KEYBOARD_NAME="Dell KB216 Wired Keyboard"
HOTKEY_1=50 # Left Shift
HOTKEY_2=64 # Left Alt
# Filtering
MIN_WORD_COUNT=2
MIN_CHAR_COUNT=6voice_typing/
โโโ voice_client_ptt # Main push-to-talk script
โโโ voice_tray_qt.py # System tray application
โโโ voice-typing-tray.desktop # Desktop entry for auto-start
โโโ config.env.example # Configuration template
โโโ icons/ # System tray icons
โ โโโ tray_icon_ready.png
โ โโโ tray_icon_recording.png
โ โโโ tray_icon_processing.png
โ โโโ tray_icon_stopped.png
โโโ utils/ # Utility scripts
โโโ Old/ # Legacy versions
"No keyboard device found"
- Update
KEYBOARD_NAMEin config.env - List available keyboards:
xinput list | grep -i keyboard
"Connection refused"
- Ensure whisper.cpp server is running
- Check
WHISPER_SERVERURL in config.env - Test with:
curl http://localhost:8080/health
"ydotool not working"
- Start the daemon:
sudo systemctl start ydotoold - Add user to input group:
sudo usermod -a -G input $USER
"System tray not showing"
- KDE: Enable system tray in panel settings
- Install Qt5:
sudo apt install python3-pyqt5
We welcome contributions! Please see our contributing guidelines:
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Submit a pull request
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
- whisper.cpp - Fast local speech recognition
- ydotool - Wayland-compatible input simulation
- PyQt5 - Cross-platform GUI toolkit
- sox - Audio processing utilities
Made with โค๏ธ for the open source community
Fast โข Secure โข Private โข Open Source