Rust implementation of hyprwhspr | Blazing fast, native speech-to-text voice dictation for Hyprland and Omarchy | Waybar, Walker, and Elephant integrations (optional)
cargo install hyprwhspr-rshyprwhspr-rs-demo.mp4
- whisper.cpp (GitHub, AUR)
- Ensure
whisper-cliis available on yourPATH(or use the managed build locations).
- Ensure
- libudev + pkg-config (required for hotplug detection;
libudev-devon Debian/Ubuntu) - GNU-only binaries (no musl releases)
- Groq or Gemini API key (optional)
- Use
GROQ_API_KEYfor providergroq - Use
GEMINI_API_KEYfor providergemini - Groq with whisper is cheap (~$0.10 USD/month) and fast as hell. [Data Controls]
- Comparatively, Gemini is very slow but offers better output formatting.
- Use
- Parakeet TDT (optional) - NVIDIA's local ASR model via ONNX
- Run
./scripts/download-parakeet-tdt.shto download model files (~1.2GB) - Very fast, but not as accurate as whisper or Gemini
- Run
- Fast speech-to-text
- Intuitive configuration
- word overrides (many are already baked in)
- multi provider support
- hot reloading during runtime
- Optional fast VAD trims (
fast_vad.enabled) audio files, reducing inferences costs while increasing output speed
- Detects Hyprland via
HYPRLAND_INSTANCE_SIGNATUREand opens the IPC socket at$XDG_RUNTIME_DIR/hypr/<signature>/.socket.sock. - Execs
dispatch sendshortcutcommands against the active window to paste dictated text, inspectingactivewindowto decide whenShiftis required for a hardcoded list of programs. - Falls back to a Wayland virtual keyboard client or a simulated keypress paste if IPC communication fails.
- See the example docs for additional integration paths outside of Waybar and Walker/Elephant.
-
Install the latest release from crates.io
cargo install hyprwhspr-rs
Omit
parakeetbackend:cargo install hyprwhspr-rs --no-default-features
-
Install systemd service and Waybar module (optionally, with a WIP elephant/walker menu using
--with-elephantflag)# Interactive install hyprwhspr-rs install # Optionally, install specific components (systemd, waybar, elephant) hyprwhspr-rs install {--all| --service | --waybar | --elephant} {--force | -f}
Notes:
- The installer writes the systemd unit with an absolute
ExecStart=pointing at thehyprwhspr-rsbinary you ranhyprwhspr-rs installwith. If you copy the unit template manually, ensurehyprwhspr-rsis resolvable by systemd (PATH / drop-in override). - If audio start/stop sounds are missing in your packaging setup, you can point the app at an installed assets directory with
HYPRWHSPR_ASSETS_DIR=/path/to/assets.
You can install the hyprwhspr-rs package from nixpkgs.
With NixOS:
{
# required to listen for keyboard shortcuts
users.users.<username>.extraGroups = [ "input" ];
# have it auto start as a systemd unit with
services.hyprwhspr-rs.enable = true;
# or just add it to your systemPackages
environment.systemPackages = [ pkgs.hyprwhspr-rs ];
# optional: to enable cuda (for AMD do `rocmSupport` instead of `cudaSupport`)
# cuda is unfree so not in the default nixos build caches
# I highly recommend adding the cuda build cache to your nixconfig https://discourse.nixos.org/t/cuda-cache-for-nix-community/56038
services.hyprwhspr-rs = {
enable = true;
package = pkgs.hyprwhspr-rs.override {
# to optimize build time you can skip enabling cudaSupport for one of these two
# for whisper do whisper-cpp, for NVIDIA Parakeet do onnxruntime
whispercpp = pkgs.whisper-cpp.override { cudaSupport = true; };
onnxruntime = pkgs.onnxruntime.override { cudaSupport = true; };
};
};
# you can also enable cuda/rocm globally, but this will increase the build time for your entire system if you dont add the cuda build cache
nixpkgs.config.cudaSupport = true;
# if you use groq or gemini for transcription, you can autoload their keys with
services.hyprwhspr-rs = {
enable = true;
# put `GROQ_API_KEY=...` or `GEMINI_API_KEY=...` in the file you put at this path
environmentFile = "/path/to/hyprwhspr_secret_file";
};
}git clone https://github.com/better-slop/hyprwhispr-rs.gitcd hyprwhspr-rscargo build --releasesudo cp target/release/hyprwhspr-rs /usr/local/bin/
./scripts/install-waybar.sh
Example config
Configure in ~/.config/hyprwhspr-rs/config.jsonc
Environment Variables
Configuring providers and other overrides.
Use transcription.provider in ~/.config/hyprwhspr-rs/config.jsonc to pick the backend.
- groq provider requires:
GROQ_API_KEY - gemini provider requires:
GEMINI_API_KEY - whisper_cpp (whisper-cli) does not require an API key; the binary is discovered via
PATHand managed locations under$XDG_DATA_HOME/$HOME
hyprwhspr-rs install installs a user unit with:
EnvironmentFile=-%h/.config/hyprwhspr-rs/env
Otherwise, set the env vars in your shell.
- For
whisper_cpp/whisper-clidiscovery, the app also consults:PATH(searches forwhisper-cli, optional fallback names)XDG_DATA_HOME/HOME(managed whisper.cpp locations)
- For asset overrides (start/stop sounds):
HYPRWHSPR_ASSETS_DIR - Resolution logic lives in:
src/config.rs(discover_whisper_binary_candidates,find_binaries_on_path,discover_assets_dir)
Earshot VAD trimming (recommended)
The default build ships with the impressive and lightweight earshot VoiceActivityDetector baked in. Toggle fast_vad.enabled in your config to trim silence before any provider (whisper.cpp, Groq, Gemini) sees the audio. Extremely useful for lowering costs and increasing speed.
fast_vad.enabled in your config to trim silence before any provider (whisper.cpp, Groq, Gemini) sees the audio. Extremely useful for lowering costs and increasing speed."fast_vad": {
"enabled": false,
"profile": "aggressive", // quality | low_bitrate | aggressive | very_aggressive
"min_speech_ms": 120, // minimum speech chunk to keep
"silence_timeout_ms": 500, // silence length that ends a segment
"pre_roll_ms": 120, // speech-leading padding
"post_roll_ms": 150, // speech-trailing padding
"volatility_window": 24, // decision history window
"volatility_increase_threshold": 0.35, // become more aggressive above this
"volatility_decrease_threshold": 0.12 // relax aggressiveness below this
}About earshot
- Works well for silence, not as accurate at speech compared to other models.
- Operates on the 16 kHz PCM emitted by the capture layer and shares the trimmed buffer across all providers.
- Drops silent stretches longer than the configured timeout while keeping configurable pre-roll and post-roll padding so word edges remain intact.
- Adapts Earshot’s aggressiveness based on recent speech/silence volatility—fewer uploads when the room is noisy.
- If an entire recording is silent, the app attempts to short-circuit the upload path instead of dispatching an empty request.
All other fields in the fast_vad block map directly to the trimmer’s behaviour, so you can tune aggressiveness without
recompiling.
git clone https://github.com/better-slop/hyprwhispr-rs.gitcd hyprwhspr-rscargo build --release- Faster build (skips Parakeet backend):
cargo build --release --no-default-features
- Faster build (skips Parakeet backend):
- Run using:
- pretty logs:
RUST_LOG=debug ./target/release/hyprwhspr-rs - production release:
./target/release/hyprwhspr-rs
- pretty logs:
Release process
Runtime builds rely on a local whisper.cpp installation, so validate that dependency before shipping a tagged version.
whisper.cpp installation, so validate that dependency before shipping a tagged version.- Use Conventional Commits –
fix:bumps patch,feat:bumps minor, andtype!:indicates a breaking change (major bump). - On every push to
main, therelease-plzworkflow runsrelease-prto open or refresh arelease-plz-*pull request. Review the proposed version and changelog there. - When the release PR looks good, merge it. The same workflow runs
release-plz release, tagging (vX.Y.Z) and publishing the crate to crates.io if it’s a stable tag. - The tag triggers the
releaseworkflow, which builds the Linux GNU binary, uploads the tarball + checksum, and publishes the GitHub release with the full commit list (plus PR links when available).
Define
CARGO_REGISTRY_TOKENin the repository secrets with publish-only permissions so the workflow can push stable releases to crates.io.
- Slop review/clean up
- Ship waybar integration
- Release on Cargo
- Release on AUR
- Add support for other operating systems/setups
- Refine paste layer
- Investigate formatting model
{ "shortcuts": { "press": "SUPER+ALT+D", "hold": "SUPER+ALT+CTRL", }, "word_overrides": { "under score": "_", "em dash": "—", "equal": "=", "at sign": "@", "pound": "#", "hashtag": "#", "hash tag": "#", "newline": "\n", "Omarkey": "Omarchy", "dot": ".", "Hyperland": "hyprland", "hyperland": "hyprland", }, "audio_feedback": true, // Play start/stop sounds while recording "start_sound_volume": 0.1, // 0.1 - 1.0 "stop_sound_volume": 0.1, // 0.1 - 1.0 "start_sound_path": null, // Optional custom audio asset overrides "stop_sound_path": null, // Optional custom audio asset overrides "auto_copy_clipboard": true, // Automatically copy the final transcription to the clipboard "shift_paste": false, // Whether to force shift paste "global_paste_shortcut": false, // Enable compositor-level paste; uses Hyprland sendshortcut with Shift+Insert for all pastes "paste_hints": { "shift": [ // List of window classes that will always paste with Ctrl+Shift+V ], "shift_insert": [ // List of window classes that will always paste with Shift+Insert ], }, "audio_device": null, // Force a specific input device index (null uses system default) "fast_vad": { "enabled": false, // Enable Earshot fast VAD trimming "profile": "aggressive", // quality | low_bitrate | aggressive | very_aggressive (lowercase only, serde-enforced; default aggressive) "min_speech_ms": 120, // Minimum detected speech before keeping a segment "silence_timeout_ms": 500, // Drop silence longer than this (ms) "pre_roll_ms": 120, // Audio to keep before speech to avoid clipping words "post_roll_ms": 150, // Audio to keep after speech before trimming "volatility_window": 24, // Frames observed for adaptive aggressiveness (30 ms per frame, matches FRAME_MS in src/audio/vad.rs) "volatility_increase_threshold": 0.35, // Bump profile when toggles exceed this ratio "volatility_decrease_threshold": 0.12, // Relax profile when toggles stay below this ratio }, "transcription": { "provider": "whisper_cpp", // whisper_cpp | groq | gemini | parakeet "request_timeout_secs": 45, "max_retries": 2, "whisper_cpp": { "prompt": "Transcribe as technical documentation with proper capitalization, acronyms, and technical terminology. Do not add punctuation.", "model": "large-v3-turbo-q8_0", // Whisper model to use (must exist in specified directories) "threads": 4, // CPU threads dedicated to whisper.cpp "gpu_layers": 999, // Number of layers to keep on GPU (999 = auto/GPU preferred) "fallback_cli": false, // Fallback to whisper-cli (uses CPU) "no_speech_threshold": 0.6, // Whisper's "no speech" confidence gate "models_dirs": ["~/.local/share/hyprwhspr-rs/models"], // Directories to search for models "vad": { "enabled": false, // Toggle whisper-cli's native Silero VAD "model": "ggml-silero-v5.1.2.bin", // Path or filename for the ggml Silero VAD model // Probability threshold for deciding a frame is speech. Higher = fewer false positives, but may miss quiet speech. "threshold": 0.5, // Minimum contiguous speech duration (ms) to accept. Increase to ignore quick clicks/taps. "min_speech_ms": 250, // Minimum silence gap (ms) required to end a speech segment. Raise if mid-sentence pauses are being split. "min_silence_ms": 120, // Maximum speech duration (seconds) before forcing a cut. Use null (or omit) to leave unlimited. "max_speech_s": 15.0, // Extra padding (ms) added before/after detected speech so words aren't clipped. "speech_pad_ms": 80, // Overlap ratio between segments. Higher overlap helps smooth transitions at the cost of a little extra decode time. "samples_overlap": 0.1, }, }, "groq": { "model": "whisper-large-v3-turbo", "endpoint": "https://api.groq.com/openai/v1/audio/transcriptions", "prompt": "Transcribe as technical documentation with proper capitalization, acronyms, and technical terminology. Do not add punctuation.", }, "gemini": { "model": "gemini-2.5-flash-preview-09-2025", "endpoint": "https://generativelanguage.googleapis.com/v1beta/models", "temperature": 0.0, "max_output_tokens": 1024, "prompt": "Transcribe as technical documentation with proper capitalization, acronyms, and technical terminology. Do not add punctuation.", }, "parakeet": { "model_dir": "models/parakeet/parakeet-tdt-0.6b-v3-onnx", // Relative to $XDG_DATA_HOME/hyprwhspr-rs (or ~/.local/share/hyprwhspr-rs) "prompt": "Transcribe as technical documentation with proper capitalization, acronyms, and technical terminology. Do not add punctuation.", }, }, }