openwispr-gnome-extension

openwispr-gnome-extension is an AI-powered voice-to-text dictation extension for GNOME Shell. It leverages local AI models (via whisper-cli) to provide private, fast, and accurate speech recognition directly into any application.

Features

Local or Remote Processing: Use local whisper-cli or remote STT endpoints (OpenAI/Groq), configurable per setup.
Optional Keyboard Shortcuts: Configure your own shortcut for toggle recording in preferences.
Hold to Speak: Optional mode to hold Ctrl+Alt+T to record and release to transcribe.
Silence Cutting: Uses ffmpeg to cut silent parts before transcription.
Configurable STT Backends: Choose local whisper-cli, OpenAI Whisper endpoint, or Groq endpoint.
LLM Transcript Cleanup: Optionally post-process transcript text with OpenAI or Groq models.
Companion Engine + Daemon: Uses openwispr user services over D-Bus for recording/transcription and optional hold daemon shortcuts.
System Integration: Seamless integration with the GNOME top bar.
Clipboard Injection: Automatically pastes transcribed text into the active text field.
Clipboard Restore: Optionally restore original clipboard content after auto-paste (enabled by default).
Clipboard-Only Mode: Optionally copy transcription without auto-paste for apps where paste is unsafe.
Configurable Notifications: Optionally enable or disable extension notifications.

Prerequisites

Before installing, ensure you have the following dependencies:

GNOME Shell: Supported versions 45 - 49.
whisper-cli: The command-line interface for the Whisper model.
- Ensure whisper-cli is installed and available in your system PATH.
- Note: This extension expects the whisper-cli binary specifically.
ffmpeg: Required for silence trimming.
go (optional): Needed only to build the companion openwispr binary from source.

Installation

Clone the Repository

git clone https://github.com/tnfssc/openwispr-gnome-extension.git
cd openwispr-gnome-extension

Download the Model Since AI models are large, they are not included in the git repository. You need to download a GGML compatible model (e.g., ggml-base.en.bin) and place it in the extension/models/ directory.

mkdir -p extension/models
# Example: Download base.en model (adjust URL as needed for your preferred model source)
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin -O extension/models/ggml-base.en.bin

Install the Extension Run the included installation script to symlink the extension to your GNOME extensions directory.
```
./install.sh
```
Restart GNOME Shell
- Wayland: Log out and log back in.
- X11: Press Alt+F2, type r, and press Enter.

Enable the Extension

gnome-extensions enable openwispr-gnome-extension@tnfssc.github.com

Usage

Start Dictation: Click the microphone icon in the top bar, or set a custom keyboard shortcut in preferences.
- The icon will change to a recording indicator.
- You can also hold Ctrl+Alt+T.
Speak: Dictate your text clearly.
Stop & Transcribe: Press Ctrl+Alt+R again to stop.
- If using hold-to-speak, just release the hold key/chord.
- The extension trims silence with ffmpeg (if enabled), transcribes, then optionally runs LLM cleanup.
- Once complete, the text will be automatically pasted into your active window and copied to your clipboard.

Companion CLI

The installer builds ~/.local/bin/openwispr (when go is installed).

Install Companion From GitHub Releases

If you installed the GNOME extension from extensions.gnome.org, you can install the companion without building from source:

ARCH="$(uname -m)"
case "$ARCH" in
  x86_64) BIN="openwispr-linux-amd64" ;;
  aarch64|arm64) BIN="openwispr-linux-arm64" ;;
  *) echo "Unsupported architecture: $ARCH"; exit 1 ;;
esac

REPO="https://github.com/tnfssc/openwispr-gnome-extension/releases/latest/download"
TMP="$(mktemp -d)"

mkdir -p ~/.local/bin ~/.config/systemd/user ~/.local/share/applications ~/.local/share/icons/hicolor/256x256/apps
curl -fsSL "$REPO/${BIN}.tar.gz" -o "$TMP/${BIN}.tar.gz"
tar -xzf "$TMP/${BIN}.tar.gz" -C "$TMP"
install -Dm755 "$TMP/$BIN" ~/.local/bin/openwispr

curl -fsSL "$REPO/openwispr-engine.service" -o ~/.config/systemd/user/openwispr-engine.service
curl -fsSL "$REPO/openwispr-hotkeyd.service" -o ~/.config/systemd/user/openwispr-hotkeyd.service
curl -fsSL "$REPO/io.github.tnfssc.openwispr.desktop" -o ~/.local/share/applications/io.github.tnfssc.openwispr.desktop
curl -fsSL "$REPO/logo.png" -o ~/.local/share/icons/hicolor/256x256/apps/io.github.tnfssc.openwispr.png

systemctl --user daemon-reload
systemctl --user enable --now openwispr-engine.service

Optional hold-to-talk daemon:

systemctl --user enable --now openwispr-hotkeyd.service

If you need to build manually:

mkdir -p ~/.local/bin
go build -o ~/.local/bin/openwispr ./cmd/openwispr

openwispr toggle
openwispr start
openwispr stop
openwispr status
openwispr doctor
openwispr restart
openwispr engine

Quick DBus check (extension must be enabled):

gdbus call --session --dest org.gnome.Shell.Extensions.OpenWispr --object-path /org/gnome/Shell/Extensions/OpenWispr --method org.gnome.Shell.Extensions.OpenWispr.Status

If you just changed extension code and DBus is still missing, log out and back in once to fully restart GNOME Shell.

The extension now talks to the companion engine over D-Bus for recording/transcription. Ensure this service is active:

systemctl --user enable --now openwispr-engine.service

Hold Daemon (Super + Z)

For more reliable hold-to-talk behavior, use the companion daemon:

systemctl --user enable --now openwispr-hotkeyd.service

The default service runs:

openwispr daemon --backend auto --trigger <Super>z --evdev-key z

auto tries the portal backend first, then falls back to evdev if needed.

If you switch to modifier-only triggers (like Alt_R) and release detection is unreliable, GNOME may emit Activated without Deactivated.

To install/update the service manually:

mkdir -p ~/.config/systemd/user
cp companion/openwispr-engine.service ~/.config/systemd/user/
cp companion/openwispr-hotkeyd.service ~/.config/systemd/user/
mkdir -p ~/.local/share/applications
cp companion/io.github.tnfssc.openwispr.desktop ~/.local/share/applications/
mkdir -p ~/.local/share/icons/hicolor/256x256/apps
cp logo.png ~/.local/share/icons/hicolor/256x256/apps/io.github.tnfssc.openwispr.png
systemctl --user daemon-reload
systemctl --user enable --now openwispr-engine.service
systemctl --user enable --now openwispr-hotkeyd.service

The desktop entry is required so GNOME GlobalShortcuts can associate a valid app ID (io.github.tnfssc.openwispr) with the daemon.

Portal-only test run:

openwispr daemon --backend portal --trigger <Super>z

If openwispr doctor reports missing org.freedesktop.portal.GlobalShortcuts, start the GNOME portal backend and restart the daemon:

systemctl --user start xdg-desktop-portal-gnome.service
systemctl --user restart xdg-desktop-portal.service
systemctl --user restart openwispr-hotkeyd.service

Quick self-heal command (recommended):

openwispr restart

If evdev fallback is needed, ensure input permissions (example for current user):

sudo usermod -aG input "$USER"
# then log out and back in

Note: the GitHub Release companion binaries are built with CGO_ENABLED=0 for portability, so evdev fallback is disabled there. Build from source on Linux with cgo enabled if you need evdev fallback.

Note: Some apps (especially terminals, password fields, or secure/sandboxed inputs) may block simulated paste events. In those cases, use clipboard paste manually.

Configuration

The extension ships with a default toggle shortcut. You can change it in preferences, with dconf-editor, or by modifying the schema:

Schema: org.gnome.shell.extensions.openwispr
Key: toggle-recording

Hold-to-speak can be enabled/disabled in extension preferences or via:

Schema: org.gnome.shell.extensions.openwispr
Key: hold-to-speak-enabled

Hold-to-speak shortcut can be configured in extension preferences or via:

Schema: org.gnome.shell.extensions.openwispr
Key: hold-to-speak-keybinding

For GNOME Settings > Keyboard > Custom Shortcuts, you can set:

Name: openwispr-toggle
Command: openwispr toggle

For start/stop split bindings (optional):

Name: openwispr-start
Command: openwispr start
Name: openwispr-stop
Command: openwispr stop

Auto-paste behavior can be toggled in extension preferences or via:

Schema: org.gnome.shell.extensions.openwispr
Key: auto-paste-enabled

Clipboard restore behavior can be toggled in extension preferences or via:

Schema: org.gnome.shell.extensions.openwispr
Key: restore-clipboard-enabled

Notification behavior can be toggled in extension preferences or via:

Schema: org.gnome.shell.extensions.openwispr
Key: notifications-enabled

Remote STT and LLM keys/endpoints are configurable in extension preferences. Relevant schema keys include:

STT: stt-provider, stt-openai-*, stt-groq-*
LLM: llm-filter-enabled, llm-provider, llm-openai-*, llm-groq-*, llm-cleanup-prompt
FFmpeg: silence-trim-enabled, silence-threshold, silence-duration

Testing

LLM cleanup unit tests (request payload + response parsing):

go test ./cmd/openwispr

Optional live LLM cleanup tests (requires explicit env vars and network access):

OPENWISPR_TEST_OPENAI_ENDPOINT="https://api.openai.com/v1/chat/completions" \
OPENWISPR_TEST_OPENAI_MODEL="gpt-4o-mini" \
OPENWISPR_TEST_OPENAI_API_KEY="..." \
OPENWISPR_TEST_GROQ_ENDPOINT="https://api.groq.com/openai/v1/chat/completions" \
OPENWISPR_TEST_GROQ_MODEL="llama-3.1-8b-instant" \
OPENWISPR_TEST_GROQ_API_KEY="..." \
go test -tags=integration ./cmd/openwispr -run TestLiveCleanupTranscript -v

Live prompt benchmark against Groq (compares multiple prompt candidates and prints outputs/scores):

OPENWISPR_TEST_GROQ_ENDPOINT="https://api.groq.com/openai/v1/chat/completions" \
OPENWISPR_TEST_GROQ_MODEL="openai/gpt-oss-120b" \
OPENWISPR_TEST_GROQ_API_KEY="..." \
go test -tags=integration ./cmd/openwispr -run TestLivePromptBenchmarkGroq -v

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
cmd/openwispr		cmd/openwispr
companion		companion
extension		extension
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
go.mod		go.mod
go.sum		go.sum
install.sh		install.sh
logo.png		logo.png
test_extension_settings.js		test_extension_settings.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

openwispr-gnome-extension

Features

Prerequisites

Installation

Usage

Companion CLI

Install Companion From GitHub Releases

Hold Daemon (Super + Z)

Configuration

Testing

License

About

Uh oh!

Releases 6

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

openwispr-gnome-extension

Features

Prerequisites

Installation

Usage

Companion CLI

Install Companion From GitHub Releases

Hold Daemon (Super + Z)

Configuration

Testing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Uh oh!

Contributors

Uh oh!

Languages