feat: add auto-stop on silence setting #488

KrE80r · 2025-12-24T00:11:11Z

Before Submitting This PR

Please confirm you have done the following:

I have searched existing issues and pull requests (including closed ones) to ensure this isn't a duplicate
I have read CONTRIBUTING.md

If this is a feature or change that was previously closed/rejected:

I have explained in the description below why this should be reconsidered
I have gathered community feedback (link to discussion below)

Human Written Description

I use another STT with which auto-stops recording after silence. I wanted the same capability in Handy for hands-free operation without needing to press the shortcut again to stop recording.

Related Issues/Discussions

This feature addresses the need for hands-free dictation workflows where users want recording to automatically stop after they finish speaking, without requiring manual shortcut activation to end the recording.

Community Feedback

New feature - no prior discussion. This implements a commonly requested pattern for voice dictation apps.

Testing

Tested on Fedora Linux with the built RPM package
Cross-platform compatible: uses existing VAD pipeline with no platform-specific code
Settings persist correctly across app restarts
Timeout options: Disabled, 2s, 3s, 5s, 10s of silence after speech detected

Screenshots/Videos (if applicable)

Settings UI added under Advanced section:

Implementation Details

Changes:

Backend (Rust):
- Added AutoStopSilenceTimeout enum to settings
- Extended AudioRecorder with silence frame tracking
- Added callback mechanism to trigger transcription stop
- Uses existing VAD (Voice Activity Detection) to detect silence
Frontend (React/TypeScript):
- New AutoStopSilenceTimeoutSetting component with dropdown
- Added i18n translations for setting labels
- Integrated with settings store

How it works:

After speech is first detected, the system starts counting consecutive silence frames (30ms each)
When silence duration exceeds the configured timeout, it triggers the same stop action as pressing the shortcut key
Only activates AFTER speech is detected, preventing premature stops

Automatically stops recording after a configurable period of silence following speech detection. Similar to nerd-dictation's --timeout feature. Changes: - Add AutoStopSilenceTimeout enum with options: disabled, 2s, 3s, 5s, 10s - Track consecutive silence frames in VAD processing pipeline - Trigger transcription stop when silence threshold exceeded - Add settings UI component in Advanced settings - Wire up frontend settings store and translations The feature only triggers after speech has been detected at least once, preventing premature stops during initial silence.

cjpais · 2025-12-24T11:48:28Z

Mostly writing here to say I have seen this. I've thought about it and I don't have a strong opinion at the moment. The main concern I have is just generally adding more features and making the app more confusing. I do understand the reasoning behind this. The screenshot definitely didn't upload properly, just for reference. I won't be able to test this for a little bit. I'm working on some major features and won't be reviewing pull requests for a little bit of time while I work on that. However, one thing maybe which might make sense is utilizing the existing "Push to Talk" setting and instead of that being a boolean value, maybe it makes more sense to for that to be a string value or an enum value where it shows a drop down menu of the different kinds of effectively shortcut triggers you want or something like that because to me this one also seems like it only really works with pressing it and then having to press the key binding again. So not push to talk mode and yeah, it's just a bit unclear what the user experience like this is overall to make sure it's like consistent and works properly. Have you tested this during push to talk and does it really even make sense to have it there?

KrE80r · 2025-12-24T23:56:17Z

Added the screenshot.

I did test with and without push-to-talk , and I still opted for this, it might just be my muscle memory since I was using another software that did this auto-disengage, so I thought of adding it here.

Currently running this on a linux machine, seems working fine so far.

joshribakoff · 2026-01-05T04:10:42Z

It would be good to land tests first, before we add more complexity. If we have auto-stop, its only natural to also include auto-start. We could also consider adding trigger words "start dictation", "stop dictation". I am generally in favor of building this direction out, as a user and new contributor. I just hope we can keep the UI simple and stable, with the above suggestions + tests.

joshribakoff · 2026-01-06T20:05:27Z

Sorry to double post but I think it’s worth noting: on the dictation, that is built into Mac OS by default, it also auto stops, and as far as I’m aware there’s no setting to disable it.

most driver assistance, also forcefully disengages out of safety and conservative trade-offs, if user presence is no longer detected.

I’m actually convinced we could globally enable this. The main blocker(s) to doing so here could potentially be:

automated testing gaps
on Mac it plays a little beep when you start and stop dictation. on Tesla self driving you can also not disable the little beep when you enter exit self driving mode. This feels like a dependency.

There are potentially real privacy and usability concerns with accidentally leaving dictation on, and then accidentally triggering input into a different window

But I do think there are arguments for having user settings here. There’s no right or wrong answer. But for what it’s worth it looks like what Apple did is they made a conservative decision to auto stop and output the text after a many seconds of silence.

I don’t think the current user interface of handy is 100% optimal, but I do agree with the sentiment that we should be moving it forward and a way where we don’t increase the level of entropy in the settings

cjpais · 2026-01-07T04:51:10Z

@joshribakoff please move comments/opinions to discussions. I don't think these comments are helpful in this PR and it clogs up something which is active development work. I'm not really in favor of expanding the scope of this PR as there are already some concerns I mentioned earlier. This PR will likely make it in at some point but it's going to take some time. I am quite busy, and there's a lot to maintain and address.

You're largely giving an opinion here which is a discussion which needs community support. I don't intend to change core and default application behavior without broad support from the community, or from myself directly. I value feedback and opinion. I like good defaults, but I personally don't know if what you are describing is a good default for my own usage of the application. Go collect support based on your belief/opinion.

It would be good to land tests first, before we add more complexity.

Regarding this... I will land whatever I have time for in the order I feel like. If I see a PR with adding tests I certainly will consider ordering, but if there's no PR for it, I am not going to block someone else's work for something that doesn't exist.

KrE80r added 2 commits December 24, 2025 10:05

refactor: clean up comments and simplify TSX handler

f0b502c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add auto-stop on silence setting #488

feat: add auto-stop on silence setting #488

Uh oh!

KrE80r commented Dec 24, 2025 •

edited

Loading

Uh oh!

cjpais commented Dec 24, 2025

Uh oh!

KrE80r commented Dec 24, 2025

Uh oh!

joshribakoff commented Jan 5, 2026 •

edited

Loading

Uh oh!

joshribakoff commented Jan 6, 2026 •

edited

Loading

Uh oh!

cjpais commented Jan 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

feat: add auto-stop on silence setting #488

Are you sure you want to change the base?

feat: add auto-stop on silence setting #488

Uh oh!

Conversation

KrE80r commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Before Submitting This PR

Human Written Description

Related Issues/Discussions

Community Feedback

Testing

Screenshots/Videos (if applicable)

Implementation Details

Changes:

How it works:

Uh oh!

cjpais commented Dec 24, 2025

Uh oh!

KrE80r commented Dec 24, 2025

Uh oh!

joshribakoff commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joshribakoff commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cjpais commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

KrE80r commented Dec 24, 2025 •

edited

Loading

joshribakoff commented Jan 5, 2026 •

edited

Loading

joshribakoff commented Jan 6, 2026 •

edited

Loading

cjpais commented Jan 7, 2026 •

edited

Loading