Continuous dictation on Linux desktop with Whisper AI and xdotool. Remote control is possible too! #1282
Replies: 3 comments 1 reply
-
looking forward to trying this out. Do you know of any projects that implement, or have you considered a vim like modal interface for this? E.g. "insert octupus" and you are in insert mode, otherwise you can issue commands via voice (i know i have seen some in the past and could dig them up with some searching myself i am sure; just curious to hear your take) |
Beta Was this translation helpful? Give feedback.
-
Everybody has their favorite editor interface. I have several. And I'm often typing into online forms. I found it's best just to use keyboard emulation to type into whatever is open at the time. |
Beta Was this translation helpful? Give feedback.
-
***Update!! voice_typing has a major update! Now does recordings in background, running whisper in foreground for performance. Uses FIFO to queue background recordings so text no longer appears out-of-order. All done in bash! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Try it here https://github.com/themanyone/voice_typing
I have it spawning an instance of
sox
rec
which detects silence and passes recordings to an instance of whisper running in the background. That way it recognizes speech WHILE recording the next sentence of speech. Then it passes the text toxdotool
to type it out in whatever box the cursor happens to be in. The result is uninterrupted, hands-free dictation anywhere!What would be cool is if there was some way to keep whisper loaded in memory, so it doesn't have to spool up each time to decode the next phrase. Edit. Turns out there is. The only problem with that is it reserves a chunk of your video card memory until the app is finally closed-down. But if you want to try that, it's over here: https://github.com/themanyone/whisper_dictation
Beta Was this translation helpful? Give feedback.
All reactions