Offline, multi-lang, realtime STT translations w/ whisper-real-time + Argos-translate #1693
autotunafish
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
As part of a recent project, I worked out a method that allows for offline, realtime, speech-to-text translations using whisper-real-time and Argos-translate. Whisper-real-time will produce very fast and accurate transcriptions of realtime speaking (presuming there are favorable environmental conditions and so forth) AND also recognize a variety of other languages which allows for 2-way transcription/translation. Online tools like this exist but this can be run all-locally and offline* using open-source tools. This project is WiP and requires non-typical manual pruning of the script to fit your system but otherwise the requirements are exactly like whisper. There are mac and windows tools similar to inotifywait but currently it relies on Linux. I did a video on the setup (*there is a required handshake in the initial start but can be run offline after that, this handshake info does not seem to persist across reboots however, todo).
This transcription is output to the console like normal. For the translation service, the program is run with the appropriate '2>&1 | tee ...' command extension to take the that output transcription and write it into a file. A Linux tool called 'inotifywait' watches this file from a script for any changes and, when so, it begins formatting the appropriate 'argos-translate' command with captured text and, through an executable var, echoes the output of that command (the translation) to another terminal.
https://github.com/autotunafish/offline_sst
https://youtu.be/gLVprfZREkI?si=l_1NftjPV73ba3ka
Beta Was this translation helpful? Give feedback.
All reactions