Skip to content

[feat] Allow replacing the entire speech-to-text recording and transcription processΒ #141

@csells

Description

@csells

Right now, when you press the mic button, your voice is recorded and it will be translated to text by an LLM by default. You can already replace the process to translate the recorded audio to text using your own means. However, what you can't do is replace the entire recording and transcription process, e.g. using the speech_to_text package. This limits your ability to get at the underlying capabilities of the device, since some devices require you to use their recording facilities to do the transcription. This also limits your ability to keep the user's speech on the device for purposes of transcription, which increases latency and potential privacy concerns.

This feature would allow you to plug in something like the speech_to_text package which works across the supported platforms of the AI Toolkit and taps into the ability of the platforms in question to provider for real-time, office speech to text translation.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions