Skip to content

Support for faster-whisper or OpenAI's API #6

@OliveiraHermogenes

Description

@OliveiraHermogenes

Hi.

Thank you for this piece of software. It's very useful in order to transcribe into searchable text the ungodly number of Whatsapp voice messages (even work-related) one gets here in Brazil .

After living in text-based bliss for a couple of months, I noticed an issue with whispercpp. For some reason, long voice messages were not getting from ffmpeg to the model (brief messages were still getting through). The ffmpeg process would just sit there idle, presumably after completing its job, without passing the result down the pipe to whisper. After trying unsuccessfully to debug the issue for a while, I hacked the plugin to use faster-whisper instead. This seems to be working well.

I am now starting to prepare a proper pull request to add support for faster-whisper alongside whispercpp. However, before I do that, I would like to ask whether this is the right approach and something the project would be interested in.

Perhaps, at least for whisper (in contrast with vosk), it would be preferable to support OpenAI's API so that people can expose their locally running model (with either whisper.cpp or faster-whisper) for use by other tools besides maubot. Furthermore, I find myself forced to add quite a bit of complexity in order to support two different flavors of whisper, which could be avoided by relying on the API.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions