Skip to content

[Whispering] Support OpenAI compatible API/DashScope compatible API for transcription #1581

@axmmisaka

Description

@axmmisaka

Feature Description

Problem / use case:

My use case is quite special:

  1. I transcribe mandarin with dialect so Parakeet and Whisper works very poorly, I tried a number of models and Qwen3-ASR works relatively well
  2. I host my local Qwen3-ASR on a separate machine with OpenAI compatible API; when I get some trial money I might as well use Aliyun's endpoint, which uses DashScope API

I found this project From Handy because of these issues, but then I realised Whispering supports OpenAI API but the available models are pre-baked.

Proposed solution:

Similar to what we did for postprocessing LLM add an openai compatible API option, allow people to customise stuffs like model name and auth

We already have OpenAI api with custom endpoint/auth key, but model and other config seems to be fixed

PS

This issue is a bit pre-mature; I'll first change my API endpoint a bit, hack around with auth and model name and make sure it works.

Relevant Platforms

Linux

How important is this feature to you?

Important for my use case

Willing to Contribute?

Yes, I can implement this

Discord Link

No response

Checklist

  • I have searched existing issues and this feature hasn't been requested

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions