Hi team,
While working with the Azure OpenAI GPT-4o real-time transcription API, I noticed that the SDK does not yet support the model, language, and prompt fields in the input_audio_transcription configuration, despite these being documented here:
๐ https://learn.microsoft.com/en-us/azure/ai-services/openai/realtime-audio-reference
These fields are critical for:
Selecting newer models like gpt-4o-transcribe and gpt-4o-mini-transcribe
Providing a prompt to guide transcription behavior
Improving accuracy and latency by specifying language (e.g. "en")
๐ Iโve opened a PR to address this gap:
๐ #134
Tested and working with my application using the following config:
input_audio_transcription: {
model : "gpt-4o-transcribe" ,
language : "en" ,
prompt : "Expect words related to a product design interview." ,
}
Would love for the maintainers to review and merge this so itโs available to others using the SDK.
Thanks!
๐ React with ๐ 2Bubbles877 and codingFerryman