feat(inputTranscription): add support for gpt-4o-transcribe & gpt-4o-mini-transcribe models, plus language & prompt parameters #134

HVbajoria · 2025-05-22T19:37:24Z

Purpose

Added support for the new transcription models: gpt-4o-transcribe and gpt-4o-mini-transcribe
Introduced optional language (ISO-639-1 format) and prompt parameters to the InputAudioTranscription interface
Applied changes in both JavaScript and Python SDKs
Updated the core library files and demonstrated usage in client_test under rtclient

Does this introduce a breaking change?

[ ] Yes
[x] No

Pull Request Type

What kind of change does this Pull Request introduce?

[ ] Bugfix
[x] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[x] Documentation content changes
[ ] Other... Please describe:

How to Test

For JavaScript

Get the code

git clone https://github.com/Azure-Samples/aoai-realtime-audio-sdk.git
cd aoai-realtime-audio-sdk
cd javascript
git checkout Added_Model_Support
npm install

Run tests

Manually test usage with the updated inputTranscription settings under the standalone/test/client.spec.ts line 120:

input_audio_transcription: {
         model:"gpt-4o-transcribe",
         language="en",
         prompt="expect words related to technology"
        },
};

For Python

Get the code

git clone https://github.com/Azure-Samples/aoai-realtime-audio-sdk.git
cd aoai-realtime-audio-sdk
cd python
cd samples
git checkout Added_Model_Support
pip3 install -r requirements.txt
cd ..

Test using client_test under rtclient:

Update the test snippet:

input_audio_transcription = InputAudioTranscription(
    model="gpt-4o-transcribe",
    language="en",
    prompt="expect words related to technology"
)

Try with different model values:
- "whisper-1"
- "gpt-4o-mini-transcribe"
- "gpt-4o-transcribe"

What to Check

Verify that the following are valid:

InputAudioTranscription in both Python and JavaScript accepts model, language, and prompt
All new values (gpt-4o-transcribe, gpt-4o-mini-transcribe) are allowed and passed correctly
Prompt format is respected based on the model:
- Whisper: comma-separated keywords
- GPT-4o models: free-text
No regressions in existing behavior when using only whisper-1
Functional parity between Python and JavaScript implementations

Other Information

The updates are backward compatible
Comments and type hints added where applicable
Example test cases show use with different model values and prompt formats
Enables developers to take advantage of OpenAI’s latest audio transcription capabilities across SDKs

HVbajoria · 2025-05-22T19:41:22Z

Hi @glecaros , @jpalvarezl, @trrwilson,

I have referred this document: https://learn.microsoft.com/en-us/azure/ai-services/openai/realtime-audio-reference

Under: RealtimeAudioInputTranscriptionSettings

Could you please check once?

juliannicolas90 · 2025-08-29T17:06:08Z

This would be great to have!

Added Input Audio Model Support

5b6912f

Updated Readme.md

a62c334

HVbajoria mentioned this pull request May 25, 2025

Support missing for model, language, and prompt fields in input_audio_transcription #136

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(inputTranscription): add support for gpt-4o-transcribe & gpt-4o-mini-transcribe models, plus language & prompt parameters #134

feat(inputTranscription): add support for gpt-4o-transcribe & gpt-4o-mini-transcribe models, plus language & prompt parameters #134

Uh oh!

HVbajoria commented May 22, 2025 •

edited

Loading

Uh oh!

HVbajoria commented May 22, 2025 •

edited

Loading

Uh oh!

juliannicolas90 commented Aug 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(inputTranscription): add support for gpt-4o-transcribe & gpt-4o-mini-transcribe models, plus language & prompt parameters #134

Are you sure you want to change the base?

feat(inputTranscription): add support for gpt-4o-transcribe & gpt-4o-mini-transcribe models, plus language & prompt parameters #134

Uh oh!

Conversation

HVbajoria commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Does this introduce a breaking change?

Pull Request Type

How to Test

For JavaScript

For Python

What to Check

Other Information

Uh oh!

HVbajoria commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juliannicolas90 commented Aug 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HVbajoria commented May 22, 2025 •

edited

Loading

HVbajoria commented May 22, 2025 •

edited

Loading