Skip to content

Commit 2ea2840

Browse files
authored
Merge pull request #6852 from PatrickFarley/openai-audio
OpenAI audio
2 parents 227f748 + 74906aa commit 2ea2840

File tree

9 files changed

+94
-61
lines changed

9 files changed

+94
-61
lines changed

articles/ai-foundry/openai/concepts/models.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -321,6 +321,7 @@ Details about maximum request tokens and training data are available in the foll
321321
|`gpt-4o-realtime-preview` (2025-06-03) <br> GPT-4o audio | Audio model for real-time audio processing. |Input: 128,000 <br> Output: 4,096 | October 2023 |
322322
|`gpt-4o-realtime-preview` (2024-12-17) <br> GPT-4o audio | Audio model for real-time audio processing. |Input: 128,000 <br> Output: 4,096 | October 2023 |
323323
|`gpt-4o-mini-realtime-preview` (2024-12-17) <br> GPT-4o audio | Audio model for real-time audio processing. |Input: 128,000 <br> Output: 4,096 | October 2023 |
324+
|`gpt-4o-realtime` (2025-08-28) <br> GPT-4o audio | Audio model for real-time audio processing. |Input: 28,672 <br> Output: 4,096 | October 2023 |
324325

325326
To compare the availability of GPT-4o audio models across all regions, refer to the [models table](#global-standard-model-availability).
326327

articles/ai-foundry/openai/includes/realtime-deploy-model.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,12 @@ ms.topic: include
77
ms.date: 1/21/2025
88
---
99

10-
To deploy the `gpt-4o-mini-realtime-preview` model in the Azure AI Foundry portal:
10+
To deploy the `gpt-4o-realtime` model in the Azure AI Foundry portal:
1111
1. Go to the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) and create or select your project.
1212
1. Select **Models + endpoints** from under **My assets** in the left pane.
1313
1. Select **+ Deploy model** > **Deploy base model** to open the deployment window.
14-
1. Search for and select the `gpt-4o-mini-realtime-preview` model and then select **Confirm**.
14+
1. Search for and select the `gpt-4o-realtime` model and then select **Confirm**.
1515
1. Review the deployment details and select **Deploy**.
1616
1. Follow the wizard to finish deploying the model.
1717

18-
Now that you have a deployment of the `gpt-4o-mini-realtime-preview` model, you can interact with it in the Azure AI Foundry portal **Audio** playground or Realtime API.
18+
Now that you have a deployment of the `gpt-4o-realtime` model, you can interact with it in the Azure AI Foundry portal **Audio** playground or Realtime API.

articles/ai-foundry/openai/includes/realtime-javascript.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.date: 3/20/2025
1212
- An Azure subscription - <a href="https://azure.microsoft.com/free/cognitive-services" target="_blank">Create one for free</a>
1313
- <a href="https://nodejs.org/" target="_blank">Node.js LTS or ESM support.</a>
1414
- An Azure OpenAI resource created in one of the supported regions. For more information about region availability, see the [models and versions documentation](../concepts/models.md#global-standard-model-availability).
15-
- Then, you need to deploy a `gpt-4o-mini-realtime-preview` model with your Azure OpenAI resource. For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md).
15+
- Then, you need to deploy a `gpt-4o-realtime` model with your Azure OpenAI resource. For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md).
1616

1717
### Microsoft Entra ID prerequisites
1818

@@ -72,8 +72,8 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
7272
// You will need to set these environment variables or edit the following values
7373
const endpoint = process.env.AZURE_OPENAI_ENDPOINT || "AZURE_OPENAI_ENDPOINT";
7474
// Required Azure OpenAI deployment name and API version
75-
const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "gpt-4o-mini-realtime-preview";
76-
const apiVersion = process.env.OPENAI_API_VERSION || "2025-04-01-preview";
75+
const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "gpt-4o-realtime";
76+
const apiVersion = process.env.OPENAI_API_VERSION || "2025-08-28";
7777
// Keyless authentication
7878
const credential = new DefaultAzureCredential();
7979
const scope = "https://cognitiveservices.azure.com/.default";
@@ -90,8 +90,8 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
9090
realtimeClient.send({
9191
type: "session.update",
9292
session: {
93-
modalities: ["text", "audio"],
94-
model: "gpt-4o-mini-realtime-preview",
93+
output_modalities: ["text", "audio"],
94+
model: "gpt-4o-realtime",
9595
},
9696
});
9797
realtimeClient.send({
@@ -113,12 +113,12 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
113113
console.log("session created!", event.session);
114114
console.log();
115115
});
116-
realtimeClient.on("response.text.delta", (event) => process.stdout.write(event.delta));
117-
realtimeClient.on("response.audio.delta", (event) => {
116+
realtimeClient.on("response.output_text.delta", (event) => process.stdout.write(event.delta));
117+
realtimeClient.on("response.output_audio.delta", (event) => {
118118
const buffer = Buffer.from(event.delta, "base64");
119119
console.log(`Received ${buffer.length} bytes of audio data.`);
120120
});
121-
realtimeClient.on("response.audio_transcript.delta", (event) => {
121+
realtimeClient.on("response.output_audio_transcript.delta", (event) => {
122122
console.log(`Received text delta:${event.delta}.`);
123123
});
124124
realtimeClient.on("response.text.done", () => console.log());
@@ -155,8 +155,8 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
155155
const endpoint = process.env.AZURE_OPENAI_ENDPOINT || "AZURE_OPENAI_ENDPOINT";
156156
const apiKey = process.env.AZURE_OPENAI_API_KEY || "Your API key";
157157
// Required Azure OpenAI deployment name and API version
158-
const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "gpt-4o-mini-realtime-preview";
159-
const apiVersion = process.env.OPENAI_API_VERSION || "2025-04-01-preview";
158+
const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "gpt-4o-realtime";
159+
const apiVersion = process.env.OPENAI_API_VERSION || "2025-28-08";
160160
const azureOpenAIClient = new AzureOpenAI({
161161
apiKey: apiKey,
162162
apiVersion: apiVersion,
@@ -170,7 +170,7 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
170170
type: "session.update",
171171
session: {
172172
modalities: ["text", "audio"],
173-
model: "gpt-4o-mini-realtime-preview",
173+
model: "gpt-4o-realtime",
174174
},
175175
});
176176
realtimeClient.send({
@@ -192,12 +192,12 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
192192
console.log("session created!", event.session);
193193
console.log();
194194
});
195-
realtimeClient.on("response.text.delta", (event) => process.stdout.write(event.delta));
196-
realtimeClient.on("response.audio.delta", (event) => {
195+
realtimeClient.on("response.output_text.delta", (event) => process.stdout.write(event.delta));
196+
realtimeClient.on("response.output_audio.delta", (event) => {
197197
const buffer = Buffer.from(event.delta, "base64");
198198
console.log(`Received ${buffer.length} bytes of audio data.`);
199199
});
200-
realtimeClient.on("response.audio_transcript.delta", (event) => {
200+
realtimeClient.on("response.output_audio_transcript.delta", (event) => {
201201
console.log(`Received text delta:${event.delta}.`);
202202
});
203203
realtimeClient.on("response.text.done", () => console.log());

articles/ai-foundry/openai/includes/realtime-portal.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,18 +13,18 @@ ms.date: 3/20/2025
1313

1414
## Use the GPT-4o real-time audio
1515

16-
To chat with your deployed `gpt-4o-mini-realtime-preview` model in the [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs) **Real-time audio** playground, follow these steps:
16+
To chat with your deployed `gpt-4o-realtime` model in the [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs) **Real-time audio** playground, follow these steps:
1717

18-
1. Go to the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) and select your project that has your deployed `gpt-4o-mini-realtime-preview` model.
18+
1. Go to the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) and select your project that has your deployed `gpt-4o-realtime` model.
1919
1. Select **Playgrounds** from the left pane.
2020
1. Select **Audio playground** > **Try the Audio playground**.
2121

2222
> [!NOTE]
23-
> The **Chat playground** doesn't support the `gpt-4o-mini-realtime-preview` model. Use the **Audio playground** as described in this section.
23+
> The **Chat playground** doesn't support the `gpt-4o-realtime` model. Use the **Audio playground** as described in this section.
2424
25-
1. Select your deployed `gpt-4o-mini-realtime-preview` model from the **Deployment** dropdown.
25+
1. Select your deployed `gpt-4o-realtime` model from the **Deployment** dropdown.
2626

27-
:::image type="content" source="../media/how-to/real-time/real-time-playground.png" alt-text="Screenshot of the audio playground with the deployed model selected." lightbox="../media/how-to/real-time/real-time-playground.png":::
27+
<!--:::image type="content" source="../media/how-to/real-time/real-time-playground.png" alt-text="Screenshot of the audio playground with the deployed model selected." lightbox="../media/how-to/real-time/real-time-playground.png":::-->
2828

2929
1. Optionally, you can edit contents in the **Give the model instructions and context** text box. Give the model instructions about how it should behave and any context it should reference when generating a response. You can describe the assistant's personality, tell it what it should and shouldn't answer, and tell it how to format responses.
3030
1. Optionally, change settings such as threshold, prefix padding, and silence duration.

articles/ai-foundry/openai/includes/realtime-python.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.date: 3/20/2025
1212
- An Azure subscription. <a href="https://azure.microsoft.com/free/ai-services" target="_blank">Create one for free</a>.
1313
- <a href="https://www.python.org/" target="_blank">Python 3.8 or later version</a>. We recommend using Python 3.10 or later, but having at least Python 3.8 is required. If you don't have a suitable version of Python installed, you can follow the instructions in the [VS Code Python Tutorial](https://code.visualstudio.com/docs/python/python-tutorial#_install-a-python-interpreter) for the easiest way of installing Python on your operating system.
1414
- An Azure OpenAI resource created in one of the supported regions. For more information about region availability, see the [models and versions documentation](../concepts/models.md#global-standard-model-availability).
15-
- Then, you need to deploy a `gpt-4o-mini-realtime-preview` model with your Azure OpenAI resource. For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md).
15+
- Then, you need to deploy a `gpt-4o-realtime` model with your Azure OpenAI resource. For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md).
1616

1717
## Microsoft Entra ID prerequisites
1818

@@ -109,12 +109,12 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
109109
client = AsyncAzureOpenAI(
110110
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
111111
azure_ad_token_provider=token_provider,
112-
api_version="2025-04-01-preview",
112+
api_version="2025-08-28",
113113
)
114114
async with client.beta.realtime.connect(
115-
model="gpt-4o-realtime-preview", # name of your deployment
115+
model="gpt-4o-realtime", # name of your deployment
116116
) as connection:
117-
await connection.session.update(session={"modalities": ["text", "audio"]})
117+
await connection.session.update(session={"output_modalities": ["text", "audio"]})
118118
while True:
119119
user_input = input("Enter a message: ")
120120
if user_input == "q":
@@ -129,15 +129,15 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
129129
)
130130
await connection.response.create()
131131
async for event in connection:
132-
if event.type == "response.text.delta":
132+
if event.type == "response.output_text.delta":
133133
print(event.delta, flush=True, end="")
134-
elif event.type == "response.audio.delta":
134+
elif event.type == "response.output_audio.delta":
135135
136136
audio_data = base64.b64decode(event.delta)
137137
print(f"Received {len(audio_data)} bytes of audio data.")
138-
elif event.type == "response.audio_transcript.delta":
138+
elif event.type == "response.output_audio_transcript.delta":
139139
print(f"Received text delta: {event.delta}")
140-
elif event.type == "response.text.done":
140+
elif event.type == "response.output_text.done":
141141
print()
142142
elif event.type == "response.done":
143143
break
@@ -181,12 +181,12 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
181181
client = AsyncAzureOpenAI(
182182
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
183183
api_key=os.environ["AZURE_OPENAI_API_KEY"],
184-
api_version="2025-04-01-preview",
184+
api_version="2025-08-28",
185185
)
186186
async with client.beta.realtime.connect(
187-
model="gpt-4o-realtime-preview", # deployment name of your model
187+
model="gpt-4o-realtime", # deployment name of your model
188188
) as connection:
189-
await connection.session.update(session={"modalities": ["text", "audio"]})
189+
await connection.session.update(session={"output_modalities": ["text", "audio"]})
190190
while True:
191191
user_input = input("Enter a message: ")
192192
if user_input == "q":
@@ -201,15 +201,15 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
201201
)
202202
await connection.response.create()
203203
async for event in connection:
204-
if event.type == "response.text.delta":
204+
if event.type == "response.output_text.delta":
205205
print(event.delta, flush=True, end="")
206-
elif event.type == "response.audio.delta":
206+
elif event.type == "response.output_audio.delta":
207207
208208
audio_data = base64.b64decode(event.delta)
209209
print(f"Received {len(audio_data)} bytes of audio data.")
210-
elif event.type == "response.audio_transcript.delta":
210+
elif event.type == "response.output_audio_transcript.delta":
211211
print(f"Received text delta: {event.delta}")
212-
elif event.type == "response.text.done":
212+
elif event.type == "response.output_text.done":
213213
print()
214214
elif event.type == "response.done":
215215
break

0 commit comments

Comments
 (0)