You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/audio-real-time.md
+16-13Lines changed: 16 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
-
title: 'How to use GPT-4o real-time audio with Azure OpenAI Service'
2
+
title: 'How to use GPT-4o Realtime API for speech and audio with Azure OpenAI Service'
3
3
titleSuffix: Azure OpenAI
4
-
description: Learn how to use GPT-4o real-time audio with Azure OpenAI Service.
4
+
description: Learn how to use GPT-4o Realtime API for speech and audio with Azure OpenAI Service.
5
5
manager: nitinme
6
6
ms.service: azure-ai-openai
7
7
ms.topic: how-to
@@ -12,11 +12,11 @@ ms.custom: references_regions
12
12
recommendations: false
13
13
---
14
14
15
-
# GPT-4o real-time audio
15
+
# GPT-4o Realtime API for speech and audio
16
16
17
-
Azure OpenAI GPT-4o audio is part of the GPT-4o model family that supports low-latency, "speech in, speech out" conversational interactions. The GPT-4o audio `realtime` API is designed to handle real-time, low-latency conversational interactions, making it a great fit for use cases involving live interactions between a user and a model, such as customer support agents, voice assistants, and real-time translators.
17
+
Azure OpenAI GPT-4o Realtime API for speech and audio is part of the GPT-4o model family that supports low-latency, "speech in, speech out" conversational interactions. The GPT-4o audio `realtime` API is designed to handle real-time, low-latency conversational interactions, making it a great fit for use cases involving live interactions between a user and a model, such as customer support agents, voice assistants, and real-time translators.
18
18
19
-
Most users of this API need to deliver and receive audio from an end-user in real time, including applications that use WebRTC or a telephony system. The real-time API isn't designed to connect directly to end user devices and relies on client integrations to terminate end user audio streams.
19
+
Most users of this API need to deliver and receive audio from an end-user in real time, including applications that use WebRTC or a telephony system. The Realtime API isn't designed to connect directly to end user devices and relies on client integrations to terminate end user audio streams.
20
20
21
21
## Supported models
22
22
@@ -29,7 +29,7 @@ The `gpt-4o-realtime-preview` model is available for global deployments in [East
29
29
30
30
## API support
31
31
32
-
Support for real-time audio was first added in API version `2024-10-01-preview`.
32
+
Support for the Realtime API was first added in API version `2024-10-01-preview`.
33
33
34
34
> [!NOTE]
35
35
> For more information about the API and architecture, see the [Azure OpenAI GPT-4o real-time audio repository on GitHub](https://github.com/azure-samples/aoai-realtime-audio-sdk).
@@ -56,15 +56,18 @@ You can deploy the model from the Azure OpenAI model catalog or from your projec
56
56
57
57
Now that you have a deployment of the `gpt-4o-realtime-preview` model, you can use the playground to interact with the model in real time. Select **Early access playground** from the list of playgrounds in the left pane.
58
58
59
-
## Use the GPT-4o real-time audio API
59
+
## Use the GPT-4o Realtime API
60
60
61
61
> [!TIP]
62
62
> A playground for GPT-4o real-time audio is coming soon to [Azure AI Studio](https://ai.azure.com). You can already use the API directly in your application.
63
63
64
-
Right now, the fastest way to get started with GPT-4o real-time audio is to download the sample code from the [Azure OpenAI GPT-4o real-time audio repository on GitHub](https://github.com/azure-samples/aoai-realtime-audio-sdk).
64
+
Right now, the fastest way to get started with the GPT-4o Realtime API is to download the sample code from the [Azure OpenAI GPT-4o real-time audio repository on GitHub](https://github.com/azure-samples/aoai-realtime-audio-sdk).
65
+
66
+
The JavaScript web sample demonstrates how to use the GPT-4o Realtime API to interact with the model in real time. The sample code includes a simple web interface that captures audio from the user's microphone and sends it to the model for processing. The model responds with text and audio, which the sample code renders in the web interface.
67
+
68
+
You can run the sample code locally on your machine by following these steps. Refer to the [repository on GitHub](https://github.com/azure-samples/aoai-realtime-audio-sdk) for the most up-to-date instructions.
69
+
1. If you don't have Node.js installed, download and install the [LTS version of Node.js](https://nodejs.org/).
65
70
66
-
The JavaScript web sample demonstrates how to use the GPT-4o real-time audio API to interact with the model in real time. The sample code includes a simple web interface that captures audio from the user's microphone and sends it to the model for processing. The model responds with text and audio, which the sample code renders in the web interface.
67
-
68
71
1. Clone the repository to your local machine:
69
72
70
73
```bash
@@ -74,12 +77,12 @@ The JavaScript web sample demonstrates how to use the GPT-4o real-time audio API
74
77
1. Go to the `javascript/samples/web` folder in your preferred code editor.
75
78
76
79
```bash
77
-
cd .\javascript\samples\web\
80
+
cd ./javascript/samples
78
81
```
79
82
80
-
1. If you don't have Node.js installed, download and install the [LTS version of Node.js](https://nodejs.org/).
83
+
1. Run `download-pkg.ps1` or `download-pkg.sh` to download the required packages.
81
84
82
-
1. Run `npm install` to download a few dependency packages. For more information, see the `package.json` file in the same `web` folder.
85
+
1. Run `npm install` to install package dependencies.
83
86
84
87
1. Run `npm run dev` to start the web server, navigating any firewall permissions prompts as needed.
85
88
1. Go to any of the provided URIs from the console output (such as `http://localhost:5173/`) in a browser.
0 commit comments