Skip to content

Commit b60a7b0

Browse files
committed
realtime webrtc
1 parent 9a941dc commit b60a7b0

File tree

2 files changed

+14
-12
lines changed

2 files changed

+14
-12
lines changed

articles/ai-services/openai/how-to/realtime-audio-webrtc.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,12 @@ Azure OpenAI GPT-4o Realtime API for speech and audio is part of the GPT-4o mode
2121
You can use the Realtime API via WebRTC or WebSocket to send audio input to the model and receive audio responses in real time. Follow the instructions in this article to get started with the Realtime API via WebRTC.
2222

2323
In most cases, we recommend using the WebRTC API for real-time audio streaming. The WebRTC API is a web standard that enables real-time communication (RTC) between browsers and mobile applications. Here are some reasons why WebRTC is preferred for real-time audio streaming:
24-
- **Lower Latency**: WebRTC is specifically designed to minimize delay, making it more suitable for audio and video communication where low latency is critical for maintaining quality and synchronization.
24+
- **Lower Latency**: WebRTC is designed to minimize delay, making it more suitable for audio and video communication where low latency is critical for maintaining quality and synchronization.
2525
- **Media Handling**: WebRTC has built-in support for audio and video codecs, providing optimized handling of media streams.
2626
- **Error Correction**: WebRTC includes mechanisms for handling packet loss and jitter, which are essential for maintaining the quality of audio streams over unpredictable networks.
2727
- **Peer-to-Peer Communication**: WebRTC allows direct communication between clients, reducing the need for a central server to relay audio data, which can further reduce latency.
28-
- **Network Traversal**: WebRTC includes built-in support for NAT traversal, which helps establish connections between clients behind firewalls or NATs.
2928

30-
Use the [Realtime API via WebSockets](./realtime-audio-websockets.md) if you need to stream audio data from a server to a client, or if you need to send and receive data in real time between a client and server. WebSockets are not recommended for real-time audio streaming because they have higher latency than WebRTC.
29+
Use the [Realtime API via WebSockets](./realtime-audio-websockets.md) if you need to stream audio data from a server to a client, or if you need to send and receive data in real time between a client and server. WebSockets aren't recommended for real-time audio streaming because they have higher latency than WebRTC.
3130

3231
## Supported models
3332

@@ -36,7 +35,7 @@ The GPT 4o real-time models are available for global deployments in [East US 2 a
3635
- `gpt-4o-realtime-preview` (2024-12-17)
3736
- `gpt-4o-realtime-preview` (2024-10-01)
3837

39-
See the [models and versions documentation](../concepts/models.md#audio-models) for more information.
38+
For more information about supported models, see the [models and versions documentation](../concepts/models.md#audio-models).
4039

4140
## Prerequisites
4241

@@ -74,7 +73,7 @@ The sessions URL includes the Azure OpenAI resource URL, deployment name, the `/
7473

7574
You can use the ephemeral API key to authenticate a WebRTC session with the Realtime API. The ephemeral key is valid for one minute and is used to establish a secure WebRTC connection between the client and the Realtime API.
7675

77-
The sequence diagram below illustrates the process of minting an ephemeral API key and using it to authenticate a WebRTC session with the Realtime API. The sequence diagram shows the following steps:
76+
Here's how the ephemeral API key is used in the Realtime API:
7877

7978
1. Your client requests an ephemeral API key from your server.
8079
1. Your server mints the ephemeral API key using the standard API key.
@@ -84,7 +83,9 @@ The sequence diagram below illustrates the process of minting an ephemeral API k
8483
8584
1. Your server returns the ephemeral API key to your client.
8685
1. Your client uses the ephemeral API key to authenticate a session with the Realtime API via WebRTC.
87-
1. Send and receive audio data in real time using the WebRTC peer connection.
86+
1. You send and receive audio data in real time using the WebRTC peer connection.
87+
88+
The following sequence diagram illustrates the process of minting an ephemeral API key and using it to authenticate a WebRTC session with the Realtime API.
8889

8990
:::image type="content" source="../media/how-to/real-time/ephemeral-key-webrtc.png" alt-text="Diagram of the ephemeral API key to WebRTC peer connection sequence." lightbox="../media/how-to/real-time/ephemeral-key-webrtc.png":::
9091

@@ -109,7 +110,7 @@ The following code sample demonstrates how to use the GPT-4o Realtime API via We
109110
The sample code is an HTML page that allows you to start a session with the GPT-4o Realtime API and send audio input to the model. The model's responses are played back in real-time.
110111

111112
> [!WARNING]
112-
> The sample code includes the API key hardcoded in the JavaScript. This is not recommended for production use. In a production environment, you should use a secure backend service to generate an ephemeral key and return it to the client.
113+
> The sample code includes the API key hardcoded in the JavaScript. This code isn't recommended for production use. In a production environment, you should use a secure backend service to generate an ephemeral key and return it to the client.
113114
114115
1. Copy the following code into an HTML file and open it in a web browser:
115116

@@ -294,7 +295,7 @@ The sample code is an HTML page that allows you to start a session with the GPT-
294295
295296
1. Select **Start Session** to start a session with the GPT-4o Realtime API. The session ID and ephemeral key are displayed in the log container.
296297
1. Allow the browser to access your microphone when prompted.
297-
1. Confirmation messages are displayed in the log container as the session progresses. The following is an example of the log messages:
298+
1. Confirmation messages are displayed in the log container as the session progresses. Here's an example of the log messages:
298299
299300
```text
300301
Ephemeral Key Received: ***

articles/ai-services/openai/how-to/realtime-audio-websockets.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,18 +20,19 @@ Azure OpenAI GPT-4o Realtime API for speech and audio is part of the GPT-4o mode
2020

2121
You can use the Realtime API via WebRTC or WebSocket to send audio input to the model and receive audio responses in real time. Follow the instructions in this article to get started with the Realtime API via WebSockets.
2222

23-
In most cases, we recommend using the [Realtime API via WebRTC](./realtime-audio-webrtc.md) for real-time audio streaming. The WebRTC API is a web standard that enables real-time communication (RTC) between browsers and mobile applications.
23+
Use the Realtime API via WebSockets in server-to-server scenarios where low latency isn't a requirement.
2424

25-
WebSockets are not recommended for real-time audio streaming because they have higher latency than WebRTC. Use the Realtime API via WebSockets if you need to stream audio data from a server to a client, or if you need to send and receive data in real time between a client and server.
25+
> [!TIP]
26+
> In most cases, we recommend using the [Realtime API via WebRTC](./realtime-audio-webrtc.md) for real-time audio streaming in client-side applications such as a web application or mobile app. WebRTC is designed for low-latency, real-time audio streaming and is the best choice for most use cases.
2627
2728
## Supported models
2829

29-
The GPT 4o real-time models are available for global deployments in [East US 2 and Sweden Central regions](../concepts/models.md#global-standard-model-availability).
30+
The GPT-4o real-time models are available for global deployments in [East US 2 and Sweden Central regions](../concepts/models.md#global-standard-model-availability).
3031
- `gpt-4o-mini-realtime-preview` (2024-12-17)
3132
- `gpt-4o-realtime-preview` (2024-12-17)
3233
- `gpt-4o-realtime-preview` (2024-10-01)
3334

34-
See the [models and versions documentation](../concepts/models.md#audio-models) for more information.
35+
For more information about supported models, see the [models and versions documentation](../concepts/models.md#audio-models).
3536

3637
## Prerequisites
3738

0 commit comments

Comments
 (0)