gpt-4o real-time audio preview

eric-urban · eric-urban · commit 8b6dfacc93a4 · 2024-10-01T13:50:07.000-07:00
diff --git a/articles/ai-services/openai/how-to/audio-real-time.md b/articles/ai-services/openai/how-to/audio-real-time.md
@@ -1,7 +1,7 @@
 ---
-title: 'How to use GPT-4o real-time audio with Azure OpenAI Service'
+title: 'How to use GPT-4o Realtime API for speech and audio with Azure OpenAI Service'
 titleSuffix: Azure OpenAI
-description: Learn how to use GPT-4o real-time audio with Azure OpenAI Service.
+description: Learn how to use GPT-4o Realtime API for speech and audio with Azure OpenAI Service.
 manager: nitinme
 ms.service: azure-ai-openai
 ms.topic: how-to
@@ -12,11 +12,11 @@ ms.custom: references_regions
 recommendations: false
 ---
 
-# GPT-4o real-time audio
+# GPT-4o Realtime API for speech and audio
 
-Azure OpenAI GPT-4o audio is part of the GPT-4o model family that supports low-latency, "speech in, speech out" conversational interactions. The GPT-4o audio `realtime` API is designed to handle real-time, low-latency conversational interactions, making it a great fit for use cases involving live interactions between a user and a model, such as customer support agents, voice assistants, and real-time translators.
+Azure OpenAI GPT-4o Realtime API for speech and audio is part of the GPT-4o model family that supports low-latency, "speech in, speech out" conversational interactions. The GPT-4o audio `realtime` API is designed to handle real-time, low-latency conversational interactions, making it a great fit for use cases involving live interactions between a user and a model, such as customer support agents, voice assistants, and real-time translators.
 
-Most users of this API need to deliver and receive audio from an end-user in real time, including applications that use WebRTC or a telephony system. The real-time API isn't designed to connect directly to end user devices and relies on client integrations to terminate end user audio streams. 
+Most users of this API need to deliver and receive audio from an end-user in real time, including applications that use WebRTC or a telephony system. The Realtime API isn't designed to connect directly to end user devices and relies on client integrations to terminate end user audio streams. 
 
 ## Supported models
 
@@ -29,7 +29,7 @@ The `gpt-4o-realtime-preview` model is available for global deployments in [East
 
 ## API support
 
-Support for real-time audio was first added in API version `2024-10-01-preview`. 
+Support for the Realtime API was first added in API version `2024-10-01-preview`. 
 
 > [!NOTE]
 > For more information about the API and architecture, see the [Azure OpenAI GPT-4o real-time audio repository on GitHub](https://github.com/azure-samples/aoai-realtime-audio-sdk).
@@ -56,15 +56,18 @@ You can deploy the model from the Azure OpenAI model catalog or from your projec
 
 Now that you have a deployment of the `gpt-4o-realtime-preview` model, you can use the playground to interact with the model in real time. Select **Early access playground** from the list of playgrounds in the left pane.
 
-## Use the GPT-4o real-time audio API
+## Use the GPT-4o Realtime API
 
 > [!TIP]
 > A playground for GPT-4o real-time audio is coming soon to [Azure AI Studio](https://ai.azure.com). You can already use the API directly in your application.
 
-Right now, the fastest way to get started with GPT-4o real-time audio is to download the sample code from the [Azure OpenAI GPT-4o real-time audio repository on GitHub](https://github.com/azure-samples/aoai-realtime-audio-sdk).
+Right now, the fastest way to get started with the GPT-4o Realtime API is to download the sample code from the [Azure OpenAI GPT-4o real-time audio repository on GitHub](https://github.com/azure-samples/aoai-realtime-audio-sdk).
+
+The JavaScript web sample demonstrates how to use the GPT-4o Realtime API to interact with the model in real time. The sample code includes a simple web interface that captures audio from the user's microphone and sends it to the model for processing. The model responds with text and audio, which the sample code renders in the web interface.
+
+You can run the sample code locally on your machine by following these steps. Refer to the [repository on GitHub](https://github.com/azure-samples/aoai-realtime-audio-sdk) for the most up-to-date instructions.
+1. If you don't have Node.js installed, download and install the [LTS version of Node.js](https://nodejs.org/).
 
-The JavaScript web sample demonstrates how to use the GPT-4o real-time audio API to interact with the model in real time. The sample code includes a simple web interface that captures audio from the user's microphone and sends it to the model for processing. The model responds with text and audio, which the sample code renders in the web interface.
- 
 1. Clone the repository to your local machine:
     
     ```bash
@@ -74,12 +77,12 @@ The JavaScript web sample demonstrates how to use the GPT-4o real-time audio API
 1. Go to the `javascript/samples/web` folder in your preferred code editor.
 
     ```bash
-    cd .\javascript\samples\web\
+    cd ./javascript/samples
     ```
 
-1. If you don't have Node.js installed, download and install the [LTS version of Node.js](https://nodejs.org/).
+1. Run `download-pkg.ps1` or `download-pkg.sh` to download the required packages. 
 
-1. Run `npm install` to download a few dependency packages. For more information, see the `package.json` file in the same `web` folder.
+1. Run `npm install` to install package dependencies.
 
 1. Run `npm run dev` to start the web server, navigating any firewall permissions prompts as needed.
 1. Go to any of the provided URIs from the console output (such as `http://localhost:5173/`) in a browser.