Merge pull request #6911 from PatrickFarley/openai-audio

JamesJBarnett · web-flow · commit 9615d22fab5a · 2025-09-03T10:26:22.000-07:00
fix names
diff --git a/articles/ai-foundry/openai/concepts/models.md b/articles/ai-foundry/openai/concepts/models.md
@@ -321,7 +321,7 @@ Details about maximum request tokens and training data are available in the foll
 |`gpt-4o-realtime-preview` (2025-06-03) <br> GPT-4o audio | Audio model for real-time audio processing. |Input: 128,000  <br> Output: 4,096 | October 2023 |
 |`gpt-4o-realtime-preview` (2024-12-17) <br> GPT-4o audio | Audio model for real-time audio processing. |Input: 128,000  <br> Output: 4,096 | October 2023 |
 |`gpt-4o-mini-realtime-preview` (2024-12-17) <br> GPT-4o audio | Audio model for real-time audio processing. |Input: 128,000  <br> Output: 4,096 | October 2023 |
-|`gpt-4o-realtime` (2025-08-28) <br> GPT-4o audio | Audio model for real-time audio processing. |Input: 28,672  <br> Output: 4,096 | October 2023 |
+|`gpt-realtime` (2025-08-28) <br> GPT-4o audio | Audio model for real-time audio processing. |Input: 28,672  <br> Output: 4,096 | October 2023 |
 
 To compare the availability of GPT-4o audio models across all regions, refer to the [models table](#global-standard-model-availability).
 
diff --git a/articles/ai-foundry/openai/how-to/prompt-caching.md b/articles/ai-foundry/openai/how-to/prompt-caching.md
@@ -75,9 +75,9 @@ Prompt caching is supported for:
 
 |**Caching supported**|**Description**|**Supported models**|
 |--------|--------|--------|
-| **Messages** | The complete messages array: system, developer, user, and assistant content | `gpt-4o`<br/>`gpt-4o-mini`<br/>`gpt-4o-realtime-preview` (version 2024-12-17)<br/>`gpt-4o-mini-realtime-preview` (version 2024-12-17)<br> `o1` (version 2024-12-17) <br> `o3-mini` (version 2025-01-31) |
+| **Messages** | The complete messages array: system, developer, user, and assistant content | `gpt-4o`<br/>`gpt-4o-mini`<br/>`gpt-4o-realtime-preview` (version 2024-12-17)<br/>`gpt-4o-mini-realtime-preview` (version 2024-12-17)<br>`gpt-realtime` (version 2025-08-28)<br> `o1` (version 2024-12-17) <br> `o3-mini` (version 2025-01-31) |
 | **Images** | Images included in user messages, both as links or as base64-encoded data. The detail parameter must be set the same across requests. | `gpt-4o`<br/>`gpt-4o-mini` <br> `o1` (version 2024-12-17)  |
-| **Tool use** | Both the messages array and tool definitions. | `gpt-4o`<br/>`gpt-4o-mini`<br/>`gpt-4o-realtime-preview` (version 2024-12-17)<br/>`gpt-4o-mini-realtime-preview` (version 2024-12-17)<br> `o1` (version 2024-12-17) <br> `o3-mini` (version 2025-01-31) |
+| **Tool use** | Both the messages array and tool definitions. | `gpt-4o`<br/>`gpt-4o-mini`<br/>`gpt-4o-realtime-preview` (version 2024-12-17)<br/>`gpt-4o-mini-realtime-preview` (version 2024-12-17)<br>`gpt-realtime` (version 2025-08-28)<br> `o1` (version 2024-12-17) <br> `o3-mini` (version 2025-01-31) |
 | **Structured outputs** | Structured output schema is appended as a prefix to the system message. | `gpt-4o`<br/>`gpt-4o-mini` <br> `o1` (version 2024-12-17) <br> `o3-mini` (version 2025-01-31) |
 
 To improve the likelihood of cache hits occurring, you should structure your requests such that repetitive content occurs at the beginning of the messages array.
diff --git a/articles/ai-foundry/openai/how-to/realtime-audio-webrtc.md b/articles/ai-foundry/openai/how-to/realtime-audio-webrtc.md
@@ -1,5 +1,5 @@
 ---
-title: 'How to use the GPT-4o Realtime API via WebRTC (Preview)'
+title: 'How to use the GPT-4o Realtime API via WebRTC'
 titleSuffix: Azure OpenAI in Azure AI Foundry Models
 description: Learn how to use the GPT-4o Realtime API for speech and audio via WebRTC.
 manager: nitinme
@@ -12,9 +12,8 @@ ms.custom: references_regions
 recommendations: false
 ---
 
-# How to use the GPT-4o Realtime API via WebRTC (Preview)
+# How to use the GPT-4o Realtime API via WebRTC
 
-[!INCLUDE [Feature preview](../includes/preview-feature.md)]
 
 Azure OpenAI GPT-4o Realtime API for speech and audio is part of the GPT-4o model family that supports low-latency, "speech in, speech out" conversational interactions. 
 
@@ -33,6 +32,7 @@ Use the [Realtime API via WebSockets](./realtime-audio-websockets.md) if you nee
 The GPT 4o real-time models are available for global deployments in [East US 2 and Sweden Central regions](../concepts/models.md#global-standard-model-availability).
 - `gpt-4o-mini-realtime-preview` (2024-12-17)
 - `gpt-4o-realtime-preview` (2024-12-17)
+- `gpt-realtime` (version 2025-08-28)
 
 You should use API version `2025-04-01-preview` in the URL for the Realtime API. The API version is included in the sessions URL.
 
@@ -44,7 +44,7 @@ Before you can use GPT-4o real-time audio, you need:
 
 - An Azure subscription - <a href="https://azure.microsoft.com/free/cognitive-services" target="_blank">Create one for free</a>.
 - An Azure OpenAI resource created in a [supported region](#supported-models). For more information, see [Create a resource and deploy a model with Azure OpenAI](create-resource.md).
-- You need a deployment of the `gpt-4o-realtime-preview` or `gpt-4o-mini-realtime-preview` model in a supported region as described in the [supported models](#supported-models) section in this article. You can deploy the model from the [Azure AI Foundry model catalog](../../../ai-foundry/how-to/model-catalog-overview.md) or from your project in Azure AI Foundry portal. 
+- You need a deployment of the `gpt-4o-realtime-preview`, `gpt-4o-mini-realtime-preview`, or `gpt-realtime` model in a supported region as described in the [supported models](#supported-models) section in this article. You can deploy the model from the [Azure AI Foundry model catalog](../../../ai-foundry/how-to/model-catalog-overview.md) or from your project in Azure AI Foundry portal. 
 
 ## Connection and authentication
 
diff --git a/articles/ai-foundry/openai/how-to/realtime-audio-websockets.md b/articles/ai-foundry/openai/how-to/realtime-audio-websockets.md
@@ -1,5 +1,5 @@
 ---
-title: 'How to use the GPT-4o Realtime API via WebSockets (Preview)'
+title: 'How to use the GPT-4o Realtime API via WebSockets'
 titleSuffix: Azure OpenAI in Azure AI Foundry Models
 description: Learn how to use the GPT-4o Realtime API for speech and audio via WebSockets.
 manager: nitinme
@@ -12,9 +12,8 @@ ms.custom: references_regions
 recommendations: false
 ---
 
-# How to use the GPT-4o Realtime API via WebSockets (Preview)
+# How to use the GPT-4o Realtime API via WebSockets
 
-[!INCLUDE [Feature preview](../includes/preview-feature.md)]
 
 Azure OpenAI GPT-4o Realtime API for speech and audio is part of the GPT-4o model family that supports low-latency, "speech in, speech out" conversational interactions. 
 
@@ -30,6 +29,7 @@ Follow the instructions in this article to get started with the Realtime API via
 The GPT-4o real-time models are available for global deployments in [East US 2 and Sweden Central regions](../concepts/models.md#global-standard-model-availability).
 - `gpt-4o-mini-realtime-preview` (2024-12-17)
 - `gpt-4o-realtime-preview` (2024-12-17)
+- `gpt-realtime` (version 2025-08-28)
 
 You should use API version `2025-04-01-preview` in the URL for the Realtime API. 
 
@@ -41,7 +41,7 @@ Before you can use GPT-4o real-time audio, you need:
 
 - An Azure subscription - <a href="https://azure.microsoft.com/free/cognitive-services" target="_blank">Create one for free</a>.
 - An Azure OpenAI resource created in a [supported region](#supported-models). For more information, see [Create a resource and deploy a model with Azure OpenAI](create-resource.md).
-- You need a deployment of the `gpt-4o-realtime-preview` or `gpt-4o-mini-realtime-preview` model in a supported region as described in the [supported models](#supported-models) section. You can deploy the model from the [Azure AI Foundry portal model catalog](../../../ai-foundry/how-to/model-catalog-overview.md) or from your project in Azure AI Foundry portal. 
+- You need a deployment of the `gpt-4o-realtime-preview`, `gpt-4o-mini-realtime-preview`, or `gpt-realtime` model in a supported region as described in the [supported models](#supported-models) section. You can deploy the model from the [Azure AI Foundry portal model catalog](../../../ai-foundry/how-to/model-catalog-overview.md) or from your project in Azure AI Foundry portal. 
 
 ## Connection and authentication
 
@@ -55,13 +55,23 @@ You can construct a full request URI by concatenating:
 - Your Azure OpenAI resource endpoint hostname, for example, `my-aoai-resource.openai.azure.com`
 - The `openai/realtime` API path.
 - An `api-version` query string parameter for a supported API version such as `2024-12-17`
-- A `deployment` query string parameter with the name of your `gpt-4o-realtime-preview` or `gpt-4o-mini-realtime-preview` model deployment.
+- A `deployment` query string parameter with the name of your `gpt-4o-realtime-preview`, `gpt-4o-mini-realtime-preview`, or `gpt-realtime` model deployment.
 
 The following example is a well-constructed `/realtime` request URI:
 
+#### [preview version](#tab/preview)
+
 ```http
 wss://my-eastus2-openai-resource.openai.azure.com/openai/realtime?api-version=2025-04-01-preview&deployment=gpt-4o-mini-realtime-preview-deployment-name
 ```
+#### [GA version](#tab/ga)
+
+```http
+wss://my-eastus2-openai-resource.openai.azure.com/openai/realtime?api-version=2025-08-28&model=gpt-realtime-deployment-name
+```
+
+---
+
 
 To authenticate:
 - **Microsoft Entra** (recommended): Use token-based authentication with the `/realtime` API for an Azure OpenAI resource with managed identity enabled. Apply a retrieved authentication token using a `Bearer` token with the `Authorization` header.
diff --git a/articles/ai-foundry/openai/how-to/realtime-audio.md b/articles/ai-foundry/openai/how-to/realtime-audio.md
@@ -12,9 +12,7 @@ ms.custom: references_regions
 recommendations: false
 ---
 
-# How to use the GPT-4o Realtime API for speech and audio (Preview)
-
-[!INCLUDE [Feature preview](../includes/preview-feature.md)]
+# How to use the GPT-4o Realtime API for speech and audio
 
 Azure OpenAI GPT-4o Realtime API for speech and audio is part of the GPT-4o model family that supports low-latency, "speech in, speech out" conversational interactions. The GPT-4o Realtime API is designed to handle real-time, low-latency conversational interactions. Realtime API is a great fit for use cases involving live interactions between a user and a model, such as customer support agents, voice assistants, and real-time translators.
 
@@ -29,6 +27,7 @@ You can use the Realtime API via WebRTC or WebSocket to send audio input to the
 The GPT 4o real-time models are available for global deployments in [East US 2 and Sweden Central regions](../concepts/models.md#global-standard-model-availability).
 - `gpt-4o-mini-realtime-preview` (2024-12-17)
 - `gpt-4o-realtime-preview` (2024-12-17)
+- `gpt-realtime` (version 2025-08-28)
 
 You should use API version `2025-04-01-preview` in the URL for the Realtime API. 
 
@@ -40,10 +39,10 @@ Before you can use GPT-4o real-time audio, you need:
 
 - An Azure subscription - <a href="https://azure.microsoft.com/free/cognitive-services" target="_blank">Create one for free</a>.
 - An Azure OpenAI resource created in a [supported region](#supported-models). For more information, see [Create a resource and deploy a model with Azure OpenAI](create-resource.md).
-- You need a deployment of the `gpt-4o-realtime-preview` or `gpt-4o-mini-realtime-preview` model in a supported region as described in the [supported models](#supported-models) section. You can deploy the model from the [Azure AI Foundry portal model catalog](../../../ai-foundry/how-to/model-catalog-overview.md) or from your project in Azure AI Foundry portal. 
+- You need a deployment of the `gpt-4o-realtime-preview`, `gpt-4o-mini-realtime-preview`, or `gpt-realtime` model in a supported region as described in the [supported models](#supported-models) section. You can deploy the model from the [Azure AI Foundry portal model catalog](../../../ai-foundry/how-to/model-catalog-overview.md) or from your project in Azure AI Foundry portal. 
 
 Here are some of the ways you can get started with the GPT-4o Realtime API for speech and audio:
-- For steps to deploy and use the `gpt-4o-realtime-preview` or `gpt-4o-mini-realtime-preview` model, see [the real-time audio quickstart](../realtime-audio-quickstart.md).
+- For steps to deploy and use the `gpt-4o-realtime-preview`, `gpt-4o-mini-realtime-preview`, or `gpt-realtime` model, see [the real-time audio quickstart](../realtime-audio-quickstart.md).
 - Try the [WebRTC via HTML and JavaScript example](./realtime-audio-webrtc.md#webrtc-example-via-html-and-javascript) to get started with the Realtime API via WebRTC.
 - [The Azure-Samples/aisearch-openai-rag-audio repo](https://github.com/Azure-Samples/aisearch-openai-rag-audio) contains an example of how to implement RAG support in applications that use voice as their user interface, powered by the GPT-4o realtime API for audio.
 
diff --git a/articles/ai-foundry/openai/includes/realtime-deploy-model.md b/articles/ai-foundry/openai/includes/realtime-deploy-model.md
@@ -7,12 +7,12 @@ ms.topic: include
 ms.date: 1/21/2025
 ---
 
-To deploy the `gpt-4o-realtime` model in the Azure AI Foundry portal:
+To deploy the `gpt-realtime` model in the Azure AI Foundry portal:
 1. Go to the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) and create or select your project. 
 1. Select **Models + endpoints** from under **My assets** in the left pane.
 1. Select **+ Deploy model** > **Deploy base model** to open the deployment window. 
-1. Search for and select the `gpt-4o-realtime` model and then select **Confirm**.
+1. Search for and select the `gpt-realtime` model and then select **Confirm**.
 1. Review the deployment details and select **Deploy**.
 1. Follow the wizard to finish deploying the model.
 
-Now that you have a deployment of the `gpt-4o-realtime` model, you can interact with it in the Azure AI Foundry portal **Audio** playground or Realtime API.
+Now that you have a deployment of the `gpt-realtime` model, you can interact with it in the Azure AI Foundry portal **Audio** playground or Realtime API.
diff --git a/articles/ai-foundry/openai/includes/realtime-javascript.md b/articles/ai-foundry/openai/includes/realtime-javascript.md
@@ -12,7 +12,7 @@ ms.date: 3/20/2025
 - An Azure subscription - <a href="https://azure.microsoft.com/free/cognitive-services" target="_blank">Create one for free</a>
 - <a href="https://nodejs.org/" target="_blank">Node.js LTS or ESM support.</a>
 - An Azure OpenAI resource created in one of the supported regions. For more information about region availability, see the [models and versions documentation](../concepts/models.md#global-standard-model-availability).
-- Then, you need to deploy a `gpt-4o-realtime` model with your Azure OpenAI resource. For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md). 
+- Then, you need to deploy a `gpt-realtime` model with your Azure OpenAI resource. For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md). 
 
 ### Microsoft Entra ID prerequisites
 
@@ -72,7 +72,7 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
         // You will need to set these environment variables or edit the following values
         const endpoint = process.env.AZURE_OPENAI_ENDPOINT || "AZURE_OPENAI_ENDPOINT";
         // Required Azure OpenAI deployment name and API version
-        const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "gpt-4o-realtime";
+        const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "gpt-realtime";
         const apiVersion = process.env.OPENAI_API_VERSION || "2025-08-28";
         // Keyless authentication 
         const credential = new DefaultAzureCredential();
@@ -91,7 +91,7 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
                 type: "session.update",
                 session: {
                     output_modalities: ["text", "audio"],
-                    model: "gpt-4o-realtime",
+                    model: "gpt-realtime",
                 },
             });
             realtimeClient.send({
@@ -155,7 +155,7 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
         const endpoint = process.env.AZURE_OPENAI_ENDPOINT || "AZURE_OPENAI_ENDPOINT";
         const apiKey = process.env.AZURE_OPENAI_API_KEY || "Your API key";
         // Required Azure OpenAI deployment name and API version
-        const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "gpt-4o-realtime";
+        const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "gpt-realtime";
         const apiVersion = process.env.OPENAI_API_VERSION || "2025-28-08";
         const azureOpenAIClient = new AzureOpenAI({
             apiKey: apiKey,
@@ -170,7 +170,7 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
                 type: "session.update",
                 session: {
                     modalities: ["text", "audio"],
-                    model: "gpt-4o-realtime",
+                    model: "gpt-realtime",
                 },
             });
             realtimeClient.send({
diff --git a/articles/ai-foundry/openai/includes/realtime-portal.md b/articles/ai-foundry/openai/includes/realtime-portal.md
@@ -13,16 +13,16 @@ ms.date: 3/20/2025
 
 ## Use the GPT-4o real-time audio
 
-To chat with your deployed `gpt-4o-realtime` model in the [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs) **Real-time audio** playground, follow these steps:
+To chat with your deployed `gpt-realtime` model in the [Azure AI Foundry](https://ai.azure.com/?cid=learnDocs) **Real-time audio** playground, follow these steps:
 
-1. Go to the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) and select your project that has your deployed `gpt-4o-realtime` model.
+1. Go to the [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) and select your project that has your deployed `gpt-realtime` model.
 1. Select **Playgrounds** from the left pane.
 1. Select **Audio playground** > **Try the Audio playground**. 
 
     > [!NOTE]
-    > The **Chat playground** doesn't support the `gpt-4o-realtime` model. Use the **Audio playground** as described in this section.
+    > The **Chat playground** doesn't support the `gpt-realtime` model. Use the **Audio playground** as described in this section.
 
-1. Select your deployed `gpt-4o-realtime` model from the **Deployment** dropdown.
+1. Select your deployed `gpt-realtime` model from the **Deployment** dropdown.
 
     <!--:::image type="content" source="../media/how-to/real-time/real-time-playground.png" alt-text="Screenshot of the audio playground with the deployed model selected." lightbox="../media/how-to/real-time/real-time-playground.png":::-->
 
diff --git a/articles/ai-foundry/openai/includes/realtime-python.md b/articles/ai-foundry/openai/includes/realtime-python.md
@@ -12,7 +12,7 @@ ms.date: 3/20/2025
 - An Azure subscription. <a href="https://azure.microsoft.com/free/ai-services" target="_blank">Create one for free</a>.
 - <a href="https://www.python.org/" target="_blank">Python 3.8 or later version</a>. We recommend using Python 3.10 or later, but having at least Python 3.8 is required. If you don't have a suitable version of Python installed, you can follow the instructions in the [VS Code Python Tutorial](https://code.visualstudio.com/docs/python/python-tutorial#_install-a-python-interpreter) for the easiest way of installing Python on your operating system.
 - An Azure OpenAI resource created in one of the supported regions. For more information about region availability, see the [models and versions documentation](../concepts/models.md#global-standard-model-availability).
-- Then, you need to deploy a `gpt-4o-realtime` model with your Azure OpenAI resource. For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md).
+- Then, you need to deploy a `gpt-realtime` model with your Azure OpenAI resource. For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md).
 
 ## Microsoft Entra ID prerequisites
 
@@ -112,7 +112,7 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
             api_version="2025-08-28",
         )
         async with client.beta.realtime.connect(
-            model="gpt-4o-realtime",  # name of your deployment
+            model="gpt-realtime",  # name of your deployment
         ) as connection:
             await connection.session.update(session={"output_modalities": ["text", "audio"]})  
             while True:
@@ -184,7 +184,7 @@ For the recommended keyless authentication with Microsoft Entra ID, you need to:
             api_version="2025-08-28",
         )
         async with client.beta.realtime.connect(
-            model="gpt-4o-realtime",  # deployment name of your model
+            model="gpt-realtime",  # deployment name of your model
         ) as connection:
             await connection.session.update(session={"output_modalities": ["text", "audio"]})  
             while True:
diff --git a/articles/ai-foundry/openai/includes/realtime-typescript.md b/articles/ai-foundry/openai/includes/realtime-typescript.md
diff --git a/articles/ai-foundry/openai/quotas-limits.md b/articles/ai-foundry/openai/quotas-limits.md
diff --git a/articles/ai-foundry/openai/realtime-audio-quickstart.md b/articles/ai-foundry/openai/realtime-audio-quickstart.md