Merge pull request #6975 from MicrosoftDocs/main

learn-build-service-prod[bot] · web-flow · commit 5695e618ef0e · 2025-09-08T17:11:47.000Z
Auto Publish – main to live - 2025-09-08 17:07 UTC
diff --git a/articles/ai-foundry/openai/concepts/model-retirements.md b/articles/ai-foundry/openai/concepts/model-retirements.md
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
 description: Learn about model deprecations and retirements in Azure OpenAI.
 ms.service: azure-ai-openai
 ms.topic: conceptual
-ms.date: 08/14/2025
+ms.date: 09/08/2025
 ms.custom: 
 manager: nitinme
 author: mrbullwinkle
diff --git a/articles/ai-foundry/openai/how-to/reasoning.md b/articles/ai-foundry/openai/how-to/reasoning.md
@@ -5,7 +5,7 @@ description: Learn how to use Azure OpenAI's advanced GPT-5 series, o3-mini, o1,
 manager: nitinme
 ms.service: azure-ai-openai
 ms.topic: include
-ms.date: 08/27/2025
+ms.date: 09/08/2025
 author: mrbullwinkle    
 ms.author: mbullwin
 ---
@@ -158,17 +158,16 @@ pip install openai --upgrade
 If you're new to using Microsoft Entra ID for authentication see [How to configure Azure OpenAI in Azure AI Foundry Models with Microsoft Entra ID authentication](../how-to/managed-identity.md).
 
 ```python
-from openai import AzureOpenAI
+from openai import OpenAI
 from azure.identity import DefaultAzureCredential, get_bearer_token_provider
 
 token_provider = get_bearer_token_provider(
     DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
 )
 
-client = AzureOpenAI(
-  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
-  azure_ad_token_provider=token_provider,
-  api_version="2025-04-01-preview"
+client = OpenAI(  
+  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
+  api_key=token_provider,
 )
 
 response = client.chat.completions.create(
@@ -371,17 +370,16 @@ pip install openai --upgrade
 If you're new to using Microsoft Entra ID for authentication see [How to configure Azure OpenAI with Microsoft Entra ID authentication](../how-to/managed-identity.md).
 
 ```python
-from openai import AzureOpenAI
+from openai import OpenAI
 from azure.identity import DefaultAzureCredential, get_bearer_token_provider
 
 token_provider = get_bearer_token_provider(
     DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
 )
 
-client = AzureOpenAI(
-  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
-  azure_ad_token_provider=token_provider,
-  api_version="2025-04-01-preview"
+client = OpenAI(  
+  base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
+  api_key=token_provider,
 )
 
 response = client.chat.completions.create(
diff --git a/articles/ai-foundry/openai/how-to/responses.md b/articles/ai-foundry/openai/how-to/responses.md
@@ -5,7 +5,7 @@ description: Learn how to use Azure OpenAI's new stateful Responses API.
 author: mrbullwinkle
 ms.author: mbullwin
 manager: nitinme
-ms.date: 08/27/2025
+ms.date: 09/08/2025
 ms.service: azure-ai-openai
 ms.topic: include
 ms.custom:
@@ -109,21 +109,17 @@ print(response.model_dump_json(indent=2))
 
 # [Python (Microsoft Entra ID)](#tab/python-secure)
 
-> [!NOTE]
-> Full v1 GA support for the OpenAI Python library with Microsoft Entra ID is coming soon. The example below will be replaced once support is added. To learn more, check out the [API lifecycle guide](../api-version-lifecycle.md#api-evolution).
-
 ```python
-from openai import AzureOpenAI
+from openai import OpenAI
 from azure.identity import DefaultAzureCredential, get_bearer_token_provider
 
 token_provider = get_bearer_token_provider(
     DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
 )
 
-client = AzureOpenAI(  
+client = OpenAI(  
   base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
-  azure_ad_token_provider=token_provider,
-  api_version="preview"
+  api_key=token_provider,
 )
 
 response = client.responses.create(
@@ -238,21 +234,17 @@ response = client.responses.retrieve("resp_67cb61fa3a448190bcf2c42d96f0d1a8")
 
 # [Python (Microsoft Entra ID)](#tab/python-secure)
 
-> [!NOTE]
-> Full v1 GA support for the OpenAI Python library with Microsoft Entra ID is coming soon. The older preview API example below will be replaced once support is added. To learn more, check out the [API lifecycle guide](../api-version-lifecycle.md#api-evolution).
-
 ```python
-from openai import AzureOpenAI
+from openai import OpenAI
 from azure.identity import DefaultAzureCredential, get_bearer_token_provider
 
 token_provider = get_bearer_token_provider(
     DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
 )
 
-client = AzureOpenAI(  
+client = OpenAI(  
   base_url = "https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/",  
-  azure_ad_token_provider=token_provider,
-  api_version="preview"
+  api_key=token_provider,
 )
 
 response = client.responses.retrieve("resp_67cb61fa3a448190bcf2c42d96f0d1a8")
diff --git a/articles/ai-foundry/openai/includes/retirement/models.md b/articles/ai-foundry/openai/includes/retirement/models.md
@@ -3,7 +3,7 @@ title: Model Retirement Table
 titleSuffix: Azure OpenAI in Azure AI Foundry Models
 description: Model retirement table for Azure OpenAI in Azure AI Foundry Models.
 manager: nitinme
-ms.date: 08/14/2025
+ms.date: 09/08/2025
 ms.service: azure-ai-openai
 ms.topic: include
 ms.custom: references_regions, build-2025
@@ -17,11 +17,11 @@ ms.custom: references_regions, build-2025
 | Model                     | Version			| Lifecycle<br>Status	| Retirement date                    | Replacement model                    |
 | --------------------------|-------------------|:----------------------|------------------------------------|--------------------------------------|
 | `computer-use-preview`    | 2025-03-11		| Preview               | No earlier than October 10, 2025   |                                      |
-| `gpt-35-turbo`            | 1106				| Generally Available   | No earlier than October  15, 2025  | `gpt-4.1-mini`                       |
-| `gpt-35-turbo`            | 0125				| Generally Available   | No earlier than October  15, 2025  | `gpt-4.1-mini`                       |
-| `gpt-3.5-turbo-instruct`  | 0914				| Generally Available   | No earlier than October  15, 2025  |                                      |
-| `gpt-4`                   | turbo-2024-04-09	| Generally Available   | No earlier than October  15, 2025  | `gpt-4.1`                            |
 | `o1-mini`                 | 2024-09-12		| Generally Available   | No earlier than October 27, 2025   | `o4-mini`                            |
+| `gpt-35-turbo`            | 1106				| Generally Available   | No earlier than November  11, 2025 | `gpt-4.1-mini`                       |
+| `gpt-35-turbo`            | 0125				| Generally Available   | No earlier than November  11, 2025 | `gpt-4.1-mini`                       |
+| `gpt-3.5-turbo-instruct`  | 0914				| Generally Available   | No earlier than November  11, 2025 |                                      |
+| `gpt-4`                   | turbo-2024-04-09	| Generally Available   | No earlier than November  11, 2025 | `gpt-4.1`                            |
 | `gpt-5-chat`              | 2025-08-07		| Preview               | No earlier than November 15, 2025  |                                      |
 | `model-router`            | 2025-05-19		| Preview               | No earlier than November 30, 2025  |                                      |
 | `model-router`            | 2025-08-07        | Preview               | No earlier than November 30, 2025  |                                      |
diff --git a/articles/search/search-agentic-retrieval-how-to-retrieve.md b/articles/search/search-agentic-retrieval-how-to-retrieve.md
@@ -67,7 +67,7 @@ POST https://{{search-url}}/agents/{{agent-name}}/retrieve?api-version=2025-08-0
                 "content" : [
                   { "type" : "text", "text" : "You can answer questions about the Earth at night.
                     Sources have a JSON format with a ref_id that must be cited in the answer.
-                    If you do not have the answer, respond with "I don't know"." }
+                    If you do not have the answer, respond with 'I don't know'." }
                 ]
             },
             {
@@ -77,39 +77,29 @@ POST https://{{search-url}}/agents/{{agent-name}}/retrieve?api-version=2025-08-0
                 ]
             }
         ],
-    "targetIndexParams" :  [
-        { 
-            "indexName" : "{{index-name}}",
-            "filterAddOn" : "page_number eq '105'",
-            "IncludeReferenceSourceData": true, 
-            "rerankerThreshold" : 2.5,
-            "maxDocsForReranker": 50
-        } 
-    ]
+  "knowledgeSourceParams": [
+    {
+      "filterAddOn": null,
+      "knowledgeSourceName": "earth-at-night-blob-ks",
+      "kind": "searchIndex"
+    }
+  ]
 }
 ```
 
 **Key points**:
 
++ The retrieve action targets a [knowledge agent](search-agentic-retrieval-how-to-create.md). The knowledge agent specifies one or more knowledge sources and a knowledge source configuration. Review your knowledge agent definition for output and semantic ranking configuration.
+
 + `messages` articulates the messages sent to the model. The message format is similar to Azure OpenAI APIs.
 
   + `role` defines where the message came from, for example either `assistant` or `user`. The model you use determines which roles are valid.
 
-  + `content` is the message sent to the LLM. It must be text in this preview.
-
-+ `targetIndexParams` provide instructions on the retrieval. Currently in this preview, you can only target a single index. 
-
-  + `filterAddOn` lets you set an [OData filter expression](search-filters.md) for keyword or hybrid search.
+  + `content` is the message or prompt sent to the LLM. It must be text in this preview.
 
-  + `IncludeReferenceSourceData` tells the retrieval engine to return the source content in the response. This value is initially set in the knowledge agent definition. You can override that setting in the retrieve action to return original source content in the [references section](#review-the-references-array) of the response.
++ [`knowledgeSourceParams`](/rest/api/searchservice/knowledge-retrieval/retrieve?view=rest-searchservice-2025-08-01-preview#searchindexknowledgesourceparams&preserve-view=true) is optional. Specify a knowledge source if the agent has more than one, and you want to focus the retrieve action on just one knowledge source. If the knowledge agent has just one knowledge source with the configuration you want, you can omit this section.
 
-  + `rerankerThreshold` and `maxDocsForReranker` are also initially set in the knowledge agent definition as defaults. You can override them in the retrieve action to configure [semantic reranker](semantic-how-to-configure.md), setting minimum thresholds and the maximum number of inputs sent to the reranker.
-
-    `rerankerThreshold` is the minimum semantic reranker score that's acceptable for inclusion in a response. [Reranker scores](semantic-search-overview.md#how-results-are-scored) range from 1 to 4. Plan on revising this value based on testing and what works for your content.
-
-    `maxDocsForReranker` dictates the maximum number of documents to consider for the final response string. Semantic reranker accepts 50 documents. If the maximum is 200, four more subqueries are added to the query plan to ensure all 200 documents are semantically ranked. for semantic ranking. If the number isn't evenly divisible by 50, the query plan rounds up to nearest whole number. 
-
-    The `content` portion of the response consists of the 200 chunks or less, excluding any results that fail to meet the minimum threshold of a 2.5 reranker score.
+  A knowledge source specification on the retrieve action describes the target search index on the search service. So even if the knowledge source "kind" is Azure blob, the valid value here is `searchIndex`. In this first public preview release, `knowledgeSourceParams.kind` is always `searchIndex`.
 
 ## Review the extracted response
 
@@ -133,9 +123,13 @@ The body of the response is also structured in the chat message style format. Cu
 
 **Key points**:
 
-+ `content` is a JSON array. It's a single string composed of the most relevant documents (or chunks) found in the search index, given the query and chat history inputs. This array is your grounding data that a chat completion model uses to formulate a response to the user's question.
++ `content.text` is a JSON array. It's a single string composed of the most relevant documents (or chunks) found in the search index, given the query and chat history inputs. This array is your grounding data that a chat completion model uses to formulate a response to the user's question.
+
+  This portion of the response consists of the 200 chunks or less, excluding any results that fail to meet the minimum threshold of a 2.5 reranker score.
+
+  The string starts with the reference ID of the chunk (used for citation purposes), and any fields specified in the semantic configuration of the target index. In this example, you should assume the semantic configuration in the target index has a "title" field, a "terms" field, and a "content" field.
 
-+ "text" is the only valid value for `type`, and it consists of the reference ID of the chunk (used for citation purposes), and any fields specified in the semantic configuration of the target index. In this example, you should assume the semantic configuration in the target index has a "title" field, a "terms" field, and a "content" field. 
++ `content.type` has one valid value in this preview: `text`. 
 
 > [!NOTE]
 > The `maxOutputSize` property on the [knowledge agent](search-agentic-retrieval-how-to-create.md) determines the length of the string. We recommend 5,000 tokens.