Merge pull request #7369 from mrbullwinkle/mrb_09_30_2025_freshness_001

prmerger-automator[bot] · web-flow · commit 87009873d730 · 2025-09-30T15:40:59.000Z
[Azure OpenAI] Freshness 001
diff --git a/articles/ai-foundry/openai/concepts/prompt-engineering.md b/articles/ai-foundry/openai/concepts/prompt-engineering.md
@@ -4,7 +4,7 @@ titleSuffix: Azure OpenAI
 description: Learn how to use prompt engineering to optimize your work with Azure OpenAI.
 ms.service: azure-ai-openai
 ms.topic: conceptual
-ms.date: 09/23/2025
+ms.date: 09/30/2025
 ms.custom: references_regions, build-2023, build-2023-dataai
 manager: nitinme
 author: mrbullwinkle
@@ -102,27 +102,10 @@ Supporting content is information that the model can utilize to influence the ou
 
 ## Scenario-specific guidance
 
-While the principles of prompt engineering can be generalized across many different model types, certain models expect a specialized prompt structure. For Azure OpenAI GPT models, there are currently two distinct APIs where prompt engineering comes into play:
-
-- Chat Completion API.
-- Completion API.
-
-Each API requires input data to be formatted differently, which in turn impacts overall prompt design. The **Chat Completion API** supports the GPT-35-Turbo and GPT-4 models. These models are designed to take input formatted in a [specific chat-like transcript](../how-to/chatgpt.md) stored inside an array of dictionaries.
-
-The **Completion API** supports the older GPT-3 models and has much more flexible input requirements in that it takes a string of text with no specific format rules.
-
 The techniques in this section will teach you strategies for increasing the accuracy and grounding of responses you generate with a Large Language Model (LLM). It is, however, important to remember that even when using prompt engineering effectively you still need to validate the responses the models generate. Just because a carefully crafted prompt worked well for a particular scenario doesn't necessarily mean it will generalize more broadly to certain use cases. Understanding the [limitations of LLMs](/azure/ai-foundry/responsible-ai/openai/transparency-note#limitations), is just as important as understanding how to leverage their strengths.
 
-#### [Chat completion APIs](#tab/chat)
-
 [!INCLUDE [Prompt Chat Completion](../includes/prompt-chat-completion.md)]
 
-#### [Completion APIs](#tab/completion)
-
-[!INCLUDE [Prompt Completion](../includes/prompt-completion.md)]
-
----
-
 ## Best practices
 
 - **Be Specific**. Leave as little to interpretation as possible. Restrict the operational space.
@@ -133,7 +116,7 @@ The techniques in this section will teach you strategies for increasing the accu
 
 ## Space efficiency
 
-While the input size increases with each new generation of GPT models, there will continue to be scenarios that provide more data than the model can handle. GPT models break words into "tokens." While common multi-syllable words are often a single token, less common words are broken in syllables. Tokens can sometimes be counter-intuitive, as shown by the example below which demonstrates token boundaries for different date formats. In this case, spelling out the entire month is more space efficient than a fully numeric date. The current range of token support goes from 2,000 tokens with earlier GPT-3 models to up to 32,768 tokens with the 32k version of the latest GPT-4 model.
+While the input size increases with each new generation of GPT models, there will continue to be scenarios that provide more data than the model can handle. GPT models break words into "tokens." While common multi-syllable words are often a single token, less common words are broken in syllables. Tokens can sometimes be counter-intuitive, as shown by the example below which demonstrates token boundaries for different date formats. In this case, spelling out the entire month is more space efficient than a fully numeric date. 
 
 :::image type="content" source="../media/prompt-engineering/space-efficiency.png" alt-text="Screenshot of a string of text with highlighted colors delineating token boundaries." lightbox="../media/prompt-engineering/space-efficiency.png":::
 
diff --git a/articles/ai-foundry/openai/how-to/function-calling.md b/articles/ai-foundry/openai/how-to/function-calling.md
@@ -7,7 +7,7 @@ ms.author: mbullwin #delegenz
 ms.service: azure-ai-openai
 ms.custom: devx-track-python
 ms.topic: how-to
-ms.date: 09/15/2025
+ms.date: 09/30/2025
 manager: nitinme
 ---
 
@@ -31,9 +31,6 @@ At a high level you can break down working with functions into three steps:
 
 * `gpt-35-turbo` (`1106`)
 * `gpt-35-turbo` (`0125`)
-* `gpt-4` (`1106-Preview`)
-* `gpt-4` (`0125-Preview`)
-* `gpt-4` (`vision-preview`)
 * `gpt-4` (`2024-04-09`)
 * `gpt-4o` (`2024-05-13`)
 * `gpt-4o` (`2024-08-06`)
@@ -44,6 +41,7 @@ At a high level you can break down working with functions into three steps:
 * `gpt-5` (`2025-08-07`)
 * `gpt-5-mini` (`2025-08-07`)
 * `gpt-5-nano` (`2025-08-07`)
+* `gpt-5-codex` (`2025-09-11`)
 
 Support for parallel function was first added in API version [`2023-12-01-preview`](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2023-12-01-preview/inference.json)