update

mrbullwinkle · mrbullwinkle · commit 49e543fdd9c5 · 2025-04-15T12:21:58.000-04:00
diff --git a/articles/ai-services/openai/concepts/models.md b/articles/ai-services/openai/concepts/models.md
@@ -38,12 +38,16 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
 | Model | Region |
 |---|---|
 | `gpt-4.1` (2025-04-14) | East US2 (Global Standard), Sweden Central (Global Standard) |
+| `gpt-4.1-nano` (2025-04-14) | East US2 (Global Standard), Sweden Central (Global Standard)|
 
 ### Capabilities
 
 |  Model ID  | Description | Context Window | Max Output Tokens | Training Data (up to)  |
 |  --- |  :--- |:--- |:---|:---: |
 | `gpt-4.1` (2025-04-14) <br> <br> **Latest model from Azure OpenAI**  | - Text & image input <br> - Text output <br> - Chat completions API <br>- Responses API <br> - Streaming <br> - Function calling <br> Structured outputs (chat completions)   | 1,047,576 | 32,768 | May 31, 2024 |
+| `gpt-4.1-nano` (2025-04-14) <br><br> **Fastest 4.1 model** | - Text & image input <br> - Text output <br> - Chat completions API <br>- Responses API <br> - Streaming <br> - Function calling <br> Structured outputs (chat completions)   | 128,000<sup>*</sup> | 32,768 | May 31, 2024 |
+
+<sup>*</sup> It is expected behavior that the context window for this model on Azure OpenAI is currently smaller than the context limit for OpenAI. We will update this page once any changes are made to this value.
 
 ## computer-use-preview
 
diff --git a/articles/ai-services/openai/how-to/function-calling.md b/articles/ai-services/openai/how-to/function-calling.md
@@ -41,6 +41,7 @@ At a high level you can break down working with functions into three steps:
 * `gpt-4o-mini` (`2024-07-18`)
 * `gpt-4.5-preview` (`2025-02-27`)
 * `gpt-4.1` (`2025-14-2025`)
+* `gpt-4.1-nano` (`2025-14-2025`)
 
 Support for parallel function was first added in API version [`2023-12-01-preview`](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2023-12-01-preview/inference.json)
 
diff --git a/articles/ai-services/openai/how-to/prompt-caching.md b/articles/ai-services/openai/how-to/prompt-caching.md
@@ -32,6 +32,7 @@ Currently only the following models support prompt caching with Azure OpenAI:
 - `gpt-4o-realtime-preview` (version 2024-12-17)
 - `gpt-4o-mini-realtime-preview` (version 2024-12-17)
 - `gpt-4.1-2025-04-14`
+- `gpt-4.1-nano-2025-04-14`
 
 > [!NOTE]
 > Prompt caching is now also available as part of model fine-tuning for `gpt-4o` and `gpt-4o-mini`. Refer to the fine-tuning section of the [pricing page](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) for details.
diff --git a/articles/ai-services/openai/how-to/responses.md b/articles/ai-services/openai/how-to/responses.md
@@ -44,6 +44,7 @@ The responses API is currently available in the following regions:
 - `gpt-4o-mini` (Version: `2024-07-18`)
 - `computer-use-preview`
 - `gpt-4.1` (Version: `2025-04-14`)
+- `gpt-4.1-nano` (Version: `2025-04-14`)
 
 Not every model is available in the regions supported by the responses API. Check the [models page](../concepts/models.md) for model region availability.
 
diff --git a/articles/ai-services/openai/how-to/structured-outputs.md b/articles/ai-services/openai/how-to/structured-outputs.md
@@ -32,6 +32,7 @@ Structured outputs make a model follow a [JSON Schema](https://json-schema.org/o
 - `gpt-4o` version: `2024-08-06`
 - `gpt-4o` version: `2024-11-20`
 - `gpt-4.1` version `2025-04-14`
+- `gpt-4.1-nano` version (`2025-14-2025`)
 
 ## API support
 
diff --git a/articles/ai-services/openai/whats-new.md b/articles/ai-services/openai/whats-new.md
@@ -23,7 +23,7 @@ This article provides a summary of the latest releases and major documentation u
 
 ### GPT-4.1 released
 
-The latest model from Azure OpenAI with a 1 million token context limit. For more information, see the [models page](./concepts/models.md#gpt-41-series).
+GPT 4.1 and GPT 4.1-nano are now available. These are the latest models from Azure OpenAI. GPT 4.1 has a 1 million token context limit. For more information, see the [models page](./concepts/models.md#gpt-41-series).
 
 ## March 2025