Merge pull request #6397 from mrbullwinkle/mrb_08_05_2025_oss

prmerger-automator[bot] · web-flow · commit 02b710a2ac2b · 2025-08-06T04:02:12.000Z
[Azure OpenAI] [gpt-oss] Models update
diff --git a/articles/ai-foundry/openai/concepts/models.md b/articles/ai-foundry/openai/concepts/models.md
@@ -22,6 +22,7 @@ Azure OpenAI is powered by a diverse set of models with different capabilities a
 
 | Models | Description |
 |--|--|
+| [gpt-oss](#gpt-oss) | **NEW** open-weight reasoning models |
 | [`codex-mini`](#o-series-models) | Fine-tuned version of o4-mini. |  
 | [`GPT-4.1 series`](#gpt-41-series) | The latest model release from Azure OpenAI. |
 | [`model-router`](#model-router) | A model that intelligently selects from a set of underlying chat models to respond to a given prompt. |
@@ -34,6 +35,38 @@ Azure OpenAI is powered by a diverse set of models with different capabilities a
 | [`Image generation`](#image-generation-models) | A series of models that can generate original images from natural language. |
 | [`Audio`](#audio-models) | A series of models for speech to text, translation, and text to speech. GPT-4o audio models support either low latency *speech in, speech out* conversational interactions or audio generation. |
 
+## gpt-oss
+
+### Region availability
+
+| Model | Region |
+|---|---|
+| `gpt-oss-120b`  | All Azure OpenAI regions |
+
+### Capabilities
+
+|  Model ID  | Description | Context Window | Max Output Tokens | Training Data (up to)  |
+|  --- |  :--- |:--- |:---|:---: |
+| `gpt-oss-120b` (Preview)   | - Text in/text out only <br> - Chat Completions API <br> - Streaming <br> - Function calling <br> - Structured outputs <br> - Reasoning <br> - Available for deployment<sup>1</sup> and via [managed compute](../../how-to/deploy-models-managed.md)  | 131,072 | 131,072 | May 31, 2024 |
+| `gpt-oss-20b` (Preview) | - Text in/text out only <br> - Chat Completions API <br> - Streaming <br> - Function calling <br> - Structured outputs <br> - Reasoning <br> - Available via [managed compute only](../../how-to/deploy-models-managed.md) | 131,072 | 131,072 | May 31, 2024 |
+
+<sup>1</sup> Unlike other Azure OpenAI models `gpt-oss-120b` requires an [Azure AI Foundry project](/azure/ai-foundry/quickstarts/get-started-code?tabs=azure-ai-foundry&pivots=fdp-project) to deploy the model.
+
+### Deploy with code
+
+```cli
+az cognitiveservices account deployment create \
+  --name "Foundry-project-resource" \
+  --resource-group "test-rg" \
+  --deployment-name "gpt-oss-120b" \
+  --model-name "gpt-oss-120b" \
+  --model-version "1" \
+  --model-format "OpenAI-OSS" \
+  --sku-capacity 10 \
+  --sku-name "GlobalStandard"
+```
+
+
 ## GPT-4.1 series
 
 ### Region availability
diff --git a/articles/ai-foundry/openai/quotas-limits.md b/articles/ai-foundry/openai/quotas-limits.md
@@ -73,6 +73,12 @@ The following section provides you with a quick guide to the default quotas and
 
 [!INCLUDE [Quota](./includes/global-batch-limits.md)]
 
+## gpt-oss
+
+| Model          | Tokens per minute (TPM) | Requests per minute (RPM) |
+|----------------|-------------------|---------------------------------|
+| `gpt-oss-120b` | 5 M               | 5 K                             |
+
 ## GPT-4 rate limits
 
 ### GPT-4.5 preview Global Standard