Skip to content

Commit 02b710a

Browse files
Merge pull request #6397 from mrbullwinkle/mrb_08_05_2025_oss
[Azure OpenAI] [gpt-oss] Models update
2 parents 90779e2 + d6450fe commit 02b710a

File tree

2 files changed

+39
-0
lines changed

2 files changed

+39
-0
lines changed

articles/ai-foundry/openai/concepts/models.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ Azure OpenAI is powered by a diverse set of models with different capabilities a
2222

2323
| Models | Description |
2424
|--|--|
25+
| [gpt-oss](#gpt-oss) | **NEW** open-weight reasoning models |
2526
| [`codex-mini`](#o-series-models) | Fine-tuned version of o4-mini. |
2627
| [`GPT-4.1 series`](#gpt-41-series) | The latest model release from Azure OpenAI. |
2728
| [`model-router`](#model-router) | A model that intelligently selects from a set of underlying chat models to respond to a given prompt. |
@@ -34,6 +35,38 @@ Azure OpenAI is powered by a diverse set of models with different capabilities a
3435
| [`Image generation`](#image-generation-models) | A series of models that can generate original images from natural language. |
3536
| [`Audio`](#audio-models) | A series of models for speech to text, translation, and text to speech. GPT-4o audio models support either low latency *speech in, speech out* conversational interactions or audio generation. |
3637

38+
## gpt-oss
39+
40+
### Region availability
41+
42+
| Model | Region |
43+
|---|---|
44+
| `gpt-oss-120b` | All Azure OpenAI regions |
45+
46+
### Capabilities
47+
48+
| Model ID | Description | Context Window | Max Output Tokens | Training Data (up to) |
49+
| --- | :--- |:--- |:---|:---: |
50+
| `gpt-oss-120b` (Preview) | - Text in/text out only <br> - Chat Completions API <br> - Streaming <br> - Function calling <br> - Structured outputs <br> - Reasoning <br> - Available for deployment<sup>1</sup> and via [managed compute](../../how-to/deploy-models-managed.md) | 131,072 | 131,072 | May 31, 2024 |
51+
| `gpt-oss-20b` (Preview) | - Text in/text out only <br> - Chat Completions API <br> - Streaming <br> - Function calling <br> - Structured outputs <br> - Reasoning <br> - Available via [managed compute only](../../how-to/deploy-models-managed.md) | 131,072 | 131,072 | May 31, 2024 |
52+
53+
<sup>1</sup> Unlike other Azure OpenAI models `gpt-oss-120b` requires an [Azure AI Foundry project](/azure/ai-foundry/quickstarts/get-started-code?tabs=azure-ai-foundry&pivots=fdp-project) to deploy the model.
54+
55+
### Deploy with code
56+
57+
```cli
58+
az cognitiveservices account deployment create \
59+
--name "Foundry-project-resource" \
60+
--resource-group "test-rg" \
61+
--deployment-name "gpt-oss-120b" \
62+
--model-name "gpt-oss-120b" \
63+
--model-version "1" \
64+
--model-format "OpenAI-OSS" \
65+
--sku-capacity 10 \
66+
--sku-name "GlobalStandard"
67+
```
68+
69+
3770
## GPT-4.1 series
3871

3972
### Region availability

articles/ai-foundry/openai/quotas-limits.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,12 @@ The following section provides you with a quick guide to the default quotas and
7373
7474
[!INCLUDE [Quota](./includes/global-batch-limits.md)]
7575

76+
## gpt-oss
77+
78+
| Model | Tokens per minute (TPM) | Requests per minute (RPM) |
79+
|----------------|-------------------|---------------------------------|
80+
| `gpt-oss-120b` | 5 M | 5 K |
81+
7682
## GPT-4 rate limits
7783

7884
### GPT-4.5 preview Global Standard

0 commit comments

Comments
 (0)