Skip to content

Commit 8a605fd

Browse files
authored
Merge pull request #4143 from MicrosoftDocs/main
4/15/2025 AM Publish
2 parents cdb903c + acdcb86 commit 8a605fd

File tree

18 files changed

+73
-56
lines changed

18 files changed

+73
-56
lines changed

articles/ai-foundry/concepts/models-featured.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,9 @@ To perform inferencing with the models, some models such as [Nixtla's TimeGEN-1]
2727

2828
The Jamba family models are AI21's production-grade Mamba-based large language model (LLM) which uses AI21's hybrid Mamba-Transformer architecture. It's an instruction-tuned version of AI21's hybrid structured state space model (SSM) transformer Jamba model. The Jamba family models are built for reliable commercial use with respect to quality and performance.
2929

30-
| Model | Type | Capabilities |
31-
| ------ | ---- | --- |
32-
| [AI21-Jamba-1.5-Mini](https://ai.azure.com/explore/models/AI21-Jamba-1.5-Mini/version/1/registry/azureml-ai21) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (262,144 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs | |
30+
| Model | Type | Capabilities |
31+
| ------ | ---- | --- |
32+
| [AI21-Jamba-1.5-Mini](https://ai.azure.com/explore/models/AI21-Jamba-1.5-Mini/version/1/registry/azureml-ai21) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (262,144 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
3333
| [AI21-Jamba-1.5-Large](https://ai.azure.com/explore/models/AI21-Jamba-1.5-Large/version/1/registry/azureml-ai21) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (262,144 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
3434

3535

@@ -68,10 +68,12 @@ The following table lists the Cohere models that you can inference via the Azur
6868

6969
| Model | Type | Capabilities |
7070
| ------ | ---- | --- |
71+
| [Cohere-command-A](https://aka.ms/aistudio/landing/cohere-command-a) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (256,000 tokens) <br /> - **Output:** text (8,000 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text |
7172
| [Cohere-command-r-plus-08-2024](https://ai.azure.com/explore/models/Cohere-command-r-plus-08-2024/version/1/registry/azureml-cohere) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
7273
| [Cohere-command-r-08-2024](https://ai.azure.com/explore/models/Cohere-command-r-08-2024/version/1/registry/azureml-cohere) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
7374
| [Cohere-command-r-plus](https://ai.azure.com/explore/models/Cohere-command-r-plus/version/1/registry/azureml-cohere) <br> (deprecated) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
7475
| [Cohere-command-r](https://ai.azure.com/explore/models/Cohere-command-r/version/1/registry/azureml-cohere) <br> (deprecated)| [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
76+
| [Cohere-embed-4](https://aka.ms/aistudio/landing/cohere-embed-4) | [embeddings](../model-inference/how-to/use-embeddings.md?context=/azure/ai-foundry/context/context) <br /> [image-embeddings](../model-inference/how-to/use-image-embeddings.md?context=/azure/ai-foundry/context/context) | - **Input:** image, text <br /> - **Output:** image, text (128,000 tokens) <br /> - **Tool calling:** Yes <br /> - **Response formats:** image, text |
7577
| [Cohere-embed-v3-english](https://ai.azure.com/explore/models/Cohere-embed-v3-english/version/1/registry/azureml-cohere) | [embeddings](../model-inference/how-to/use-embeddings.md?context=/azure/ai-foundry/context/context) <br /> [image-embeddings](../model-inference/how-to/use-image-embeddings.md?context=/azure/ai-foundry/context/context) | - **Input:** text (512 tokens) <br /> - **Output:** Vector (1,024 dim.) |
7678
| [Cohere-embed-v3-multilingual](https://ai.azure.com/explore/models/Cohere-embed-v3-multilingual/version/1/registry/azureml-cohere) | [embeddings](../model-inference/how-to/use-embeddings.md?context=/azure/ai-foundry/context/context) <br /> [image-embeddings](../model-inference/how-to/use-image-embeddings.md?context=/azure/ai-foundry/context/context) | - **Input:** text (512 tokens) <br /> - **Output:** Vector (1,024 dim.) |
7779

articles/ai-foundry/includes/region-availability-maas.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,13 +27,15 @@ Bria-2.3-Fast | [Microsoft Managed Countries/Regions](/partner-center/marketpl
2727

2828
| Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
2929
|---------|---------|---------|---------|
30+
Cohere Command A | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Israel <br> Qatar | East US 2 <br> Sweden Central | Not available |
3031
Cohere Command R+ 08-2024 | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) |East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
3132
Cohere Command R 08-2024 | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) |East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
3233
Cohere Command R+ | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar |East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
3334
Cohere Command R | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
3435
Cohere Rerank v3.5 | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Israel <br> Qatar | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
3536
Cohere Rerank v3 - English | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
3637
Cohere Rerank v3 - Multilingual | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
38+
Cohere Embed 4 | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Israel <br> Qatar | East US 2 <br> Sweden Central | Not available |
3739
Cohere Embed v3 - English | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar |East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
3840
Cohere Embed v3 - Multilingual | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar |East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
3941

articles/ai-foundry/model-inference/concepts/models.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -108,10 +108,10 @@ See [this model collection in Azure AI Foundry portal](https://ai.azure.com/expl
108108
DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks.
109109

110110
| Model | Type | Tier | Capabilities |
111-
| ------ | ---- | --- | ------------ |
111+
| ------ | ---- | ---- | ------------ |
112+
| [DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
112113
| [DeekSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | chat-completion <br /> [(with reasoning content)](../how-to/use-chat-reasoning.md) | Global standard | - **Input:** text (163,840 tokens) <br /> - **Output:** (163,840 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text. |
113114
| [DeekSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) <br />(Legacy) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
114-
| [DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
115115

116116
For a tutorial on DeepSeek-R1, see [Tutorial: Get started with DeepSeek-R1 reasoning model in Azure AI model inference](../tutorials/get-started-deepseek-r1.md).
117117

@@ -167,12 +167,13 @@ Mistral AI offers two categories of models: premium models including Mistral Lar
167167

168168
| Model | Type | Tier | Capabilities |
169169
| ------ | ---- | --- | ------------ |
170+
| [Mistral-small-2503](https://ai.azure.com/explore/models/Mistral-small-2503/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
170171
| [Mistral-Large-2411](https://ai.azure.com/explore/models/Mistral-Large-2411/version/2/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (128,000 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** `en`, `fr`, `de`, `es`, `it`, `zh`, `ja`, `ko`, `pt`, `nl`, and `pl` <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
171172
| [Codestral-2501](https://ai.azure.com/explore/models/Codestral-2501/version/2/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (262,144 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
172173
| [Ministral-3B](https://ai.azure.com/explore/models/Ministral-3B/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
173174
| [Mistral-Nemo](https://ai.azure.com/explore/models/Mistral-Nemo/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** `en`, `fr`, `de`, `es`, `it`, `zh`, `ja`, `ko`, `pt`, `nl`, and `pl` <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
174175
| [Mistral-large-2407](https://ai.azure.com/explore/models/Mistral-large-2407/version/1/registry/azureml-mistral) <br> (deprecated) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** `en`, `fr`, `de`, `es`, `it`, `zh`, `ja`, `ko`, `pt`, `nl`, and `pl` <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
175-
| [Mistral-small](https://ai.azure.com/explore/models/Mistral-small/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
176+
| [Mistral-small](https://ai.azure.com/explore/models/Mistral-small/version/1/registry/azureml-mistral) <br> (deprecated) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
176177
| [Mistral-large](https://ai.azure.com/explore/models/Mistral-large/version/1/registry/azureml-mistral) <br> (deprecated) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
177178

178179
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=mistral).

articles/ai-foundry/model-inference/includes/create-model-deployments/cli.md

Lines changed: 20 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ To add a model, you first need to identify the model that you want to deploy. Yo
4040
2. If you have more than 1 subscription, select the subscription where your resource is located:
4141
4242
```azurecli
43-
az account set --subscription $subscriptionId>
43+
az account set --subscription $subscriptionId
4444
```
4545
4646
3. Set the following environment variables with the name of the Azure AI Services resource you plan to use and resource group.
@@ -98,6 +98,24 @@ To add a model, you first need to identify the model that you want to deploy. Yo
9898
9999
You can deploy the same model multiple times if needed as long as it's under a different deployment name. This capability might be useful in case you want to test different configurations for a given model, including content safety.
100100
101+
## Use the model
102+
103+
Deployed models in Azure AI model inference can be consumed using the [Azure AI model's inference endpoint](../../concepts/endpoints.md) for the resource. When constructing your request, indicate the parameter `model` and insert the model deployment name you have created. You can programmatically get the URI for the inference endpoint using the following code:
104+
105+
__Inference endpoint__
106+
107+
```azurecli
108+
az cognitiveservices account show -n $accountName -g $resourceGroupName | jq '.properties.endpoints["Azure AI Model Inference API"]'
109+
```
110+
111+
To make requests to the Azure AI model inference endpoint, append the route `models`, for example `https://<resource>.services.ai.azure.com/models`. You can see the API reference for the endpoint at [Azure AI model inference API reference page](https://aka.ms/azureai/modelinference).
112+
113+
__Inference keys__
114+
115+
```azurecli
116+
az cognitiveservices account keys list -n $accountName -g $resourceGroupName
117+
```
118+
101119
## Manage deployments
102120

103121
You can see all the deployments available using the CLI:
@@ -124,26 +142,5 @@ You can see all the deployments available using the CLI:
124142
--deployment-name "Phi-3.5-vision-instruct" \
125143
-n $accountName \
126144
-g $resourceGroupName
127-
```
128-
129-
## Use the model
130-
131-
Deployed models in Azure AI model inference can be consumed using the [Azure AI model's inference endpoint](../../concepts/endpoints.md) for the resource. When constructing your request, indicate the parameter `model` and insert the model deployment name you have created. You can programmatically get the URI for the inference endpoint using the following code:
132-
133-
__Inference endpoint__
134-
135-
```azurecli
136-
az cognitiveservices account show -n $accountName -g $resourceGroupName | jq '.properties.endpoints["Azure AI Model Inference API"]'
137-
```
138-
139-
To make requests to the Azure AI model inference endpoint, append the route `models`, for example `https://<resource>.services.ai.azure.com/models`. You can see the API reference for the endpoint at [Azure AI model inference API reference page](https://aka.ms/azureai/modelinference).
140-
141-
__Inference keys__
142-
143-
```azurecli
144-
az cognitiveservices account keys list -n $accountName -g $resourceGroupName
145-
```
146-
145+
```
147146
148-
149-

articles/ai-services/openai/concepts/models.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,12 +38,15 @@ Azure OpenAI Service is powered by a diverse set of models with different capabi
3838
| Model | Region |
3939
|---|---|
4040
| `gpt-4.1` (2025-04-14) | East US2 (Global Standard), Sweden Central (Global Standard) |
41+
| `gpt-4.1-nano` (2025-04-14) | East US2 (Global Standard), Sweden Central (Global Standard)|
4142

4243
### Capabilities
4344

4445
| Model ID | Description | Context Window | Max Output Tokens | Training Data (up to) |
4546
| --- | :--- |:--- |:---|:---: |
4647
| `gpt-4.1` (2025-04-14) <br> <br> **Latest model from Azure OpenAI** | - Text & image input <br> - Text output <br> - Chat completions API <br>- Responses API <br> - Streaming <br> - Function calling <br> Structured outputs (chat completions) | 1,047,576 | 32,768 | May 31, 2024 |
48+
| `gpt-4.1-nano` (2025-04-14) <br><br> **Fastest 4.1 model** | - Text & image input <br> - Text output <br> - Chat completions API <br>- Responses API <br> - Streaming <br> - Function calling <br> Structured outputs (chat completions) | 1,047,576 | 32,768 | May 31, 2024 |
49+
4750

4851
## computer-use-preview
4952

articles/ai-services/openai/how-to/embeddings.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ string oaiKey = "YOUR_API_KEY";
5757

5858
AzureKeyCredential credentials = new (oaiKey);
5959

60-
OpenAIClient openAIClient = new (oaiEndpoint, credentials);
60+
AzureOpenAIClient openAIClient = new (oaiEndpoint, credentials);
6161

6262
EmbeddingsOptions embeddingOptions = new()
6363
{

articles/ai-services/openai/how-to/function-calling.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.author: mbullwin #delegenz
77
ms.service: azure-ai-openai
88
ms.custom: devx-track-python
99
ms.topic: how-to
10-
ms.date: 02/28/2025
10+
ms.date: 04/14/2025
1111
manager: nitinme
1212
---
1313

@@ -40,6 +40,8 @@ At a high level you can break down working with functions into three steps:
4040
* `gpt-4o` (`2024-11-20`)
4141
* `gpt-4o-mini` (`2024-07-18`)
4242
* `gpt-4.5-preview` (`2025-02-27`)
43+
* `gpt-4.1` (`2025-14-2025`)
44+
* `gpt-4.1-nano` (`2025-14-2025`)
4345

4446
Support for parallel function was first added in API version [`2023-12-01-preview`](https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2023-12-01-preview/inference.json)
4547

0 commit comments

Comments
 (0)