You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/concepts/models-featured.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,9 +27,9 @@ To perform inferencing with the models, some models such as [Nixtla's TimeGEN-1]
27
27
28
28
The Jamba family models are AI21's production-grade Mamba-based large language model (LLM) which uses AI21's hybrid Mamba-Transformer architecture. It's an instruction-tuned version of AI21's hybrid structured state space model (SSM) transformer Jamba model. The Jamba family models are built for reliable commercial use with respect to quality and performance.
| Model | Offer Availability Region | Hub/Project Region for Deployment | Hub/Project Region for Fine tuning |
29
29
|---------|---------|---------|---------|
30
+
Cohere Command A | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Israel <br> Qatar | East US 2 <br> Sweden Central | Not available |
30
31
Cohere Command R+ 08-2024 | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) |East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
31
32
Cohere Command R 08-2024 | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) |East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
32
33
Cohere Command R+ | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar |East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
33
34
Cohere Command R | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
34
35
Cohere Rerank v3.5 | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Israel <br> Qatar | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
35
36
Cohere Rerank v3 - English | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
36
37
Cohere Rerank v3 - Multilingual | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar | East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
38
+
Cohere Embed 4 | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Israel <br> Qatar | East US 2 <br> Sweden Central | Not available |
37
39
Cohere Embed v3 - English | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar |East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
38
40
Cohere Embed v3 - Multilingual | [Microsoft Managed Countries/Regions](/partner-center/marketplace/tax-details-marketplace#microsoft-managed-countriesregions) <br> Japan <br> Qatar |East US <br> East US 2 <br> North Central US <br> South Central US <br> Sweden Central <br> West US <br> West US 3 | Not available |
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/concepts/models.md
+4-3Lines changed: 4 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -108,10 +108,10 @@ See [this model collection in Azure AI Foundry portal](https://ai.azure.com/expl
108
108
DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks.
109
109
110
110
| Model | Type | Tier | Capabilities |
111
-
| ------ | ---- | --- | ------------ |
111
+
| ------ | ---- | ---- | ------------ |
112
+
|[DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek)| chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:**`en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
112
113
|[DeekSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek)| chat-completion <br /> [(with reasoning content)](../how-to/use-chat-reasoning.md)| Global standard | - **Input:** text (163,840 tokens) <br /> - **Output:** (163,840 tokens) <br /> - **Languages:**`en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text. |
113
114
|[DeekSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) <br />(Legacy) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:**`en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
114
-
|[DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek)| chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Languages:**`en` and `zh` <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
115
115
116
116
For a tutorial on DeepSeek-R1, see [Tutorial: Get started with DeepSeek-R1 reasoning model in Azure AI model inference](../tutorials/get-started-deepseek-r1.md).
117
117
@@ -167,12 +167,13 @@ Mistral AI offers two categories of models: premium models including Mistral Lar
167
167
168
168
| Model | Type | Tier | Capabilities |
169
169
| ------ | ---- | --- | ------------ |
170
+
|[Mistral-small-2503](https://ai.azure.com/explore/models/Mistral-small-2503/version/1/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
170
171
|[Mistral-Large-2411](https://ai.azure.com/explore/models/Mistral-Large-2411/version/2/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (128,000 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:**`en`, `fr`, `de`, `es`, `it`, `zh`, `ja`, `ko`, `pt`, `nl`, and `pl` <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
171
172
|[Codestral-2501](https://ai.azure.com/explore/models/Codestral-2501/version/2/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (262,144 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
172
173
|[Ministral-3B](https://ai.azure.com/explore/models/Ministral-3B/version/1/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
173
174
|[Mistral-Nemo](https://ai.azure.com/explore/models/Mistral-Nemo/version/1/registry/azureml-mistral)| chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:**`en`, `fr`, `de`, `es`, `it`, `zh`, `ja`, `ko`, `pt`, `nl`, and `pl` <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/includes/create-model-deployments/cli.md
+20-23Lines changed: 20 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -40,7 +40,7 @@ To add a model, you first need to identify the model that you want to deploy. Yo
40
40
2. If you have more than 1 subscription, select the subscription where your resource is located:
41
41
42
42
```azurecli
43
-
az account set --subscription $subscriptionId>
43
+
az account set --subscription $subscriptionId
44
44
```
45
45
46
46
3. Set the following environment variables with the name of the Azure AI Services resource you plan to use and resource group.
@@ -98,6 +98,24 @@ To add a model, you first need to identify the model that you want to deploy. Yo
98
98
99
99
You can deploy the same model multiple times if needed as long as it's under a different deployment name. This capability might be useful in case you want to test different configurations for a given model, including content safety.
100
100
101
+
## Use the model
102
+
103
+
Deployed models in Azure AI model inference can be consumed using the [Azure AI model's inference endpoint](../../concepts/endpoints.md) for the resource. When constructing your request, indicate the parameter `model` and insert the model deployment name you have created. You can programmatically get the URI for the inference endpoint using the following code:
104
+
105
+
__Inference endpoint__
106
+
107
+
```azurecli
108
+
az cognitiveservices account show -n $accountName -g $resourceGroupName | jq '.properties.endpoints["Azure AI Model Inference API"]'
109
+
```
110
+
111
+
To make requests to the Azure AI model inference endpoint, append the route `models`, for example `https://<resource>.services.ai.azure.com/models`. You can see the API reference for the endpoint at [Azure AI model inference API reference page](https://aka.ms/azureai/modelinference).
112
+
113
+
__Inference keys__
114
+
115
+
```azurecli
116
+
az cognitiveservices account keys list -n $accountName -g $resourceGroupName
117
+
```
118
+
101
119
## Manage deployments
102
120
103
121
You can see all the deployments available using the CLI:
@@ -124,26 +142,5 @@ You can see all the deployments available using the CLI:
124
142
--deployment-name "Phi-3.5-vision-instruct" \
125
143
-n $accountName \
126
144
-g $resourceGroupName
127
-
```
128
-
129
-
## Use the model
130
-
131
-
Deployed models in Azure AI model inference can be consumed using the [Azure AI model's inference endpoint](../../concepts/endpoints.md) for the resource. When constructing your request, indicate the parameter `model` and insert the model deployment name you have created. You can programmatically get the URI for the inference endpoint using the following code:
132
-
133
-
__Inference endpoint__
134
-
135
-
```azurecli
136
-
az cognitiveservices account show -n $accountName -g $resourceGroupName | jq '.properties.endpoints["Azure AI Model Inference API"]'
137
-
```
138
-
139
-
To make requests to the Azure AI model inference endpoint, append the route `models`, for example `https://<resource>.services.ai.azure.com/models`. You can see the API reference for the endpoint at [Azure AI model inference API reference page](https://aka.ms/azureai/modelinference).
140
-
141
-
__Inference keys__
142
-
143
-
```azurecli
144
-
az cognitiveservices account keys list -n $accountName -g $resourceGroupName
0 commit comments