You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this article, you learn how to use Azure AI Studio to deploy the Mistral family of models as serverless APIs with pay-as-you-go token-based billing.
21
-
Mistral AI offers two categories of models in [Azure AI Studio](https://ai.azure.com):
21
+
Mistral AI offers two categories of models in the [Azure AI Studio](https://ai.azure.com). These models are available in the [model catalog](model-catalog-overview.md):
22
22
23
-
*__Premium models__: Mistral Large and Mistral Small. These models are available as serverless APIs with pay-as-you-go token-based billing in the AI Studio model catalog.
24
-
*__Open models__: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the AI Studio model catalog and can be deployed to managed compute in your own Azure subscription.
23
+
*__Premium models__: Mistral Large and Mistral Small. These models can be deployed as serverless APIs with pay-as-you-go token-based billing.
24
+
*__Open models__: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models can be deployed to managed computes in your own Azure subscription.
25
25
26
-
You can browse the Mistral family of models in the [Model Catalog](model-catalog-overview.md) by filtering on the Mistral collection.
26
+
You can browse the Mistral family of models in the modelcatalog by filtering on the Mistral collection.
27
27
28
28
## Mistral family of models
29
29
@@ -61,10 +61,18 @@ Certain models in the model catalog can be deployed as a serverless API with pay
61
61
### Prerequisites
62
62
63
63
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
64
-
- An [Azure AI Studio hub](../how-to/create-azure-ai-resource.md).
64
+
- An [AI Studio hub](../how-to/create-azure-ai-resource.md). The serverless API model deployment offering for eligible models in the Mistral family is only available with hubs created in these regions:
65
+
66
+
- East US
67
+
- East US 2
68
+
- North Central US
69
+
- South Central US
70
+
- West US
71
+
- West US 3
72
+
- Sweden Central
73
+
74
+
For a list of regions that are available for each of the models supporting serverless API endpoint deployments, see [Region availability for models in serverless API endpoints](deploy-models-serverless-availability.md).
65
75
66
-
> [!IMPORTANT]
67
-
> The serverless API model deployment offering for eligible models in the Mistral family is only available in hubs created in the **East US 2** and **Sweden Central** regions.
68
76
- An [Azure AI Studio project](../how-to/create-projects.md).
69
77
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
70
78
@@ -93,7 +101,7 @@ To create a deployment:
93
101
94
102
:::image type="content" source="../media/deploy-monitor/mistral/mistral-large-deploy-pay-as-you-go.png" alt-text="A screenshot showing how to deploy a model as a serverless API." lightbox="../media/deploy-monitor/mistral/mistral-large-deploy-pay-as-you-go.png":::
95
103
96
-
1. Select the project in which you want to deploy your model. To deploy the Mistral model, your project must be in the *EastUS2* or *Sweden Central* region.
104
+
1. Select the project in which you want to deploy your model. To use the serverless API model deployment offering, your project must belong to one of the regions listed in the [prerequisites](#prerequisites).
97
105
1. In the deployment wizard, select the link to **Azure Marketplace Terms** to learn more about the terms of use.
98
106
1. Select the **Pricing and terms** tab to learn about pricing for the selected model.
99
107
1. Select the **Subscribe and Deploy** button. If this is your first time deploying the model in the project, you have to subscribe your project for the particular offering. This step requires that your account has the **Azure AI Developer role** permissions on the resource group, as listed in the prerequisites. Each project has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending. Currently, you can have only one deployment for each model within a project.
@@ -125,15 +133,15 @@ You can consume Mistral family models by using the chat API.
125
133
126
134
For more information on using the APIs, see the [reference](#reference-for-mistral-family-of-models-deployed-as-a-service) section.
127
135
128
-
###Reference for Mistral family of models deployed as a service
136
+
## Reference for Mistral family of models deployed as a service
129
137
130
138
Mistral models accept both the [Azure AI Model Inference API](../reference/reference-model-inference-api.md) on the route `/chat/completions` and the native [Mistral Chat API](#mistral-chat-api) on `/v1/chat/completions`.
131
139
132
140
### Azure AI Model Inference API
133
141
134
142
The [Azure AI Model Inference API](../reference/reference-model-inference-api.md) schema can be found in the [reference for Chat Completions](../reference/reference-model-inference-chat-completions.md) article and an [OpenAPI specification can be obtained from the endpoint itself](../reference/reference-model-inference-api.md?tabs=rest#getting-started).
135
143
136
-
####Mistral Chat API
144
+
### Mistral Chat API
137
145
138
146
Use the method `POST` to send the request to the `/v1/chat/completions` route:
139
147
@@ -168,7 +176,7 @@ The `messages` object has the following fields:
168
176
|`role`|`string`| The role of the message's author. One of `system`, `user`, or `assistant`. |
169
177
170
178
171
-
#### Example
179
+
#### Request example
172
180
173
181
__Body__
174
182
@@ -234,7 +242,7 @@ The `logprobs` object is a dictionary with the following fields:
234
242
|`tokens`|`array` of `string`| Selected tokens. |
235
243
|`top_logprobs`|`array` of `dictionary`| Array of dictionary. In each dictionary, the key is the token and the value is the probability. |
236
244
237
-
#### Example
245
+
#### Response example
238
246
239
247
The following JSON is an example response:
240
248
@@ -261,15 +269,16 @@ The following JSON is an example response:
0 commit comments