You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-models-mistral.md
+49-38Lines changed: 49 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,69 +6,80 @@ manager: scottpolly
6
6
ms.service: machine-learning
7
7
ms.subservice: inferencing
8
8
ms.topic: how-to
9
-
ms.date: 02/23/2024
10
-
ms.reviewer: shubhiraj
11
-
reviewer: shubhirajMsft
9
+
ms.date: 04/29/2024
10
+
mms.author: kritifaujdar
11
+
.author: fkriti
12
12
ms.author: mopeakande
13
13
author: msakande
14
14
ms.custom: [references_regions]
15
15
16
16
#This functionality is also available in Azure AI Studio: /azure/ai-studio/how-to/deploy-models-mistral.md
17
17
---
18
18
# How to deploy Mistral models with Azure Machine Learning studio
19
+
20
+
In this article, you learn how to use Azure Machine Learning studio to deploy the Mistral family of models as a service with pay-as-you-go billing.
21
+
19
22
Mistral AI offers two categories of models in Azure Machine Learning studio:
20
23
21
-
-Premium models: Mistral Large. These models are available with pay-as-you-go token based billing with Models as a Service in the studio model catalog.
22
-
-Open models: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the Azure Machine Learning studio model catalog and can be deployed to dedicated VM instances in your own Azure subscription with managed online endpoints.
24
+
-__Premium models__: Mistral Large and Mistral Small. These models are available with pay-as-you-go token based billing with Models as a Service in the studio model catalog.
25
+
-__Open models__: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the studio model catalog and can be deployed to dedicated VM instances in your own Azure subscription with managed online endpoints.
23
26
24
-
You can browse the Mistral family of models in the model catalog by filtering on the Mistral collection.
27
+
You can browse the Mistral family of models in the [model catalog](concept-model-catalog.md) by filtering on the Mistral collection.
25
28
26
-
## Mistral Large
29
+
## Mistral family of models
27
30
28
-
In this article, you learn how to use Azure Machine Learning studio to deploy the Mistral Large model as a service with pay-as you go billing.
31
+
# [Mistral Large](#tab/mistral-large)
29
32
30
-
Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task thanks to its state-of-the-art reasoning and knowledge capabilities.
33
+
Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task, thanks to its state-of-the-art reasoning and knowledge capabilities.
31
34
32
-
Additionally, mistral-large is:
35
+
Additionally, Mistral Large is:
33
36
34
-
- Specialized in RAG. Crucial information isn't lost in the middle of long context windows (up to 32 K tokens).
35
-
- Strong in coding. Code generation, review, and comments. Supports all mainstream coding languages.
36
-
- Multi-lingual by design. Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.
37
-
- Responsible AI. Efficient guardrails baked in the model, with additional safety layer with safe_mode option.
37
+
-__Specialized in RAG.__ Crucial information isn't lost in the middle of long context windows (up to 32 K tokens).
38
+
-__Strong in coding.__ Code generation, review, and comments. Supports all mainstream coding languages.
39
+
-__Multi-lingual by design.__ Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.
40
+
-__Responsible AI compliant.__ Efficient guardrails baked in the model, and extra safety layer with the `safe_mode` option.
Mistral Small is Mistral AI's most efficient Large Language Model (LLM). It can be used on any language-based task that requires high efficiency and low latency.
45
+
46
+
Mistral Small is:
40
47
41
-
## Deploy Mistral Large with pay-as-you-go
48
+
-**A small model optimized for low latency.** Very efficient for high volume and low latency workloads. Mistral Small is Mistral's smallest proprietary model, it outperforms Mixtral-8x7B and has lower latency.
49
+
-**Specialized in RAG.** Crucial information isn't lost in the middle of long context windows (up to 32K tokens).
50
+
-**Strong in coding.** Code generation, review, and comments. Supports all mainstream coding languages.
51
+
-**Multi-lingual by design.** Best-in-class performance in French, German, Spanish, Italian, and English. Dozens of other languages are supported.
52
+
-**Responsible AI compliant.** Efficient guardrails baked in the model, and extra safety layer with the `safe_mode` option.
42
53
43
-
Certain models in the model catalog can be deployed as a service with pay-as-you-go, providing a way to consume them as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
54
+
---
44
55
45
-
Mistral Large can be deployed as a service with pay-as-you-go, and is offered by Mistral AI through the Microsoft Azure Marketplace. Please note that Mistral AI can change or update the terms of use and pricing of this model.
##Deploy Mistral family of models with pay-as-you-go
48
59
49
-
The following models are available in Azure Marketplace for Mistral AI when deployed as a service with pay-as-you-go:
60
+
Certain models in the model catalog can be deployed as a service with pay-as-you-go. Pay-as-you-go deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
50
61
51
-
*Mistral Large (preview)
62
+
**Mistral Large** and **Mistral Small** are eligible to be deployed as a service with pay-as-you-go and are offered by Mistral AI through the Microsoft Azure Marketplace. Mistral AI can change or update the terms of use and pricing of these models.
52
63
53
64
### Prerequisites
54
65
55
66
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
56
-
- An Azure Machine Learning workspace. If you don't have these, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create them.
67
+
- An Azure Machine Learning workspace. If you don't have a workspace, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create one.
57
68
58
69
> [!IMPORTANT]
59
-
> Pay-as-you-go model deployment offering is only available in workspaces created in **East US 2** and **France Central** regions.
70
+
> The pay-as-you-go model deployment offering for eligible models in the Mistral family is only available in workspaces created in the **East US 2** and **Sweden Central** regions. For _Mistral Large_, the pay-as-you-go offering is also available in the **France Central** region.
60
71
61
-
- Azure role-based access controls (Azure RBAC) are used to grant access to operations. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the Resouce Group.
62
-
63
-
For more information on permissions, see [Manage access to an Azure Machine Learning workspace](how-to-assign-roles.md).
72
+
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure Machine Learning. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Manage access to an Azure Machine Learning workspace](how-to-assign-roles.md).
64
73
65
74
### Create a new deployment
66
75
76
+
The following steps demonstrate the deployment of Mistral Large, but you can use the same steps to deploy Mistral Small by replacing the model name.
77
+
67
78
To create a deployment:
68
79
69
80
1. Go to [Azure Machine Learning studio](https://ml.azure.com/home).
70
-
1. Select the workspace in which you want to deploy your models. To use the pay-as-you-go model deployment offering, your workspace must belong to the **East US 2** or **France Central** region.
71
-
1. Choose the model (Mistral-large) you want to deploy from the [model catalog](https://ml.azure.com/model/catalog).
81
+
1. Select the workspace in which you want to deploy your models. To use the pay-as-you-go model deployment offering, your workspace must belong to the **East US 2**, **Sweden Central**, or **France Central** region.
82
+
1. Choose the model (Mistral-large) that you want to deploy from the [model catalog](https://ml.azure.com/model/catalog).
72
83
73
84
Alternatively, you can initiate deployment by going to your workspace and selecting **Endpoints** > **Serverless endpoints** > **Create**.
74
85
@@ -82,7 +93,7 @@ To create a deployment:
82
93
83
94
:::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-marketplace-terms.png" alt-text="A screenshot showing the terms and conditions of a given model." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-marketplace-terms.png":::
84
95
85
-
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you will see a **Continue to deploy** option to select.
96
+
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you'll see a **Continue to deploy** option to select.
86
97
87
98
:::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png":::
88
99
@@ -96,20 +107,20 @@ To create a deployment:
96
107
1. You can always find the endpoint's details, URL, and access keys by navigating to **Workspace** > **Endpoints** > **Serverless endpoints**.
97
108
1. Take note of the **Target** URL and the **Secret Key** to call the deployment and generate chat completions using the [`<target_url>/v1/chat/completions`](#chat-api) API.
98
109
99
-
To learn about billing for Mistral models deployed with pay-as-you-go, see [Cost and quota considerations for Mistral models deployed as a service](#cost-and-quota-considerations-for-mistral-large-deployed-as-a-service).
110
+
To learn about billing for Mistral models deployed with pay-as-you-go, see [Cost and quota considerations for Mistral family of models deployed as a service](#cost-and-quota-considerations-for-mistral-family-of-models-deployed-as-a-service).
100
111
101
-
### Consume the Mistral Large model as a service
112
+
### Consume the Mistral family of models as a service
102
113
103
-
Mistral Large can be consumed using the chat API.
114
+
You can consume Mistral Large by using the chat API.
104
115
105
116
1. In the **workspace**, select **Endpoints** > **Serverless endpoints**.
106
117
1. Find and select the deployment you created.
107
118
1. Copy the **Target** URL and the **Key** token values.
108
119
1. Make an API request using the [`<target_url>/v1/chat/completions`](#chat-api) API.
109
120
110
-
For more information on using the APIs, see the [reference](#reference-for-mistral-large-deployed-as-a-service) section.
121
+
For more information on using the APIs, see the [reference](#reference-for-mistral-family-of-models-deployed-as-a-service) section.
111
122
112
-
### Reference for Mistral large deployed as a service
123
+
### Reference for Mistral family of models deployed as a service
113
124
114
125
#### Chat API
115
126
@@ -134,7 +145,7 @@ Payload is a JSON formatted string containing the following parameters:
134
145
|`stream`|`boolean`|`False`| Streaming allows the generated tokens to be sent as data-only server-sent events whenever they become available. |
135
146
|`max_tokens`|`integer`|`8192`| The maximum number of tokens to generate in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length. |
136
147
|`top_p`|`float`|`1`| An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with `top_p` probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering `top_p` or `temperature`, but not both. |
137
-
|`temperature`|`float`|`1`| The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this or `top_p`, but not both. |
148
+
|`temperature`|`float`|`1`| The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this parameter or `top_p`, but not both. |
138
149
|`ignore_eos`|`boolean`|`False`| Whether to ignore the EOS token and continue generating tokens after the EOS token is generated. |
139
150
|`safe_prompt`|`boolean`|`False`| Whether to inject a safety prompt before all conversations. |
140
151
@@ -214,7 +225,7 @@ The `logprobs` object is a dictionary with the following fields:
214
225
215
226
#### Example
216
227
217
-
The following is an example response:
228
+
The following JSON is an example response:
218
229
219
230
```json
220
231
{
@@ -240,7 +251,7 @@ The following is an example response:
@@ -252,7 +263,7 @@ The following is an example response:
252
263
253
264
## Cost and quotas
254
265
255
-
### Cost and quota considerations for Mistral Large deployed as a service
266
+
### Cost and quota considerations for Mistral family of models deployed as a service
256
267
257
268
Mistral models deployed as a service are offered by Mistral AI through Azure Marketplace and integrated with Azure Machine Learning studio for use. You can find Azure Marketplace pricing when deploying the models.
0 commit comments