You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-models-mistral.md
+38-49Lines changed: 38 additions & 49 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,80 +6,69 @@ manager: scottpolly
6
6
ms.service: machine-learning
7
7
ms.subservice: inferencing
8
8
ms.topic: how-to
9
-
ms.date: 04/29/2024
10
-
mms.author: kritifaujdar
11
-
.author: fkriti
9
+
ms.date: 02/23/2024
10
+
ms.reviewer: shubhiraj
11
+
reviewer: shubhirajMsft
12
12
ms.author: mopeakande
13
13
author: msakande
14
14
ms.custom: [references_regions]
15
15
16
16
#This functionality is also available in Azure AI Studio: /azure/ai-studio/how-to/deploy-models-mistral.md
17
17
---
18
18
# How to deploy Mistral models with Azure Machine Learning studio
19
-
20
-
In this article, you learn how to use Azure Machine Learning studio to deploy the Mistral family of models as a service with pay-as-you-go billing.
21
-
22
19
Mistral AI offers two categories of models in Azure Machine Learning studio:
23
20
24
-
-__Premium models__: Mistral Large and Mistral Small. These models are available with pay-as-you-go token based billing with Models as a Service in the studio model catalog.
25
-
-__Open models__: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the studio model catalog and can be deployed to dedicated VM instances in your own Azure subscription with managed online endpoints.
21
+
-Premium models: Mistral Large. These models are available with pay-as-you-go token based billing with Models as a Service in the studio model catalog.
22
+
-Open models: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the Azure Machine Learning studio model catalog and can be deployed to dedicated VM instances in your own Azure subscription with managed online endpoints.
26
23
27
-
You can browse the Mistral family of models in the [model catalog](concept-model-catalog.md) by filtering on the Mistral collection.
24
+
You can browse the Mistral family of models in the model catalog by filtering on the Mistral collection.
28
25
29
-
## Mistral family of models
26
+
## Mistral Large
30
27
31
-
# [Mistral Large](#tab/mistral-large)
28
+
In this article, you learn how to use Azure Machine Learning studio to deploy the Mistral Large model as a service with pay-as you go billing.
32
29
33
-
Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task, thanks to its state-of-the-art reasoning and knowledge capabilities.
30
+
Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task thanks to its state-of-the-art reasoning and knowledge capabilities.
34
31
35
-
Additionally, Mistral Large is:
32
+
Additionally, mistral-large is:
36
33
37
-
-__Specialized in RAG.__ Crucial information isn't lost in the middle of long context windows (up to 32 K tokens).
38
-
-__Strong in coding.__ Code generation, review, and comments. Supports all mainstream coding languages.
39
-
-__Multi-lingual by design.__ Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.
40
-
-__Responsible AI compliant.__ Efficient guardrails baked in the model, and extra safety layer with the `safe_mode` option.
41
-
42
-
# [Mistral Small](#tab/mistral-small)
34
+
- Specialized in RAG. Crucial information isn't lost in the middle of long context windows (up to 32 K tokens).
35
+
- Strong in coding. Code generation, review, and comments. Supports all mainstream coding languages.
36
+
- Multi-lingual by design. Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.
37
+
- Responsible AI. Efficient guardrails baked in the model, with additional safety layer with safe_mode option.
43
38
44
-
Mistral Small is Mistral AI's most efficient Large Language Model (LLM). It can be used on any language-based task that requires high efficiency and low latency.
-**A small model optimized for low latency.** Very efficient for high volume and low latency workloads. Mistral Small is Mistral's smallest proprietary model, it outperforms Mixtral-8x7B and has lower latency.
49
-
-**Specialized in RAG.** Crucial information isn't lost in the middle of long context windows (up to 32K tokens).
50
-
-**Strong in coding.** Code generation, review, and comments. Supports all mainstream coding languages.
51
-
-**Multi-lingual by design.** Best-in-class performance in French, German, Spanish, Italian, and English. Dozens of other languages are supported.
52
-
-**Responsible AI compliant.** Efficient guardrails baked in the model, and extra safety layer with the `safe_mode` option.
41
+
## Deploy Mistral Large with pay-as-you-go
53
42
54
-
---
43
+
Certain models in the model catalog can be deployed as a service with pay-as-you-go, providing a way to consume them as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
Mistral Large can be deployed as a service with pay-as-you-go, and is offered by Mistral AI through the Microsoft Azure Marketplace. Please note that Mistral AI can change or update the terms of use and pricing of this model.
57
46
58
-
##Deploy Mistral family of models with pay-as-you-go
47
+
### Azure Marketplace model offerings
59
48
60
-
Certain models in the model catalog can be deployed as a service with pay-as-you-go. Pay-as-you-go deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
49
+
The following models are available in Azure Marketplace for Mistral AI when deployed as a service with pay-as-you-go:
61
50
62
-
**Mistral Large** and **Mistral Small** are eligible to be deployed as a service with pay-as-you-go and are offered by Mistral AI through the Microsoft Azure Marketplace. Mistral AI can change or update the terms of use and pricing of these models.
51
+
*Mistral Large (preview)
63
52
64
53
### Prerequisites
65
54
66
55
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
67
-
- An Azure Machine Learning workspace. If you don't have a workspace, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create one.
56
+
- An Azure Machine Learning workspace. If you don't have these, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create them.
68
57
69
58
> [!IMPORTANT]
70
-
> The pay-as-you-go model deployment offering for eligible models in the Mistral family is only available in workspaces created in the **East US 2** and **Sweden Central** regions. For _Mistral Large_, the pay-as-you-go offering is also available in the **France Central**region.
59
+
> Pay-as-you-go model deployment offering is only available in workspaces created in **East US 2** and **France Central**regions.
71
60
72
-
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure Machine Learning. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Manage access to an Azure Machine Learning workspace](how-to-assign-roles.md).
61
+
-Azure role-based access controls (Azure RBAC) are used to grant access to operations. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the Resouce Group.
73
62
74
-
### Create a new deployment
63
+
For more information on permissions, see [Manage access to an Azure Machine Learning workspace](how-to-assign-roles.md).
75
64
76
-
The following steps demonstrate the deployment of Mistral Large, but you can use the same steps to deploy Mistral Small by replacing the model name.
65
+
### Create a new deployment
77
66
78
67
To create a deployment:
79
68
80
69
1. Go to [Azure Machine Learning studio](https://ml.azure.com/home).
81
-
1. Select the workspace in which you want to deploy your models. To use the pay-as-you-go model deployment offering, your workspace must belong to the **East US 2**, **Sweden Central**, or **France Central** region.
82
-
1. Choose the model (Mistral-large) that you want to deploy from the [model catalog](https://ml.azure.com/model/catalog).
70
+
1. Select the workspace in which you want to deploy your models. To use the pay-as-you-go model deployment offering, your workspace must belong to the **East US 2** or **France Central** region.
71
+
1. Choose the model (Mistral-large) you want to deploy from the [model catalog](https://ml.azure.com/model/catalog).
83
72
84
73
Alternatively, you can initiate deployment by going to your workspace and selecting **Endpoints** > **Serverless endpoints** > **Create**.
85
74
@@ -93,7 +82,7 @@ To create a deployment:
93
82
94
83
:::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-marketplace-terms.png" alt-text="A screenshot showing the terms and conditions of a given model." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-marketplace-terms.png":::
95
84
96
-
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you'll see a **Continue to deploy** option to select.
85
+
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you will see a **Continue to deploy** option to select.
97
86
98
87
:::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png":::
99
88
@@ -107,20 +96,20 @@ To create a deployment:
107
96
1. You can always find the endpoint's details, URL, and access keys by navigating to **Workspace** > **Endpoints** > **Serverless endpoints**.
108
97
1. Take note of the **Target** URL and the **Secret Key** to call the deployment and generate chat completions using the [`<target_url>/v1/chat/completions`](#chat-api) API.
109
98
110
-
To learn about billing for Mistral models deployed with pay-as-you-go, see [Cost and quota considerations for Mistral family of models deployed as a service](#cost-and-quota-considerations-for-mistral-family-of-models-deployed-as-a-service).
99
+
To learn about billing for Mistral models deployed with pay-as-you-go, see [Cost and quota considerations for Mistral models deployed as a service](#cost-and-quota-considerations-for-mistral-large-deployed-as-a-service).
111
100
112
-
### Consume the Mistral family of models as a service
101
+
### Consume the Mistral Large model as a service
113
102
114
-
You can consume Mistral Large by using the chat API.
103
+
Mistral Large can be consumed using the chat API.
115
104
116
105
1. In the **workspace**, select **Endpoints** > **Serverless endpoints**.
117
106
1. Find and select the deployment you created.
118
107
1. Copy the **Target** URL and the **Key** token values.
119
108
1. Make an API request using the [`<target_url>/v1/chat/completions`](#chat-api) API.
120
109
121
-
For more information on using the APIs, see the [reference](#reference-for-mistral-family-of-models-deployed-as-a-service) section.
110
+
For more information on using the APIs, see the [reference](#reference-for-mistral-large-deployed-as-a-service) section.
122
111
123
-
### Reference for Mistral family of models deployed as a service
112
+
### Reference for Mistral large deployed as a service
124
113
125
114
#### Chat API
126
115
@@ -145,7 +134,7 @@ Payload is a JSON formatted string containing the following parameters:
145
134
|`stream`|`boolean`|`False`| Streaming allows the generated tokens to be sent as data-only server-sent events whenever they become available. |
146
135
|`max_tokens`|`integer`|`8192`| The maximum number of tokens to generate in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length. |
147
136
|`top_p`|`float`|`1`| An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with `top_p` probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering `top_p` or `temperature`, but not both. |
148
-
|`temperature`|`float`|`1`| The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this parameter or `top_p`, but not both. |
137
+
|`temperature`|`float`|`1`| The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this or `top_p`, but not both. |
149
138
|`ignore_eos`|`boolean`|`False`| Whether to ignore the EOS token and continue generating tokens after the EOS token is generated. |
150
139
|`safe_prompt`|`boolean`|`False`| Whether to inject a safety prompt before all conversations. |
151
140
@@ -225,7 +214,7 @@ The `logprobs` object is a dictionary with the following fields:
225
214
226
215
#### Example
227
216
228
-
The following JSON is an example response:
217
+
The following is an example response:
229
218
230
219
```json
231
220
{
@@ -251,7 +240,7 @@ The following JSON is an example response:
@@ -263,7 +252,7 @@ The following JSON is an example response:
263
252
264
253
## Cost and quotas
265
254
266
-
### Cost and quota considerations for Mistral family of models deployed as a service
255
+
### Cost and quota considerations for Mistral Large deployed as a service
267
256
268
257
Mistral models deployed as a service are offered by Mistral AI through Azure Marketplace and integrated with Azure Machine Learning studio for use. You can find Azure Marketplace pricing when deploying the models.
0 commit comments