Skip to content

Commit 7bfc1e0

Browse files
authored
Merge pull request #274127 from fkriti/kriti/mistral-small-AzureML
mistral updates to AzureML
2 parents 940155b + 84d5e64 commit 7bfc1e0

File tree

2 files changed

+49
-38
lines changed

2 files changed

+49
-38
lines changed

articles/machine-learning/how-to-deploy-models-mistral.md

Lines changed: 49 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -6,69 +6,80 @@ manager: scottpolly
66
ms.service: machine-learning
77
ms.subservice: inferencing
88
ms.topic: how-to
9-
ms.date: 02/23/2024
10-
ms.reviewer: shubhiraj
11-
reviewer: shubhirajMsft
9+
ms.date: 04/29/2024
10+
mms.author: kritifaujdar
11+
.author: fkriti
1212
ms.author: mopeakande
1313
author: msakande
1414
ms.custom: [references_regions]
1515

1616
#This functionality is also available in Azure AI Studio: /azure/ai-studio/how-to/deploy-models-mistral.md
1717
---
1818
# How to deploy Mistral models with Azure Machine Learning studio
19+
20+
In this article, you learn how to use Azure Machine Learning studio to deploy the Mistral family of models as a service with pay-as-you-go billing.
21+
1922
Mistral AI offers two categories of models in Azure Machine Learning studio:
2023

21-
- Premium models: Mistral Large. These models are available with pay-as-you-go token based billing with Models as a Service in the studio model catalog.
22-
- Open models: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the Azure Machine Learning studio model catalog and can be deployed to dedicated VM instances in your own Azure subscription with managed online endpoints.
24+
- __Premium models__: Mistral Large and Mistral Small. These models are available with pay-as-you-go token based billing with Models as a Service in the studio model catalog.
25+
- __Open models__: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the studio model catalog and can be deployed to dedicated VM instances in your own Azure subscription with managed online endpoints.
2326

24-
You can browse the Mistral family of models in the model catalog by filtering on the Mistral collection.
27+
You can browse the Mistral family of models in the [model catalog](concept-model-catalog.md) by filtering on the Mistral collection.
2528

26-
## Mistral Large
29+
## Mistral family of models
2730

28-
In this article, you learn how to use Azure Machine Learning studio to deploy the Mistral Large model as a service with pay-as you go billing.
31+
# [Mistral Large](#tab/mistral-large)
2932

30-
Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task thanks to its state-of-the-art reasoning and knowledge capabilities.
33+
Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task, thanks to its state-of-the-art reasoning and knowledge capabilities.
3134

32-
Additionally, mistral-large is:
35+
Additionally, Mistral Large is:
3336

34-
- Specialized in RAG. Crucial information isn't lost in the middle of long context windows (up to 32 K tokens).
35-
- Strong in coding. Code generation, review, and comments. Supports all mainstream coding languages.
36-
- Multi-lingual by design. Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.
37-
- Responsible AI. Efficient guardrails baked in the model, with additional safety layer with safe_mode option.
37+
- __Specialized in RAG.__ Crucial information isn't lost in the middle of long context windows (up to 32 K tokens).
38+
- __Strong in coding.__ Code generation, review, and comments. Supports all mainstream coding languages.
39+
- __Multi-lingual by design.__ Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.
40+
- __Responsible AI compliant.__ Efficient guardrails baked in the model, and extra safety layer with the `safe_mode` option.
41+
42+
# [Mistral Small](#tab/mistral-small)
3843

39-
[!INCLUDE [machine-learning-preview-generic-disclaimer](includes/machine-learning-preview-generic-disclaimer.md)]
44+
Mistral Small is Mistral AI's most efficient Large Language Model (LLM). It can be used on any language-based task that requires high efficiency and low latency.
45+
46+
Mistral Small is:
4047

41-
## Deploy Mistral Large with pay-as-you-go
48+
- **A small model optimized for low latency.** Very efficient for high volume and low latency workloads. Mistral Small is Mistral's smallest proprietary model, it outperforms Mixtral-8x7B and has lower latency.
49+
- **Specialized in RAG.** Crucial information isn't lost in the middle of long context windows (up to 32K tokens).
50+
- **Strong in coding.** Code generation, review, and comments. Supports all mainstream coding languages.
51+
- **Multi-lingual by design.** Best-in-class performance in French, German, Spanish, Italian, and English. Dozens of other languages are supported.
52+
- **Responsible AI compliant.** Efficient guardrails baked in the model, and extra safety layer with the `safe_mode` option.
4253

43-
Certain models in the model catalog can be deployed as a service with pay-as-you-go, providing a way to consume them as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
54+
---
4455

45-
Mistral Large can be deployed as a service with pay-as-you-go, and is offered by Mistral AI through the Microsoft Azure Marketplace. Please note that Mistral AI can change or update the terms of use and pricing of this model.
56+
[!INCLUDE [machine-learning-preview-generic-disclaimer](includes/machine-learning-preview-generic-disclaimer.md)]
4657

47-
### Azure Marketplace model offerings
58+
## Deploy Mistral family of models with pay-as-you-go
4859

49-
The following models are available in Azure Marketplace for Mistral AI when deployed as a service with pay-as-you-go:
60+
Certain models in the model catalog can be deployed as a service with pay-as-you-go. Pay-as-you-go deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
5061

51-
* Mistral Large (preview)
62+
**Mistral Large** and **Mistral Small** are eligible to be deployed as a service with pay-as-you-go and are offered by Mistral AI through the Microsoft Azure Marketplace. Mistral AI can change or update the terms of use and pricing of these models.
5263

5364
### Prerequisites
5465

5566
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
56-
- An Azure Machine Learning workspace. If you don't have these, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create them.
67+
- An Azure Machine Learning workspace. If you don't have a workspace, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create one.
5768

5869
> [!IMPORTANT]
59-
> Pay-as-you-go model deployment offering is only available in workspaces created in **East US 2** and **France Central** regions.
70+
> The pay-as-you-go model deployment offering for eligible models in the Mistral family is only available in workspaces created in the **East US 2** and **Sweden Central** regions. For _Mistral Large_, the pay-as-you-go offering is also available in the **France Central** region.
6071
61-
- Azure role-based access controls (Azure RBAC) are used to grant access to operations. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the Resouce Group.
62-
63-
For more information on permissions, see [Manage access to an Azure Machine Learning workspace](how-to-assign-roles.md).
72+
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure Machine Learning. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Manage access to an Azure Machine Learning workspace](how-to-assign-roles.md).
6473

6574
### Create a new deployment
6675

76+
The following steps demonstrate the deployment of Mistral Large, but you can use the same steps to deploy Mistral Small by replacing the model name.
77+
6778
To create a deployment:
6879

6980
1. Go to [Azure Machine Learning studio](https://ml.azure.com/home).
70-
1. Select the workspace in which you want to deploy your models. To use the pay-as-you-go model deployment offering, your workspace must belong to the **East US 2** or **France Central** region.
71-
1. Choose the model (Mistral-large) you want to deploy from the [model catalog](https://ml.azure.com/model/catalog).
81+
1. Select the workspace in which you want to deploy your models. To use the pay-as-you-go model deployment offering, your workspace must belong to the **East US 2**, **Sweden Central**, or **France Central** region.
82+
1. Choose the model (Mistral-large) that you want to deploy from the [model catalog](https://ml.azure.com/model/catalog).
7283

7384
Alternatively, you can initiate deployment by going to your workspace and selecting **Endpoints** > **Serverless endpoints** > **Create**.
7485

@@ -82,7 +93,7 @@ To create a deployment:
8293

8394
:::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-marketplace-terms.png" alt-text="A screenshot showing the terms and conditions of a given model." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-marketplace-terms.png":::
8495

85-
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you will see a **Continue to deploy** option to select.
96+
1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you'll see a **Continue to deploy** option to select.
8697

8798
:::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png":::
8899

@@ -96,20 +107,20 @@ To create a deployment:
96107
1. You can always find the endpoint's details, URL, and access keys by navigating to **Workspace** > **Endpoints** > **Serverless endpoints**.
97108
1. Take note of the **Target** URL and the **Secret Key** to call the deployment and generate chat completions using the [`<target_url>/v1/chat/completions`](#chat-api) API.
98109

99-
To learn about billing for Mistral models deployed with pay-as-you-go, see [Cost and quota considerations for Mistral models deployed as a service](#cost-and-quota-considerations-for-mistral-large-deployed-as-a-service).
110+
To learn about billing for Mistral models deployed with pay-as-you-go, see [Cost and quota considerations for Mistral family of models deployed as a service](#cost-and-quota-considerations-for-mistral-family-of-models-deployed-as-a-service).
100111

101-
### Consume the Mistral Large model as a service
112+
### Consume the Mistral family of models as a service
102113

103-
Mistral Large can be consumed using the chat API.
114+
You can consume Mistral Large by using the chat API.
104115

105116
1. In the **workspace**, select **Endpoints** > **Serverless endpoints**.
106117
1. Find and select the deployment you created.
107118
1. Copy the **Target** URL and the **Key** token values.
108119
1. Make an API request using the [`<target_url>/v1/chat/completions`](#chat-api) API.
109120

110-
For more information on using the APIs, see the [reference](#reference-for-mistral-large-deployed-as-a-service) section.
121+
For more information on using the APIs, see the [reference](#reference-for-mistral-family-of-models-deployed-as-a-service) section.
111122

112-
### Reference for Mistral large deployed as a service
123+
### Reference for Mistral family of models deployed as a service
113124

114125
#### Chat API
115126

@@ -134,7 +145,7 @@ Payload is a JSON formatted string containing the following parameters:
134145
| `stream` | `boolean` | `False` | Streaming allows the generated tokens to be sent as data-only server-sent events whenever they become available. |
135146
| `max_tokens` | `integer` | `8192` | The maximum number of tokens to generate in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length. |
136147
| `top_p` | `float` | `1` | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with `top_p` probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering `top_p` or `temperature`, but not both. |
137-
| `temperature` | `float` | `1` | The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this or `top_p`, but not both. |
148+
| `temperature` | `float` | `1` | The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this parameter or `top_p`, but not both. |
138149
| `ignore_eos` | `boolean` | `False` | Whether to ignore the EOS token and continue generating tokens after the EOS token is generated. |
139150
| `safe_prompt` | `boolean` | `False` | Whether to inject a safety prompt before all conversations. |
140151

@@ -214,7 +225,7 @@ The `logprobs` object is a dictionary with the following fields:
214225

215226
#### Example
216227

217-
The following is an example response:
228+
The following JSON is an example response:
218229

219230
```json
220231
{
@@ -240,7 +251,7 @@ The following is an example response:
240251
}
241252
```
242253

243-
#### Additional inference examples
254+
#### More inference examples
244255

245256
| **Sample Type** | **Sample Notebook** |
246257
|----------------|----------------------------------------|
@@ -252,7 +263,7 @@ The following is an example response:
252263

253264
## Cost and quotas
254265

255-
### Cost and quota considerations for Mistral Large deployed as a service
266+
### Cost and quota considerations for Mistral family of models deployed as a service
256267

257268
Mistral models deployed as a service are offered by Mistral AI through Azure Marketplace and integrated with Azure Machine Learning studio for use. You can find Azure Marketplace pricing when deploying the models.
258269

-9.86 KB
Loading

0 commit comments

Comments
 (0)