Skip to content

Commit 1babc40

Browse files
Merge pull request #273019 from fkriti/kriti/mistral-small
making updates to add Mistral Small
2 parents 7bfc1e0 + 5b4c52d commit 1babc40

File tree

5 files changed

+44
-30
lines changed

5 files changed

+44
-30
lines changed

articles/ai-studio/how-to/deploy-models-mistral.md

Lines changed: 44 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,9 @@ description: Learn how to deploy Mistral Large with Azure AI Studio.
55
manager: scottpolly
66
ms.service: azure-ai-studio
77
ms.topic: how-to
8-
ms.date: 3/6/2024
9-
ms.reviewer: shubhirajMsft
8+
ms.date: 04/29/2024
9+
ms.reviewer: kritifaujdar
10+
reviewer: fkriti
1011
ms.author: mopeakande
1112
author: msakande
1213
ms.custom: [references_regions]
@@ -16,49 +17,62 @@ ms.custom: [references_regions]
1617

1718
[!INCLUDE [Azure AI Studio preview](../includes/preview-ai-studio.md)]
1819

19-
In this article, you learn how to use Azure AI Studio to deploy the Mistral Large model as a service with pay-as you go billing.
20+
In this article, you learn how to use Azure AI Studio to deploy the Mistral family of models as a service with pay-as-you-go billing.
2021

2122
Mistral AI offers two categories of models in [Azure AI Studio](https://ai.azure.com):
22-
* Premium models: Mistral Large. These models are available with pay-as-you-go token based billing with Models as a Service in the AI Studio model catalog.
23-
* Open models: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the AI Studio model catalog and can be deployed to dedicated VM instances in your own Azure subscription with Managed Online Endpoints.
23+
* __Premium models__: Mistral Large and Mistral Small. These models are available with pay-as-you-go token-based billing with Models as a Service in the AI Studio model catalog.
24+
* __Open models__: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the AI Studio model catalog and can be deployed to dedicated VM instances in your own Azure subscription with managed online endpoints.
2425

2526
You can browse the Mistral family of models in the [Model Catalog](model-catalog.md) by filtering on the Mistral collection.
2627

27-
## Mistral Large
28+
## Mistral family of models
2829

29-
In this article, you learn how to use Azure AI Studio to deploy the Mistral Large model as a service with pay-as-you-go billing.
30+
# [Mistral Large](#tab/mistral-large)
3031

31-
Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task thanks to its state-of-the-art reasoning and knowledge capabilities.
32+
Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task, thanks to its state-of-the-art reasoning and knowledge capabilities.
3233

33-
Additionally, mistral-large is:
34+
Additionally, Mistral Large is:
3435

35-
* Specialized in RAG. Crucial information isn't lost in the middle of long context windows (up to 32-K tokens).
36-
* Strong in coding. Code generation, review, and comments. Supports all mainstream coding languages.
37-
* Multi-lingual by design. Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.
38-
* Responsible AI. Efficient guardrails baked in the model and another safety layer with the `safe_mode` option.
36+
* __Specialized in RAG.__ Crucial information isn't lost in the middle of long context windows (up to 32-K tokens).
37+
* __Strong in coding.__ Code generation, review, and comments. Supports all mainstream coding languages.
38+
* __Multi-lingual by design.__ Best-in-class performance in French, German, Spanish, Italian, and English. Dozens of other languages are supported.
39+
* __Responsible AI compliant.__ Efficient guardrails baked in the model and extra safety layer with the `safe_mode` option.
3940

40-
## Deploy Mistral Large with pay-as-you-go
41+
# [Mistral Small](#tab/mistral-small)
4142

42-
Certain models in the model catalog can be deployed as a service with pay-as-you-go, providing a way to consume them as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
43+
Mistral Small is Mistral AI's most efficient Large Language Model (LLM). It can be used on any language-based task that requires high efficiency and low latency.
4344

44-
Mistral Large can be deployed as a service with pay-as-you-go, and is offered by Mistral AI through the Microsoft Azure Marketplace. Mistral AI can change or update the terms of use and pricing of this model.
45+
Mistral Small is:
46+
47+
- **A small model optimized for low latency.** Very efficient for high volume and low latency workloads. Mistral Small is Mistral's smallest proprietary model, it outperforms Mixtral-8x7B and has lower latency.
48+
- **Specialized in RAG.** Crucial information isn't lost in the middle of long context windows (up to 32K tokens).
49+
- **Strong in coding.** Code generation, review, and comments. Supports all mainstream coding languages.
50+
- **Multi-lingual by design.** Best-in-class performance in French, German, Spanish, Italian, and English. Dozens of other languages are supported.
51+
- **Responsible AI compliant.** Efficient guardrails baked in the model, and extra safety layer with the `safe_mode` option.
52+
53+
---
54+
## Deploy Mistral family of models with pay-as-you-go
55+
56+
Certain models in the model catalog can be deployed as a service with pay-as-you-go. Pay-as-you-go deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
57+
58+
**Mistral Large** and **Mistral Small** are eligible to be deployed as a service with pay-as-you-go and are offered by Mistral AI through the Microsoft Azure Marketplace. Mistral AI can change or update the terms of use and pricing of these models.
4559

4660
### Prerequisites
4761

4862
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
4963
- An [Azure AI hub resource](../how-to/create-azure-ai-resource.md).
5064

5165
> [!IMPORTANT]
52-
> For Mistral family models, the pay-as-you-go model deployment offering is only available with AI hubs created in **East US 2** and **France Central** regions.
66+
> The pay-as-you-go model deployment offering for eligible models in the Mistral family is only available in AI hubs created in the **East US 2** and **Sweden Central** regions. For _Mistral Large_, the pay-as-you-go offering is also available in the **France Central** region.
5367
5468
- An [Azure AI project](../how-to/create-projects.md) in Azure AI Studio.
55-
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group.
56-
57-
For more information on permissions, see [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
69+
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
5870

5971

6072
### Create a new deployment
6173

74+
The following steps demonstrate the deployment of Mistral Large, but you can use the same steps to deploy Mistral Small by replacing the model name.
75+
6276
To create a deployment:
6377

6478
1. Sign in to [Azure AI Studio](https://ai.azure.com).
@@ -70,7 +84,7 @@ To create a deployment:
7084

7185
:::image type="content" source="../media/deploy-monitor/mistral/mistral-deploy-pay-as-you-go.png" alt-text="A screenshot showing how to deploy a model with the pay-as-you-go option." lightbox="../media/deploy-monitor/mistral/mistral-deploy-pay-as-you-go.png":::
7286

73-
1. Select the project in which you want to deploy your model. To deploy the Mistral-large model your project must be in the **East US 2** or **France Central** regions.
87+
1. Select the project in which you want to deploy your model. To deploy the Mistral-large model, your project must be in the **East US 2**, **Sweden Central**, or **France Central** region.
7488
1. In the deployment wizard, select the link to **Azure Marketplace Terms** to learn more about the terms of use.
7589
1. You can also select the **Marketplace offer details** tab to learn about pricing for the selected model.
7690
1. If this is your first time deploying the model in the project, you have to subscribe your project for the particular offering. This step requires that your account has the **Azure AI Developer role** permissions on the Resource Group, as listed in the prerequisites. Each project has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending. Select **Subscribe and Deploy**. Currently you can have only one deployment for each model within a project.
@@ -90,11 +104,11 @@ To create a deployment:
90104
1. You can return to the Deployments page, select the deployment, and note the endpoint's **Target** URL and the Secret **Key**, which you can use to call the deployment for chat completions using the [`<target_url>/v1/chat/completions`](#chat-api) API.
91105
1. You can always find the endpoint's details, URL, and access keys by navigating to the **Build** tab and selecting **Deployments** from the Components section.
92106

93-
To learn about billing for the Mistral AI model deployed with pay-as-you-go, see [Cost and quota considerations for Mistral Large deployed as a service](#cost-and-quota-considerations-for-mistral-large-deployed-as-a-service).
107+
To learn about billing for the Mistral AI model deployed with pay-as-you-go, see [Cost and quota considerations for Mistral family of models deployed as a service](#cost-and-quota-considerations-for-mistral-family-of-models-deployed-as-a-service).
94108

95-
### Consume the Mistral Large model as a service
109+
### Consume the Mistral family of models as a service
96110

97-
Mistral Large can be consumed using the chat API.
111+
You can consume Mistral Large by using the chat API.
98112

99113
1. On the **Build** page, select **Deployments**.
100114

@@ -104,9 +118,9 @@ Mistral Large can be consumed using the chat API.
104118

105119
1. Make an API request using the [`/v1/chat/completions`](#chat-api) API using [`<target_url>/v1/chat/completions`](#chat-api).
106120

107-
For more information on using the APIs, see the [reference](#reference-for-mistral-large-deployed-as-a-service) section.
121+
For more information on using the APIs, see the [reference](#reference-for-mistral-family-of-models-deployed-as-a-service) section.
108122

109-
### Reference for Mistral Large deployed as a service
123+
### Reference for Mistral family of models deployed as a service
110124

111125
#### Chat API
112126

@@ -131,7 +145,7 @@ Payload is a JSON formatted string containing the following parameters:
131145
| `stream` | `boolean` | `False` | Streaming allows the generated tokens to be sent as data-only server-sent events whenever they become available. |
132146
| `max_tokens` | `integer` | `8192` | The maximum number of tokens to generate in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length. |
133147
| `top_p` | `float` | `1` | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with `top_p` probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering `top_p` or `temperature`, but not both. |
134-
| `temperature` | `float` | `1` | The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this or `top_p`, but not both. |
148+
| `temperature` | `float` | `1` | The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this parameter or `top_p`, but not both. |
135149
| `ignore_eos` | `boolean` | `False` | Whether to ignore the EOS token and continue generating tokens after the EOS token is generated. |
136150
| `safe_prompt` | `boolean` | `False` | Whether to inject a safety prompt before all conversations. |
137151

@@ -211,7 +225,7 @@ The `logprobs` object is a dictionary with the following fields:
211225

212226
#### Example
213227

214-
The following is an example response:
228+
The following JSON is an example response:
215229

216230
```json
217231
{
@@ -248,7 +262,7 @@ The following is an example response:
248262

249263
## Cost and quotas
250264

251-
### Cost and quota considerations for Mistral Large deployed as a service
265+
### Cost and quota considerations for Mistral family of models deployed as a service
252266

253267
Mistral models deployed as a service are offered by Mistral AI through the Azure Marketplace and integrated with Azure AI Studio for use. You can find the Azure Marketplace pricing when deploying the model.
254268

@@ -262,7 +276,7 @@ Quota is managed per deployment. Each deployment has a rate limit of 200,000 tok
262276

263277
Models deployed as a service with pay-as-you-go are protected by [Azure AI Content Safety](../../ai-services/content-safety/overview.md). With Azure AI content safety, both the prompt and completion pass through an ensemble of classification models aimed at detecting and preventing the output of harmful content. The content filtering system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions. Learn more about [content filtering here](../concepts/content-filtering.md).
264278

265-
## Next steps
279+
## Related content
266280

267281
- [What is Azure AI Studio?](../what-is-ai-studio.md)
268282
- [Azure AI FAQ article](../faq.yml)
-64.4 KB
Loading
-45.1 KB
Loading
66 Bytes
Loading
-34.5 KB
Loading

0 commit comments

Comments
 (0)