MicrosoftDocs
diff --git a/‎articles/machine-learning/how-to-deploy-models-mistral.md
Lines changed: 49 additions & 38 deletions b/‎articles/machine-learning/how-to-deploy-models-mistral.md
Lines changed: 49 additions & 38 deletions
diff --git a/‎articles/machine-learning/media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go.png
-9.86 KB b/‎articles/machine-learning/media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go.png
-9.86 KB
@@ -6,69 +6,80 @@ manager: scottpolly
 ms.service: machine-learning
 ms.subservice: inferencing
 ms.topic: how-to
-ms.date: 02/23/2024
-ms.reviewer: shubhiraj
-reviewer: shubhirajMsft
+ms.date: 04/29/2024
+mms.author: kritifaujdar
+.author: fkriti
 ms.author: mopeakande
 author: msakande
 ms.custom: [references_regions]
 
 #This functionality is also available in Azure AI Studio: /azure/ai-studio/how-to/deploy-models-mistral.md
 ---
 # How to deploy Mistral models with Azure Machine Learning studio
+
+In this article, you learn how to use Azure Machine Learning studio to deploy the Mistral family of models as a service with pay-as-you-go billing.
+
 Mistral AI offers two categories of models in Azure Machine Learning studio:
 
-- Premium models: Mistral Large. These models are available with pay-as-you-go token based billing with Models as a Service in the studio model catalog. 
-- Open models: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the Azure Machine Learning studio model catalog and can be deployed to dedicated VM instances in your own Azure subscription with managed online endpoints.
+- __Premium models__: Mistral Large and Mistral Small. These models are available with pay-as-you-go token based billing with Models as a Service in the studio model catalog. 
+- __Open models__: Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01. These models are also available in the studio model catalog and can be deployed to dedicated VM instances in your own Azure subscription with managed online endpoints.
 
-You can browse the Mistral family of models in the model catalog by filtering on the Mistral collection. 
+You can browse the Mistral family of models in the [model catalog](concept-model-catalog.md) by filtering on the Mistral collection. 
 
-## Mistral Large
+## Mistral family of models
 
-In this article, you learn how to use Azure Machine Learning studio to deploy the Mistral Large model as a service with pay-as you go billing.
+# [Mistral Large](#tab/mistral-large)
 
-Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task thanks to its state-of-the-art reasoning and knowledge capabilities.
+Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task, thanks to its state-of-the-art reasoning and knowledge capabilities.
 
-Additionally, mistral-large is:
+Additionally, Mistral Large is:
 
-- Specialized in RAG. Crucial information isn't lost in the middle of long context windows (up to 32 K tokens).
-- Strong in coding. Code generation, review, and comments. Supports all mainstream coding languages.
-- Multi-lingual by design. Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.
-- Responsible AI. Efficient guardrails baked in the model, with additional safety layer with safe_mode option.
+- __Specialized in RAG.__ Crucial information isn't lost in the middle of long context windows (up to 32 K tokens).
+- __Strong in coding.__ Code generation, review, and comments. Supports all mainstream coding languages.
+- __Multi-lingual by design.__ Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.
+- __Responsible AI compliant.__ Efficient guardrails baked in the model, and extra safety layer with the `safe_mode` option.
+ 
+# [Mistral Small](#tab/mistral-small)
 
-[!INCLUDE [machine-learning-preview-generic-disclaimer](includes/machine-learning-preview-generic-disclaimer.md)]
+Mistral Small is Mistral AI's most efficient Large Language Model (LLM). It can be used on any language-based task that requires high efficiency and low latency.
+
+Mistral Small is:
 
-## Deploy Mistral Large with pay-as-you-go
+- **A small model optimized for low latency.** Very efficient for high volume and low latency workloads. Mistral Small is Mistral's smallest proprietary model, it outperforms Mixtral-8x7B and has lower latency. 
+- **Specialized in RAG.** Crucial information isn't lost in the middle of long context windows (up to 32K tokens).
+- **Strong in coding.** Code generation, review, and comments. Supports all mainstream coding languages.
+- **Multi-lingual by design.** Best-in-class performance in French, German, Spanish, Italian, and English. Dozens of other languages are supported.
+- **Responsible AI compliant.** Efficient guardrails baked in the model, and extra safety layer with the `safe_mode` option.
 
-Certain models in the model catalog can be deployed as a service with pay-as-you-go, providing a way to consume them as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
+---
 
-Mistral Large can be deployed as a service with pay-as-you-go, and is offered by Mistral AI through the Microsoft Azure Marketplace. Please note that Mistral AI can change or update the terms of use and pricing of this model.
+[!INCLUDE [machine-learning-preview-generic-disclaimer](includes/machine-learning-preview-generic-disclaimer.md)]
 
-### Azure Marketplace model offerings
+## Deploy Mistral family of models with pay-as-you-go
 
-The following models are available in Azure Marketplace for Mistral AI when deployed as a service with pay-as-you-go:
+Certain models in the model catalog can be deployed as a service with pay-as-you-go. Pay-as-you-go deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
 
-* Mistral Large (preview)
+**Mistral Large** and **Mistral Small** are eligible to be deployed as a service with pay-as-you-go and are offered by Mistral AI through the Microsoft Azure Marketplace. Mistral AI can change or update the terms of use and pricing of these models.
 
 ### Prerequisites
 
 - An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
-- An Azure Machine Learning workspace. If you don't have these, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create them.
+- An Azure Machine Learning workspace. If you don't have a workspace, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create one.
 
     > [!IMPORTANT]
-    > Pay-as-you-go model deployment offering is only available in workspaces created in **East US 2** and **France Central** regions.
+    > The pay-as-you-go model deployment offering for eligible models in the Mistral family is only available in workspaces created in the **East US 2** and **Sweden Central** regions. For _Mistral Large_, the pay-as-you-go offering is also available in the **France Central** region.
 
--  Azure role-based access controls (Azure RBAC) are used to grant access to operations. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the Resouce Group.
-
-    For more information on permissions, see [Manage access to an Azure Machine Learning workspace](how-to-assign-roles.md).
+- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure Machine Learning. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Manage access to an Azure Machine Learning workspace](how-to-assign-roles.md).
 
 ### Create a new deployment
 
+The following steps demonstrate the deployment of Mistral Large, but you can use the same steps to deploy Mistral Small by replacing the model name.
+
 To create a deployment:
 
 1. Go to [Azure Machine Learning studio](https://ml.azure.com/home).
-1. Select the workspace in which you want to deploy your models. To use the pay-as-you-go model deployment offering, your workspace must belong to the **East US 2** or **France Central** region.
-1. Choose the model (Mistral-large) you want to deploy from the [model catalog](https://ml.azure.com/model/catalog).
+1. Select the workspace in which you want to deploy your models. To use the pay-as-you-go model deployment offering, your workspace must belong to the **East US 2**, **Sweden Central**, or **France Central** region.
+1. Choose the model (Mistral-large) that you want to deploy from the [model catalog](https://ml.azure.com/model/catalog).
 
    Alternatively, you can initiate deployment by going to your workspace and selecting **Endpoints** > **Serverless endpoints** > **Create**.
 
@@ -82,7 +93,7 @@ To create a deployment:
 
     :::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-marketplace-terms.png" alt-text="A screenshot showing the terms and conditions of a given model." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-marketplace-terms.png":::
 
-1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you will see a **Continue to deploy** option to select.
+1. Once you subscribe the workspace for the particular Azure Marketplace offering, subsequent deployments of the _same_ offering in the _same_ workspace don't require subscribing again. If this scenario applies to you, you'll see a **Continue to deploy** option to select.
 
     :::image type="content" source="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png" alt-text="A screenshot showing a project that is already subscribed to the offering." lightbox="media/how-to-deploy-models-mistral/mistral-deploy-pay-as-you-go-project.png":::
 
@@ -96,20 +107,20 @@ To create a deployment:
 1. You can always find the endpoint's details, URL, and access keys by navigating to **Workspace** > **Endpoints** > **Serverless endpoints**.
 1. Take note of the **Target** URL and the **Secret Key** to call the deployment and generate chat completions using the [`<target_url>/v1/chat/completions`](#chat-api) API.
 
-To learn about billing for Mistral models deployed with pay-as-you-go, see [Cost and quota considerations for Mistral models deployed as a service](#cost-and-quota-considerations-for-mistral-large-deployed-as-a-service).
+To learn about billing for Mistral models deployed with pay-as-you-go, see [Cost and quota considerations for Mistral family of models deployed as a service](#cost-and-quota-considerations-for-mistral-family-of-models-deployed-as-a-service).
 
-### Consume the Mistral Large model as a service
+### Consume the Mistral family of models as a service
 
-Mistral Large can be consumed using the chat API.
+You can consume Mistral Large by using the chat API.
 
 1. In the **workspace**, select **Endpoints** > **Serverless endpoints**.
 1. Find and select the deployment you created.
 1. Copy the **Target** URL and the **Key** token values.
 1. Make an API request using the [`<target_url>/v1/chat/completions`](#chat-api) API.
 
-   For more information on using the APIs, see the [reference](#reference-for-mistral-large-deployed-as-a-service) section.
+For more information on using the APIs, see the [reference](#reference-for-mistral-family-of-models-deployed-as-a-service) section.
 
-### Reference for Mistral large deployed as a service
+### Reference for Mistral family of models deployed as a service
 
 #### Chat API
 
@@ -134,7 +145,7 @@ Payload is a JSON formatted string containing the following parameters:
 | `stream`      | `boolean` | `False` | Streaming allows the generated tokens to be sent as data-only server-sent events whenever they become available.  |
 | `max_tokens`  | `integer` | `8192`    | The maximum number of tokens to generate in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length. |
 | `top_p`       | `float`   | `1`     | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with `top_p` probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering `top_p` or `temperature`, but not both.  |
-| `temperature` | `float`   | `1`     | The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this or `top_p`, but not both.  |
+| `temperature` | `float`   | `1`     | The sampling temperature to use, between 0 and 2. Higher values mean the model samples more broadly the distribution of tokens. Zero means greedy sampling. We recommend altering this parameter or `top_p`, but not both.  |
 | `ignore_eos`          | `boolean` | `False`  | Whether to ignore the EOS token and continue generating tokens after the EOS token is generated. |
 | `safe_prompt` | `boolean` | `False`  | Whether to inject a safety prompt before all conversations. |
 
@@ -214,7 +225,7 @@ The `logprobs` object is a dictionary with the following fields:
 
 #### Example
 
-The following is an example response:
+The following JSON is an example response:
 
 ```json
 {
@@ -240,7 +251,7 @@ The following is an example response:
 }
 ```
 
-#### Additional inference examples
+#### More inference examples
 
 | **Sample Type**       | **Sample Notebook**                             |
 |----------------|----------------------------------------|
@@ -252,7 +263,7 @@ The following is an example response:
 
 ## Cost and quotas
 
-### Cost and quota considerations for Mistral Large deployed as a service
+### Cost and quota considerations for Mistral family of models deployed as a service
 
 Mistral models deployed as a service are offered by Mistral AI through Azure Marketplace and integrated with Azure Machine Learning studio for use. You can find Azure Marketplace pricing when deploying the models.