Skip to content

Commit 17fb63b

Browse files
authored
Merge pull request #176 from ssalgadodev/patch-3
Update how-to-deploy-models-jamba.md
2 parents 13fb5dd + 0fe7606 commit 17fb63b

File tree

3 files changed

+40
-29
lines changed

3 files changed

+40
-29
lines changed

articles/ai-studio/toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ items:
128128
displayName: maas
129129
- name: JAIS model
130130
href: how-to/deploy-models-jais.md
131-
- name: Jamba instruct model
131+
- name: AI21 Jamba models
132132
href: how-to/deploy-models-jamba.md
133133
- name: TimeGEN-1 model
134134
href: how-to/deploy-models-timegen-1.md

articles/machine-learning/how-to-deploy-models-jamba.md

Lines changed: 38 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,53 @@
11
---
2-
title: How to deploy Jamba models with Azure Machine Learning studio
2+
title: How to deploy AI21's Jamba family models with Azure Machine Learning studio
33
titleSuffix: Azure Machine Learning studio
4-
description: How to deploy Jamba models with Azure Machine Learning studio
4+
description: How to deploy AI21's Jamba family models with Azure Machine Learning studio
55
manager: scottpolly
66
ms.service: azure-machine-learning
77
ms.topic: how-to
8-
ms.date: 06/19/2024
8+
ms.date: 09/06/2024
99
ms.author: ssalgado
1010
ms.reviewer: tgokal
1111
author: ssalgadodev
1212
ms.custom: references_regions
1313
---
1414

15-
# How to deploy AI21's Jamba-Instruct model with Azure Machine Learning studio
15+
# How to deploy AI21's Jamba family models with Azure Machine Learning studio
1616

17-
In this article, you learn how to use Azure Machine Learning studio to deploy AI21's Jamba-Instruct model as a serverless API with pay-as-you-go billing.
17+
[!INCLUDE [machine-learning-preview-generic-disclaimer](includes/machine-learning-preview-generic-disclaimer.md)]
1818

19-
The Jamba Instruct model is AI21's production-grade Mamba-based large language model (LLM) which leverages AI21's hybrid Mamba-Transformer architecture. It's an instruction-tuned version of AI21's hybrid structured state space model (SSM) transformer Jamba model. The Jamba Instruct model is built for reliable commercial use with respect to quality and performance.
19+
In this article, you learn how to use Azure Machine Learning studio to deploy AI21's Jamba family models as a serverless API with pay-as-you-go billing.
2020

21-
[!INCLUDE [machine-learning-preview-generic-disclaimer](includes/machine-learning-preview-generic-disclaimer.md)]
21+
The Jamba family models are AI21's production-grade Mamba-based large language model (LLM) which leverages AI21's hybrid Mamba-Transformer architecture. It's an instruction-tuned version of AI21's hybrid structured state space model (SSM) transformer Jamba model. The Jamba family models are built for reliable commercial use with respect to quality and performance.
22+
23+
> [!TIP]
24+
> See our announcements of AI21's Jamba family models available now on Azure AI Model Catalog through [AI21's blog](https://aka.ms/ai21-jamba-1.5-large-announcement) and [Microsoft Tech Community Blog](https://aka.ms/ai21-jamba-1.5-large-microsoft-annnouncement).
2225
2326

24-
## Deploy the Jamba Instruct model as a serverless API
27+
## Deploy the Jamba family models as a serverless API
2528

26-
Certain models in the model catalog can be deployed as a serverless API with pay-as-you-go billing, providing a way to consume them as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
29+
Certain models in the model catalog can be deployed as a serverless API with pay-as-you-go billing, providing a way to consume them as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
2730

28-
The AI21-Jamba-Instruct model deployed as a serverless API with pay-as-you-go billing is [offered by AI21 through Microsoft Azure Marketplace](https://aka.ms/azure-marketplace-offer-ai21-jamba-instruct). AI21 can change or update the terms of use and pricing of this model.
31+
# [AI21 Jamba 1.5 Large](#tab/ai21-jamba-1-5-large)
2932

30-
To get started with Jamba Instruct deployed as a serverless API, explore our integrations with [LangChain](https://aka.ms/ai21-jamba-instruct-langchain-sample), [LiteLLM](https://aka.ms/ai21-jamba-instruct-litellm-sample), [OpenAI](https://aka.ms/ai21-jamba-instruct-openai-sample) and the [Azure API](https://aka.ms/ai21-jamba-instruct-azure-api-sample).
33+
The [AI21-Jamba 1.5 Large model](https://aka.ms/aistudio/landing/ai21-labs-jamba-1.5-large) deployed as a serverless API with pay-as-you-go billing is [offered by AI21 through Microsoft Azure Marketplace](https://aka.ms/azure-marketplace-offer-ai21-jamba-1.5-large). AI21 can change or update the terms of use and pricing of this model.
3134

32-
> [!TIP]
33-
> See our announcements of AI21's Jamba-Instruct model available now on Azure AI Model Catalog through [AI21's blog](https://aka.ms/ai21-jamba-instruct-blog) and [Microsoft Tech Community Blog](https://aka.ms/ai21-jamba-instruct-announcement).
35+
To get started with Jamba 1.5 large deployed as a serverless API, explore our integrations with [LangChain](https://aka.ms/ai21-jamba-1.5-large-langchain-sample), [LiteLLM](https://aka.ms/ai21-jamba-1.5-large-litellm-sample), [OpenAI](https://aka.ms/ai21-jamba-1.5-large-openai-sample) and the [Azure API](https://aka.ms/ai21-jamba-1.5-large-azure-api-sample).
36+
37+
38+
# [AI21 Jamba 1.5 Mini](#tab/ai21-jamba-1-5)
39+
40+
The [AI21 Jamba 1.5 Mini model](https://aka.ms/aistudio/landing/ai21-labs-jamba-1.5-mini) deployed as a serverless API with pay-as-you-go billing is [offered by AI21 through Microsoft Azure Marketplace](https://aka.ms/azure-marketplace-offer-ai21-jamba-1.5-mini). AI21 can change or update the terms of use and pricing of this model.
41+
42+
To get started with Jamba 1.5 mini deployed as a serverless API, explore our integrations with [LangChain](https://aka.ms/ai21-jamba-1.5-mini-langchain-sample), [LiteLLM](https://aka.ms/ai21-jamba-1.5-mini-litellm-sample), [OpenAI](https://aka.ms/ai21-jamba-1.5-mini-openai-sample) and the [Azure API](https://aka.ms/ai21-jamba-1.5-mini-azure-api-sample).
43+
44+
---
3445

3546

3647
### Prerequisites
3748

3849
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
39-
- An Azure Machine Learning workspace and a compute instance. If you don't have these, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create them. The serverless API model deployment offering for Jamba Instruct is only available with workspaces created in these regions:
50+
- An Azure Machine Learning workspace and a compute instance. If you don't have these, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create them. The serverless API model deployment offering for the Jamba family of models is only available with workspaces created in these regions:
4051

4152
* East US
4253
* East US 2
@@ -70,11 +81,11 @@ To get started with Jamba Instruct deployed as a serverless API, explore our int
7081

7182
### Create a new deployment
7283

73-
These steps demonstrate the deployment of AI21-Jamba-Instruct. To create a deployment:
84+
These steps demonstrate the deployment of `AI21 Jamba 1.5 Large` or `AI21 Jamba 1.5 Mini` models. To create a deployment:
7485

7586
1. Go to [Azure Machine Learning studio](https://ml.azure.com/home).
76-
1. Select the workspace in which you want to deploy your models. To use the Serverless API model deployment offering, your workspace must belong to the **East US 2** or **Sweden Central** region.
77-
1. Choose the model you want to deploy from the [model catalog](https://ml.azure.com/model/catalog).
87+
1. Select the workspace in which you want to deploy your models. To use the Serverless API model deployment offering, your workspace must belong to one of the supported regions that are listed in the pre-requisites.
88+
1. Search for and select an AI21 model like `AI21 Jamba 1.5 Large` or `AI21 Jamba 1.5 Mini` or `AI21 Jamba Instruct` from the [model catalog](https://ml.azure.com/model/catalog).
7889

7990
Alternatively, you can initiate deployment by going to your workspace and selecting **Endpoints** > **Serverless endpoints** > **Create**.
8091

@@ -97,26 +108,26 @@ These steps demonstrate the deployment of AI21-Jamba-Instruct. To create a deplo
97108
1. You can always find the endpoint's details, URL, and access keys by navigating to **Workspace** > **Endpoints** > **Serverless endpoints**.
98109

99110

100-
To learn about billing for Jamba models deployed as a serverless API, see [Cost and quota considerations for Jamba Instruct deployed as a serverless API](#cost-and-quota-considerations-for-jamba-instruct-deployed-as-a-serverless-api).
111+
To learn about billing for the AI21-Jamba family models deployed as a serverless API with pay-as-you-go token-based billing, see [Cost and quota considerations for Jamba family of models deployed as a serverless API](#cost-and-quota-considerations-for-jamba-family-models-deployed-as-a-serverless-api).
101112

102-
### Consume Jamba Instruct as a service
113+
### Consume Jamba family models as a serverless API
103114

104-
You can consume Jamba Instruct models as follows:
115+
You can consume Jamba family models as follows:
105116

106117
1. In the **workspace**, select **Endpoints** > **Serverless endpoints**.
107118
1. Find and select the deployment you created.
108119
1. Copy the **Target** URL and the **Key** token values.
109120
1. Make an API request using either the [Azure AI Model Inference API](reference-model-inference-api.md) on the route `/chat/completions` or the [AI21's Azure Client](https://docs.ai21.com/reference/jamba-instruct-api) on `/v1/chat/completions`.
110121

111-
For more information on using the APIs, see the [reference](#reference-for-jamba-instruct-deployed-as-a-serverless-api) section.
122+
For more information on using the APIs, see the [reference](#reference-for-jamba-family-models-deployed-as-a-serverless-api) section.
112123

113124

114125

115-
## Reference for Jamba Instruct deployed as a serverless API
126+
## Reference for Jamba family models deployed as a serverless API
116127

117-
Jamba Instruct models accept both of these APIs:
128+
Jamba family models accept both of these APIs:
118129

119-
- The [Azure AI model inference API](reference-model-inference-api.md) [Azure AI Model Inference API] on the route `/chat/completions` for multi-turn chat or single-turn question-answering. This API is supported because Jamba Instruct is fine-tuned for chat completion.
130+
- The [Azure AI model inference API](reference-model-inference-api.md) [Azure AI Model Inference API] on the route `/chat/completions` for multi-turn chat or single-turn question-answering. This API is supported because Jamba family models are fine-tuned for chat completion.
120131
- [AI21's Azure Client](https://docs.ai21.com/reference/jamba-instruct-api). For more information about the REST endpoint being called, visit [AI21's REST documentation](https://docs.ai21.com/reference/jamba-instruct-api).
121132

122133
### Azure AI model inference API
@@ -165,7 +176,7 @@ Payload is a JSON formatted string containing the following parameters:
165176

166177
| Key | Type | Required/Default | Allowed values | Description |
167178
| ------------- | -------------- | :-----------------:| ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
168-
| `model` | `string` | Y | Must be `jamba-instruct` |
179+
| `model` | `string` | Y | `jamba-instruct` or `AI21 Jamba 1.5 Large` or `AI21 Jamba 1.5 Mini` |
169180
| `messages` | `list[object]` | Y | A list of objects, one per message, from oldest to newest. The oldest message can be role `system`. All later messages must alternate between user and assistant roles. See the message object definition below. |
170181
| `max_tokens` | `integer` | N <br>`4096` | 0 – 4096 | The maximum number of tokens to allow for each generated response message. Typically the best way to limit output length is by providing a length limit in the system prompt (for example, "limit your answers to three sentences") |
171182
| `temperature` | `float` | N <br>`1` | 0.0 – 2.0 | How much variation to provide in each answer. Setting this value to 0 guarantees the same response to the same question every time. Setting a higher value encourages more variation. Modifies the distribution from which tokens are sampled. We recommend altering this or `top_p`, but not both. |
@@ -302,9 +313,9 @@ data: [DONE]
302313

303314
## Cost and quotas
304315

305-
### Cost and quota considerations for Jamba Instruct deployed as a serverless API
316+
### Cost and quota considerations for Jamba family models deployed as a serverless API
306317

307-
Jamba models deployed as a serverless API are offered by AI21 through Azure Marketplace and integrated with Azure Machine Learning studio for use. You can find Azure Marketplace pricing when deploying or fine-tuning models.
318+
The Jamba family models are deployed as a serverless API and is offered by AI21 through Azure Marketplace and integrated with Azure AI studio for use. You can find Azure Marketplace pricing when deploying or fine-tuning models.
308319

309320
Each time a workspace subscribes to a given model offering from Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference and fine-tuning; however, multiple meters are available to track each scenario independently.
310321

articles/machine-learning/toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -619,7 +619,7 @@
619619
href: how-to-deploy-models-llama.md
620620
- name: How to deploy JAIS models
621621
href: deploy-jais-models.md
622-
- name: How to deploy Jamba Instruct model
622+
- name: AI21 Jamba models
623623
href: how-to-deploy-models-jamba.md
624624
- name: Regulate deployments using policy
625625
href: how-to-regulate-registry-deployments.md

0 commit comments

Comments
 (0)