You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this article, you learn how to use Azure AI Studio to deploy AI21's Jamba-Instruct model as a serverless API with pay-as-you-go billing.
19
+
In this article, you learn how to use Azure AI Studio to deploy AI21's Jamba family models as a serverless API with pay-as-you-go billing.
20
20
21
-
The Jamba Instruct model is AI21's production-grade Mamba-based large language model (LLM) which leverages AI21's hybrid Mamba-Transformer architecture. It's an instruction-tuned version of AI21's hybrid structured state space model (SSM) transformer Jamba model. The Jamba Instruct model is built for reliable commercial use with respect to quality and performance.
21
+
The Jamba family models are AI21's production-grade Mamba-based large language model (LLM) which leverages AI21's hybrid Mamba-Transformer architecture. It's an instruction-tuned version of AI21's hybrid structured state space model (SSM) transformer Jamba model. The Jamba family models are built for reliable commercial use with respect to quality and performance.
22
22
23
-
## Deploy the Jamba Instruct model as a serverless API
23
+
## Deploy the Jamba family models as a serverless API
24
24
25
-
Certain models in the model catalog can be deployed as a serverless API with pay-as-you-go billing, providing a way to consume them as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
25
+
Certain models in the model catalog can be deployed as a serverless API with pay-as-you-go billing, providing a way to consume them as an API without hosting them on your subscription, while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription.
26
26
27
+
### AI21 Jamba Instruct
27
28
The [AI21-Jamba-Instruct model](https://aka.ms/aistudio/landing/ai21-labs-jamba-instruct) deployed as a serverless API with pay-as-you-go billing is [offered by AI21 through Microsoft Azure Marketplace](https://aka.ms/azure-marketplace-offer-ai21-jamba-instruct). AI21 can change or update the terms of use and pricing of this model.
28
29
29
30
To get started with Jamba Instruct deployed as a serverless API, explore our integrations with [LangChain](https://aka.ms/ai21-jamba-instruct-langchain-sample), [LiteLLM](https://aka.ms/ai21-jamba-instruct-litellm-sample), [OpenAI](https://aka.ms/ai21-jamba-instruct-openai-sample) and the [Azure API](https://aka.ms/ai21-jamba-instruct-azure-api-sample).
30
31
32
+
### AI21 Jamba 1.5
33
+
The [AI21-Jamba-1.5 model](https://aka.ms/aistudio/landing/ai21-labs-jamba-1.5) deployed as a serverless API with pay-as-you-go billing is [offered by AI21 through Microsoft Azure Marketplace](https://aka.ms/azure-marketplace-offer-ai21-jamba-1.5). AI21 can change or update the terms of use and pricing of this model.
34
+
35
+
To get started with Jamba 1.5 deployed as a serverless API, explore our integrations with [LangChain](https://aka.ms/ai21-jamba-1.5-langchain-sample), [LiteLLM](https://aka.ms/ai21-jamba-1.5-litellm-sample), [OpenAI](https://aka.ms/ai21-jamba-1.5-openai-sample) and the [Azure API](https://aka.ms/ai21-jamba-1.5-azure-api-sample).
36
+
37
+
### AI21 Jamba 1.5 Large
38
+
The [AI21-Jamba-1.5-large model](https://aka.ms/aistudio/landing/ai21-labs-jamba-1.5-large) deployed as a serverless API with pay-as-you-go billing is [offered by AI21 through Microsoft Azure Marketplace](https://aka.ms/azure-marketplace-offer-ai21-jamba-1.5-large). AI21 can change or update the terms of use and pricing of this model.
39
+
40
+
To get started with Jamba 1.5 large deployed as a serverless API, explore our integrations with [LangChain](https://aka.ms/ai21-jamba-1.5-large-langchain-sample), [LiteLLM](https://aka.ms/ai21-jamba-1.5-large-litellm-sample), [OpenAI](https://aka.ms/ai21-jamba-1.5-large-openai-sample) and the [Azure API](https://aka.ms/ai21-jamba-1.5-large-azure-api-sample).
41
+
31
42
> [!TIP]
32
-
> See our announcements of AI21's Jamba-Instruct model available now on Azure AI Model Catalog through [AI21's blog](https://aka.ms/ai21-jamba-instruct-blog) and [Microsoft Tech Community Blog](https://aka.ms/ai21-jamba-instruct-announcement).
43
+
> See our announcements of AI21's Jamba family models available now on Azure AI Model Catalog through [AI21's blog](https://aka.ms/ai21-jamba-instruct-blog) and [Microsoft Tech Community Blog](https://aka.ms/ai21-jamba-instruct-announcement).
33
44
45
+
---
34
46
35
47
### Prerequisites
36
48
37
-
# [AI21 Jamba Instruct](#tab/jamba-instruct)
38
-
39
49
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
40
-
- An [AI Studio hub](../how-to/create-azure-ai-resource.md). The serverless API model deployment offering for Jamba Instruct is only available with hubs created in these regions:
50
+
- An [AI Studio hub](../how-to/create-azure-ai-resource.md). The serverless API model deployment offering for Jamba family models is only available with hubs created in these regions:
41
51
42
52
* East US
43
53
* East US 2
@@ -68,35 +78,22 @@ To get started with Jamba Instruct deployed as a serverless API, explore our int
68
78
69
79
For more information on permissions, see [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
70
80
71
-
72
-
# [AI21 Jamba](#tab/jamba-1.5)
73
-
74
-
75
-
76
-
77
-
# [AI21 Jamba](#tab/jamba-1.5-instruct)
78
-
79
-
80
-
81
-
82
81
---
83
82
84
83
### Create a new deployment
85
84
86
-
# [AI21 Jamba Instruct](#tab/jamba-instruct)
87
-
88
-
These steps demonstrate the deployment of AI21-Jamba-Instruct. To create a deployment:
85
+
These steps demonstrate the deployment of AI21-Jamba family models. To create a deployment:
89
86
90
87
1. Sign in to [Azure AI Studio](https://ai.azure.com).
91
88
1. Select **Model catalog** from the left sidebar.
92
-
1. Search for and select **AI21-Jamba-Instruct** to open its Details page.
89
+
1. Search for and select a AI21 model like **AI21-Jamba-Instruct** or **AI21-Jamba-1.5** or **AI21-Jamba-1.5-large** to open its Details page.
93
90
1. Select **Deploy** to open a serverless API deployment window for the model.
94
91
1. Alternatively, you can initiate a deployment by starting from your project in AI Studio.
95
92
1. From the left sidebar of your project, select **Components** > **Deployments**.
96
93
1. Select **+ Create deployment**.
97
-
1. Search for and select **AI21-Jamba-Instruct**. to open the Model's Details page.
94
+
1. Search for and select a AI21 model like **AI21-Jamba-Instruct** or **AI21-Jamba-1.5** or **AI21-Jamba-1.5-large** to open the Model's Details page.
98
95
1. Select **Confirm** to open a serverless API deployment window for the model.
99
-
1. Select the project in which you want to deploy your model. To deploy the AI21-Jamba-Instruct model, your project must be in one of the regions listed in the [Prerequisites](#prerequisites) section.
96
+
1. Select the project in which you want to deploy your model. To deploy the AI21-Jamba family models, your project must be in one of the regions listed in the [Prerequisites](#prerequisites) section.
100
97
1. In the deployment wizard, select the link to **Azure Marketplace Terms**, to learn more about the terms of use.
101
98
1. Select the **Pricing and terms** tab to learn about pricing for the selected model.
102
99
1. Select the **Subscribe and Deploy** button. If this is your first time deploying the model in the project, you have to subscribe your project for the particular offering. This step requires that your account has the Azure subscription permissions and resource group permissions listed in the [Prerequisites](#prerequisites). Each project has its own subscription to the particular Azure Marketplace offering of the model, which allows you to control and monitor spending. Currently, you can have only one deployment for each model within a project.
@@ -106,27 +103,13 @@ These steps demonstrate the deployment of AI21-Jamba-Instruct. To create a deplo
106
103
1. Return to the Deployments page, select the deployment, and note the endpoint's **Target** URL and the Secret **Key**. For more information on using the APIs, see the [Reference](#reference-for-jamba-instruct-deployed-as-a-serverless-api) section.
107
104
1. You can always find the endpoint's details, URL, and access keys by navigating to your **Project overview** page. Then, from the left sidebar of your project, select **Components** > **Deployments**.
108
105
109
-
To learn about billing for the AI21-Jamba-Instruct model deployed as a serverless API with pay-as-you-go token-based billing, see [Cost and quota considerations for Jamba Instruct deployed as a serverless API](#cost-and-quota-considerations-for-jamba-instruct-deployed-as-a-serverless-api).
110
-
111
-
# [AI21 Jamba](#tab/jamba-1.5)
112
-
113
-
114
-
115
-
116
-
117
-
# [AI21 Jamba](#tab/jamba-1.5-instruct)
118
-
119
-
120
-
121
-
106
+
To learn about billing for the AI21-Jamba family models deployed as a serverless API with pay-as-you-go token-based billing, see [Cost and quota considerations for Jamba Instruct deployed as a serverless API](#cost-and-quota-considerations-for-jamba-instruct-deployed-as-a-serverless-api).
122
107
123
108
---
124
109
125
-
### Consume Jamba Instruct as a serverless API
126
-
127
-
# [AI21 Jamba Instruct](#tab/jamba-instruct)
110
+
### Consume Jamba family models as a serverless API
128
111
129
-
You can consume Jamba Instruct models as follows:
112
+
You can consume Jamba family models as follows:
130
113
131
114
1. From your **Project overview** page, go to the left sidebar and select **Components** > **Deployments**.
132
115
@@ -138,24 +121,13 @@ You can consume Jamba Instruct models as follows:
138
121
139
122
For more information on using the APIs, see the [reference](#reference-for-jamba-instruct-deployed-as-a-serverless-api) section.
140
123
141
-
# [AI21 Jamba Instruct](#tab/jamba-1.5)
142
-
143
-
144
-
145
-
146
-
147
-
# [AI21 Jamba Instruct](#tab/jamba-1.5-instruct)
148
-
149
-
150
-
151
-
152
124
---
153
125
154
-
## Reference for Jamba Instruct deployed as a serverless API
126
+
## Reference for Jamba family models deployed as a serverless API
155
127
156
-
Jamba Instruct models accept both of these APIs:
128
+
Jamba family models accept both of these APIs:
157
129
158
-
- The [Azure AI Model Inference API](../reference/reference-model-inference-api.md) on the route `/chat/completions` for multi-turn chat or single-turn question-answering. This API is supported because Jamba Instruct is fine-tuned for chat completion.
130
+
- The [Azure AI Model Inference API](../reference/reference-model-inference-api.md) on the route `/chat/completions` for multi-turn chat or single-turn question-answering. This API is supported because Jamba family models are fine-tuned for chat completion.
159
131
-[AI21's Azure Client](https://docs.ai21.com/reference/jamba-instruct-api). For more information about the REST endpoint being called, visit [AI21's REST documentation](https://docs.ai21.com/reference/jamba-instruct-api).
160
132
161
133
### Azure AI model inference API
@@ -204,7 +176,7 @@ Payload is a JSON formatted string containing the following parameters:
|`model`|`string`| Y | Must be `jamba-1.5` or `jamba-1.5-large` or `jamba-instruct`|
208
180
|`messages`|`list[object]`| Y | A list of objects, one per message, from oldest to newest. The oldest message can be role `system`. All later messages must alternate between user and assistant roles. See the message object definition below. |
209
181
|`max_tokens`|`integer`| N <br>`4096`| 0 – 4096 | The maximum number of tokens to allow for each generated response message. Typically the best way to limit output length is by providing a length limit in the system prompt (for example, "limit your answers to three sentences") |
210
182
|`temperature`|`float`| N <br>`1`| 0.0 – 2.0 | How much variation to provide in each answer. Setting this value to 0 guarantees the same response to the same question every time. Setting a higher value encourages more variation. Modifies the distribution from which tokens are sampled. We recommend altering this or `top_p`, but not both. |
"content": "You are a helpful genie just released from a bottle. You start the conversation with 'Thank you for freeing me! I grant you one wish.'"},
@@ -341,9 +313,9 @@ data: [DONE]
341
313
342
314
## Cost and quotas
343
315
344
-
### Cost and quota considerations for Jamba Instruct deployed as a serverless API
316
+
### Cost and quota considerations for Jamba family models deployed as a serverless API
345
317
346
-
The Jamba Instruct model is deployed as a serverless API and is offered by AI21 through Azure Marketplace and integrated with Azure AI studio for use. You can find Azure Marketplace pricing when deploying or fine-tuning models.
318
+
The Jamba family models are deployed as a serverless API and is offered by AI21 through Azure Marketplace and integrated with Azure AI studio for use. You can find Azure Marketplace pricing when deploying or fine-tuning models.
347
319
348
320
Each time a workspace subscribes to a given model offering from Azure Marketplace, a new resource is created to track the costs associated with its consumption. The same resource is used to track costs associated with inference and fine-tuning; however, multiple meters are available to track each scenario independently.
0 commit comments