You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The AI Studio [model catalog](../how-to/model-catalog-overview.md) offers over 1,600 models, and the most common way to deploy these models is to use the managed compute deployment option, which is also sometimes referred to as a managed online deployment.
19
20
20
21
Deployment of a large language model (LLM) makes it available for use in a website, an application, or other production environment. Deployment typically involves hosting the model on a server or in the cloud and creating an API or other interface for users to interact with the model. You can invoke the deployment for real-time inference of generative AI applications such as chat and copilot.
21
22
22
-
In this article, you learn how to deploy large language models in Azure AI Studio. You can deploy models from the model catalog or from your project. You can also deploy models using the Azure Machine Learning SDK. The article also covers how to perform inference on the deployed model.
23
-
24
-
## Deploy and inference a Serverless API model with code
25
-
26
-
### Deploying a model
27
-
28
-
Serverless API models are the models you can deploy with pay-as-you-go billing. Examples include Phi-3, Llama-2, Command R, Mistral Large, and more. For serverless API models, you're only charged for inferencing, unless you choose to fine-tune the model.
29
-
30
-
#### Get the model ID
31
-
32
-
You can deploy Serverless API models using the Azure Machine Learning SDK, but first, let's browse the model catalog and get the model ID you need for deployment.
33
-
34
-
1. Sign in to [AI Studio](https://ai.azure.com) and go to the **Home** page.
35
-
1. Select **Model catalog** from the left sidebar.
36
-
1. In the **Deployment options** filter, select **Serverless API**.
37
-
38
-
:::image type="content" source="../media/deploy-monitor/catalog-filter-serverless-api.png" alt-text="A screenshot showing how to filter by serverless API models in the catalog." lightbox="../media/deploy-monitor/catalog-filter-serverless-api.png":::
39
-
40
-
1. Select a model.
41
-
1. Copy the model ID from the details page of the model you selected. It looks something like this: `azureml://registries/azureml-cohere/models/Cohere-command-r-plus/versions/3`
42
-
43
-
44
-
#### Install the Azure Machine Learning SDK
45
-
46
-
Next, you need to install the Azure Machine Learning SDK. Run the following commands in your terminal:
47
-
48
-
```python
49
-
pip install azure-ai-ml
50
-
pip install azure-identity
51
-
```
52
-
53
-
#### Deploy the serverless API model
54
-
55
-
First, you need to authenticate into Azure AI.
56
-
57
-
```python
58
-
from azure.ai.ml import MLClient
59
-
from azure.identity import DefaultAzureCredential
60
-
from azure.ai.ml.entities import MarketplaceSubscription, ServerlessEndpoint
61
-
62
-
# You can find your credential information in project settings.
63
-
client = MLClient(
64
-
credential=DefaultAzureCredential(),
65
-
subscription_id="your subscription name goes here",
66
-
resource_group_name="your resource group name goes here",
67
-
workspace_name="your project name goes here",
68
-
)
69
-
```
70
-
Second, let's reference the model ID you found earlier.
Serverless API models from third party model providers require an Azure Marketplace subscription in order to use the model. Let's create a marketplace subscription.
23
+
In this article, you learn how to deploy models using the Azure Machine Learning SDK. The article also covers how to perform inference on the deployed model.
77
24
78
-
> [!NOTE]
79
-
> You can skip the part if you are deploying a Serverless API model from Microsoft, such as Phi-3.
To inference, you want to use the code specifically catering to different model types and SDK you're using. You can find code samples at the [Azure/azureml-examples sample repository](https://github.com/Azure/azureml-examples/tree/main/sdk/python/foundation-models).
121
-
122
-
## Deploy and inference a managed compute deployment with code
123
-
124
-
### Deploying a model
125
-
126
-
The AI Studio [model catalog](../how-to/model-catalog-overview.md) offers over 1,600 models, and the most common way to deploy these models is to use the managed compute deployment option, which is also sometimes referred to as a managed online deployment.
127
-
128
-
#### Get the model ID
25
+
## Get the model ID
129
26
130
27
You can deploy managed compute models using the Azure Machine Learning SDK, but first, let's browse the model catalog and get the model ID you need for deployment.
131
28
@@ -138,18 +35,20 @@ You can deploy managed compute models using the Azure Machine Learning SDK, but
138
35
1. Select a model.
139
36
1. Copy the model ID from the details page of the model you selected. It looks something like this: `azureml://registries/azureml/models/deepset-roberta-base-squad2/versions/16`
140
37
141
-
#### Install the Azure Machine Learning SDK
142
38
143
-
For this step, you need to install the Azure Machine Learning SDK.
39
+
40
+
## Deploy the model
41
+
42
+
Let's deploy the model.
43
+
44
+
First, you need to install the Azure Machine Learning SDK.
144
45
145
46
```python
146
47
pip install azure-ai-ml
147
48
pip install azure-identity
148
49
```
149
50
150
-
#### Deploy the model
151
-
152
-
First, you need to authenticate into Azure AI.
51
+
Use this code to authenticate with Azure Machine Learning and create a client object. Replace the placeholders with your subscription ID, resource group name, and AI Studio project name.
153
52
154
53
```python
155
54
from azure.ai.ml import MLClient
@@ -163,9 +62,7 @@ client = MLClient(
163
62
)
164
63
```
165
64
166
-
Let's deploy the model.
167
-
168
-
For Managed compute deployment option, you need to create an endpoint before a model deployment. Think of endpoint as a container that can house multiple model deployments. The endpoint names need to be unique in a region, so in this example we're using the timestamp to create a unique endpoint name.
65
+
For the managed compute deployment option, you need to create an endpoint before a model deployment. Think of an endpoint as a container that can house multiple model deployments. The endpoint names need to be unique in a region, so in this example we're using the timestamp to create a unique endpoint name.
0 commit comments