|
1 | 1 | ---
|
2 |
| -title: Deploy ML models to Azure Kubernetes Service with CLI and SDK v1 |
| 2 | +title: Deploy ML models to Azure Kubernetes Service - CLI/SDK v1 |
3 | 3 | titleSuffix: Azure Machine Learning
|
4 | 4 | description: 'Use CLI (v1) and SDK (v1) to deploy your Azure Machine Learning models as a web service using Azure Kubernetes Service.'
|
5 | 5 | services: machine-learning
|
6 | 6 | ms.service: azure-machine-learning
|
7 | 7 | ms.subservice: inferencing
|
8 | 8 | ms.topic: how-to
|
9 |
| -ms.custom: UpdateFrequency5, deploy, cliv1, sdkv1 |
| 9 | +ms.custom: UpdateFrequency5, deploy, cliv1, sdkv1, FY25Q1-Linter |
10 | 10 | author: Blackmist
|
11 | 11 | ms.author: larryfr
|
12 | 12 | ms.reviewer: bozhlin
|
13 |
| -ms.date: 03/08/2024 |
| 13 | +ms.date: 09/09/2024 |
| 14 | +# Customer Intent: As a data scientist, I want to deploy my machine learning model to Azure Kubernetes Service so that I can scale my model to meet production demands. |
14 | 15 | ---
|
15 | 16 |
|
16 | 17 | # Deploy a model to an Azure Kubernetes Service cluster with v1
|
@@ -43,7 +44,7 @@ When deploying to AKS, you deploy to an AKS cluster that's *connected to your wo
|
43 | 44 |
|
44 | 45 | - The [Azure CLI extension (v1) for Machine Learning service](reference-azure-machine-learning-cli.md), [Azure Machine Learning Python SDK](/python/api/overview/azure/ml/intro), or the [Azure Machine Learning Visual Studio Code extension](../how-to-setup-vs-code.md).
|
45 | 46 |
|
46 |
| - [!INCLUDE [cli v1 deprecation](../includes/machine-learning-cli-v1-deprecation.md)] |
| 47 | + [!INCLUDE [CLI v1 deprecation](../includes/machine-learning-cli-v1-deprecation.md)] |
47 | 48 |
|
48 | 49 | - The Python code snippets in this article assume that the following variables are set:
|
49 | 50 |
|
@@ -196,7 +197,7 @@ For more information on the classes, methods, and parameters used in this exampl
|
196 | 197 |
|
197 | 198 | # [Azure CLI](#tab/azure-cli)
|
198 | 199 |
|
199 |
| -[!INCLUDE [cli v1](../includes/machine-learning-cli-v1.md)] |
| 200 | +[!INCLUDE [CLI v1](../includes/machine-learning-cli-v1.md)] |
200 | 201 |
|
201 | 202 | To deploy using the CLI, use the following command. Replace `myaks` with the name of the AKS compute target. Replace `mymodel:1` with the name and version of the registered model. Replace `myservice` with the name to give this service:
|
202 | 203 |
|
@@ -283,9 +284,9 @@ For information on using VS Code, see [deploy to AKS via the VS Code extension](
|
283 | 284 | The component that handles autoscaling for Azure Machine Learning model deployments is azureml-fe, which is a smart request router. Since all inference requests go through it, it has the necessary data to automatically scale the deployed models.
|
284 | 285 |
|
285 | 286 | > [!IMPORTANT]
|
286 |
| -> * **Don't enable Kubernetes Horizontal Pod Autoscaler (HPA) for model deployments**. Doing so causes the two auto-scaling components to compete with each other. Azureml-fe is designed to auto-scale models deployed by Azure Machine Learning, where HPA would have to guess or approximate model utilization from a generic metric like CPU usage or a custom metric configuration. |
| 287 | +> * __Don't enable Kubernetes Horizontal Pod Autoscaler (HPA) for model deployments__. Doing so causes the two auto-scaling components to compete with each other. Azureml-fe is designed to auto-scale models deployed by Azure Machine Learning, where HPA would have to guess or approximate model utilization from a generic metric like CPU usage or a custom metric configuration. |
287 | 288 | >
|
288 |
| -> * **Azureml-fe does not scale the number of nodes in an AKS cluster**, because this could lead to unexpected cost increases. Instead, **it scales the number of replicas for the model** within the physical cluster boundaries. If you need to scale the number of nodes within the cluster, you can manually scale the cluster or [configure the AKS cluster autoscaler](/azure/aks/cluster-autoscaler). |
| 289 | +> * __Azureml-fe does not scale the number of nodes in an AKS cluster__, because this could lead to unexpected cost increases. Instead, __it scales the number of replicas for the model__ within the physical cluster boundaries. If you need to scale the number of nodes within the cluster, you can manually scale the cluster or [configure the AKS cluster autoscaler](/azure/aks/cluster-autoscaler). |
289 | 290 |
|
290 | 291 | Autoscaling can be controlled by setting `autoscale_target_utilization`, `autoscale_min_replicas`, and `autoscale_max_replicas` for the AKS web service. The following example demonstrates how to enable autoscaling:
|
291 | 292 |
|
|
0 commit comments