You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: includes/aml-compute-target-deploy.md
+5-9Lines changed: 5 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,25 +8,21 @@ ms.author: sgilley
8
8
manager: cgronlund
9
9
ms.custom: "include file"
10
10
ms.topic: "include"
11
-
ms.date: 10/21/2021
11
+
ms.date: 09/30/2022
12
12
---
13
13
14
14
15
15
The compute target you use to host your model will affect the cost and availability of your deployed endpoint. Use this table to choose an appropriate compute target.
16
16
17
17
| Compute target | Used for | GPU support | Description |
18
18
| ----- | ----- | ----- | ----- |
19
-
| [Local web service](../articles/machine-learning/v1/how-to-deploy-local-container-notebook-vm.md) | Testing/debugging | | Use for limited testing and troubleshooting. Hardware acceleration depends on use of libraries in the local system.
19
+
|[Local web service](../articles/machine-learning/v1/how-to-deploy-local-container-notebook-vm.md)| Testing/debugging | | Use for limited testing and troubleshooting. Hardware acceleration depends on use of libraries in the local system. |
20
+
|[Azure Machine Learning endpoints](../articles/machine-learning/concept-endpoints.md)| Real-time inference <br/><br/>Batch inference | Yes | Fully managed computes for real-time (managed online endpoints) and batch scoring (batch endpoints) on serverless compute. |
20
21
|[Azure Machine Learning Kubernetes](../articles/machine-learning/how-to-attach-kubernetes-anywhere.md)| Real-time inference <br/><br/> Batch inference | Yes | Run inferencing workloads on on-premises, cloud, and edge Kubernetes clusters. |
21
-
|[Azure Container Instances](../articles/machine-learning/v1/how-to-deploy-azure-container-instance.md)| Real-time inference <br/><br/> Recommended for dev/test purposes only.| | Use for low-scale CPU-based workloads that require less than 48 GB of RAM. Doesn't require you to manage a cluster. <br/><br/> Supported in the designer. |
22
-
|[Azure Machine Learning compute clusters](../articles/machine-learning/tutorial-pipeline-batch-scoring-classification.md)| Batch inference |[Yes](../articles/machine-learning/tutorial-pipeline-batch-scoring-classification.md) (machine learning pipeline) | Run batch scoring on serverless compute. Supports normal and low-priority VMs. No support for real-time inference.|
22
+
|[Azure Container Instances](../articles/machine-learning/v1/how-to-deploy-azure-container-instance.md) (SDK/CLI v1 only) | Real-time inference <br/><br/> Recommended for dev/test purposes only.| | Use for low-scale CPU-based workloads that require less than 48 GB of RAM. Doesn't require you to manage a cluster. <br/><br/> Supported in the designer. |
23
23
24
24
> [!NOTE]
25
-
> Although compute targets like local, and Azure Machine Learning compute clusters support GPU for training and experimentation, using GPU for inference _when deployed as a web service_ is supported only on Azure Machine Learning Kubernetes.
26
-
>
27
-
> Using a GPU for inference _when scoring with a machine learning pipeline_ is supported only on Azure Machine Learning compute.
28
-
>
29
25
> When choosing a cluster SKU, first scale up and then scale out. Start with a machine that has 150% of the RAM your model requires, profile the result and find a machine that has the performance you need. Once you've learned that, increase the number of machines to fit your need for concurrent inference.
30
26
31
27
> [!NOTE]
32
-
> *Container instances are suitable only for small models less than 1 GB in size.
28
+
> Container instances require the SDK or CLI v1 and are suitable only for small models less than 1 GB in size.
0 commit comments