Skip to content

Commit 3f6bec9

Browse files
authored
Update articles/machine-learning/how-to-deploy-models-from-huggingface.md
1 parent 846a7bf commit 3f6bec9

File tree

1 file changed

+0
-1
lines changed

1 file changed

+0
-1
lines changed

articles/machine-learning/how-to-deploy-models-from-huggingface.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ Microsoft has partnered with Hugging Face to bring open-source models from Huggi
2121
> [!NOTE]
2222
> Models from Hugging Face are subject to third party license terms available on the Hugging Face model details page. It is your responsibility to comply with the model's license terms.
2323
24-
2524
## Benefits of using online endpoints for real-time inference
2625

2726
Managed online endpoints in Azure Machine Learning help you deploy models to powerful CPU and GPU machines in Azure in a turnkey manner. Managed online endpoints take care of serving, scaling, securing, and monitoring your models, freeing you from the overhead of setting up and managing the underlying infrastructure. The virtual machines are provisioned on your behalf when you deploy models. You can have multiple deployments behind and [split traffic or mirror traffic](./how-to-safely-rollout-online-endpoints.md) to those deployments. Mirror traffic helps you to test new versions of models on production traffic without releasing them production environments. Splitting traffic lets you gradually increase production traffic to new model versions while observing performance. [Auto scale](./how-to-autoscale-endpoints.md) lets you dynamically ramp up or ramp down resources based on workloads. You can configure scaling based on utilization metrics, a specific schedule or a combination of both. An example of scaling based on utilization metrics is to add nodes if CPU utilization goes higher than 70%. An example of schedule-based scaling is to add nodes based on peak business hours.

0 commit comments

Comments
 (0)