Update articles/machine-learning/how-to-deploy-models-from-huggingface.md

ssalgadodev · web-flow · commit 3f6bec97c4cf · 2024-05-31T03:25:50.000-04:00
diff --git a/articles/machine-learning/how-to-deploy-models-from-huggingface.md b/articles/machine-learning/how-to-deploy-models-from-huggingface.md
@@ -21,7 +21,6 @@ Microsoft has partnered with Hugging Face to bring open-source models from Huggi
 > [!NOTE] 
 > Models from Hugging Face are subject to third party license terms available on the Hugging Face model details page. It is your responsibility to comply with the model's license terms.
 
-
 ## Benefits of using online endpoints for real-time inference
 
 Managed online endpoints in Azure Machine Learning help you deploy models to powerful CPU and GPU machines in Azure in a turnkey manner. Managed online endpoints take care of serving, scaling, securing, and monitoring your models, freeing you from the overhead of setting up and managing the underlying infrastructure. The virtual machines are provisioned on your behalf when you deploy models. You can have multiple deployments behind and [split traffic or mirror traffic](./how-to-safely-rollout-online-endpoints.md) to those deployments. Mirror traffic helps you to test new versions of models on production traffic without releasing them production environments. Splitting traffic lets you gradually increase production traffic to new model versions while observing performance. [Auto scale](./how-to-autoscale-endpoints.md) lets you dynamically ramp up or ramp down resources based on workloads. You can configure scaling based on utilization metrics, a specific schedule or a combination of both. An example of scaling based on utilization metrics is to add nodes if CPU utilization goes higher than 70%. An example of schedule-based scaling is to add nodes based on peak business hours.