Skip to content

Commit 9eb6fea

Browse files
authored
Merge pull request #188329 from zhiyong-gayang/master
Update azureml aks self-scaler
2 parents 326e735 + ae9f5cf commit 9eb6fea

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

articles/machine-learning/how-to-deploy-azure-kubernetes-service.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,12 +83,14 @@ In Azure Machine Learning, "deployment" is used in the more general sense of mak
8383
The front-end component (azureml-fe) that routes incoming inference requests to deployed services automatically scales as needed. Scaling of azureml-fe is based on the AKS cluster purpose and size (number of nodes). The cluster purpose and nodes are configured when you [create or attach an AKS cluster](how-to-create-attach-kubernetes.md). There is one azureml-fe service per cluster, which may be running on multiple pods.
8484

8585
> [!IMPORTANT]
86-
> When using a cluster configured as __dev-test__, the self-scaler is **disabled**.
86+
> When using a cluster configured as __dev-test__, the self-scaler is **disabled**. Even for FastProd/DenseProd clusters, Self-Scaler is only enabled when telemetry shows that it's needed.
8787
8888
Azureml-fe scales both up (vertically) to use more cores, and out (horizontally) to use more pods. When making the decision to scale up, the time that it takes to route incoming inference requests is used. If this time exceeds the threshold, a scale-up occurs. If the time to route incoming requests continues to exceed the threshold, a scale-out occurs.
8989

9090
When scaling down and in, CPU usage is used. If the CPU usage threshold is met, the front end will first be scaled down. If the CPU usage drops to the scale-in threshold, a scale-in operation happens. Scaling up and out will only occur if there are enough cluster resources available.
9191

92+
When scale-up or scale-down, azureml-fe pods will be restarted to apply the cpu/memory changes. Inferencing requests are not affected by the restarts.
93+
9294
<a id="connectivity"></a>
9395

9496
## Understand connectivity requirements for AKS inferencing cluster
@@ -105,6 +107,8 @@ The following diagram shows the connectivity requirements for AKS inferencing. B
105107

106108
For general AKS connectivity requirements, see [Control egress traffic for cluster nodes in Azure Kubernetes Service](../aks/limit-egress-traffic.md).
107109

110+
For accessing Azure ML services behind a firewall, see [How to access azureml behind firewall](https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/machine-learning/how-to-access-azureml-behind-firewall.md).
111+
108112
### Overall DNS resolution requirements
109113

110114
DNS resolution within an existing VNet is under your control. For example, a firewall or custom DNS server. The following hosts must be reachable:
@@ -117,6 +121,7 @@ DNS resolution within an existing VNet is under your control. For example, a fir
117121
| `<account>.table.core.windows.net` | Azure Storage Account (table storage) |
118122
| `<account>.blob.core.windows.net` | Azure Storage Account (blob storage) |
119123
| `api.azureml.ms` | Azure Active Directory (AAD) authentication |
124+
| `ingest-vienna<region>.kusto.windows.net` | Kusto endpoint for uploading telemetry |
120125
| `<leaf-domain-label + auto-generated suffix>.<region>.cloudapp.azure.com` | Endpoint domain name, if you autogenerated by Azure Machine Learning. If you used a custom domain name, you do not need this entry. |
121126

122127
### Connectivity requirements in chronological order: from cluster creation to model deployment

0 commit comments

Comments
 (0)