Skip to content

Commit acbd021

Browse files
committed
update new user error and update role require for TA-cluster
1 parent 192dfdf commit acbd021

File tree

2 files changed

+31
-0
lines changed

2 files changed

+31
-0
lines changed

articles/machine-learning/how-to-attach-kubernetes-to-workspace.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ Otherwise, if a [user-assigned managed identity is specified in Azure Machine Le
5151
|--|--|--|
5252
|Azure Relay|Azure Relay Owner|Only applicable for Arc-enabled Kubernetes cluster. Azure Relay isn't created for AKS cluster without Arc connected.|
5353
|Kubernetes - Azure Arc or Azure Kubernetes Service|Reader <br> Kubernetes Extension Contributor <br> Azure Kubernetes Service Cluster Admin |Applicable for both Arc-enabled Kubernetes cluster and AKS cluster.|
54+
|Azure Kubernetes Service|Contributor|For private cluster in the region enabled Trusted Access cluster and cannot attach the cluster to workspace.You can check [this doc](https://github.com/Azure/AML-Kubernetes/blob/master/docs/azureml-aks-ta-support.md) for details.|
5455

5556

5657
> [!TIP]

articles/machine-learning/how-to-troubleshoot-kubernetes-compute.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,8 @@ Below is a list of error types in **cluster scope** that you might encounter whe
110110
* [ERROR: GenericClusterError](#error-genericclustererror)
111111
* [ERROR: ClusterNotReachable](#error-clusternotreachable)
112112
* [ERROR: ClusterNotFound](#error-clusternotfound)
113+
* [ERROR: ClusterServiceNotFound](#error-clusterservicenotfound)
114+
* [ERROR: ClusterUnaithorized](#error-clusterunauthorized)
113115

114116
#### ERROR: GenericClusterError
115117

@@ -163,6 +165,34 @@ You can check the following items to troubleshoot the issue:
163165
* First, check the cluster resource ID in the Azure portal to verify whether Kubernetes cluster resource still exists and is running normally.
164166
* If the cluster exists and is running, then you can try to detach and reattach the compute to the workspace. Pay attention to more notes on [reattach](#error-genericcomputeerror).
165167

168+
#### ERROR: ClusterServiceNotFound
169+
170+
The error message is as follows:
171+
172+
````
173+
AzureML extension service not found in cluster.
174+
````
175+
176+
This error should occur when the extension owned ingress service does not have enough backend pod.
177+
178+
You can:
179+
180+
* Access the cluster and check the status of service `azureml-ingress-nginx-controller` and its backend pod under `azureml` namespace.
181+
* If cluster does not have any running backend pod, check the reason by describe the pod. For example, if the pod does not have enough resources to run, please delete some deployment to free some resources for the ingress pod.
182+
183+
#### ERROR: ClusterUnauthorized
184+
185+
The error message is as follows:
186+
187+
````
188+
AMLArc failed to connect to the cluster, reason: Unauthorized.
189+
````
190+
191+
This error should only occur in TA-enabled cluster which means the token has expired during thedeploymen.
192+
193+
You can:
194+
* Try again after several minutes.
195+
166196
> [!TIP]
167197
> More troubleshoot guide of common errors when creating/updating the Kubernetes online endpoints and deployments, you can find in [How to troubleshoot online endpoints](how-to-troubleshoot-online-endpoints.md).
168198

0 commit comments

Comments
 (0)