Skip to content

Commit 4d12268

Browse files
committed
add common errors to aks trouble shooting doc
1 parent 6dddd3b commit 4d12268

File tree

1 file changed

+58
-6
lines changed

1 file changed

+58
-6
lines changed

articles/service-connector/how-to-use-service-connector-in-aks.md

Lines changed: 58 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -125,19 +125,19 @@ Service Connector kubernetes extension is built on top of [Azure Arc-enabled Kub
125125
126126
1. Install the `k8s-extension` Azure CLI extension.
127127

128-
```azurecli
128+
```azurecli
129129
az extension add --name k8s-extension
130-
```
130+
```
131131

132132
1. Get the Service Connector extension status. Check the `statuses` property in the command output to see if there are any errors.
133133

134-
```azurecli
134+
```azurecli
135135
az k8s-extension show \
136136
--resource-group MyClusterResourceGroup \
137137
--cluster-name MyCluster \
138138
--cluster-type managedClusters \
139139
--name sc-extension
140-
```
140+
```
141141

142142
### Check kubernetes cluster logs
143143

@@ -150,7 +150,7 @@ If there's an error during the extension installation, and the error message in
150150
--resource-group MyClusterResourceGroup \
151151
--name MyCluster
152152
```
153-
1. Service Connector extension is installed in the namespace `sc-system` through helm chart, check the namespace and the helm release by following commands.
153+
2. Service Connector extension is installed in the namespace `sc-system` through helm chart, check the namespace and the helm release by following commands.
154154

155155
- Check the namespace exists.
156156

@@ -163,7 +163,7 @@ If there's an error during the extension installation, and the error message in
163163
```Bash
164164
helm list -n sc-system
165165
```
166-
1. During the extension installation or updating, a kubernetes job called `sc-job` creates the kubernetes resources for the service connection. The job execution failure usually causes the extension failure. Check the job status by running the following commands. If `sc-job` doesn't exist in `sc-system` namespace, it should have been executed successfully. This job is designed to be automatically deleted after successful execution.
166+
3. During the extension installation or updating, a kubernetes job called `sc-job` creates the kubernetes resources for the service connection. The job execution failure usually causes the extension failure. Check the job status by running the following commands. If `sc-job` doesn't exist in `sc-system` namespace, it should have been executed successfully. This job is designed to be automatically deleted after successful execution.
167167

168168
- Check the job exists.
169169

@@ -183,6 +183,58 @@ If there's an error during the extension installation, and the error message in
183183
kubectl logs job/sc-job -n sc-system
184184
```
185185

186+
### Common Errors and Mitigations
187+
188+
#### 1. Conflict
189+
190+
**Error Message:**
191+
`Operation returned an invalid status code: Conflict`.
192+
193+
**Reason:**
194+
This error usually occurs when attempting to create a service connector while the AKS (Azure Kubernetes Service) cluster is in an updating state. The service connector's update conflicts with the ongoing update.
195+
196+
**Mitigation:**
197+
Ensure your cluster is in a "Succeeded" state before retrying the creation. It resolves most errors related to conflicts.
198+
199+
#### 2. Timeout
200+
201+
**Error Message:**
202+
- `Long running operation failed with status 'Failed'. Unable to get a response from the Agent in time`.
203+
- `Timed out waiting for the resource to come to a ready/completed state`
204+
205+
**Reason:**
206+
This error often happens when the Kubernetes job used to create or update the Service Connector's cluster extension fails to be scheduled due to resource limitations or other issues.
207+
208+
**Mitigation:**
209+
Refer to the [Check Kubernetes Cluster Logs](#check-kubernetes-cluster-logs) to identify and resolve the detailed reasons. A common issue is that no nodes are available due to preemption. In this case, consider adding more nodes or enabling auto-scaling for your nodes.
210+
211+
#### 3. Unauthorized Resource Access
212+
213+
**Error Message:**
214+
`You do not have permission to perform ... If access was recently granted, please refresh your credentials`.
215+
216+
**Reason:**
217+
Service Connector requires permissions to operate the Azure resources you want to connect to, in order to perform connection operations on your behalf. This error indicates a lack of necessary permissions on some Azure resources.
218+
219+
**Mitigation:**
220+
Check the permissions on the Azure resources specified in the error message. Obtain the required permissions and retry the creation.
221+
222+
#### Other Issues
223+
224+
If the above mitigations don't resolve your issue, try resetting the service connector cluster extension by removing it and then retrying the creation. This method is expected to resolve most issues related to the Service Connector cluster extension.
225+
226+
Use the following CLI commands to reset the extension:
227+
228+
```azurecli
229+
az extension add --name k8s-extension
230+
231+
az k8s-extension delete \
232+
--resource-group <MyClusterResourceGroup> \
233+
--cluster-name <MyCluster> \
234+
--cluster-type managedClusters \
235+
--name sc-extension
236+
```
237+
186238
## Next steps
187239

188240
Learn how to integrate different target services and read about their configuration settings and authentication methods.

0 commit comments

Comments
 (0)