You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/backup/azure-kubernetes-service-backup-troubleshoot.md
+44-15Lines changed: 44 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,9 +14,14 @@ This article provides troubleshooting steps that help you resolve Azure Kubernet
14
14
15
15
## AKS Backup Extension installation error resolutions
16
16
17
-
### Error 1
17
+
### Scenario 1
18
+
19
+
**Error message**:
20
+
21
+
```Erroe
22
+
{Helm installation from path [] for release [azure-aks-backup] failed with the following error: err [release azure-aks-backup failed, and has been uninstalled due to atomic being set: failed post-install: timed out waiting for the condition]} occurred while doing the operation: {Installing the extension} on the config"`
23
+
```
18
24
19
-
**Error message**: `{Helm installation from path [] for release [azure-aks-backup] failed with the following error: err [release azure-aks-backup failed, and has been uninstalled due to atomic being set: failed post-install: timed out waiting for the condition]} occurred while doing the operation: {Installing the extension} on the config"`
20
25
21
26
**Cause**: The extension has been installed successfully, but the pods aren't spawning. This happens because the required compute and memory aren't available for the pods.
22
27
@@ -28,11 +33,15 @@ To scale node pool on Azure portal, follow these steps:
28
33
1. Select **Scale node pool**, and then update the *minimum* and *maximum* values on the **Node count range**.
29
34
1. Select **Apply**.
30
35
31
-
### Error 2
36
+
### Scenario 2
32
37
33
-
Error message: `BackupStorageLocation "default" is unavailable: rpc error: code = Unknown desc = azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/e30af180-aa96-4d81-981a-b67570b0d615/resourceGroups/AzureBackupRG_westeurope_1/providers/Microsoft.Storage/storageAccounts/devhayyabackup/listKeys?%24expand=kerb&api-version=2019-06-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: no azure identity found for request clientID 4e95##### REDACTED #####0777`
BackupStorageLocation "default" is unavailable: rpc error: code = Unknown desc = azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/e30af180-aa96-4d81-981a-b67570b0d615/resourceGroups/AzureBackupRG_westeurope_1/providers/Microsoft.Storage/storageAccounts/devhayyabackup/listKeys?%24expand=kerb&api-version=2019-06-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: no azure identity found for request clientID 4e95##### REDACTED #####0777`
**Cause**: When you enable pod-managed identity on your AKS cluster, an *AzurePodIdentityException* named *aks-addon-exception* is added to the *kube-system* namespace. An *AzurePodIdentityException* allows pods with certain labels to access the Azure Instance Metadata Service (IMDS) endpoint without being intercepted by the NMI server.
38
47
@@ -42,20 +51,30 @@ The extension pods aren't exempt, and require the Azure Active Directory (Azure
az role assignment create --assignee-object-id $(az k8s-extension show --name azure-aks-backup --cluster-name aksclustername --resource-group aksclusterresourcegroup --cluster-type managedClusters --query aksAssignedIdentity.principalId --output tsv) --role 'Storage Account Contributor' --scope /subscriptions/subscriptionid/resourceGroups/storageaccountresourcegroup/providers/Microsoft.Storage/storageAccounts/storageaccountname
69
+
```
54
70
55
-
### Error 3
71
+
### Scenario 3
56
72
57
-
**Error message**:`{"Message":"Error in the getting the Configurations: error {Post \https://centralus.dp.kubernetesconfiguration.azure.com/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /provider/managedclusters/clusters/ aksclustername /configurations/getPendingConfigs?api-version=2021-11-01\: dial tcp: lookup centralus.dp.kubernetesconfiguration.azure.com on 10.63.136.10:53: no such host}","LogType":"ConfigAgentTrace","LogLevel":"Error","Environment":"prod","Role":"ClusterConfigAgent","Location":"centralus","ArmId":"/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /providers/Microsoft.ContainerService/managedclusters/ aksclustername ","CorrelationId":"","AgentName":"ConfigAgent","AgentVersion":"1.8.14","AgentTimestamp":"2023/01/19 20:24:16"}`
73
+
**Error message**:
58
74
75
+
```Error
76
+
{"Message":"Error in the getting the Configurations: error {Post \https://centralus.dp.kubernetesconfiguration.azure.com/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /provider/managedclusters/clusters/ aksclustername /configurations/getPendingConfigs?api-version=2021-11-01\: dial tcp: lookup centralus.dp.kubernetesconfiguration.azure.com on 10.63.136.10:53: no such host}","LogType":"ConfigAgentTrace","LogLevel":"Error","Environment":"prod","Role":"ClusterConfigAgent","Location":"centralus","ArmId":"/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /providers/Microsoft.ContainerService/managedclusters/ aksclustername ","CorrelationId":"","AgentName":"ConfigAgent","AgentVersion":"1.8.14","AgentTimestamp":"2023/01/19 20:24:16"}`
77
+
```
59
78
**Cause**: Specific FQDN/application rules are required to use cluster extensions in the AKS clusters. [Learn more](/azure/aks/limit-egress-traffic#cluster-extensions).
60
79
61
80
This error appears due to absence of these FQDN rules because of which configuration information from the Cluster Extensions service wasn't available.
@@ -64,23 +83,33 @@ This error appears due to absence of these FQDN rules because of which configura
64
83
65
84
1. To fetch *Existing CoreDNS-custom* YAML in your cluster (save it on your local for reference later), run the following command:
66
85
67
-
`kubectl get configmap coredns-custom -n kube-system -o yaml`
86
+
```azurepowershell-interactive
87
+
kubectl get configmap coredns-custom -n kube-system -o yaml
88
+
```
68
89
69
90
2. To override mapping for *Central US DP* endpoint to public IP (download the YAML file attached), run the following command:
70
91
71
-
`kubectl apply -f corednsms.yaml`
92
+
```azurepowershell-interactive
93
+
kubectl apply -f corednsms.yaml
94
+
```
72
95
73
96
3. To force reload `coredns` pods, run the following command:
74
97
75
-
`kubectl delete pod --namespace kube-system -l k8s-app=kube-dns`
98
+
```azurepowershell-interactive
99
+
kubectl delete pod --namespace kube-system -l k8s-app=kube-dns
100
+
```
76
101
77
102
4. To perform `NSlookup` from the *ExtensionAgent* pod to check if *coreDNS-custom* is working, run the following command:
78
103
79
-
`kubectl exec -i -t pod/extension-agent-<pod guid that's there in your cluster> -n kube-system -- nslookup centralus.dp.kubernetesconfiguration.azure.com`
104
+
```azurepowershell-interactive
105
+
kubectl exec -i -t pod/extension-agent-<pod guid that's there in your cluster> -n kube-system -- nslookup centralus.dp.kubernetesconfiguration.azure.com
106
+
```
80
107
81
108
5. To check logs of the *ExtensionAgent* pod, run the following command:
82
109
83
-
`kubectl logs pod/extension-agent-<pod guid that’s there in your cluster> -n kube-system --tail=200`
110
+
```azurepowershell-interactive
111
+
kubectl logs pod/extension-agent-<pod guid that’s there in your cluster> -n kube-system --tail=200
112
+
```
84
113
85
114
6. Delete and reinstall Backup Extension to initiate backup.
0 commit comments