You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/backup/azure-kubernetes-service-backup-troubleshoot.md
+68-5Lines changed: 68 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,12 +14,75 @@ This article provides troubleshooting steps that help you resolve Azure Kubernet
14
14
15
15
## AKS Backup Extension installation error resolutions
16
16
17
-
| Error message | Cause | Resolution |
18
-
| --- | --- | --- |
19
-
| `{Helm installation from path [] for release [azure-aks-backup] failed with the following error: err [release azure-aks-backup failed, and has been uninstalled due to atomic being set: failed post-install: timed out waiting for the condition]} occurred while doing the operation: {Installing the extension} on the config"` | The extension has been installed successfully, but the pods aren't spawning. This happens because the required compute and memory aren't available for the pods. | To resolve the issue, increase the number of nodes in the cluster. This allows sufficient compute and memory to be available for the pods to spawn. <br><br> To scale node pool on Azure portal, follow these steps: <br><br> 1. On the Azure portal, open the *AKS cluster*. <br><br> 2. Go to **Node pools** under **Settings**. <br><br> 3. Select **Scale node pool**, and then update the *minimum* and *maximum* values on the **Node count range**. <br><br> 4. Select **Apply**. |
20
-
| `BackupStorageLocation "default" is unavailable: rpc error: code = Unknown desc = azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/e30af180-aa96-4d81-981a-b67570b0d615/resourceGroups/AzureBackupRG_westeurope_1/providers/Microsoft.Storage/storageAccounts/devhayyabackup/listKeys?%24expand=kerb&api-version=2019-06-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: no azure identity found for request clientID 4e95##### REDACTED #####0777` <br> `Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=4e95dcc5-a769-4745-b2d9-` | When you enable pod-managed identity on your AKS cluster, an *AzurePodIdentityException* named *aks-addon-exception* is added to the *kube-system* namespace. An *AzurePodIdentityException* allows pods with certain labels to access the Azure Instance Metadata Service (IMDS) endpoint without being intercepted by the NMI server. <br><br> The extension pods aren't exempt, and require the Azure Active Directory (Azure AD) pod identity to be enabled manually. | Create *pod-identity* exception in AKS cluster (that works only for *dataprotection-microsoft* namespace and for *not kube-system*). [Learn more](/cli/azure/aks/pod-identity/exception?view=azure-cli-latest&preserve-view=true#az-aks-pod-identity-exception-add). <br><br> **Step 1**: Run the following command: <br> `az aks pod-identity exception add --resource-group shracrg --cluster-name shractestcluster --namespace dataprotection-microsoft --pod-labels app.kubernetes.io/name=dataprotection-microsoft-kubernetes` <br><br> **Step 2**: To verify *Azurepodidentityexceptions* in cluster, run the following command: <br> `kubectl get Azurepodidentityexceptions --all-namespaces` <br><br> **Step 3**: To assign the *Storage Account Contributor* role to the extension identity, run the following command: <br> `az role assignment create --assignee-object-id $(az k8s-extension show --name azure-aks-backup --cluster-name aksclustername --resource-group aksclusterresourcegroup --cluster-type managedClusters --query aksAssignedIdentity.principalId --output tsv) --role 'Storage Account Contributor' --scope /subscriptions/subscriptionid/resourceGroups/storageaccountresourcegroup/providers/Microsoft.Storage/storageAccounts/storageaccountname` |
21
-
| `{"Message":"Error in the getting the Configurations: error {Post \https://centralus.dp.kubernetesconfiguration.azure.com/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /provider/managedclusters/clusters/ aksclustername /configurations/getPendingConfigs?api-version=2021-11-01\: dial tcp: lookup centralus.dp.kubernetesconfiguration.azure.com on 10.63.136.10:53: no such host}","LogType":"ConfigAgentTrace","LogLevel":"Error","Environment":"prod","Role":"ClusterConfigAgent","Location":"centralus","ArmId":"/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /providers/Microsoft.ContainerService/managedclusters/ aksclustername ","CorrelationId":"","AgentName":"ConfigAgent","AgentVersion":"1.8.14","AgentTimestamp":"2023/01/19 20:24:16"}` | Specific FQDN/application rules are required to use cluster extensions in the AKS clusters. [Learn more](/azure/aks/limit-egress-traffic#cluster-extensions). <br><br> This error appears due to absence of these FQDN rules because of which configuration information from the Cluster Extensions service wasn't available. | To resolve the issue, you need to create a *CoreDNS-custom override* for the *DP* endpoint to pass through the public network. <br><br> **Step 1**: To fetch *Existing CoreDNS-custom* YAML in your cluster (save it on your local for reference later), run the following command: <br> `kubectl get configmap coredns-custom -n kube-system -o yaml` <br><br> **Step 2**: To override mapping for *Central US DP* endpoint to public IP (download the YAML file attached), run the following command: <br> `kubectl apply -f corednsms.yaml` <br><br> **Step 3**: To force reload `coredns` pods, run the following command: <br> `kubectl delete pod --namespace kube-system -l k8s-app=kube-dns` <br><br> **Step 4**: To perform `NSlookup` from the *ExtensionAgent* pod to check if *coreDNS-custom* is working, run the following command: <br> `kubectl exec -i -t pod/extension-agent-<pod guid that's there in your cluster> -n kube-system -- nslookup centralus.dp.kubernetesconfiguration.azure.com` <br><br> **Step 5**: To check logs of the *ExtensionAgent* pod, run the following command: <br> `kubectl logs pod/extension-agent-<pod guid that’s there in your cluster> -n kube-system --tail=200` <br><br> **Step 6**: Delete and reinstall Backup Extension to initiate backup. |
17
+
### Error 1
22
18
19
+
**Error message**: `{Helm installation from path [] for release [azure-aks-backup] failed with the following error: err [release azure-aks-backup failed, and has been uninstalled due to atomic being set: failed post-install: timed out waiting for the condition]} occurred while doing the operation: {Installing the extension} on the config"`
20
+
21
+
**Cause**: The extension has been installed successfully, but the pods aren't spawning. This happens because the required compute and memory aren't available for the pods.
22
+
23
+
**Resolution**: To resolve the issue, increase the number of nodes in the cluster. This allows sufficient compute and memory to be available for the pods to spawn.
24
+
To scale node pool on Azure portal, follow these steps:
25
+
26
+
1. On the Azure portal, open the *AKS cluster*.
27
+
1. Go to **Node pools** under **Settings**.
28
+
1. Select **Scale node pool**, and then update the *minimum* and *maximum* values on the **Node count range**.
29
+
1. Select **Apply**.
30
+
31
+
### Error 2
32
+
33
+
Error message: `BackupStorageLocation "default" is unavailable: rpc error: code = Unknown desc = azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/e30af180-aa96-4d81-981a-b67570b0d615/resourceGroups/AzureBackupRG_westeurope_1/providers/Microsoft.Storage/storageAccounts/devhayyabackup/listKeys?%24expand=kerb&api-version=2019-06-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: no azure identity found for request clientID 4e95##### REDACTED #####0777`
**Cause**: When you enable pod-managed identity on your AKS cluster, an *AzurePodIdentityException* named *aks-addon-exception* is added to the *kube-system* namespace. An *AzurePodIdentityException* allows pods with certain labels to access the Azure Instance Metadata Service (IMDS) endpoint without being intercepted by the NMI server.
38
+
39
+
The extension pods aren't exempt, and require the Azure Active Directory (Azure AD) pod identity to be enabled manually.
40
+
41
+
**Resolution**: Create *pod-identity* exception in AKS cluster (that works only for *dataprotection-microsoft* namespace and for *not kube-system*). [Learn more](/cli/azure/aks/pod-identity/exception?view=azure-cli-latest&preserve-view=true#az-aks-pod-identity-exception-add).
**Error message**: `{"Message":"Error in the getting the Configurations: error {Post \https://centralus.dp.kubernetesconfiguration.azure.com/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /provider/managedclusters/clusters/ aksclustername /configurations/getPendingConfigs?api-version=2021-11-01\: dial tcp: lookup centralus.dp.kubernetesconfiguration.azure.com on 10.63.136.10:53: no such host}","LogType":"ConfigAgentTrace","LogLevel":"Error","Environment":"prod","Role":"ClusterConfigAgent","Location":"centralus","ArmId":"/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /providers/Microsoft.ContainerService/managedclusters/ aksclustername ","CorrelationId":"","AgentName":"ConfigAgent","AgentVersion":"1.8.14","AgentTimestamp":"2023/01/19 20:24:16"}`
58
+
59
+
**Cause**: Specific FQDN/application rules are required to use cluster extensions in the AKS clusters. [Learn more](/azure/aks/limit-egress-traffic#cluster-extensions).
60
+
61
+
This error appears due to absence of these FQDN rules because of which configuration information from the Cluster Extensions service wasn't available.
62
+
63
+
**Resolution**: To resolve the issue, you need to create a *CoreDNS-custom override* for the *DP* endpoint to pass through the public network.
64
+
65
+
1. To fetch *Existing CoreDNS-custom* YAML in your cluster (save it on your local for reference later), run the following command:
66
+
67
+
`kubectl get configmap coredns-custom -n kube-system -o yaml`
68
+
69
+
2. To override mapping for *Central US DP* endpoint to public IP (download the YAML file attached), run the following command:
70
+
71
+
`kubectl apply -f corednsms.yaml`
72
+
73
+
3. To force reload `coredns` pods, run the following command:
74
+
75
+
`kubectl delete pod --namespace kube-system -l k8s-app=kube-dns`
76
+
77
+
4. To perform `NSlookup` from the *ExtensionAgent* pod to check if *coreDNS-custom* is working, run the following command:
78
+
79
+
`kubectl exec -i -t pod/extension-agent-<pod guid that's there in your cluster> -n kube-system -- nslookup centralus.dp.kubernetesconfiguration.azure.com`
80
+
81
+
5. To check logs of the *ExtensionAgent* pod, run the following command:
82
+
83
+
`kubectl logs pod/extension-agent-<pod guid that’s there in your cluster> -n kube-system --tail=200`
84
+
85
+
6. Delete and reinstall Backup Extension to initiate backup.
0 commit comments