Skip to content

Commit 0d68cc3

Browse files
author
AbhishekMallick01
committed
more updates
1 parent bfe5d2e commit 0d68cc3

File tree

1 file changed

+68
-5
lines changed

1 file changed

+68
-5
lines changed

articles/backup/azure-kubernetes-service-backup-troubleshoot.md

Lines changed: 68 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,75 @@ This article provides troubleshooting steps that help you resolve Azure Kubernet
1414

1515
## AKS Backup Extension installation error resolutions
1616

17-
| Error message | Cause | Resolution |
18-
| --- | --- | --- |
19-
| `{Helm installation from path [] for release [azure-aks-backup] failed with the following error: err [release azure-aks-backup failed, and has been uninstalled due to atomic being set: failed post-install: timed out waiting for the condition]} occurred while doing the operation: {Installing the extension} on the config"` | The extension has been installed successfully, but the pods aren't spawning. This happens because the required compute and memory aren't available for the pods. | To resolve the issue, increase the number of nodes in the cluster. This allows sufficient compute and memory to be available for the pods to spawn. <br><br> To scale node pool on Azure portal, follow these steps: <br><br> 1. On the Azure portal, open the *AKS cluster*. <br><br> 2. Go to **Node pools** under **Settings**. <br><br> 3. Select **Scale node pool**, and then update the *minimum* and *maximum* values on the **Node count range**. <br><br> 4. Select **Apply**. |
20-
| `BackupStorageLocation "default" is unavailable: rpc error: code = Unknown desc = azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/e30af180-aa96-4d81-981a-b67570b0d615/resourceGroups/AzureBackupRG_westeurope_1/providers/Microsoft.Storage/storageAccounts/devhayyabackup/listKeys?%24expand=kerb&api-version=2019-06-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: no azure identity found for request clientID 4e95##### REDACTED #####0777` <br> `Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=4e95dcc5-a769-4745-b2d9-` | When you enable pod-managed identity on your AKS cluster, an *AzurePodIdentityException* named *aks-addon-exception* is added to the *kube-system* namespace. An *AzurePodIdentityException* allows pods with certain labels to access the Azure Instance Metadata Service (IMDS) endpoint without being intercepted by the NMI server. <br><br> The extension pods aren't exempt, and require the Azure Active Directory (Azure AD) pod identity to be enabled manually. | Create *pod-identity* exception in AKS cluster (that works only for *dataprotection-microsoft* namespace and for *not kube-system*). [Learn more](/cli/azure/aks/pod-identity/exception?view=azure-cli-latest&preserve-view=true#az-aks-pod-identity-exception-add). <br><br> **Step 1**: Run the following command: <br> `az aks pod-identity exception add --resource-group shracrg --cluster-name shractestcluster --namespace dataprotection-microsoft --pod-labels app.kubernetes.io/name=dataprotection-microsoft-kubernetes` <br><br> **Step 2**: To verify *Azurepodidentityexceptions* in cluster, run the following command: <br> `kubectl get Azurepodidentityexceptions --all-namespaces` <br><br> **Step 3**: To assign the *Storage Account Contributor* role to the extension identity, run the following command: <br> `az role assignment create --assignee-object-id $(az k8s-extension show --name azure-aks-backup --cluster-name aksclustername --resource-group aksclusterresourcegroup --cluster-type managedClusters --query aksAssignedIdentity.principalId --output tsv) --role 'Storage Account Contributor' --scope /subscriptions/subscriptionid/resourceGroups/storageaccountresourcegroup/providers/Microsoft.Storage/storageAccounts/storageaccountname` |
21-
| `{"Message":"Error in the getting the Configurations: error {Post \https://centralus.dp.kubernetesconfiguration.azure.com/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /provider/managedclusters/clusters/ aksclustername /configurations/getPendingConfigs?api-version=2021-11-01\: dial tcp: lookup centralus.dp.kubernetesconfiguration.azure.com on 10.63.136.10:53: no such host}","LogType":"ConfigAgentTrace","LogLevel":"Error","Environment":"prod","Role":"ClusterConfigAgent","Location":"centralus","ArmId":"/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /providers/Microsoft.ContainerService/managedclusters/ aksclustername ","CorrelationId":"","AgentName":"ConfigAgent","AgentVersion":"1.8.14","AgentTimestamp":"2023/01/19 20:24:16"}` | Specific FQDN/application rules are required to use cluster extensions in the AKS clusters. [Learn more](/azure/aks/limit-egress-traffic#cluster-extensions). <br><br> This error appears due to absence of these FQDN rules because of which configuration information from the Cluster Extensions service wasn't available. | To resolve the issue, you need to create a *CoreDNS-custom override* for the *DP* endpoint to pass through the public network. <br><br> **Step 1**: To fetch *Existing CoreDNS-custom* YAML in your cluster (save it on your local for reference later), run the following command: <br> `kubectl get configmap coredns-custom -n kube-system -o yaml` <br><br> **Step 2**: To override mapping for *Central US DP* endpoint to public IP (download the YAML file attached), run the following command: <br> `kubectl apply -f corednsms.yaml` <br><br> **Step 3**: To force reload `coredns` pods, run the following command: <br> `kubectl delete pod --namespace kube-system -l k8s-app=kube-dns` <br><br> **Step 4**: To perform `NSlookup` from the *ExtensionAgent* pod to check if *coreDNS-custom* is working, run the following command: <br> `kubectl exec -i -t pod/extension-agent-<pod guid that's there in your cluster> -n kube-system -- nslookup centralus.dp.kubernetesconfiguration.azure.com` <br><br> **Step 5**: To check logs of the *ExtensionAgent* pod, run the following command: <br> `kubectl logs pod/extension-agent-<pod guid that’s there in your cluster> -n kube-system --tail=200` <br><br> **Step 6**: Delete and reinstall Backup Extension to initiate backup. |
17+
### Error 1
2218

19+
**Error message**: `{Helm installation from path [] for release [azure-aks-backup] failed with the following error: err [release azure-aks-backup failed, and has been uninstalled due to atomic being set: failed post-install: timed out waiting for the condition]} occurred while doing the operation: {Installing the extension} on the config"`
20+
21+
**Cause**: The extension has been installed successfully, but the pods aren't spawning. This happens because the required compute and memory aren't available for the pods.
22+
23+
**Resolution**: To resolve the issue, increase the number of nodes in the cluster. This allows sufficient compute and memory to be available for the pods to spawn.
24+
To scale node pool on Azure portal, follow these steps:
25+
26+
1. On the Azure portal, open the *AKS cluster*.
27+
1. Go to **Node pools** under **Settings**.
28+
1. Select **Scale node pool**, and then update the *minimum* and *maximum* values on the **Node count range**.
29+
1. Select **Apply**.
30+
31+
### Error 2
32+
33+
Error message: `BackupStorageLocation "default" is unavailable: rpc error: code = Unknown desc = azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/e30af180-aa96-4d81-981a-b67570b0d615/resourceGroups/AzureBackupRG_westeurope_1/providers/Microsoft.Storage/storageAccounts/devhayyabackup/listKeys?%24expand=kerb&api-version=2019-06-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: no azure identity found for request clientID 4e95##### REDACTED #####0777`
34+
35+
`Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=4e95dcc5-a769-4745-b2d9-`
36+
37+
**Cause**: When you enable pod-managed identity on your AKS cluster, an *AzurePodIdentityException* named *aks-addon-exception* is added to the *kube-system* namespace. An *AzurePodIdentityException* allows pods with certain labels to access the Azure Instance Metadata Service (IMDS) endpoint without being intercepted by the NMI server.
38+
39+
The extension pods aren't exempt, and require the Azure Active Directory (Azure AD) pod identity to be enabled manually.
40+
41+
**Resolution**: Create *pod-identity* exception in AKS cluster (that works only for *dataprotection-microsoft* namespace and for *not kube-system*). [Learn more](/cli/azure/aks/pod-identity/exception?view=azure-cli-latest&preserve-view=true#az-aks-pod-identity-exception-add).
42+
43+
1. Run the following command:
44+
45+
`az aks pod-identity exception add --resource-group shracrg --cluster-name shractestcluster --namespace dataprotection-microsoft --pod-labels app.kubernetes.io/name=dataprotection-microsoft-kubernetes`
46+
47+
2. To verify *Azurepodidentityexceptions* in cluster, run the following command:
48+
49+
`kubectl get Azurepodidentityexceptions --all-namespaces`
50+
51+
3. To assign the *Storage Account Contributor* role to the extension identity, run the following command:
52+
53+
`az role assignment create --assignee-object-id $(az k8s-extension show --name azure-aks-backup --cluster-name aksclustername --resource-group aksclusterresourcegroup --cluster-type managedClusters --query aksAssignedIdentity.principalId --output tsv) --role 'Storage Account Contributor' --scope /subscriptions/subscriptionid/resourceGroups/storageaccountresourcegroup/providers/Microsoft.Storage/storageAccounts/storageaccountname`
54+
55+
### Error 3
56+
57+
**Error message**: `{"Message":"Error in the getting the Configurations: error {Post \https://centralus.dp.kubernetesconfiguration.azure.com/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /provider/managedclusters/clusters/ aksclustername /configurations/getPendingConfigs?api-version=2021-11-01\: dial tcp: lookup centralus.dp.kubernetesconfiguration.azure.com on 10.63.136.10:53: no such host}","LogType":"ConfigAgentTrace","LogLevel":"Error","Environment":"prod","Role":"ClusterConfigAgent","Location":"centralus","ArmId":"/subscriptions/ subscriptionid /resourceGroups/ aksclusterresourcegroup /providers/Microsoft.ContainerService/managedclusters/ aksclustername ","CorrelationId":"","AgentName":"ConfigAgent","AgentVersion":"1.8.14","AgentTimestamp":"2023/01/19 20:24:16"}`
58+
59+
**Cause**: Specific FQDN/application rules are required to use cluster extensions in the AKS clusters. [Learn more](/azure/aks/limit-egress-traffic#cluster-extensions).
60+
61+
This error appears due to absence of these FQDN rules because of which configuration information from the Cluster Extensions service wasn't available.
62+
63+
**Resolution**: To resolve the issue, you need to create a *CoreDNS-custom override* for the *DP* endpoint to pass through the public network.
64+
65+
1. To fetch *Existing CoreDNS-custom* YAML in your cluster (save it on your local for reference later), run the following command:
66+
67+
`kubectl get configmap coredns-custom -n kube-system -o yaml`
68+
69+
2. To override mapping for *Central US DP* endpoint to public IP (download the YAML file attached), run the following command:
70+
71+
`kubectl apply -f corednsms.yaml`
72+
73+
3. To force reload `coredns` pods, run the following command:
74+
75+
`kubectl delete pod --namespace kube-system -l k8s-app=kube-dns`
76+
77+
4. To perform `NSlookup` from the *ExtensionAgent* pod to check if *coreDNS-custom* is working, run the following command:
78+
79+
`kubectl exec -i -t pod/extension-agent-<pod guid that's there in your cluster> -n kube-system -- nslookup centralus.dp.kubernetesconfiguration.azure.com`
80+
81+
5. To check logs of the *ExtensionAgent* pod, run the following command:
82+
83+
`kubectl logs pod/extension-agent-<pod guid that’s there in your cluster> -n kube-system --tail=200`
84+
85+
6. Delete and reinstall Backup Extension to initiate backup.
2386

2487
## Next steps
2588

0 commit comments

Comments
 (0)