|
1 | 1 | ---
|
2 | 2 | title: Troubleshoot Azure Kubernetes Service backup
|
3 |
| -description: Symptoms, causes, and resolutions of Azure Kubernetes Service backup and restore. |
| 3 | +description: Symptoms, causes, and resolutions of the Azure Kubernetes Service backup and restore operations. |
4 | 4 | ms.topic: troubleshooting
|
5 |
| -ms.date: 03/15/2023 |
| 5 | +ms.date: 12/28/2023 |
6 | 6 | ms.service: backup
|
7 | 7 | ms.custom:
|
8 | 8 | - ignite-2023
|
@@ -114,6 +114,114 @@ This error appears due to absence of these FQDN rules because of which configura
|
114 | 114 |
|
115 | 115 | 6. Delete and reinstall Backup Extension to initiate backup.
|
116 | 116 |
|
| 117 | +## Backup Extension post installation related errors |
| 118 | + |
| 119 | +These error codes appear due to issues on the Backup Extension installed in the AKS cluster. |
| 120 | + |
| 121 | + |
| 122 | + |
| 123 | +### KubernetesBackupListExtensionsError: |
| 124 | + |
| 125 | +**Cause**: Backup vault as part of a validation, checks if the cluster has backup extension installed. For this, the Vault MSI needs a reader permission on the AKS cluster allowing it to list all the extensions installed in the cluster. |
| 126 | + |
| 127 | +**Recommended action**: Reassign the Reader role to the Vault MSI (remove the existing role assignment and assign the Reader role again), because the Reader role assigned is missing the *list-extension* permission in it. If reassignment fails, use a different Backup vault to configure backup. |
| 128 | + |
| 129 | +### UserErrorKubernetesBackupExtensionNotFoundError |
| 130 | + |
| 131 | +**Cause**: Backup vault as part of validation, checks if the cluster has the Backup extension installed. Vault performs an operation to list the extensions installed in the cluster. If the Backup extension is absent in the list, this error appears. |
| 132 | + |
| 133 | +**Recommended action**: Use the CL or Azure portal client to delete the extension, and then install the extension again. |
| 134 | + |
| 135 | +### UserErrorKubernetesBackupExtensionHasErrors |
| 136 | + |
| 137 | +**Cause**: The Backup extension installed in the cluster has some internal errors. |
| 138 | + |
| 139 | +**Recommended action**: Use the CL or Azure portal client to delete the extension, and then install the extension again. |
| 140 | + |
| 141 | +### UserErrorKubernetesBackupExtensionIdentityNotFound |
| 142 | + |
| 143 | +**Cause**: AKS backup requires a Backup extension installed in the cluster. The extension along with its installation has a User Identity created called extension MSI. This MSI is created in the Resource Group comprising the node pools for the AKS cluster. This MSI gets the required Roles assigned to access Backup Storage location. The error code suggests that the Extension Identity is missing. |
| 144 | + |
| 145 | +**Recommended action**: Use the CLI or the Azure portal client to delete the extension, and then install the extension again. A new identity is created along with the extension. |
| 146 | + |
| 147 | +### KubernetesBackupCustomResourcesTrackingTimeOutError |
| 148 | + |
| 149 | +**Cause**: Azure Backup for AKS requires a Backup extension to be installed in the cluster. To perform the backup and restore operations, custom resources are created in the cluster. The extension-spawn pods that perform backup related operations via these CRs. This error occurs when the extension isn't able to update the status of these CRs. |
| 150 | + |
| 151 | +**Recommended action**: The health of the extension is required to be verified via running the command `kubectl get pods -n dataprotection.microsoft`. If the pods aren't in running state, then increase the number of nodes in the cluster by *1* or increase the compute limits. Then wait for a few minutes and run the command again, which should change the state of the pods to *running*. If the issue persists, delete and reinstall the extension. |
| 152 | + |
| 153 | +### BackupPluginDeleteBackupOperationFailed |
| 154 | + |
| 155 | +**Cause**: The Backup extension should be running to delete the backups. |
| 156 | + |
| 157 | +**Recommended action**: If the cluster is running, verify if the extension is running in a healthy state. Check if the extension pods are spawning, otherwise, increase the nodes. If that fails, try deleting and reinstalling the extension. If the backed-up cluster is deleted, then manually delete the snapshots and metadata. |
| 158 | + |
| 159 | +### ExtensionTimedOutWaitingForBackupItemSync |
| 160 | + |
| 161 | +**Cause**: The Backup extension waits for the backup items to be synced with the storage account. |
| 162 | + |
| 163 | +**Recommended action**: If this error code appears, then either retry the backup operation or reinstall the extension. |
| 164 | + |
| 165 | +## Backup storage location based errors |
| 166 | + |
| 167 | +These error codes appear due to issues based on the Backup extension installed in the AKS cluster. |
| 168 | + |
| 169 | +### UserErrorDeleteBackupFailedBackupStorageLocationReadOnly |
| 170 | + |
| 171 | +**Cause**: The storage account provided as input during Backup extension installation is in *read only* state, which doesn't allow to delete the backup data from the blob container. |
| 172 | + |
| 173 | +**Recommended action**: Change the storage account state from *read only* to *write*. |
| 174 | + |
| 175 | +### UserErrorDeleteBackupFailedBackupStorageLocationNotFound |
| 176 | + |
| 177 | +**Cause**: During the extension installation, a Backup Storage Location is to be provided as input that includes a storage account and blob container. This error appears if the location is deleted or incorrectly added during extension installation. |
| 178 | + |
| 179 | +**Recommended action**: Delete the Backup extension, and then reinstall it with correct storage account and blob container as input. |
| 180 | + |
| 181 | +### UserErrorBackupFailedBackupStorageLocationReadOnly |
| 182 | + |
| 183 | +**Cause**: The storage account provided as input during Backup extension installation is in *read only* state, which doesn't allow to write backup data on the blob container. |
| 184 | + |
| 185 | +**Recommended action**: Change the storage account state from *read only* to *write*. |
| 186 | + |
| 187 | +### UserErrorNoDefaultBackupStorageLocationFound |
| 188 | + |
| 189 | +**Cause**: During extension installation, a Backup Storage Location is to be provided as input, which includes a storage account and blob container. The error appears if the location is deleted or incorrectly entered during extension installation. |
| 190 | + |
| 191 | +**Recommended action**: Delete the Backup extension, and then reinstall it with correct storage account and blob container as input. |
| 192 | + |
| 193 | +### UserErrorExtensionMSIMissingPermissionsOnBackupStorageLocation |
| 194 | + |
| 195 | +**Cause**: The Backup extension should have the *Storage Account Contributor* role on the Backup Storage Location (storage account). The Extension Identity gets this role assigned. |
| 196 | + |
| 197 | +**Recommended action**: If this role is missing, then use Azure portal or CLI to reassign this missing permission on the storage account. |
| 198 | + |
| 199 | +### UserErrorBackupStorageLocationNotReady |
| 200 | + |
| 201 | +**Cause**: During extension installation, a Backup Storage Location is to be provided as input that includes a storage account and blob container. The Backup extension should have *Storage Account Contributor* role on the Backup Storage Location (storage account). The Extension Identity gets this role assigned. |
| 202 | + |
| 203 | +**Recommended action**: The error appears if the Extension Identity doesn't have right permissions to access the storage account. This error appears if AKS backup extension is installed the first time when configuring protection operation. This happens for the time taken for the granted permissions to propagate to the AKS backup extension. As a workaround, wait an hour and retry the protection configuration. Otherwise, use Azure portal or CLI to reassign this missing permission on the storage account. |
| 204 | + |
| 205 | +## Vaulted backup based errors |
| 206 | + |
| 207 | +This error code can appear while you enable AKS backup to store backups in a vault standard datastore. |
| 208 | + |
| 209 | +### DppUserErrorVaultTierPolicyNotSupported |
| 210 | + |
| 211 | +**Cause**: This error code appears when a backup policy is created with retention rule defined for vault-standard datastore for a Backup vault in a region where this datastore isn't supported. |
| 212 | + |
| 213 | +**Recommended action**: Update the retention rule with vault-standard duration defined on Azure portal: |
| 214 | + |
| 215 | +1. Select **Edit** icon next to the rule. |
| 216 | + |
| 217 | + :::image type="content" source="./media/azure-kubernetes-service-backup-troubleshoot/edit-backup-policy-for-vaulted-backup.png" alt-text="Screenshot shows how to edit the retention duration of the AKS backups." lightbox="./media/azure-kubernetes-service-backup-troubleshoot/edit-backup-policy-for-vaulted-backup.png"::: |
| 218 | + |
| 219 | +2. Clear the checkbox next the **Vault-standard**, and then select **Update**. |
| 220 | + |
| 221 | + :::image type="content" source="./media/azure-kubernetes-service-backup-troubleshoot/clear-vault-standard-checkbox.png" alt-text="Screenshot shows clearing the vault-standard checkbox." lightbox="./media/azure-kubernetes-service-backup-troubleshoot/clear-vault-standard-checkbox.png"::: |
| 222 | + |
| 223 | +3. Create a backup policy for operational tier backup (only snapshots for the AKS cluster). |
| 224 | + |
117 | 225 | ## Next steps
|
118 | 226 |
|
119 | 227 | - [About Azure Kubernetes Service (AKS) backup](azure-kubernetes-service-backup-overview.md)
|
0 commit comments