You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/backup/aks-backup-faq.yml
+18-2Lines changed: 18 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -29,7 +29,7 @@ sections:
29
29
30
30
- The storage account must be of `Standard general-purpose v2` type.
31
31
- The blob container must be created in the storage account before installing the AKS Backup Extension.
32
-
- The blob container should preferably be empty before installation or atleast shouldn't haven't nonbackup related data in it, as the extension will create its own folder structure within the container to store backup data and metadata.
32
+
- The blob container should preferably be empty before installation or at least shouldn't have nonbackup related data in it, as the extension will create its own folder structure within the container to store backup data and metadata.
33
33
- In case the AKS cluster is within a Private Network, the storage account must be accessible from the AKS cluster. This can be achieved by using a Private Endpoint for the storage account or by configuring the necessary network rules to allow access from the AKS cluster to the storage account.
Will snapshots for all Persistent Volumes (PVs) in a backup configuration be taken at the exact same time, or is there a delay?
240
+
answer: |
241
+
Azure Backup for AKS does not currently support taking snapshots of all PVs at the exact same millisecond. While the snapshot operations are initiated in parallel, there may be slight delays between individual PV snapshots due to infrastructure and API timing. To help achieve consistency across multiple PVs, Azure Backup supports application-consistent backups using hooks. Hooks allow users to pause application writes before snapshotting and resume them afterward. This approach improves consistency and mimics crash consistency, though it may not be equivalent to true simultaneous snapshots or coordinated database-level consistency.
242
+
243
+
- question: |
244
+
What happens if I select the "Skip" option for Kubernetes resources including PVCs during an AKS restore?
245
+
answer: |
246
+
Selecting "Skip" means the restore process will not attempt to recreate any Kubernetes resources. If matching resources already exist in the target cluster, they will be reused as-is. If they do not exist, Azure Backup will attempt to dynamically recreate them. In case of PVs, ensure that compatible StorageClass definitions and permissions exist in the target environment.
247
+
248
+
- question: |
249
+
Why is my restored cluster trying to mount PVCs from the original source cluster?
250
+
answer: |
251
+
This typically happens when the restored cluster references Persistent Volumes (PVs) that still point to the original source resource group. AKS separates cluster resources into two resource groups: one for the control plane and another for infrastructure (like disks). If the PVC-to-PV mapping wasn’t correctly updated during restore, the restored workloads may attempt to attach source PVs, resulting in errors. Ensure that the restore process correctly remaps PVCs to new or existing PVs in the target cluster's resource group.
252
+
253
+
238
254
additionalContent: |
239
255
240
256
## Next steps
241
257
242
-
- [Azure Backup for AKS support matrix](azure-kubernetes-service-cluster-backup-support-matrix.md)
258
+
- [Azure Backup for AKS support matrix](azure-kubernetes-service-cluster-backup-support-matrix.md)
Copy file name to clipboardExpand all lines: articles/backup/azure-kubernetes-service-backup-troubleshoot.md
+85Lines changed: 85 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -163,6 +163,46 @@ Example log message:
163
163
In this case, there is a Network/Calico policy or NSG that didn't allow dataprotection-microsoft pods to communicate with the API server.
164
164
You should allow the dataprotection-microsoft namespace, and then reinstall the extension.
165
165
166
+
### Scenario 5
167
+
168
+
Extension Agent Failing to Communicate with Data Plane Endpoints leading to backup extension pods to not be deployed.
169
+
170
+
**Error message**:
171
+
The extension agent in your AKS cluster is failing to connect to Azure Kubernetes Configuration service data plane endpoints `*.dp.kubernetesconfiguration.azure.com` in your region. This failure is indicated by reviewing the logs of the `extension-agent` pod. You will likely see repeated 403 errors for requests to data plane endpoints
172
+
173
+
```
174
+
Error code: 403
175
+
Message: This traffic is not authorized
176
+
```
177
+
This typically means that the traffic from the extension agent is being blocked or lacks the necessary authorization to reach the Azure service. This extension agent is requisite to install and run the backup extension in the AKS cluster.
178
+
179
+
**Cause**
180
+
This error occurs due to a conflict in private DNS resolution when both Azure Arc-enabled Kubernetes and an AKS managed cluster share the same virtual network (VNet) or private DNS server:
181
+
182
+
The shared VNet (or private DNS zone) contains a preexisting private endpoint for Azure Arc-enabled Kubernetes.
183
+
184
+
As a result, the data plane endpoint used by the AKS extension agent (e.g., *.dp.kubernetesconfiguration.azure.com) resolves to a private IP address (e.g., 10.x.x.x) instead of the intended public IP.
185
+
186
+
This misrouting causes the AKS extension agent to send traffic to an unintended private endpoint, leading to 403 Unauthorized errors. You can verify the resolved IP address of the data plane endpoint from within your AKS cluster using the following command:
Replace `region` with your specific Azure region (e.g., eastus, westeurope).
193
+
194
+
**Resolution**
195
+
To resolve this issue, consider the following approaches:
196
+
197
+
-**Use Separate VNets:** In case you are using both Azure Arc-enabled Kubernetes and AKS clusters, then deploy them in separate virtual networks to avoid DNS resolution conflicts caused by shared private endpoints.
198
+
199
+
-**Configure CoreDNS Override:** Override the CoreDNS settings in your AKS cluster to explicitly resolve the extension data plane endpoint to its public IP address. Refer to Scenario 3 in the documentation for detailed steps on configuring a CoreDNS override for the extension.
200
+
201
+
-**Verify Public IP Resolution:** Identify the correct public IP address of the extension data plane endpoint by using the nslookup command. Replace the region with your AKS cluster’s region:
## Backup Extension post installation related errors
168
208
@@ -332,6 +372,51 @@ These error codes appear due to issues based on the Backup extension installed i
332
372
333
373
**Recommended action**: In case if you are configuring a new backup instance, use a resource group without a delete or read lock. If the backup instance already configured then remove the lock from the snapshot resource group.
334
374
375
+
### KubernetesBackupGenericWarning
376
+
377
+
**Cause**: This error code indicates that a Kubernetes resource could not be backed up or restored, typically due to validation or dependency issues within the cluster.
378
+
379
+
One commonly observed scenario is a failure during the restoration of Ingress resources due to issues with validating webhooks. A required service (e.g., fabp-ingress-nginx-controller-admission) is missing, preventing the webhook validate.nginx.ingress.kubernetes.io from executing properly. The validating webhook configuration exists but references a non-existent or misconfigured service. DNS resolution issues are preventing the webhook from reaching the intended endpoint. The cluster uses custom admission webhooks that were not backed up or recreated before the restore. The webhook configuration is obsolete or unnecessary for the restored cluster state.
380
+
381
+
**Recommended action**:
382
+
383
+
- Verify if the missing service fabp-ingress-nginx-controller-admission exists using:
384
+
385
+
```json
386
+
kubectl get svc -n ingress-basic
387
+
```
388
+
389
+
- If the service is missing, check deployment configurations and recreate it if necessary.
390
+
391
+
- Investigate potential DNS resolution issues by running:
- If the webhook validation is unnecessary, consider removing it using:
400
+
401
+
```json
402
+
kubectl delete validatingwebhookconfiguration
403
+
```
404
+
405
+
- List all webhook configurations with:
406
+
407
+
```json
408
+
kubectl get validatingwebhookconfigurations
409
+
```
410
+
411
+
- If the issue is resolved, manually restore the ingress by applying its YAML backup:
412
+
413
+
```json
414
+
kubectl apply -f
415
+
```
416
+
417
+
>[!Note]
418
+
>This warning can arise from multiple causes. If the above steps do not resolve your issue, consult the Kubernetes controller logs and webhook configuration for more specific error messages.
419
+
335
420
## Vaulted backup based errors
336
421
337
422
These error codes can appear while you enable AKS backup to store backups in a vault standard datastore.
Copy file name to clipboardExpand all lines: articles/backup/azure-kubernetes-service-cluster-backup-concept.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -72,7 +72,7 @@ To enable backup for an AKS cluster, see the following prerequisites: .
72
72
73
73
- The Backup Extension during installation fetches Container Images stored in Microsoft Container Registry (MCR). If you enable a firewall on the AKS cluster, the extension installation process might fail due to access issues on the Registry. Learn [how to allow MCR access from the firewall](/azure/container-registry/container-registry-firewall-access-rules#configure-client-firewall-rules-for-mcr).
74
74
75
-
- In case you have the cluster in a Private Virtual Network and Firewall, apply the following FQDN/application rules: `*.microsoft.com`, `mcr.microsoft.com`, `data.mcr.microsoft.com`, `crl.microsoft.com`, `mscrl.microsoft.com`, `oneocsp.microsoft.com` , `*.azure.com`, `management.azure.com`, `gcs.prod.monitoring.core.windows.net`, `*.prod.warm.ingest.monitor.core.windows.net`, `*.blob.core.windows.net`, `*.azmk8s.io`, `ocsp.digicert.com`, `cacerts.digicert.com`, `crl3.digicert.com`, `crl4.digicert.com`, `ocsp.digicert.cn`, `cacerts.digicert.cn`, `cacerts.geotrust.com`, `cdp.geotrust.com`, `status.geotrust.com`, `ocsp.msocsp.com`, `*.azurecr.io`, `docker.io`, `*.dp.kubernetesconfiguration.azure.com`. Learn [how to apply FQDN rules](../firewall/dns-settings.md).
75
+
- During the installation of the backup extension in Azure Kubernetes Service (AKS), communication with several Fully Qualified Domain Names (FQDNs) is required to support essential operations. In addition to Azure Backup and Storage Accounts, AKS must also access external endpoints to download container images for running backup pods and to emit service logs to Microsoft Defender for Endpoint via MDM. Therefore, if your cluster is deployed in a private virtual network with firewall restrictions, ensure the following FQDNs or application rules are allowed:`*.microsoft.com`, `mcr.microsoft.com`, `data.mcr.microsoft.com`, `crl.microsoft.com`, `mscrl.microsoft.com`, `oneocsp.microsoft.com` , `*.azure.com`, `management.azure.com`, `gcs.prod.monitoring.core.windows.net`, `*.prod.warm.ingest.monitor.core.windows.net`, `*.blob.core.windows.net`, `*.azmk8s.io`, `ocsp.digicert.com`, `cacerts.digicert.com`, `crl3.digicert.com`, `crl4.digicert.com`, `ocsp.digicert.cn`, `cacerts.digicert.cn`, `cacerts.geotrust.com`, `cdp.geotrust.com`, `status.geotrust.com`, `ocsp.msocsp.com`, `*.azurecr.io`, `docker.io`, `*.dp.kubernetesconfiguration.azure.com`. Learn [how to apply FQDN rules](../firewall/dns-settings.md).
76
76
77
77
- If you have any previous installation of *Velero* in the AKS cluster, you need to delete it before installing Backup Extension.
0 commit comments