Skip to content

Commit e948a0a

Browse files
authored
Merge branch 'MicrosoftDocs:main' into main
2 parents fe1bfed + 85acbb4 commit e948a0a

30 files changed

+292
-191
lines changed

articles/backup/aks-backup-faq.yml

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ sections:
2929
3030
- The storage account must be of `Standard general-purpose v2` type.
3131
- The blob container must be created in the storage account before installing the AKS Backup Extension.
32-
- The blob container should preferably be empty before installation or atleast shouldn't haven't nonbackup related data in it, as the extension will create its own folder structure within the container to store backup data and metadata.
32+
- The blob container should preferably be empty before installation or at least shouldn't have nonbackup related data in it, as the extension will create its own folder structure within the container to store backup data and metadata.
3333
- In case the AKS cluster is within a Private Network, the storage account must be accessible from the AKS cluster. This can be achieved by using a Private Endpoint for the storage account or by configuring the necessary network rules to allow access from the AKS cluster to the storage account.
3434
3535
- question: |
@@ -235,8 +235,24 @@ sections:
235235

236236
1. Microsoft.Storage/storageAccounts/blobServices/containers/blobs/*
237237

238+
- question: |
239+
Will snapshots for all Persistent Volumes (PVs) in a backup configuration be taken at the exact same time, or is there a delay?
240+
answer: |
241+
Azure Backup for AKS does not currently support taking snapshots of all PVs at the exact same millisecond. While the snapshot operations are initiated in parallel, there may be slight delays between individual PV snapshots due to infrastructure and API timing. To help achieve consistency across multiple PVs, Azure Backup supports application-consistent backups using hooks. Hooks allow users to pause application writes before snapshotting and resume them afterward. This approach improves consistency and mimics crash consistency, though it may not be equivalent to true simultaneous snapshots or coordinated database-level consistency.
242+
243+
- question: |
244+
What happens if I select the "Skip" option for Kubernetes resources including PVCs during an AKS restore?
245+
answer: |
246+
Selecting "Skip" means the restore process will not attempt to recreate any Kubernetes resources. If matching resources already exist in the target cluster, they will be reused as-is. If they do not exist, Azure Backup will attempt to dynamically recreate them. In case of PVs, ensure that compatible StorageClass definitions and permissions exist in the target environment.
247+
248+
- question: |
249+
Why is my restored cluster trying to mount PVCs from the original source cluster?
250+
answer: |
251+
This typically happens when the restored cluster references Persistent Volumes (PVs) that still point to the original source resource group. AKS separates cluster resources into two resource groups: one for the control plane and another for infrastructure (like disks). If the PVC-to-PV mapping wasn’t correctly updated during restore, the restored workloads may attempt to attach source PVs, resulting in errors. Ensure that the restore process correctly remaps PVCs to new or existing PVs in the target cluster's resource group.
252+
253+
238254
additionalContent: |
239255
240256
## Next steps
241257
242-
- [Azure Backup for AKS support matrix](azure-kubernetes-service-cluster-backup-support-matrix.md)
258+
- [Azure Backup for AKS support matrix](azure-kubernetes-service-cluster-backup-support-matrix.md)

articles/backup/azure-kubernetes-service-backup-troubleshoot.md

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,46 @@ Example log message:
163163
In this case, there is a Network/Calico policy or NSG that didn't allow dataprotection-microsoft pods to communicate with the API server.
164164
You should allow the dataprotection-microsoft namespace, and then reinstall the extension.
165165

166+
### Scenario 5
167+
168+
Extension Agent Failing to Communicate with Data Plane Endpoints leading to backup extension pods to not be deployed.
169+
170+
**Error message**:
171+
The extension agent in your AKS cluster is failing to connect to Azure Kubernetes Configuration service data plane endpoints `*.dp.kubernetesconfiguration.azure.com` in your region. This failure is indicated by reviewing the logs of the `extension-agent` pod. You will likely see repeated 403 errors for requests to data plane endpoints
172+
173+
```
174+
Error code: 403
175+
Message: This traffic is not authorized
176+
```
177+
This typically means that the traffic from the extension agent is being blocked or lacks the necessary authorization to reach the Azure service. This extension agent is requisite to install and run the backup extension in the AKS cluster.
178+
179+
**Cause**
180+
This error occurs due to a conflict in private DNS resolution when both Azure Arc-enabled Kubernetes and an AKS managed cluster share the same virtual network (VNet) or private DNS server:
181+
182+
The shared VNet (or private DNS zone) contains a preexisting private endpoint for Azure Arc-enabled Kubernetes.
183+
184+
As a result, the data plane endpoint used by the AKS extension agent (e.g., *.dp.kubernetesconfiguration.azure.com) resolves to a private IP address (e.g., 10.x.x.x) instead of the intended public IP.
185+
186+
This misrouting causes the AKS extension agent to send traffic to an unintended private endpoint, leading to 403 Unauthorized errors. You can verify the resolved IP address of the data plane endpoint from within your AKS cluster using the following command:
187+
188+
```
189+
kubectl exec -it -n kube-system extension-agent-<podGuid> --nslookup <region>.dp.kubernetesconfiguration.azure.com
190+
```
191+
192+
Replace `region` with your specific Azure region (e.g., eastus, westeurope).
193+
194+
**Resolution**
195+
To resolve this issue, consider the following approaches:
196+
197+
- **Use Separate VNets:** In case you are using both Azure Arc-enabled Kubernetes and AKS clusters, then deploy them in separate virtual networks to avoid DNS resolution conflicts caused by shared private endpoints.
198+
199+
- **Configure CoreDNS Override:** Override the CoreDNS settings in your AKS cluster to explicitly resolve the extension data plane endpoint to its public IP address. Refer to Scenario 3 in the documentation for detailed steps on configuring a CoreDNS override for the extension.
200+
201+
- **Verify Public IP Resolution:** Identify the correct public IP address of the extension data plane endpoint by using the nslookup command. Replace the region with your AKS cluster’s region:
202+
203+
```
204+
nslookup eastus2euap.dp.kubernetesconfiguration.azure.com
205+
```
166206

167207
## Backup Extension post installation related errors
168208

@@ -332,6 +372,51 @@ These error codes appear due to issues based on the Backup extension installed i
332372

333373
**Recommended action**: In case if you are configuring a new backup instance, use a resource group without a delete or read lock. If the backup instance already configured then remove the lock from the snapshot resource group.
334374

375+
### KubernetesBackupGenericWarning
376+
377+
**Cause**: This error code indicates that a Kubernetes resource could not be backed up or restored, typically due to validation or dependency issues within the cluster.
378+
379+
One commonly observed scenario is a failure during the restoration of Ingress resources due to issues with validating webhooks. A required service (e.g., fabp-ingress-nginx-controller-admission) is missing, preventing the webhook validate.nginx.ingress.kubernetes.io from executing properly. The validating webhook configuration exists but references a non-existent or misconfigured service. DNS resolution issues are preventing the webhook from reaching the intended endpoint. The cluster uses custom admission webhooks that were not backed up or recreated before the restore. The webhook configuration is obsolete or unnecessary for the restored cluster state.
380+
381+
**Recommended action**:
382+
383+
- Verify if the missing service fabp-ingress-nginx-controller-admission exists using:
384+
385+
```json
386+
kubectl get svc -n ingress-basic
387+
```
388+
389+
- If the service is missing, check deployment configurations and recreate it if necessary.
390+
391+
- Investigate potential DNS resolution issues by running:
392+
393+
```JSON
394+
kubectl get endpoints -n ingress-basic
395+
396+
nslookup fabp-ingress-nginx-controller-admission.ingress-basic.svc.cluster.local
397+
```
398+
399+
- If the webhook validation is unnecessary, consider removing it using:
400+
401+
```json
402+
kubectl delete validatingwebhookconfiguration
403+
```
404+
405+
- List all webhook configurations with:
406+
407+
```json
408+
kubectl get validatingwebhookconfigurations
409+
```
410+
411+
- If the issue is resolved, manually restore the ingress by applying its YAML backup:
412+
413+
```json
414+
kubectl apply -f
415+
```
416+
417+
>[!Note]
418+
>This warning can arise from multiple causes. If the above steps do not resolve your issue, consult the Kubernetes controller logs and webhook configuration for more specific error messages.
419+
335420
## Vaulted backup based errors
336421

337422
These error codes can appear while you enable AKS backup to store backups in a vault standard datastore.

articles/backup/azure-kubernetes-service-cluster-backup-concept.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ To enable backup for an AKS cluster, see the following prerequisites: .
7272

7373
- The Backup Extension during installation fetches Container Images stored in Microsoft Container Registry (MCR). If you enable a firewall on the AKS cluster, the extension installation process might fail due to access issues on the Registry. Learn [how to allow MCR access from the firewall](/azure/container-registry/container-registry-firewall-access-rules#configure-client-firewall-rules-for-mcr).
7474

75-
- In case you have the cluster in a Private Virtual Network and Firewall, apply the following FQDN/application rules: `*.microsoft.com`, `mcr.microsoft.com`, `data.mcr.microsoft.com`, `crl.microsoft.com`, `mscrl.microsoft.com`, `oneocsp.microsoft.com` , `*.azure.com`, `management.azure.com`, `gcs.prod.monitoring.core.windows.net`, `*.prod.warm.ingest.monitor.core.windows.net`, `*.blob.core.windows.net`, `*.azmk8s.io`, `ocsp.digicert.com`, `cacerts.digicert.com`, `crl3.digicert.com`, `crl4.digicert.com`, `ocsp.digicert.cn`, `cacerts.digicert.cn`, `cacerts.geotrust.com`, `cdp.geotrust.com`, `status.geotrust.com`, `ocsp.msocsp.com`, `*.azurecr.io`, `docker.io`, `*.dp.kubernetesconfiguration.azure.com`. Learn [how to apply FQDN rules](../firewall/dns-settings.md).
75+
- During the installation of the backup extension in Azure Kubernetes Service (AKS), communication with several Fully Qualified Domain Names (FQDNs) is required to support essential operations. In addition to Azure Backup and Storage Accounts, AKS must also access external endpoints to download container images for running backup pods and to emit service logs to Microsoft Defender for Endpoint via MDM. Therefore, if your cluster is deployed in a private virtual network with firewall restrictions, ensure the following FQDNs or application rules are allowed:`*.microsoft.com`, `mcr.microsoft.com`, `data.mcr.microsoft.com`, `crl.microsoft.com`, `mscrl.microsoft.com`, `oneocsp.microsoft.com` , `*.azure.com`, `management.azure.com`, `gcs.prod.monitoring.core.windows.net`, `*.prod.warm.ingest.monitor.core.windows.net`, `*.blob.core.windows.net`, `*.azmk8s.io`, `ocsp.digicert.com`, `cacerts.digicert.com`, `crl3.digicert.com`, `crl4.digicert.com`, `ocsp.digicert.cn`, `cacerts.digicert.cn`, `cacerts.geotrust.com`, `cdp.geotrust.com`, `status.geotrust.com`, `ocsp.msocsp.com`, `*.azurecr.io`, `docker.io`, `*.dp.kubernetesconfiguration.azure.com`. Learn [how to apply FQDN rules](../firewall/dns-settings.md).
7676

7777
- If you have any previous installation of *Velero* in the AKS cluster, you need to delete it before installing Backup Extension.
7878

0 commit comments

Comments
 (0)