Skip to content

Commit 1f2f089

Browse files
authored
Merge pull request #110233 from davidsmatlak/ds-docsfresh05
Copyedits, fixes links
2 parents d9ddc47 + 3b60df8 commit 1f2f089

File tree

1 file changed

+52
-41
lines changed

1 file changed

+52
-41
lines changed

articles/site-recovery/azure-to-azure-troubleshoot-replication.md

Lines changed: 52 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -4,30 +4,33 @@ description: Troubleshoot replication in Azure VM disaster recovery with Azure S
44
author: sideeksh
55
manager: rochakm
66
ms.topic: troubleshooting
7-
ms.date: 8/2/2019
8-
7+
ms.date: 04/03/2020
98
---
9+
1010
# Troubleshoot replication in Azure VM disaster recovery
1111

12-
This article describes common problems in Azure Site Recovery when you're replicating and recovering Azure virtual machines from one region to another region. It also explains how to troubleshoot the common problems. For more information about supported configurations, see the [support matrix for replicating Azure VMs](site-recovery-support-matrix-azure-to-azure.md).
12+
This article describes common problems in Azure Site Recovery when you're replicating and recovering Azure virtual machines (VM) from one region to another region. It also explains how to troubleshoot the common problems. For more information about supported configurations, see the [support matrix for replicating Azure VMs](site-recovery-support-matrix-azure-to-azure.md).
1313

1414
Azure Site Recovery consistently replicates data from the source region to the disaster recovery region. It also creates a crash-consistent recovery point every 5 minutes. If Site Recovery can't create recovery points for 60 minutes, it alerts you with this information:
1515

16-
Error message: "No crash consistent recovery point available for the VM in the last 60 minutes."</br>
16+
```plaintext
17+
Error message: "No crash consistent recovery point available for the VM in the last 60 minutes."
18+
1719
Error ID: 153007
20+
```
1821

1922
The following sections describe causes and solutions.
2023

21-
## <a name="high-data-change-rate-on-the-source-virtal-machine"></a>High data change rate on the source virtual machine
24+
## High data change rate on the source virtual machine
2225

2326
Azure Site Recovery creates an event if the data change rate on the source virtual machine is higher than the supported limits. To see whether the problem is because of high churn, go to **Replicated items** > **VM** > **Events - last 72 hours**.
24-
You should see the event "Data change rate beyond supported limits":
27+
You should see the event **Data change rate beyond supported limits**:
2528

26-
![Azure Site Recovery page that shows a high data change rate that is too high](./media/site-recovery-azure-to-azure-troubleshoot/data_change_event.png)
29+
:::image type="content" source="./media/site-recovery-azure-to-azure-troubleshoot/data_change_event.png" alt-text="Azure Site Recovery page that shows a high data change rate that is too high.":::
2730

2831
If you select the event, you should see the exact disk information:
2932

30-
![Page that shows the data change rate event details](./media/site-recovery-azure-to-azure-troubleshoot/data_change_event2.png)
33+
:::image type="content" source="./media/site-recovery-azure-to-azure-troubleshoot/data_change_event2.png" alt-text="Page that shows the data change rate event details.":::
3134

3235
### Azure Site Recovery limits
3336

@@ -48,96 +51,104 @@ Premium P20 or P30 or P40 or P50 disk | 16 KB or greater |20 MB/s | 1684 GB per
4851

4952
Azure Site Recovery has limits on data change rates, depending on the type of disk. To see if this problem is recurring or temporary, find the data change rate of the affected virtual machine. Go to the source virtual machine, find the metrics under **Monitoring**, and add the metrics as shown in this screenshot:
5053

51-
![Page that shows the three-step process for finding the data change rate](./media/site-recovery-azure-to-azure-troubleshoot/churn.png)
54+
:::image type="content" source="./media/site-recovery-azure-to-azure-troubleshoot/churn.png" alt-text="Page that shows the three-step process for finding the data change rate.":::
5255

5356
1. Select **Add metric**, and add **OS Disk Write Bytes/Sec** and **Data Disk Write Bytes/Sec**.
5457
1. Monitor the spike as shown in the screenshot.
5558
1. View the total write operations happening across OS disks and all data disks combined. These metrics might not give you information at the per-disk level, but they indicate the total pattern of data churn.
5659

5760
A spike in data change rate might come from an occasional data burst. If the data change rate is greater than 10 MB/s (for Premium) or 2 MB/s (for Standard) and comes down, replication will catch up. If the churn is consistently well beyond the supported limit, consider one of these options:
5861

59-
- Exclude the disk that's causing a high data-change rate: First, disable the replication. Then you can exclude the disk by using [PowerShell](./azure-to-azure-exclude-disks.md).
60-
- Change the tier of the disaster recovery storage disk: This option is possible only if the disk data churn is less than 20 MB/s. Let's say a VM with a P10 disk has a data churn of greater than 8 MB/s but less than 10 MB/s. If the customer can use a P30 disk for target storage during protection, the problem can be solved. This solution is only possible for machines that are using Premium-Managed Disks. Follow these steps:
62+
- Exclude the disk that's causing a high data-change rate: First, disable the replication. Then you can exclude the disk by using [PowerShell](azure-to-azure-exclude-disks.md).
63+
- Change the tier of the disaster recovery storage disk: This option is possible only if the disk data churn is less than 20 MB/s. For example, a VM with a P10 disk has a data churn of greater than 8 MB/s but less than 10 MB/s. If the customer can use a P30 disk for target storage during protection, the problem can be solved. This solution is only possible for machines that are using Premium-Managed Disks. Follow these steps:
6164

62-
1. Go to **Disks** of the affected replicated machine and copy the replica disk name.
63-
1. Go to this replica of the managed disk.
64-
1. You might see a banner in **Overview** that says an SAS URL has been generated. Select this banner and cancel the export. Ignore this step if you don't see the banner.
65-
1. As soon as the SAS URL is revoked, go to **Configuration** for the managed disk. Increase the size so that Site Recovery supports the observed churn rate on the source disk.
65+
1. Go to **Disks** of the affected replicated machine and copy the replica disk name.
66+
1. Go to this replica of the managed disk.
67+
1. You might see a banner in **Overview** that says an SAS URL has been generated. Select this banner and cancel the export. Ignore this step if you don't see the banner.
68+
1. As soon as the SAS URL is revoked, go to **Configuration** for the managed disk. Increase the size so that Site Recovery supports the observed churn rate on the source disk.
6669

67-
## <a name="Network-connectivity-problem"></a>Network connectivity problems
70+
## Network connectivity problems
6871

6972
### Network latency to a cache storage account
7073

7174
Site Recovery sends replicated data to the cache storage account. You might experience network latency if uploading the data from a virtual machine to the cache storage account is slower than 4 MB in 3 seconds.
7275

73-
To check for a problem related to latency, use [AzCopy](https://docs.microsoft.com/azure/storage/common/storage-use-azcopy). You can use this command-line utility to upload data from the virtual machine to the cache storage account. If the latency is high, check whether you're using a network virtual appliance (NVA) to control outbound network traffic from VMs. The appliance might get throttled if all the replication traffic passes through the NVA.
76+
To check for a problem related to latency, use [AzCopy](/azure/storage/common/storage-use-azcopy). You can use this command-line utility to upload data from the virtual machine to the cache storage account. If the latency is high, check whether you're using a network virtual appliance (NVA) to control outbound network traffic from VMs. The appliance might get throttled if all the replication traffic passes through the NVA.
7477

75-
We recommend creating a network service endpoint in your virtual network for "Storage" so that the replication traffic doesn't go to the NVA. For more information, see [Network virtual appliance configuration](https://docs.microsoft.com/azure/site-recovery/azure-to-azure-about-networking#network-virtual-appliance-configuration).
78+
We recommend creating a network service endpoint in your virtual network for "Storage" so that the replication traffic doesn't go to the NVA. For more information, see [Network virtual appliance configuration](azure-to-azure-about-networking.md#network-virtual-appliance-configuration).
7679

7780
### Network connectivity
7881

79-
For Site Recovery replication to work, it needs the VM to provide outbound connectivity to specific URLs or IP ranges. You might have your VM behind a firewall or use network security group (NSG) rules to control outbound connectivity. If so, you might experience issues. To make sure all the URLs are connected, see [Outbound connectivity for Site Recovery URLs](https://docs.microsoft.com/azure/site-recovery/azure-to-azure-about-networking#outbound-connectivity-for-urls).
82+
For Site Recovery replication to work, it needs the VM to provide outbound connectivity to specific URLs or IP ranges. You might have your VM behind a firewall or use network security group (NSG) rules to control outbound connectivity. If so, you might experience issues. To make sure all the URLs are connected, see [Outbound connectivity for URLs](azure-to-azure-about-networking.md#outbound-connectivity-for-urls).
8083

8184
## Error ID 153006 - No app-consistent recovery point available for the VM in the past "X" minutes
8285

8386
Following are some of the most common issues.
8487

85-
#### Known issue in SQL server 2008/2008 R2
88+
### Known issue in SQL server 2008/2008 R2
8689

87-
**How to fix**: There's a known issue with SQL server 2008/2008 R2. Refer to the article [Azure Site Recovery Agent or other non-component VSS backup fails for a server hosting SQL Server 2008 R2](https://support.microsoft.com/help/4504103/non-component-vss-backup-fails-for-server-hosting-sql-server-2008-r2).
90+
**How to fix:** There's a known issue with SQL server 2008/2008 R2. Refer to the article [Azure Site Recovery Agent or other non-component VSS backup fails for a server hosting SQL Server 2008 R2](https://support.microsoft.com/help/4504103/non-component-vss-backup-fails-for-server-hosting-sql-server-2008-r2).
8891

89-
#### Azure Site Recovery jobs fail on servers hosting any version of SQL Server instances with AUTO_CLOSE DBs
92+
### Azure Site Recovery jobs fail on servers hosting any version of SQL Server instances with AUTO_CLOSE DBs
9093

91-
**How to fix**: Refer to the article [Non-component VSS backups such as Azure Site Recovery jobs fail on servers hosting SQL Server instances with AUTO_CLOSE DBs](https://support.microsoft.com/help/4504104/non-component-vss-backups-such-as-azure-site-recovery-jobs-fail-on-ser).
94+
**How to fix:** Refer to the article [Non-component VSS backups such as Azure Site Recovery jobs fail on servers hosting SQL Server instances with AUTO_CLOSE DBs](https://support.microsoft.com/help/4504104/non-component-vss-backups-such-as-azure-site-recovery-jobs-fail-on-ser).
9295

96+
### Known issue in SQL Server 2016 and 2017
9397

94-
#### Known issue in SQL Server 2016 and 2017
98+
**How to fix**: Refer to the article [Cumulative Update 16 for SQL Server 2017](https://support.microsoft.com/help/4508218/cumulative-update-16-for-sql-server-2017).
9599

96-
**How to fix**: Refer to the article [Error occurs when you back up a virtual machine with non-component based backup in SQL Server 2016 and 2017](https://support.microsoft.com/en-us/help/4508218/cumulative-update-16-for-sql-server-2017).
100+
### You're using Azure Storage Spaces Direct Configuration
97101

98-
#### You're using Azure Storage Spaces Direct Configuration
99-
100-
**How to fix**: Azure Site Recovery can't create application consistent recovery point for Storage Spaces Direct Configuration. [Configure the replication policy](https://docs.microsoft.com/azure/site-recovery/azure-to-azure-how-to-enable-replication-s2d-vms).
102+
**How to fix**: Azure Site Recovery can't create application consistent recovery point for Storage Spaces Direct Configuration. [Configure the replication policy](azure-to-azure-how-to-enable-replication-s2d-vms.md).
101103

102104
### More causes because of VSS-related issues:
103105

104106
To troubleshoot further, check the files on the source machine to get the exact error code for failure:
105107

106-
C:\Program Files (x86)\Microsoft Azure Site Recovery\agent\Application Data\ApplicationPolicyLogs\vacp.log
108+
`C:\Program Files (x86)\Microsoft Azure Site Recovery\agent\Application Data\ApplicationPolicyLogs\vacp.log`
107109

108-
To locate the errors in the file, search for the string "vacpError" by opening the vacp.log file in an editor.
110+
To locate the errors, open the _vacp.log_ file in a text editor search for the string **vacpError**.
109111

110-
Ex: vacpError:220#Following disks are in FilteringStopped state [\\.\PHYSICALDRIVE1=5, ]#220|^|224#FAILED: CheckWriterStatus().#2147754994|^|226#FAILED to revoke tags.FAILED: CheckWriterStatus().#2147754994|^|
112+
```plaintext
113+
Ex: vacpError:220#Following disks are in FilteringStopped state [\\.\PHYSICALDRIVE1=5, ]#220|^|224#FAILED: CheckWriterStatus().#2147754994|^|226#FAILED to revoke tags.FAILED: CheckWriterStatus().#2147754994|^|
114+
```
111115

112116
In the preceding example, **2147754994** is the error code that tells you about the failure following this sentence.
113117

114118
#### VSS writer is not installed - Error 2147221164
115119

116-
**How to fix**: To generate application consistency tag, Azure Site Recovery uses Volume Shadow Copy Service (VSS). Site Recovery installs a VSS Provider for its operation to take app consistency snapshots. Azure Site Recovery installs this VSS Provider as a service. If VSS Provider isn't installed, the application consistency snapshot creation fails. It shows the error ID 0x80040154 "Class not registered." Refer to the article for [VSS writer installation troubleshooting](https://docs.microsoft.com/azure/site-recovery/vmware-azure-troubleshoot-push-install#vss-installation-failures).
120+
**How to fix**: To generate application consistency tag, Azure Site Recovery uses Volume Shadow Copy Service (VSS). Site Recovery installs a VSS Provider for its operation to take app consistency snapshots. Azure Site Recovery installs this VSS Provider as a service. If VSS Provider isn't installed, the application consistency snapshot creation fails. It shows the **error ID 0x80040154 Class not registered**. Refer to the article for [VSS writer installation troubleshooting](vmware-azure-troubleshoot-push-install.md#vss-installation-failures).
117121

118122
#### VSS writer is disabled - Error 2147943458
119123

120-
**How to fix**: To generate the application consistency tag, Azure Site Recovery uses VSS. Site Recovery installs a VSS Provider for its operation to take app consistency snapshots. This VSS Provider is installed as a service. If you don't have the VSS Provider service enabled, the application consistency snapshot creation fails. It shows the error "The specified service is disabled and cannot be started (0x80070422)."
124+
**How to fix**: To generate the application consistency tag, Azure Site Recovery uses VSS. Site Recovery installs a VSS Provider for its operation to take app consistency snapshots. This VSS Provider is installed as a service. If you don't have the VSS Provider service enabled, the application consistency snapshot creation fails. It shows the error: **The specified service is disabled and cannot be started (0x80070422)**.
121125

122126
If VSS is disabled:
123127

124128
- Verify that the startup type of the VSS Provider service is set to **Automatic**.
125129
- Restart the following services:
126-
- VSS service
127-
- Azure Site Recovery VSS Provider
128-
- VDS service
130+
- VSS service.
131+
- Azure Site Recovery VSS Provider.
132+
- VDS service.
129133

130134
#### VSS PROVIDER NOT_REGISTERED - Error 2147754756
131135

132136
**How to fix**: To generate the application consistency tag, Azure Site Recovery uses VSS. Check whether the Azure Site Recovery VSS Provider service is installed.
133137

134138
Use the following commands to reinstall VSS Provider:
135-
1. Uninstall existing provider: C:\Program Files (x86)\Microsoft Azure Site Recovery\agent\InMageVSSProvider_Uninstall.cmd
136-
1. Reinstall VSS Provider: C:\Program Files (x86)\Microsoft Azure Site Recovery\agent\InMageVSSProvider_Install.cmd
139+
140+
1. Uninstall existing provider:
141+
142+
`"C:\Program Files (x86)\Microsoft Azure Site Recovery\agent\InMageVSSProvider_Uninstall.cmd"`
143+
144+
1. Reinstall VSS Provider:
145+
146+
`"C:\Program Files (x86)\Microsoft Azure Site Recovery\agent\InMageVSSProvider_Install.cmd"`
137147

138148
Verify that the startup type of the VSS Provider service is set to **Automatic**.
139149

140150
Restart the following services:
141-
- VSS service
142-
- Azure Site Recovery VSS Provider
143-
- VDS service
151+
152+
- VSS service.
153+
- Azure Site Recovery VSS Provider.
154+
- VDS service.

0 commit comments

Comments
 (0)