Skip to content

Commit c6f1139

Browse files
committed
Staging DR drills
1 parent 329fb3c commit c6f1139

File tree

1 file changed

+46
-5
lines changed

1 file changed

+46
-5
lines changed

azure-sql/managed-instance/disaster-recovery-drills.md

Lines changed: 46 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Learn guidance and best practices for using Azure SQL Managed Insta
44
author: Stralle
55
ms.author: strrodic
66
ms.reviewer: wiassaf, mathoma
7-
ms.date: 06/25/2024
7+
ms.date: 05/05/2025
88
ms.service: azure-sql-managed-instance
99
ms.subservice: high-availability
1010
ms.topic: conceptual
@@ -16,13 +16,14 @@ ms.topic: conceptual
1616
> * [Azure SQL Database](../database/disaster-recovery-drills.md?view=azuresql-db&preserve-view=true)
1717
> * [Azure SQL Managed Instance](disaster-recovery-drills.md?view=azuresql-mi&preserve-view=true)
1818
19-
It's recommended to periodically test and validate that applications are ready for a recovery workflow. Verifying the application behavior and implications of data loss and/or the disruption that failover involves is good engineering practice. It is also a requirement by most industry standards as part of business continuity certification.
19+
You should periodically test and validate that applications are ready for a recovery workflow. Verifying the application behavior and implications of data loss and/or the disruption that failover involves is good engineering practice. It's also a requirement by most industry standards as part of business continuity certification.
2020

2121
Performing a disaster recovery drill consists of:
2222

2323
* Simulating data tier outage
2424
* Recovering
25-
* Validate application integrity post recovery
25+
* Validate application integrity post recovery
26+
* Fail back to the original instance (optional)
2627

2728
Depending on how you [designed your application for business continuity](business-continuity-high-availability-disaster-recover-hadr-overview.md), the workflow to execute the drill can vary. This article describes the best practices for conducting a disaster recovery drill in the context of Azure SQL Managed Instance.
2829

@@ -41,11 +42,15 @@ To simulate the outage, you can rename the source database. This name change cau
4142

4243
### Validation
4344

44-
Complete the drill by verifying the application integrity post recovery (including connection strings, logins, basic functionality testing, or other validations part of standard application signoffs procedures).
45+
Complete the drill by verifying the application integrity post recovery (including connection strings, logins, basic functionality testing, or other validations part of standard application sign off procedures).
46+
47+
### Fail back
48+
49+
With geo-restore, fail back consists of repointing the application to the original instance.
4550

4651
## Failover groups
4752

48-
For an instance protected by failover groups, the drill exercise involves planned failover to the secondary instance. The planned failover ensures that the primary and the secondary instances in the failover group remain in sync when the roles are switched. Unlike the unplanned failover, this operation does not result in data loss, so the drill can be performed in a production environment.
53+
For an instance protected by failover groups, the drill exercise involves planned failover to the secondary instance. The planned failover ensures that the primary and the secondary instances in the failover group remain in sync when the roles are switched. Unlike the unplanned failover, this operation doesn't result in data loss, so the drill can be performed in a production environment.
4954

5055
Configure your failover group with the [failover policy](failover-group-sql-mi.md#failover-policy) that suits your business need, and test failover regardless of how your failover policy is configured. For more information, review [test failover](failover-group-configure-sql-mi.md#test-failover). A customer-managed failover policy is recommended to give you control over the failover process.
5156

@@ -67,6 +72,42 @@ To simulate the outage, you can disable the web application or virtual machine c
6772

6873
Complete the drill by verifying the application integrity post recovery (including connectivity, basic functionality testing, or other validations required for the drill signoffs).
6974

75+
### Fail back
76+
77+
To fail back, perform a planned failover of the failover group back to the original primary instance. Since the application is already configured to point to the failover group endpoint, no further changes are needed. The failover group endpoint automatically routes traffic to the new primary instance after the failover.
78+
79+
Fail back is optional. If you don't need to fail back, you can keep the secondary instance as the new primary instance.
80+
81+
## Managed Instance link
82+
83+
It's possible to use a [Managed Instance link](managed-instance-link-disaster-recovery.md) for disaster recovery. Two-way failover is only supported with SQL Server 2022, and instances configured with the [SQL Server 2022 update policy](update-policy.md#sql-server-2022-update-policy). SQL Server 2019 and earlier versions support one-way failover only and fail back to SQL Server isn't supported.
84+
85+
This sections describes how to perform a disaster recovery drill with SQL Server 2022. When using a [Managed Instance link](managed-instance-link-disaster-recovery.md) for disaster recovery, the drill exercise involves planned failover to the secondary instance. The planned failover ensures that the primary and the secondary instances in the Managed Instance link remain in sync when the roles are switched. Unlike the unplanned failover, this operation doesn't result in data loss, so the drill can be performed in a production environment.
86+
87+
### Outage simulation
88+
89+
To simulate the outage, disable client connections to the primary replica of the database replicated via link. This outage simulation results in the connectivity failures for the database clients (applications).
90+
91+
### Recovery
92+
93+
For recovery, do the following:
94+
95+
1. Initiate a [planned link failover](managed-instance-link-failover-how-to.md) to the secondary instance.
96+
1. Repoint the impacted applications to the new primary instance.
97+
98+
### Validation
99+
100+
For validation, do the following:
101+
102+
1. Perform application connectivity and read/write tests on the new primary instance.
103+
1. Optionally, validate that the test data written during the drill is replicated to the secondary instance.
104+
105+
### Fail back
106+
107+
To fail back, perform a planned failover of the Managed Instance link back to the original primary instance. After failover, the application must be repointed to the original primary instance.
108+
109+
Fail back is optional. If you don't need to fail back, you can keep the secondary instance as the new primary instance.
110+
70111
## Related content
71112

72113
To learn more, review:

0 commit comments

Comments
 (0)