Skip to content

Commit 4feb5a4

Browse files
Merge pull request #233984 from MikeRayMSFT/dnethi-patch-44
Update managed-instance-disaster-recovery
2 parents 9855acd + ffbf05b commit 4feb5a4

File tree

3 files changed

+143
-26
lines changed

3 files changed

+143
-26
lines changed

articles/azure-arc/data/managed-instance-disaster-recovery.md

Lines changed: 143 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.custom: event-tier1-build-2022
88
author: dnethi
99
ms.author: dinethi
1010
ms.reviewer: mikeray
11-
ms.date: 06/13/2022
11+
ms.date: 04/04/2023
1212
ms.topic: conceptual
1313
---
1414

@@ -22,20 +22,69 @@ Azure failover groups use the same distributed availability groups technology th
2222

2323
> [!NOTE]
2424
> - The Azure Arc-enabled SQL Managed Instance in both geo-primary and geo-secondary sites need to be identical in terms of their compute & capacity, as well as service tiers they are deployed in.
25-
> - Distributed availability groups can be setup for either General Purpose or Business Critical service tiers.
25+
> - Distributed availability groups can be set up for either General Purpose or Business Critical service tiers.
2626
27-
To configure an Azure failover group:
27+
## Prerequisites
28+
29+
The following prerequisites must be met before setting up failover groups between two Azure Arc-enabled SQL managed instances:
30+
31+
- An Azure Arc data controller and an Arc enabled SQL managed instance provisioned at the primary site with `--license-type` as one of `BasePrice` or `LicenseIncluded`.
32+
- An Azure Arc data controller and an Arc enabled SQL managed instance provisioned at the secondary site with identical configuration as the primary in terms of:
33+
- CPU
34+
- Memory
35+
- Storage
36+
- Service tier
37+
- Collation
38+
- Other instance settings
39+
- The instance at the secondary site requires `--license-type` as `DisasterRecovery`
40+
41+
> [!NOTE]
42+
> - It is important to specify the `--license-type` **during** the Azure Arc-enabled SQL MI creation. This will allow the DR instance to be seeded from the primary instance in the primary data center. Updating this property post deployment will not have the same effect.
43+
44+
## Deployment process
45+
46+
To set up an Azure failover group between two Azure Arc-enabled SQL managed instances, complete the following steps:
2847

2948
1. Create custom resource for distributed availability group at the primary site
3049
1. Create custom resource for distributed availability group at the secondary site
31-
1. Copy the binary data from the mirroring certificates
50+
1. Copy the binary data from the mirroring certificates
3251
1. Set up the distributed availability group between the primary and secondary sites
52+
either in `sync` mode or `async` mode
3353

3454
The following image shows a properly configured distributed availability group:
3555

36-
![A properly configured distributed availability group](.\media\business-continuity\dag.png)
56+
![Diagram showing a properly configured distributed availability group](.\media\business-continuity\distributed-availability-group.png)
57+
58+
## Synchronization modes
59+
60+
Failover groups in Azure Arc data services support two synchronization modes - `sync` and `async`. The synchronization mode directly impacts how the data is synchronized between the Azure Arc-enabled SQL managed instances, and potentially the performance on the primary managed instance.
61+
62+
If primary and secondary sites are within a few miles of each other, use `sync` mode. Otherwise use `async` mode to avoid any performance impact on the primary site.
3763

38-
### Configure Azure failover group
64+
## Configure Azure failover group - direct mode
65+
66+
Follow the steps below if the Azure Arc data services are deployed in `directly` connected mode.
67+
68+
Once the prerequisites are met, run the below command to set up Azure failover group between the two Azure Arc-enabled SQL managed instances:
69+
70+
```azurecli
71+
az sql instance-failover-group-arc create --name <name of failover group> --mi <primary SQL MI> --partner-mi <Partner MI> --resource-group <name of RG> --partner-resource-group <name of partner MI RG>
72+
```
73+
74+
Example:
75+
76+
```azurecli
77+
az sql instance-failover-group-arc create --name sql-fog --mi sql1 --partner-mi sql2 --resource-group rg-name --partner-resource-group rg-name
78+
```
79+
80+
The above command:
81+
82+
1. Creates the required custom resources on both primary and secondary sites
83+
1. Copies the mirroring certificates and configures the failover group between the instances
84+
85+
## Configure Azure failover group - indirect mode
86+
87+
Follow the steps below if Azure Arc data services are deployed in `indirectly` connected mode.
3988

4089
1. Provision the managed instance in the primary site.
4190

@@ -45,22 +94,20 @@ The following image shows a properly configured distributed availability group:
4594

4695
2. Switch context to the secondary cluster by running ```kubectl config use-context <secondarycluster>``` and provision the managed instance in the secondary site that will be the disaster recovery instance. At this point, the system databases are not part of the contained availability group.
4796

48-
> [!NOTE]
49-
> - It is important to specify `--license-type DisasterRecovery` **during** the Azure Arc SQL MI creation. This will allow the DR instance to be seeded from the primary instance in the primary data center. Updating this property post deployment will not have the same effect.
50-
51-
97+
> [!NOTE]
98+
> It is important to specify `--license-type DisasterRecovery` **during** the Azure Arc-enabled SQL MI creation. This will allow the DR instance to be seeded from the primary instance in the primary data center. Updating this property post deployment will not have the same effect.
5299
53100
```azurecli
54101
az sql mi-arc create --name <secondaryinstance> --tier bc --replicas 3 --license-type DisasterRecovery --k8s-namespace <namespace> --use-k8s
55102
```
56103

57-
3. Mirroring certificates - The binary data inside the Mirroring Certificate property of the Arc SQL MI is needed for the Instance Failover Group CR (Custom Resource) creation.
104+
3. Mirroring certificates - The binary data inside the Mirroring Certificate property of the Azure Arc-enabled SQL MI is needed for the Instance Failover Group CR (Custom Resource) creation.
58105

59106
This can be achieved in a few ways:
60107

61-
(a) If using ```az``` CLI, generate the mirroring certificate file first, and then point to that file while configuring the Instance Failover Group so the binary data is read from the file and copied over into the CR. The cert files are not needed post FOG creation.
108+
(a) If using `az` CLI, generate the mirroring certificate file first, and then point to that file while configuring the Instance Failover Group so the binary data is read from the file and copied over into the CR. The cert files are not needed after failover group creation.
62109

63-
(b) If using ```kubectl```, directly copy and paste the binary data from the Arc SQL MI CR into the yaml file that will be used to create the Instance Failover Group.
110+
(b) If using `kubectl`, directly copy and paste the binary data from the Azure Arc-enabled SQL MI CR into the yaml file that will be used to create the Instance Failover Group.
64111

65112

66113
Using (a) above:
@@ -96,31 +143,77 @@ The following image shows a properly configured distributed availability group:
96143
> Ensure the SQL instances have different names for both primary and secondary sites, and the `shared-name` value should be identical on both sites.
97144
98145
```azurecli
99-
az sql instance-failover-group-arc create --shared-name <name of failover group> --name <name for primary DAG resource> --mi <local SQL managed instance name> --role primary --partner-mi <partner SQL managed instance name> --partner-mirroring-url tcp://<secondary IP> --partner-mirroring-cert-file <secondary.pem> --k8s-namespace <namespace> --use-k8s
146+
az sql instance-failover-group-arc create --shared-name <name of failover group> --name <name for primary failover group resource> --mi <local SQL managed instance name> --role primary --partner-mi <partner SQL managed instance name> --partner-mirroring-url tcp://<secondary IP> --partner-mirroring-cert-file <secondary.pem> --k8s-namespace <namespace> --use-k8s
100147
```
101148
102149
Example:
103150
```azurecli
104151
az sql instance-failover-group-arc create --shared-name myfog --name primarycr --mi sqlinstance1 --role primary --partner-mi sqlinstance2 --partner-mirroring-url tcp://10.20.5.20:970 --partner-mirroring-cert-file $HOME/sqlcerts/sqlinstance2.pem --k8s-namespace my-namespace --use-k8s
105152
```
106153
107-
On the secondary instance, run the following command to setup the FOG CR. The ```--partner-mirroring-cert-file``` in this case should point to a path that has the mirroring certificate file generated from the primary instance as described in 3(a) above.
154+
On the secondary instance, run the following command to set up the failover group custom resource. The `--partner-mirroring-cert-file` in this case should point to a path that has the mirroring certificate file generated from the primary instance as described in 3(a) above.
108155
109156
```azurecli
110-
az sql instance-failover-group-arc create --shared-name <name of failover group> --name <name for secondary DAG resource> --mi <local SQL managed instance name> --role secondary --partner-mi <partner SQL managed instance name> --partner-mirroring-url tcp://<primary IP> --partner-mirroring-cert-file <primary.pem> --k8s-namespace <namespace> --use-k8s
157+
az sql instance-failover-group-arc create --shared-name <name of failover group> --name <name for secondary failover group resource> --mi <local SQL managed instance name> --role secondary --partner-mi <partner SQL managed instance name> --partner-mirroring-url tcp://<primary IP> --partner-mirroring-cert-file <primary.pem> --k8s-namespace <namespace> --use-k8s
111158
```
112159
113160
Example:
114161
```azurecli
115162
az sql instance-failover-group-arc create --shared-name myfog --name secondarycr --mi sqlinstance2 --role secondary --partner-mi sqlinstance1 --partner-mirroring-url tcp://10.10.5.20:970 --partner-mirroring-cert-file $HOME/sqlcerts/sqlinstance1.pem --k8s-namespace my-namespace --use-k8s
116163
```
117164
118-
## Manual failover from primary to secondary instance
165+
## Retrieve Azure failover group health state
166+
167+
Information about the failover group such as primary role, secondary role, and the current health status can be viewed on the custom resource on either primary or secondary site.
168+
169+
Run the below command on primary and/or the secondary site to list the failover groups custom resource:
170+
171+
```azurecli
172+
kubectl get fog -n <namespace>
173+
```
174+
175+
Describe the custom resource to retrieve the failover group status, as follows:
176+
177+
```azurecli
178+
kubectl describe fog <failover group cr name> -n <namespace>
179+
```
180+
181+
## Failover group operations
182+
183+
Once the failover group is set up between the managed instances, different failover operations can be performed depending on the circumstances.
184+
185+
Possible failover scenarios are:
186+
187+
- The Azure Arc-enabled SQL managed instances at both sites are in healthy state and a failover needs to be performed:
188+
+ perform a manual failover from primary to secondary without data loss by setting `role=secondary` on the primary SQL MI.
189+
190+
- Primary site is unhealthy/unreachable and a failover needs to be performed:
191+
192+
+ the primary Azure Arc-enabled SQL managed instance is down/unhealthy/unreachable
193+
+ the secondary Azure Arc-enabled SQL managed instance needs to be force-promoted to primary with potential data loss
194+
+ when the original primary Azure Arc-enabled SQL managed instance comes back online, it will report as `Primary` role and unhealthy state and needs to be forced into a `secondary` role so it can join the failover group and data can be synchronized.
195+
196+
197+
## Manual failover (without data loss)
119198

120-
Use `az sql instance-failover-group-arc ...` to initiate a failover from primary to secondary. The following command initiates a failover from the primary instance to the secondary instance. Any pending transactions on the geo-primary instance are replicated over to the geo-secondary instance before the failover.
199+
Use `az sql instance-failover-group-arc update ...` command group to initiate a failover from primary to secondary. Any pending transactions on the geo-primary instance are replicated over to the geo-secondary instance before the failover.
200+
201+
### Directly connected mode
202+
Run the following command to initiate a manual failover, in `direct` connected mode using ARM APIs:
203+
204+
```azurecli
205+
az sql instance-failover-group-arc update --name <shared name of failover group> --mi <primary Azure Arc-enabled SQL MI> --role secondary --resource-group <resource group>
206+
```
207+
Example:
208+
209+
```azurecli
210+
az sql instance-failover-group-arc update --name myfog --mi sqlmi1 --role secondary --resource-group myresourcegroup
211+
```
212+
### Indirectly connected mode
213+
Run the following command to initiate a manual failover, in `indirect` connected mode using kubernetes APIs:
121214

122215
```azurecli
123-
az sql instance-failover-group-arc update --name <name of DAG resource> --role secondary --k8s-namespace <namespace> --use-k8s
216+
az sql instance-failover-group-arc update --name <name of failover group resource> --role secondary --k8s-namespace <namespace> --use-k8s
124217
```
125218

126219
Example:
@@ -129,21 +222,45 @@ Example:
129222
az sql instance-failover-group-arc update --name myfog --role secondary --k8s-namespace my-namespace --use-k8s
130223
```
131224

132-
## Forced failover
225+
## Forced failover with data loss
133226

134227
In the circumstance when the geo-primary instance becomes unavailable, the following commands can be run on the geo-secondary DR instance to promote to primary with a forced failover incurring potential data loss.
135228

136-
Run the below command on geo-primary, if available:
229+
On the geo-secondary DR instance, run the following command to promote it to primary role, with data loss.
230+
231+
> [!NOTE]
232+
> If the `--partner-sync-mode` was configured as `sync`, it needs to be reset to `async` when the secondary is promoted to primary.
233+
234+
### Directly connected mode
235+
```azurecli
236+
az sql instance-failover-group-arc update --name <shared name of failover group> --mi <secondary Azure Arc-enabled SQL MI> --role force-primary-allow-data-loss --resource-group <resource group> --partner-sync-mode async
237+
```
238+
Example:
137239

138240
```azurecli
139-
az sql instance-failover-group-arc update --k8s-namespace my-namespace --name primarycr --use-k8s --role force-secondary
241+
az sql instance-failover-group-arc update --name myfog --mi sqlmi2 --role force-primary-allow-data-loss --resource-group myresourcegroup --partner-sync-mode async
140242
```
141243

142-
On the geo-secondary DR instance, run the following command to promote it to primary role, with data loss.
244+
### Indirectly connected mode
245+
```azurecli
246+
az sql instance-failover-group-arc update --k8s-namespace my-namespace --name secondarycr --use-k8s --role force-primary-allow-data-loss --partner-sync-mode async
247+
```
248+
249+
When the geo-primary Azure Arc-enabled SQL MI instance becomes available, run the below command to bring it into the failover group and synchronize the data:
250+
251+
### Directly connected mode
252+
```azurecli
253+
az sql instance-failover-group-arc update --name <shared name of failover group> --mi <old primary Azure Arc-enabled SQL MI> --role force-secondary --resource-group <resource group>
254+
```
143255

256+
### Indirectly connected mode
144257
```azurecli
145-
az sql instance-failover-group-arc update --k8s-namespace my-namespace --name secondarycr --use-k8s --role force-primary-allow-data-loss
258+
az sql instance-failover-group-arc update --k8s-namespace my-namespace --name secondarycr --use-k8s --role force-secondary
146259
```
147-
## Limitation
260+
Optionally, the `--partner-sync-mode` can be configured back to `sync` mode if desired.
261+
262+
At this point, if you plan to continue running the production workload off of the secondary site, the `--license-type` needs to be updated to either `BasePrice` or `LicenseIncluded` to initiate billing for the vCores consumed.
263+
264+
## Next steps
148265

149-
When you use [SQL Server Management Studio Object Explorer to create a database](/sql/relational-databases/databases/create-a-database#SSMSProcedure), the application returns an error. You can [create new databases with T-SQL](/sql/relational-databases/databases/create-a-database#TsqlProcedure).
266+
[Overview: Azure Arc-enabled SQL Managed Instance business continuity](managed-instance-business-continuity-overview.md)
Binary file not shown.
81.8 KB
Loading

0 commit comments

Comments
 (0)