You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: Cloud business continuity - disaster recovery
2
+
title: Cloud Business Continuity - Disaster Recovery
3
3
titleSuffix: Azure SQL Database
4
4
description: Learn how Azure SQL Database supports cloud business continuity and disaster recovery to help keep mission-critical cloud applications running.
This article provides an overview of the business continuity and disaster recovery capabilities of Azure SQL Database, describing the options and recommendations to recover from disruptive events that could lead to data loss or cause your database and application to become unavailable. Learn what to do when a user or application error affects data integrity, an Azure availability zone or region has an outage, or your application requires maintenance.
**Business continuity** in Azure SQL Database refers to the mechanisms, policies, and procedures that enable your business to continue operating in the face of disruption by providing availability, high availability, and disaster recovery.
@@ -38,19 +41,26 @@ In most cases, SQL Database handles disruptive events that might happen in a clo
38
41
- User accidentally deletes or updates a row in a table.
39
42
- Malicious attacker successfully deletes data or drops a database.
40
43
- Catastrophic natural disaster event takes down a datacenter or availability zone or region.
41
-
- Rare datacenter, availability zone or region-wide outage caused by a configuration change, software bug or hardware component failure.
44
+
- Rare datacenter, availability zone, or region-wide outage caused by a configuration change, software bug, or hardware failure.
Azure SQL Database comes with a core resiliency and reliability promise that protects it against software or hardware failures. Database backups are automated to protect your data from corruption or accidental deletion. As a Platform-as-a-service (PaaS), the Azure SQL Database service provides availability as an off-the-shelf feature with an industry-leading availability SLA of 99.99%.
48
55
49
-
### High Availability
56
+
To achieve high availability in the Azure cloud environment, enable [zone redundancy](high-availability-sla-local-zone-redundancy.md#zone-redundant-availability). With zone redundancy, the database or elastic pool uses [Azure availability zones](/azure/reliability/availability-zones-overview) to ensure resilience to zonal failures.
57
+
58
+
- Many Azure regions provide availability zones, which are separated groups of data centers within a region that have independent power, cooling, and networking infrastructure.
59
+
- Availability zones are designed to provide regional services, capacity, and high availability in the remaining zones if one zone experiences an outage.
50
60
51
-
To achieve high availability in the Azure cloud environment, enable [zone redundancy](high-availability-sla-local-zone-redundancy.md#zone-redundant-availability) so the database, or elastic pool, uses [availability zones](/azure/reliability/availability-zones-overview) to ensure the database, or elastic pool, is resilient to zonal failures. Many Azure regions provide availability zones, which are separated groups of data centers within a region that have independent power, cooling, and networking infrastructure. Availability zones are designed to provide regional services, capacity, and high availability in the remaining zones if one zone experiences an outage. By enabling zone redundancy, the database or elastic pool is resilient to zonal hardware and software failures and the recovery is transparent to applications. When high availability is enabled, the Azure SQL Database service is able to provide a higher availability SLA of 99.995%.
61
+
By enabling zone redundancy, the database or elastic pool is resilient to zonal hardware and software failures and the recovery is transparent to applications. When high availability is enabled, the Azure SQL Database service is able to provide a higher availability SLA of 99.995%.
52
62
53
-
###Disaster recovery
63
+
## Disaster recovery
54
64
55
65
To achieve higher availability and redundancy across regions, you can enable disaster recovery capabilities to quickly recover the database from a catastrophic regional failure. Options for disaster recovery with Azure SQL Database are:
56
66
@@ -69,58 +79,61 @@ The following table compares active geo-replication and failover groups, two dis
69
79
|**Multiple replicas**| Yes | No |
70
80
|**Can be in same region as primary**| Yes | No |
71
81
72
-
## Features that provide business continuity
73
-
74
-
From a database perspective, there are four major potential disruption scenarios. The following table lists SQL Database business continuity features you can use to mitigate a potential business disruption scenario:
75
-
76
-
| Business disruption scenario | Business continuity feature |
77
-
|:--|:--|
78
-
| Local hardware or software failures affecting the database node. | To mitigate local hardware and software failures, SQL Database includes an [availability architecture](high-availability-sla-local-zone-redundancy.md), which guarantees automatic recovery from these failures with up to 99.99% availability SLA. |
79
-
| Data corruption or deletion typically caused by an application bug or human error. Such failures are application-specific and typically can't be detected by the database service. | To protect your business from data loss, SQL Database automatically creates full database backups weekly, differential database backups every 12 or 24 hours, and transaction log backups every 5 - 10 minutes. By default, backups are stored in [geo-redundant storage](automated-backups-overview.md#backup-storage-redundancy) for seven days for all service tiers. All service tiers except Basic support a configurable backup retention period for [point-in-time restore (PITR)](recovery-using-backups.md#point-in-time-restore) of up to 35 days. You can [restore a deleted database](recovery-using-backups.md#restore-deleted-database) to the point at which it was deleted if the server hasn't been deleted, or if you've configured [long-term retention (LTR)](long-term-retention-overview.md). |
80
-
| Rare datacenter or availability zone outage, possibly caused by a natural disaster event, configuration change, software bug or hardware component failure. | To mitigate datacenter or availability zone level outage, enable [zone redundancy](high-availability-sla-local-zone-redundancy.md#zone-redundant-availability) for the database or elastic pool to use [Azure Availability Zones](/azure/reliability/availability-zones-overview) and provide redundancy across multiple physical zones within an Azure region. Enabling zone redundancy ensures the database or elastic pool is resilient to zonal failures with up to 99.995% high availability SLA. |
81
-
| Rare _regional outage_ impacting all availability zones and the datacenters comprising it, possibly caused by catastrophic natural disaster event. | To mitigate a region-wide outage, enable disaster recovery using one of the options: </br> - Continuous data synchronization options like [failover groups (recommended)](failover-group-sql-db.md) or [active geo-replication](active-geo-replication-overview.md) that allow you to create replicas in a secondary region for failover. </br> - Setting backup storage redundancy to geo-redundant backup storage to use [geo-restore](recovery-using-backups.md#geo-restore). |
82
-
83
82
## RTO and RPO
84
83
85
-
As you develop your business continuity plan, understand the maximum acceptable time before the application fully recovers after the disruptive event. The time required for an application to fully recover is known as the Recovery Time Objective (RTO). Also understand the maximum period of recent data updates (time interval) the application can tolerate losing when recovering from an unplanned disruptive event. The potential data loss is known as Recovery Point Objective (RPO).
84
+
As you develop your business continuity plan, understand the maximum acceptable time before the application fully recovers after the disruptive event. Two common ways to quantify business requirements around disaster recovery are:
85
+
86
+
-**Recovery Time Objective (RTO)**: The time required for an application to fully recover after an unplanned disruptive event.
87
+
-**Recovery Point Objective (RPO)**: The time amount of data loss that can be tolerated from an unplanned disruptive event.
86
88
87
89
The following table compares RPO and RTO of each business continuity option:
| High Availability </br>(Using zone redundancy) | Typically less than 30 seconds | 0 |
92
-
| Disaster Recovery </br>(Using failover groups with [customer managed failover policy](failover-group-sql-db.md#failover-policy) or active geo-replication) | Typically less than 60 seconds | Equal to or greater than 0 </br> (Depends on data changes before the disruptive event that haven't been replicated) |
94
+
| Disaster Recovery </br>(Using failover groups with [customer managed failover policy](failover-group-sql-db.md#failover-policy) or active geo-replication) | Typically less than 60 seconds | Equal to or greater than 0 </br> (Depends on data changes before the disruptive event that haven't been replicated) |
93
95
| Disaster Recovery </br>(Using geo-restore) | Typically minutes or hours, dependent on Azure storage replication | Typically minutes or hours, dependent on size of database backup |
94
96
95
-
## Business continuity checklists
97
+
## Features that provide business continuity
98
+
99
+
From a database perspective, there are four major potential disruption scenarios. The following table lists SQL Database business continuity features you can use to mitigate a potential business disruption scenario:
100
+
101
+
| Business disruption scenario | Business continuity feature |
102
+
|:--|:--|
103
+
| Local hardware or software failures affecting the database node. | To mitigate local hardware and software failures, Azure SQL Database includes an [availability architecture](high-availability-sla-local-zone-redundancy.md), which guarantees automatic recovery from these failures with up to 99.99% availability SLA.|
104
+
| Data corruption or deletion typically caused by an application bug or human error. Such failures are application-specific and typically can't be detected by the database service. | To protect your business from data loss, SQL Database automatically creates full database backups weekly, differential database backups every 12 or 24 hours, and transaction log backups every 5 - 10 minutes. By default, backups are stored in [geo-redundant storage](automated-backups-overview.md#backup-storage-redundancy) for seven days for all service tiers. All service tiers except Basic support a configurable backup retention period for [point-in-time restore (PITR)](recovery-using-backups.md#point-in-time-restore) of up to 35 days. You can [restore a deleted database](recovery-using-backups.md#restore-deleted-database) to the point at which it was deleted if the server hasn't been deleted, or if you've configured [long-term retention (LTR)](long-term-retention-overview.md).|
105
+
| Rare datacenter or availability zone outage, possibly caused by a natural disaster event, configuration change, software bug, or hardware failure. | To mitigate datacenter or availability zone level outage, enable [zone redundancy](high-availability-sla-local-zone-redundancy.md#zone-redundant-availability) for the database or elastic pool to use [Azure Availability Zones](/azure/reliability/availability-zones-overview) and provide redundancy across multiple physical zones within an Azure region. Enabling zone redundancy ensures the database or elastic pool is resilient to zonal failures with up to 99.995% high availability SLA.|
106
+
| Rare _regional outage_ affecting all availability zones and the datacenters comprising it, possibly caused by catastrophic natural disaster event. | To mitigate a region-wide outage, enable disaster recovery using one of the options: </br> - Continuous data synchronization options like [failover groups (recommended)](failover-group-sql-db.md) or [active geo-replication](active-geo-replication-overview.md) that allow you to create replicas in a secondary region for failover. </br> - Setting backup storage redundancy to geo-redundant backup storage to use [geo-restore](recovery-using-backups.md#geo-restore).|
96
107
97
-
For prescriptive recommendations to maximize availability and achieve higher business continuity, refer to the:
Regardless of which business continuity features you use, you must prepare the secondary database in another region. If you don't prepare properly, bringing your applications online after a failover or recovery takes additional time and likely also requires troubleshooting, which can delay RTO. Follow the [checklist for preparing secondary for a region outage](high-availability-disaster-recovery-checklist.md#prepare-secondary-for-an-outage).
105
112
106
113
## Restore a database within the same Azure region
107
114
108
-
You can use automatic database backups to restore a database to a point in time in the past. This way you can recover from data corruptions caused by human errors. Point-in-time restore (PITR) allows you to create a new database on the same server that represents the state of data prior to the corrupting event. For recovery times, see [RTO and RPO](#rto-and-rpo).
115
+
You can use automatic database backups to restore a database to a point in time in the past. This way you can recover from data corruptions caused by human errors. Point-in-time restore (PITR) allows you to create a new database on the same server that represents the state of data before the corrupting event. For recovery times, see [RTO and RPO](#rto-and-rpo).
109
116
110
-
If the maximum supported backup retention period for point-in-time restore isn't sufficient for your application, you can extend it by configuring a long-term retention (LTR) policy for the database(s). For more information, see [Long-term backup retention](long-term-retention-overview.md).
117
+
If the maximum supported backup retention period for point-in-time restore isn't sufficient for your application, you can extend it by configuring a long-term retention (LTR) policy. For more information, see [Long-term retention](long-term-retention-overview.md).
111
118
112
119
## Upgrade an application with minimal downtime
113
120
114
-
Sometimes an application must be taken offline because of maintenance such as an application upgrade. [Manage application upgrades](manage-application-rolling-upgrade.md) describes how to use active geo-replication to enable rolling upgrades of your cloud application to minimize downtime during upgrades and provide a recovery path if something goes wrong.
121
+
Sometimes an application must be taken offline because of maintenance such as an application upgrade. You can [manage rolling upgrades of cloud applications by using SQL Database active geo-replication](manage-application-rolling-upgrade.md). Geo-replication can also provide a recovery path if something goes wrong.
115
122
116
-
## Save on costs with a standby replica
123
+
## Save on costs with a standby replica
117
124
118
125
If your secondary replica is used _only_ for disaster recovery (DR) and doesn't have any read or write workloads, you can save on licensing costs by designating the database for standby when you configure a new active geo-replication relationship.
119
126
120
127
Review [license-free standby replica](standby-replica-how-to-configure.md) to learn more.
121
128
122
-
## Next steps
129
+
## Next step
130
+
131
+
> [!div class="nextstepaction"]
132
+
> [High availability and disaster recovery checklist](high-availability-disaster-recovery-checklist.md)
123
133
124
-
For application design considerations, see [Design an application for cloud disaster recovery](designing-cloud-solutions-for-disaster-recovery.md) and [Elastic pool disaster recovery strategies](disaster-recovery-strategies-for-applications-with-elastic-pool.md).
134
+
## Related content
125
135
126
-
Review the [Azure SQL Database disaster recovery guidance](disaster-recovery-guidance.md) and [Azure SQL Database high availability and disaster recovery checklist](high-availability-disaster-recovery-checklist.md).
136
+
-[Designing globally available services using Azure SQL Database](designing-cloud-solutions-for-disaster-recovery.md)
137
+
-[Disaster recovery strategies for applications using Azure SQL Database elastic pools](disaster-recovery-strategies-for-applications-with-elastic-pool.md)
0 commit comments