You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/sql-database/sql-database-business-continuity.md
+31-43Lines changed: 31 additions & 43 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,82 +12,68 @@ author: anosov1960
12
12
ms.author: sashan
13
13
ms.reviewer: mathoma, carlrab
14
14
manager: craigg
15
-
ms.date: 04/04/2019
15
+
ms.date: 06/25/2019
16
16
---
17
17
# Overview of business continuity with Azure SQL Database
18
18
19
-
**Business continuity** in Azure SQL Database refers to the mechanisms, policies, and procedures that enable your business to continue operating in the face of disruption, particularly to its computing infrastructure. In the most of the cases, Azure SQL Database will handle the disruptive events that might happen in the cloud environment and keep your applications and business processes running. However, there are some disruptive events that cannot be handled by SQL Database such as:
19
+
**Business continuity** in Azure SQL Database refers to the mechanisms, policies, and procedures that enable your business to continue operating in the face of disruption, particularly to its computing infrastructure. In the most of the cases, Azure SQL Database will handle the disruptive events that might happen in the cloud environment and keep your applications and business processes running. However, there are some disruptive events that cannot be handled by SQL Database automatically such as:
20
20
21
21
- User accidentally deleted or updated a row in a table.
22
22
- Malicious attacker succeeded to delete data or drop a database.
23
23
- Earthquake caused a power outage and temporary disabled data-center.
24
24
25
-
These cases cannot be controlled by Azure SQL Database, so you would need to use the business continuity features in SQL Database that enables you to recover your data and keep your applications running.
26
-
27
25
This overview describes the capabilities that Azure SQL Database provides for business continuity and disaster recovery. Learn about options, recommendations, and tutorials for recovering from disruptive events that could cause data loss or cause your database and application to become unavailable. Learn what to do when a user or application error affects data integrity, an Azure region has an outage, or your application requires maintenance.
28
26
29
27
## SQL Database features that you can use to provide business continuity
30
28
31
29
From a database perspective, there are four major potential disruption scenarios:
32
30
33
31
- Local hardware or software failures affecting the database node such as a disk-drive failure.
34
-
- Data corruption or deletion typically caused by an application bug or human error. Such failures are intrinsically application-specific and cannot as a rule be detected or mitigated automatically by the infrastructure.
32
+
- Data corruption or deletion typically caused by an application bug or human error. Such failures are application-specific and typically cannot be detected by the database service.
35
33
- Datacenter outage, possibly caused by a natural disaster. This scenario requires some level of geo-redundancy with application failover to an alternate datacenter.
36
-
- Upgrade or maintenance errors, unanticipated issues that occur during planned upgrades or maintenance to an application or database may require rapid rollback to a prior database state.
34
+
- Upgrade or maintenance errors, unanticipated issues that occur during planned infrastructure maintenance or upgrades may require rapid rollback to a prior database state.
35
+
36
+
To mitigate the local hardware and software failures, SQL Database includes a [high availability architecture](sql-database-high-availability.md), which guarantees automatic recovery from these failures with up to 99.995% availability SLA.
37
+
38
+
To protect your business from data loss, SQL Database automatically creates full database backups weekly, differential database backups every 12 hours, and transaction log backups every 5 - 10 minutes . The backups are stored in RA-GRS storage for at least 7 days for all service tiers. All service tiers except Basic support configurable backup retention period for point-in-time restore, up to 35 days.
37
39
38
-
SQL Database provides several business continuity features, including automated backups and optional database replication that can mitigate these scenarios. First, you need to understand how SQL Database [high availability architecture](sql-database-high-availability.md) provides 99.99% availability and resiliency to some disruptive events that might affect your business process.
39
-
Then, you can learn about the additional mechanisms that you can use to recover from the disruptive events that cannot be handled by SQL Database high availability architecture, such as:
40
+
SQL Database also provides several business continuity features, that you can use to mitigate various unplanned scenarios.
40
41
41
42
-[Temporal tables](sql-database-temporal-tables.md) enable you to restore row versions from any point in time.
42
-
-[Built-in automated backups](sql-database-automated-backups.md) and [Point in Time Restore](sql-database-recovery-using-backups.md#point-in-time-restore) enables you to restore complete database to some point in time within the last 35 days.
43
+
-[Built-in automated backups](sql-database-automated-backups.md) and [Point in Time Restore](sql-database-recovery-using-backups.md#point-in-time-restore) enables you to restore complete database to some point in time within the configured retention period up to 35 days.
43
44
- You can [restore a deleted database](sql-database-recovery-using-backups.md#deleted-database-restore) to the point at which it was deleted if the **SQL Database server has not been deleted**.
44
45
-[Long-term backup retention](sql-database-long-term-retention.md) enables you to keep the backups up to 10 years.
45
46
-[Active geo-replication](sql-database-active-geo-replication.md) enables you to create readable replicas and manually failover to any replica in case of a data center outage or application upgrade.
46
47
-[Auto-failover group](sql-database-auto-failover-group.md#auto-failover-group-terminology-and-capabilities) allows the application to automatically recovery in case of a data center outage.
47
48
48
-
Each has different characteristics for estimated recovery time (ERT) and potential data loss for recent transactions. Once you understand these options, you can choose among them - and, in most scenarios, use them together for different scenarios. As you develop your business continuity plan, you need to understand the maximum acceptable time before the application fully recovers after the disruptive event. The time required for application to fully recover is known as recovery time objective (RTO). You also need to understand the maximum period of recent data updates (time interval) the application can tolerate losing when recovering after the disruptive event. The time period of updates that you might afford to lose is known as recovery point objective (RPO).
49
+
## Recover a database within the same Azure region
49
50
50
-
The following table compares the ERT and RPO for each service tier for the most common scenarios.
51
+
You can use automatic database backups to restore a database to a point in time in the past. This way you can recover from data corruptions caused by human errors. The poin-in-time restore allows you to create a new database in the same server that represents the state of data prior to the corrupting event. For most databases the restore operations takes less than 12 hours. It may take longer to recover a very large or very active database. For more information about recovery time, see [database recovery time](sql-database-recovery-using-backups.md#recovery-time).
51
52
52
-
| Capability | Basic | Standard | Premium | General Purpose | Business Critical
53
-
| --- | --- | --- | --- |--- |--- |
54
-
| Point in Time Restore from backup |Any restore point within seven days |Any restore point within 35 days |Any restore point within 35 days |Any restore point within configured period (up to 35 days)|Any restore point within configured period (up to 35 days)|
55
-
| Geo-restore from geo-replicated backups |ERT < 12 h<br> RPO < 1 h |ERT < 12 h<br>RPO < 1 h |ERT < 12 h<br>RPO < 1 h |ERT < 12 h<br>RPO < 1 h|ERT < 12 h<br>RPO < 1 h|
> *Manual database failover* refers to failover of a single database to its geo-replicated secondary using the [unplanned mode](sql-database-active-geo-replication.md#active-geo-replication-terminology-and-capabilities).
53
+
If the maximum supported backup retention period for point-in-time restore (PITR) is not sufficient for your application, you can extend it by configuring a long-term retention (LTR) policy for the database(s). For more information, see [Long-term backup retention](sql-database-long-term-retention.md).
61
54
62
-
## Recover a database to the existing server
55
+
## Recover a database to another Azure region
63
56
64
-
SQL Database automatically performs a combination of full database backups weekly, differential database backups generally taken every 12 hours, and transaction log backups every 5 - 10 minutes to protect your business from data loss. The backups are stored in RA-GRS storage for 35 days for all service tiers except Basic DTU service tiers where the backups are stored for 7 days. For more information, see [automatic database backups](sql-database-automated-backups.md). You can restore an existing database form the automated backups to an earlier point in time as a new database on the same SQL Database server using the Azure portal, PowerShell, or the REST API. For more information, see [Point-in-time restore](sql-database-recovery-using-backups.md#point-in-time-restore).
65
-
66
-
If the maximum supported point-in-time restore (PITR) retention period is not sufficient for your application, you can extend it by configuring a long-term retention (LTR) policy for the database(s). For more information, see [Long-term backup retention](sql-database-long-term-retention.md).
67
-
68
-
You can use these automatic database backups to recover a database from various disruptive events, both within your data center and to another data center. The recovery time is usually less than 12 hours. It may take longer to recover a very large or active database. Using automatic database backups, the estimated time of recovery depends on several factors including the total number of databases recovering in the same region at the same time, the database size, the transaction log size, and network bandwidth. For more information about recovery time, see [database recovery time](sql-database-recovery-using-backups.md#recovery-time). When recovering to another data region, the potential data loss is limited to 1 hour with use of geo-redundant backups.
69
-
70
-
Use automated backups and [point-in-time restore](sql-database-recovery-using-backups.md#point-in-time-restore) as your business continuity and recovery mechanism if your application:
57
+
Although rare, an Azure data center can have an outage. When an outage occurs, it causes a business disruption that might only last a few minutes or might last for hours.
71
58
72
-
- Is not considered mission critical.
73
-
- Doesn't have a binding SLA - a downtime of 24 hours or longer does not result in financial liability.
74
-
- Has a low rate of data change (low transactions per hour) and losing up to an hour of change is an acceptable data loss.
75
-
- Is cost sensitive.
59
+
- One option is to wait for your database to come back online when the data center outage is over. This works for applications that can afford to have the database offline. For example, a development project or free trial you don't need to work on constantly. When a data center has an outage, you do not know how long the outage might last, so this option only works if you don't need your database for a while.
60
+
- Another option is to restore a database on any server in any Azure region using [geo-redundant database backups](sql-database-recovery-using-backups.md#geo-restore) (geo-restore). Geo-restore uses a geo-redundant backup as its source and can be used to recover a database even if the database or datacenter is inaccessible due to an outage.
61
+
- Finally, you can quickly recover from an outage if you have configured either geo-secondary using [active geo-replication](sql-database-active-geo-replication.md) or an [auto-failover group](sql-database-auto-failover-group.md) for your database or databases. Depending on your choice of these technologies, you can use either manual or automatic failover. While failover itself takes only a few seconds, the service will take at least 1 hour to activate it. This is necessary to ensure that the failover is justified by the scale of the outage. Also, the failover may result in small data loss due to the nature of asynchronous replication.
76
62
77
-
If you need faster recovery, use [active geo-replication](sql-database-active-geo-replication.md) or [auto-failover groups](sql-database-auto-failover-group.md). If you need to be able to recover data from a period older than 35 days, use [Long-term retention](sql-database-long-term-retention.md).
63
+
As you develop your business continuity plan, you need to understand the maximum acceptable time before the application fully recovers after the disruptive event. The time required for application to fully recover is known as Recovery time objective (RTO). You also need to understand the maximum period of recent data updates (time interval) the application can tolerate losing when recovering from an unplanned disruptive event. The potential data loss is known as Recovery point objective (RPO).
78
64
79
-
## Recover a database to another region
65
+
Different recovery methods offer different levels of RPO and RTO. You can choose a specific recovery method, or use a combination of methods to achicethe the full application recovery. The following table compares RPO and RTO of each recovery option.
80
66
81
-
Although rare, an Azure data center can have an outage. When an outage occurs, it causes a business disruption that might only last a few minutes or might last for hours.
67
+
| Recovery method | RTO | RPO |
68
+
| --- | --- | --- |
69
+
| Geo-restore from geo-replicated backups | 12 h | 1 h |
70
+
| Auto-failover groups | 1 h | 5 s |
71
+
| Manual database failover | 30 s | 5 s |
82
72
83
-
- One option is to wait for your database to come back online when the data center outage is over. This works for applications that can afford to have the database offline. For example, a development project or free trial you don't need to work on constantly. When a data center has an outage, you do not know how long the outage might last, so this option only works if you don't need your database for a while.
84
-
- Another option is to restore a database on any server in any Azure region using [geo-redundant database backups](sql-database-recovery-using-backups.md#geo-restore) (geo-restore). Geo-restore uses a geo-redundant backup as its source and can be used to recover a database even if the database or datacenter is inaccessible due to an outage.
85
-
- Finally, you can quickly recover from an outage if you have configured either geo-secondary using [active geo-replication](sql-database-active-geo-replication.md) or an [auto-failover group](sql-database-auto-failover-group.md) for your database or databases. Depending on your choice of these technologies, you can use either manual or automatic failover. While failover itself takes only a few seconds, the service will take at least 1 hour to activate it. This is necessary to ensure that the failover is justified by the scale of the outage. Also, the failover may result in small data loss due to the nature of asynchronous replication. See the table earlier in this article for details of the auto-failover RTO and RPO.
73
+
> [!NOTE]
74
+
> *Manual database failover* refers to failover of a single database to its geo-replicated secondary using the [unplanned mode](sql-database-active-geo-replication.md#active-geo-replication-terminology-and-capabilities).
75
+
See the table earlier in this article for details of the auto-failover RTO and RPO.
> To use active geo-replication and auto-failover groups, you must either be the subscription owner or have administrative permissions in SQL Server. You can configure and fail over using the Azure portal, PowerShell, or the REST API using Azure subscription permissions or using Transact-SQL with SQL Server permissions.
91
77
92
78
Use auto-failover groups if your application meets any of these criteria:
93
79
@@ -97,8 +83,10 @@ Use auto-failover groups if your application meets any of these criteria:
97
83
- Has a high rate of data change and 1 hour of data loss is not acceptable.
98
84
- The additional cost of active geo-replication is lower than the potential financial liability and associated loss of business.
99
85
100
-
When you take action, how long it takes you to recover, and how much data loss you incur depends upon how you decide to use these business continuity features in your application. Indeed, you may choose to use a combination of database backups and active geo-replication depending upon your application requirements. For a discussion of application design considerations for stand-alone databases and for elastic pools using these business continuity features, see [Design an application for cloud disaster recovery](sql-database-designing-cloud-solutions-for-disaster-recovery.md) and [Elastic pool disaster recovery strategies](sql-database-disaster-recovery-strategies-for-applications-with-elastic-pool.md).
You may choose to use a combination of database backups and active geo-replication depending upon your application requirements. For a discussion of design considerations for stand-alone databases and for elastic pools using these business continuity features, see [Design an application for cloud disaster recovery](sql-database-designing-cloud-solutions-for-disaster-recovery.md) and [Elastic pool disaster recovery strategies](sql-database-disaster-recovery-strategies-for-applications-with-elastic-pool.md).
102
90
103
91
The following sections provide an overview of the steps to recover using either database backups or active geo-replication. For detailed steps including planning requirements, post recovery steps, and information about how to simulate an outage to perform a disaster recovery drill, see [Recover a SQL Database from an outage](sql-database-disaster-recovery.md).
0 commit comments