You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/azure-sql/database/designing-cloud-solutions-for-disaster-recovery.md
+11-11Lines changed: 11 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,23 +30,23 @@ In this scenario, the applications have the following characteristics:
30
30
* Web tier and data tier must be collocated to reduce latency and traffic cost
31
31
* Fundamentally, downtime is a higher business risk for these applications than data loss
32
32
33
-
In this case, the application deployment topology is optimized for handling regional disasters when all application components need to failover together. The diagram below shows this topology. For geographic redundancy, the application’s resources are deployed to Region A and B. However, the resources in Region B are not utilized until Region A fails. A failover group is configured between the two regions to manage database connectivity, replication and failover. The web service in both regions is configured to access the database via the read-write listener **<failover-group-name>.database.windows.net** (1). Traffic manager is set up to use [priority routing method](../../traffic-manager/traffic-manager-configure-priority-routing-method.md) (2).
33
+
In this case, the application deployment topology is optimized for handling regional disasters when all application components need to fail over together. The diagram below shows this topology. For geographic redundancy, the application’s resources are deployed to Region A and B. However, the resources in Region B are not utilized until Region A fails. A failover group is configured between the two regions to manage database connectivity, replication and failover. The web service in both regions is configured to access the database via the read-write listener **<failover-group-name>.database.windows.net** (1). Azure Traffic Manager is set up to use [priority routing method](../../traffic-manager/traffic-manager-configure-priority-routing-method.md) (2).
34
34
35
35
> [!NOTE]
36
-
> [Azure traffic manager](../../traffic-manager/traffic-manager-overview.md) is used throughout this article for illustration purposes only. You can use any load-balancing solution that supports priority routing method.
36
+
> [Azure Traffic Manager](../../traffic-manager/traffic-manager-overview.md) is used throughout this article for illustration purposes only. You can use any load-balancing solution that supports priority routing method.
37
37
38
38
The following diagram shows this configuration before an outage:
39
39
40
40

41
41
42
-
After an outage in the primary region, the SQL Database service detects that the primary database is not accessible and triggers failover to the secondary region based on the parameters of the automatic failover policy (1). Depending on your application SLA, you can configure a grace period that controls the time between the detection of the outage and the failover itself. It is possible that traffic manager initiates the endpoint failover before the failover group triggers the failover of the database. In that case the web application cannot immediately reconnect to the database. But the reconnections will automatically succeed as soon as the database failover completes. When the failed region is restored and back online, the old primary automatically reconnects as a new secondary. The diagram below illustrates the configuration after failover.
42
+
After an outage in the primary region, SQL Database detects that the primary database is not accessible and triggers failover to the secondary region based on the parameters of the automatic failover policy (1). Depending on your application SLA, you can configure a grace period that controls the time between the detection of the outage and the failover itself. It is possible that Traffic Manager initiates the endpoint failover before the failover group triggers the failover of the database. In that case the web application cannot immediately reconnect to the database. But the reconnections will automatically succeed as soon as the database failover completes. When the failed region is restored and back online, the old primary automatically reconnects as a new secondary. The diagram below illustrates the configuration after failover.
43
43
44
44
> [!NOTE]
45
45
> All transactions committed after the failover are lost during the reconnection. After the failover is completed, the application in region B is able to reconnect and restart processing the user requests. Both the web application and the primary database are now in region B and remain co-located.
46
46
47
47

48
48
49
-
If an outage happens in region B, the replication process between the primary and the secondary database gets suspended but the link between the two remains intact (1). Traffic managed detects that connectivity to Region B is broken and marks the endpoint web app 2 as Degraded (2). The application's performance is not impacted in this case, but the database becomes exposed and therefore at higher risk of data loss in case region A fails in succession.
49
+
If an outage happens in region B, the replication process between the primary and the secondary database gets suspended but the link between the two remains intact (1). Traffic Manager detects that connectivity to Region B is broken and marks the endpoint web app 2 as Degraded (2). The application's performance is not impacted in this case, but the database becomes exposed and therefore at higher risk of data loss in case region A fails in succession.
50
50
51
51
> [!NOTE]
52
52
> For disaster recovery, we recommend the configuration with application deployment limited to two regions. This is because most of the Azure geographies have only two regions. This configuration does not protect your application from a simultaneous catastrophic failure of both regions. In an unlikely event of such a failure, you can recover your databases in a third region using [geo-restore operation](disaster-recovery-guidance.md#recover-using-geo-restore).
@@ -70,20 +70,20 @@ This option is best suited for applications with the following characteristics:
70
70
* Any data loss is high business risk. The database failover can only be used as a last resort if the outage is caused by a catastrophic failure.
71
71
* The application supports read-only and read-write modes of operations and can operate in "read-only mode" for a period of time.
72
72
73
-
In this pattern, the application switches to read-only mode when the read-write connections start getting time-out errors. The Web Application is deployed to both regions and include a connection to the read-write listener endpoint and different connection to the read-only listener endpoint (1). The Traffic manager profile should use [priority routing](../../traffic-manager/traffic-manager-configure-priority-routing-method.md). [End point monitoring](../../traffic-manager/traffic-manager-monitoring.md) should be enabled for the application endpoint in each region (2).
73
+
In this pattern, the application switches to read-only mode when the read-write connections start getting time-out errors. The web application is deployed to both regions and includes a connection to the read-write listener endpoint and different connection to the read-only listener endpoint (1). The Traffic Manager profile should use [priority routing](../../traffic-manager/traffic-manager-configure-priority-routing-method.md). [End point monitoring](../../traffic-manager/traffic-manager-monitoring.md) should be enabled for the application endpoint in each region (2).
74
74
75
75
The following diagram illustrates this configuration before an outage:
76
76
77
77

78
78
79
-
When the traffic manager detects a connectivity failure to region A, it automatically switches user traffic to the application instance in region B. With this pattern, it is important that you set the grace period with data loss to a sufficiently high value, for example 24 hours. It ensures that data loss is prevented if the outage is mitigated within that time. When the Web application in region B is activated the read-write operations start failing. At that point, it should switch to the read-only mode (1). In this mode the requests are automatically routed to the secondary database. If the outage is caused by a catastrophic failure, most likely it cannot be mitigated within the grace period. When it expires the failover group triggers the failover. After that the read-write listener becomes available and the connections to it stop failing (2). The following diagram illustrates the two stages of the recovery process.
79
+
When Traffic Manager detects a connectivity failure to region A, it automatically switches user traffic to the application instance in region B. With this pattern, it is important that you set the grace period with data loss to a sufficiently high value, for example 24 hours. It ensures that data loss is prevented if the outage is mitigated within that time. When the web application in region B is activated the read-write operations start failing. At that point, it should switch to the read-only mode (1). In this mode the requests are automatically routed to the secondary database. If the outage is caused by a catastrophic failure, most likely it cannot be mitigated within the grace period. When it expires the failover group triggers the failover. After that the read-write listener becomes available and the connections to it stop failing (2). The following diagram illustrates the two stages of the recovery process.
80
80
81
81
> [!NOTE]
82
-
> If the outage in the primary region is mitigated within the grace period, traffic manager detects the restoration of connectivity in the primary region and switches user traffic back to the application instance in region A. That application instance resumes and operates in read-write mode using the primary database in region A as illustrated by the previous diagram.
82
+
> If the outage in the primary region is mitigated within the grace period, Traffic Manager detects the restoration of connectivity in the primary region and switches user traffic back to the application instance in region A. That application instance resumes and operates in read-write mode using the primary database in region A as illustrated by the previous diagram.
If an outage happens in region B, the traffic manager detects the failure of the end point web-app-2 in region B and marks it degraded (1). In the meantime, the failover group switches the read-only listener to region A (2). This outage does not impact the end user experience but the primary database is exposed during the outage. The following diagram illustrates a failure in the secondary region:
86
+
If an outage happens in region B, Traffic Manager detects the failure of the end point web-app-2 in region B and marks it degraded (1). In the meantime, the failover group switches the read-only listener to region A (2). This outage does not impact the end user experience but the primary database is exposed during the outage. The following diagram illustrates a failure in the secondary region:
87
87
88
88

89
89
@@ -92,7 +92,7 @@ Once the outage is mitigated, the secondary database is immediately synchronized
92
92
This design pattern has several **advantages**:
93
93
94
94
* It avoids data loss during the temporary outages.
95
-
* Downtime depends only on how quickly traffic manager detects the connectivity failure, which is configurable.
95
+
* Downtime depends only on how quickly Traffic Manager detects the connectivity failure, which is configurable.
96
96
97
97
The **tradeoff** is that the application must be able to operate in read-only mode.
98
98
@@ -107,7 +107,7 @@ In this scenario the application has the following characteristics:
107
107
108
108
In order to meet these requirements you need to guarantee that the user device **always** connects to the application deployed in the same geography for the read-only operations, such as browsing data, analytics etc. Whereas, the OLTP operations are processed in the same geography **most of the time**. For example, during the day time OLTP operations are processed in the same geography, but during the off hours they could be processed in a different geography. If the end user activity mostly happens during the working hours, you can guarantee the optimal performance for most of the users most of the time. The following diagram shows this topology.
109
109
110
-
The application’s resources should be deployed in each geography where you have substantial usage demand. For example, if your application is actively used in the United States, European Union and South East Asia the application should be deployed to all of these geographies. The primary database should be dynamically switched from one geography to the next at the end of the working hours. This method is called “follow the sun”. The OLTP workload always connects to the database via the read-write listener **<failover-group-name>.database.windows.net** (1). The read-only workload connects to the local database directly using the databases server endpoint **<server-name>.database.windows.net** (2). Traffic manager is configured with the [performance routing method](../../traffic-manager/traffic-manager-configure-performance-routing-method.md). It ensures that the end user’s device is connected to the web service in the closest region. Traffic manager should be set up with end point monitoring enabled for each web service end point (3).
110
+
The application’s resources should be deployed in each geography where you have substantial usage demand. For example, if your application is actively used in the United States, European Union and South East Asia the application should be deployed to all of these geographies. The primary database should be dynamically switched from one geography to the next at the end of the working hours. This method is called “follow the sun”. The OLTP workload always connects to the database via the read-write listener **<failover-group-name>.database.windows.net** (1). The read-only workload connects to the local database directly using the databases server endpoint **<server-name>.database.windows.net** (2). Traffic Manager is configured with the [performance routing method](../../traffic-manager/traffic-manager-configure-performance-routing-method.md). It ensures that the end user’s device is connected to the web service in the closest region. Traffic Manager should be set up with end point monitoring enabled for each web service end point (3).
111
111
112
112
> [!NOTE]
113
113
> The failover group configuration defines which region is used for failover. Because the new primary is in a different geography the failover results in longer latency for both OLTP and read-only workloads until the impacted region is back online.
@@ -140,7 +140,7 @@ The key **benefits** of this design are:
140
140
141
141
But there are some **tradeoffs**:
142
142
143
-
* A regional outage results in the geography to be impacted by longer latency. Both read-write and read-only workloads is served by the application in a different geography.
143
+
* A regional outage results in the geography to be impacted by longer latency. Both read-write and read-only workloads are served by the application in a different geography.
144
144
* The read-only workloads must connect to a different end point in each region.
145
145
146
146
## Business continuity planning: Choose an application design for cloud disaster recovery
Copy file name to clipboardExpand all lines: articles/azure-sql/database/service-tiers-general-purpose-business-critical.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,7 +19,7 @@ ms.date: 01/30/2020
19
19
Azure SQL Database and Azure SQL Managed Instance are based on SQL Server database engine architecture that's adjusted for the cloud environment to ensure 99.99 percent availability, even if there is an infrastructure failure. Two service tiers are used by Azure SQL Database and Azure SQL Managed Instance, each with a different architectural model. These service tiers are:
20
20
21
21
-[General purpose](service-tier-general-purpose.md), which is designed for budget-oriented workloads.
22
-
-[Business critical](service-tier-business-critical.md),which is designed for low-latency workloads with high resiliency to failures and fast failovers.
22
+
-[Business critical](service-tier-business-critical.md),which is designed for low-latency workloads with high resiliency to failures and fast failovers.
23
23
24
24
Azure SQL Database has an additional service tier:
25
25
@@ -42,7 +42,7 @@ The following table describes the key differences between service tiers for the
|**TempDB size**| SQL Database |[32 GB per vCore](resource-limits-vcore-single-databases.md#general-purpose---provisioned-compute---gen4)|[32 GB per vCore](resource-limits-vcore-single-databases.md#hyperscale---provisioned-compute---gen5)|[32 GB per vCore](resource-limits-vcore-single-databases.md#business-critical---provisioned-compute---gen4)|
47
47
|| SQL Managed Instance |[24 GB per vCore](../managed-instance/resource-limits.md#service-tier-characteristics)| N/A | Up to 4 TB - [limited by storage size](../managed-instance/resource-limits.md#service-tier-characteristics)|
@@ -62,7 +62,7 @@ For more information, see the detailed differences between the service tiers in
62
62
63
63
## Data and log storage
64
64
65
-
The following factors affect the amount of storage used for data and log files, and applies to General Purpose and Business Critical. For details on data and log storage in Hyperscale, see [Hyperscale service tier](service-tier-hyperscale.md).
65
+
The following factors affect the amount of storage used for data and log files, and apply to General Purpose and Business Critical. For details on data and log storage in Hyperscale, see [Hyperscale service tier](service-tier-hyperscale.md).
66
66
67
67
- The allocated storage is used by data files (MDF) and log files (LDF).
68
68
- Each single database compute size supports a maximum database size, with a default maximum size of 32 GB.
@@ -95,6 +95,6 @@ Storage for database backups is allocated to support the point-in-time restore (
95
95
For details about the specific compute and storage sizes available in the general purpose and business critical service tiers, see:
96
96
97
97
-[vCore-based resource limits for Azure SQL Database](resource-limits-vcore-single-databases.md).
98
-
-[vCore-based resource limits for pooled Azure SQL Databases](resource-limits-vcore-elastic-pools.md).
98
+
-[vCore-based resource limits for pooled databases in Azure SQL Database](resource-limits-vcore-elastic-pools.md).
99
99
-[vCore-based resource limits for Azure SQL Managed Instance](../managed-instance/resource-limits.md).
0 commit comments