You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/reliability/concept-reliability.md
+15-24Lines changed: 15 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: Overview of reliability
3
-
description: Get an overvview of reliability concepts such as availability zones, regions.
3
+
description: Get an overview of reliability in Azure, including platform capabilities, the shared responsibility model, and how each Azure service supports reliability.
CustomerIntent: As a cloud architect/engineer, I want to learn about Azure Reliability.
11
11
---
12
12
13
-
# What is Reliability?
13
+
# What is reliability?
14
14
15
-
Reliability is a key concept in cloud computing, and refers to the ability of a workload to perform at expectation and in accordance with business continuity requirements. In Azure, reliability is achieved through a combination of factors, including the the design of the platform itself, its services, the architecture of applications, and the implementation of best practices.
16
-
17
-
Primarily, the reliability of a workload is defined by its *resiliency*, which is a workload's ability to recover from possible faults or outages and still "just work". Azure offers a number of resiliency features such as availability zones, multi-region support, data replication, and backup and restore capabilities. These features must be considered when designing a workload to meet its business continuity requirements.
18
-
19
-
While resiliency is the primary way you can ensure a reliable workload, you also can consider other aspects of workflow design such as:
20
-
21
-
- Operational Excellence,
22
-
23
-
- Security,
24
-
25
-
- Performance Efficiency,
26
-
27
-
- Cost Optimization,
15
+
Reliability refers to the ability of a workload to perform consistently at the expected level, and in accordance with business continuity requirements. Reliability is a key concept in cloud computing. In Azure, reliability is achieved through a combination of factors, including the design of the platform itself, its services, the architecture of your applications, and the implementation of best practices.
28
16
17
+
A key approach to achieve reliability in a workload is *resiliency*, which is a workload's ability to withstand and recover from faults and outages. Azure offers a number of resiliency features such as availability zones, multi-region support, data replication, and backup and restore capabilities. These features must be considered when designing a workload to meet its business continuity requirements. The Azure reliability documentation provides detailed information about these platform capabilities and how Azure services can be used to implement your resiliency needs.
29
18
19
+
> [!TIP]
20
+
> Reliability also incorporates other elements of your solution design too, including how you deploy changes safely, how you manage your performance to avoid downtime due to high load, and how you test and validate each part of your solution. To learn more, see the [Azure Well-Architected Framework](/azure/well-architected).
30
21
31
22
## Business continuity, high availability, and disaster recovery
32
23
@@ -36,15 +27,15 @@ When considering business continuity, it's important to understand the following
36
27
37
28
-*Business continuity* is the state in which a business can continue operations during failures, outages, or disasters. Business continuity requires proactive planning, preparation, and the implementation of resilient systems and processes.
38
29
39
-
-*High availability* is about designing a solution to be resilient to day-to-day issues and to meet the business needs for availability.
30
+
-*High availability* is about designing a solution to meet the business needs for availability, and being resilient to day-to-day issues that might affect the uptime requirements.
40
31
41
32
-*Disaster recovery* is about planning how to deal with uncommon risks and the catastrophic outages that can result.
42
33
43
34
For more information on business continuity and business continuity planning through high availability and disaster recovery design, see [What are business continuity, high availability, and disaster recovery?](./concept-business-continuity-high-availability-disaster-recovery.md)
44
35
45
36
## Resiliency and shared responsibility
46
37
47
-
Resiliency defines a workload's ability to be highly available by being able to automatically self-correct and recover from various forms of failures or outages. Although Azure services are built to be resilient to common failures, the resiliency of your workload depends on how you have designed your business continuity plan to meet your business needs. Some plans may consider certain failure risks to be unimportant, while others may consider them critical.
38
+
Resiliency defines a workload's ability to automatically self-correct and recover from various forms of failures or outages. Azure services are built to be resilient to many common failures, and each product provides a service level agreement (SLA) that describes the uptime you can expect. However, the overall resiliency of your workload depends on how you have designed your solution to meet your business needs. Some business continuity plans may consider certain failure risks to be unimportant, while others may consider them critical.
48
39
49
40
In the Azure public cloud platform, resiliency is a shared responsibility between Microsoft and you. Because there are different levels of resiliency in each workload that you design and deploy, it's important that you understand who has primary responsibility for each one of those levels from a resiliency perspective. To better understand how shared responsibility works, especially when confronting an outage or disaster, see [Shared responsibility for resiliency](concept-shared-responsibility.md).
50
41
@@ -55,27 +46,27 @@ Azure provides over 60 regions globally, that are located across many different
55
46
56
47
- For more information on Azure regions, see [What are Azure regions](./regions-overview.md).
57
48
- To learn about paired and nonpaired regions, including lists of region pairs and nonpaired regions, see [Azure region pairs and nonpaired regions](./regions-paired.md).
58
-
- To see the list of services that are deployed to Azure regions, see [Product Availability by Region](/explore/global-infrastructure/products-by-region/table)
49
+
- To see the list of services that are deployed to Azure regions, see [Product Availability by Region](https://azure.microsoft.com/explore/global-infrastructure/products-by-region/table)
59
50
60
51
61
52
## Azure availability zones
62
53
63
54
Many Azure regions provide availability zones, which are separated groups of datacenters within a region. Availability zones are close enough to have low-latency connections to other availability zones, but are far enough apart to reduce the likelihood that more than one will be affected by local outages or weather. Availability zones have independent power, cooling, and networking infrastructure. They're designed so that if one zone experiences an outage, then regional services, capacity, and high availability are supported by the remaining zones.
64
55
65
56
- For more information on availability zones, see [What are availability zones?](./availability-zones-overview.md).
66
-
- To view which services support availability zones, see [Azure services with availability zone support](./availability-zones-service-support.md)
67
57
- To view which regions support availability zones, see [Azure regions with availability zone support](./availability-zones-region-support.md).
58
+
- To learn about how each Azure service supports availability zones, see [Azure services with availability zone support](./availability-zones-service-support.md)
68
59
- To learn how to approach a migration to availability zone support, see [Azure availability zone migration baseline](availability-zones-baseline.md).
69
60
70
61
## Azure reliability guides by service
71
62
72
-
Azure provides a set of servicespecific reliability guidance that can help you design and implement a reliable workload. Each service has its own unique characteristics, and the guidance can help you understand how to best use the service to meet your business needs. Each guide may contain the following sections, depending on which reliability features it supports:
63
+
Each Azure service has its own unique reliability characteristics. Azure provides a set of service-specific reliability guides that can help you design and implement a reliable workload, and the guidance can help you understand how to best use the service to meet your business needs. Each guide may contain the following sections, depending on which reliability features it supports:
73
64
74
-
Each reliability service guide generally contains information on how the service supports:
65
+
Each reliability service guide generally contains information on how the service supports a range of reliability capabilities, including:
75
66
76
-
-*Availability zones* such as zonal or zone-redundant options, traffic routing and data replication between zones, zone-down experience, capacity planning, failback, and how to configure for availability zone support.
77
-
-*Multi-region support* such as how to configure multi-region or geo-disaster support, traffic routing and data replication between regions, region-down experience, failover and failback support, alternative multi-region support.
78
-
-*Backup support* such as who controls backups, where they are stored,how they can be recovered, and whether they are accessible only within a region or across regions.
67
+
-*Availability zones* such as zonal or zone-redundant deployment options, traffic routing and data replication between zones, zone-down experience, capacity planning, failback, and how to configure for availability zone support.
68
+
-*Multi-region support* such as how to configure multi-region or geo-disaster recovery support, traffic routing and data replication between regions, region-down experience, and failover and failback support. For some services that don't have native multi-region support, the guides present alternative multi-region deployment approaches to consider.
69
+
-*Backup support* such as Microsoft-controlled and customer-controlled backup capabilities, where they are stored,how they can be recovered, and whether they are accessible only within a region or across regions.
79
70
80
71
For more information and a list of reliability service guides, see [Reliability guides by service](./reliability-guidance-overview.md).
While Azure provides a set of reliability features, the resiliency of your workload is a [shared responsibility between you and Microsoft]((./concept-shared-responsibility.md)) and depends on how you have designed your business continuity plan to define your expectations for reliability. For this reason, it's important that you understand the reliability features of each service you use, and how to best implement them in your workload. This document provides links to the reliability guidance for each Azure service, detailing how each services supports or does not support specific reliability features.
15
+
While Azure provides a set of reliability features, the resiliency of your workload is a [shared responsibility between you and Microsoft](./concept-shared-responsibility.md) and depends on how you have designed your business continuity plan to define your expectations for reliability. For this reason, it's important that you understand the reliability features of each service you use, and how to best implement them in your workload. This document provides links to the reliability guidance for each Azure service, detailing how each services supports or does not support specific reliability features.
16
16
17
17
Each service guide generally contains information on how the service supports:
0 commit comments