Skip to content

Commit 6582acb

Browse files
committed
Add notes and minor updates to Device Registry reliability guide
1 parent 941bb78 commit 6582acb

File tree

1 file changed

+49
-17
lines changed

1 file changed

+49
-17
lines changed
Lines changed: 49 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,41 @@
11
---
22
title: Reliability in Azure Device Registry
33
description: Find out about reliability in Azure Device Registry, including availability zones and multi-region deployments.
4-
author: isabellaecr
4+
author: isabellaecr
55
ms.author: anaharris
66
ms.topic: reliability-article
77
ms.custom: subject-reliability, references_regions
88
ms.service: azure-device-registry
9-
ms.date: 11/19/2024
9+
ms.date: 11/19/2024
1010
---
1111

1212
# Reliability in Azure Device Registry
1313

14+
<!-- Can we verify the branding? I can't see "Azure Device Registry" in any other docs. It seems like it's part of IoT Operations - should it be branded as part of that instead? -->
15+
1416
This article describes reliability support in Azure Device Registry. It covers both intra-regional resiliency with [availability zones](#availability-zone-support) and information on [multi-region deployments](#multi-region-support).
1517

1618
Because resiliency is a shared responsibility between you and Microsoft, this article also covers ways for you to build a resilient solution that meets your needs.
1719

18-
1920
## Transient faults
2021

21-
[!INCLUDE [Transient fault description](includes/reliability-transient-fault-description-include.md)]
22+
<!-- Is there any other information we should give here? For example, do clients typically interact with this service through an SDK - and if so does it handle transient faults autoatically? We should state that if so. -->
2223

24+
[!INCLUDE [Transient fault description](includes/reliability-transient-fault-description-include.md)]
2325

2426
## Availability zone support
2527

2628
[!INCLUDE [AZ support description](includes/reliability-availability-zone-description-include.md)]
2729

28-
Azure Device Registry is zone-redundant, which means that it automatically replicates across multiple [availability zones](../reliability/availability-zones-overview.md). This setup enhances the resiliency of the service by providing high availability. If there's a failure in one zone, the service can continue to operate seamlessly from another zone.
30+
Azure Device Registry is zone redundant by default, which means that it automatically replicates your data across multiple [availability zones](../reliability/availability-zones-overview.md). This setup enhances the resiliency of the service by providing high availability. If there's a failure in one zone, the service can continue to operate seamlessly from another zone.
2931

30-
Microsoft manages setup and configuration for zone redundancy in Azure Device Registry. You don't need to perform any more configuration to enable this zone redundancy. Microsoft ensures that the service is configured to provide the highest level of availability and reliability.
32+
Microsoft manages setup and configuration for zone redundancy in Azure Device Registry. You don't need to perform any more configuration to enable this zone redundancy. Microsoft ensures that the service is configured to provide the highest level of availability and reliability.
3133

3234
### Regions supported
3335

3436
The following list of regions support availability zones in Azure Device Registry:
3537

38+
<!-- Anastasia - style question: should we remove the empty columns? -->
3639

3740
| Americas | Europe | Middle East | Africa | Asia Pacific |
3841
|------------------|----------------------|---------------|--------------------|----------------|
@@ -41,48 +44,77 @@ The following list of regions support availability zones in Azure Device Registr
4144
| West US 2 | | | | |
4245
| West US 3 | | | | |
4346

44-
4547
### Cost
4648

4749
There's no extra cost to use zone redundancy for Azure Device Registry.
4850

4951
### Configure availability zone support
5052

51-
**New resources:** When you create an Azure Device Registry resource in Azure IoT Operations, it automatically includes zone-redundancy by default. There's no need for you to perform any more configuration.
53+
**New resources:** When you create an Azure Device Registry resource in Azure IoT Operations, it automatically includes zone-redundancy by default. There's no need for you to perform any more configuration.
5254

55+
### Normal operations
56+
57+
The following information describes what happens when you have a zone-redundant device registry and all availability zones are operational:
58+
59+
- **Traffic routing between zones:** Requests are automatically spread across each availability zone. A request might go to a Device Registry instance in any availability zone. <!-- Need to verify -->
60+
61+
- **Data replication between zones:** Device data is replicated synchronously across availability zones.
5362

5463
### Zone-down experience
5564

56-
During a zone-wide outage, you don't need to take any action to fail over to a healthy zone. The service automatically self-heals and rebalances itself to take advantage of the healthy zone automatically.
65+
The following information describes what happens when you have a zone-redundant device registry and an availability zone experiences an outage.
66+
67+
- **Detection and response:** Because Azure Device Registry detects and responds automatically to failures in an availability zone, you don't need to do anything to initiate an availability zone failover.
68+
69+
- **Active requests:** Any active requests could be dropped and might need to be retried. Follow [transient fault handling guidance](#transient-faults) to ensure your application is resilient to any transient faults.
5770

58-
**Detection and response:** Because Azure Device Registry detects and responds automatically to failures in an availability zone, you don't need to do anything to initiate an availability zone failover.
71+
- **Expected data loss:** A zone failure isn't expected to cause any data loss.
5972

73+
- **Expected downtime:** A zone failure isn't expected to cause downtime to your resources.
74+
75+
### Failback
76+
77+
When the availability zone recovers, Azure Device Registry automatically restores operations in the availability zone.
78+
79+
### Testing for zone failures
80+
81+
The Azure Device Registry platform manages traffic routing, failover, and failback across availability zones. You don't need to initiate anything. Because this feature is fully managed, you don't need to validate availability zone failure processes.
6082

6183
## Multi-region support
6284

63-
Azure Device Registry is a regional service with automatic geographical data replication. In a region-wide outage, Microsoft initiates compute failover from one region to another. If Azure Device Registry fails over, it continues to support its primary region, and no more actions by you're required.
85+
<!--
86+
87+
This section is extremely vague.
6488
65-
When using Azure IoT Operations (Azure IoT Operations), Azure Device Registry projects assets as Azure resources in the cloud within a single registry. The single registry is a source of truth for asset metadata and asset management capabilities. However, Azure IoT Operations includes various other components beyond Azure Device Registry. For detailed information on the high availability and zero data loss features of Azure IoT Operations components, refer to [Azure IoT Operations frequently asked questions](/azure/iot-operations/troubleshoot/iot-operations-faq#does-azure-iot-operations-offer-high-availability-and-zero-data-loss-features-).
89+
1. Does this capability depend on paired regions? (All the regions listed above are paired, so I assume yes - but we need to be explicit about the secondary region being the pair.)
90+
2. What does this mean - "If Azure Device Registry fails over, it continues to support its primary region"? How can it continue to support its primary region if the primary region is unavailable?
6691
92+
-->
93+
94+
Azure Device Registry is a regional service with automatic geographical data replication. In a region-wide outage, Microsoft initiates compute failover from one region to another. If Azure Device Registry fails over, it continues to support its primary region, and no more actions by you are required.
95+
96+
When using Azure IoT Operations (Azure IoT Operations), Azure Device Registry projects assets as Azure resources in the cloud within a single registry. The single registry is a source of truth for asset metadata and asset management capabilities. However, Azure IoT Operations includes various other components beyond Azure Device Registry. For detailed information on the high availability and zero data loss features of Azure IoT Operations components, refer to [Azure IoT Operations frequently asked questions](/azure/iot-operations/troubleshoot/iot-operations-faq#does-azure-iot-operations-offer-high-availability-and-zero-data-loss-features-).
6797

6898
### Region down experience
6999

70-
During a region outage, Microsoft adheres to the Recovery Time Objective (RTO) to recover the service. During this time, the customer can expect some service interruption until the service is fully recovered.
100+
<!-- Let's frame this in terms of "expected downtime" and "expected data loss" instead of "RTO" and "RPO". -->
71101

72-
In a complete region loss scenario, you can expect a manual recovery from Microsoft.
102+
During a region outage, Microsoft adheres to the Recovery Time Objective (RTO) to recover the service. During this time, the customer can expect some service interruption until the service is fully recovered.
73103

104+
In a complete region loss scenario, you can expect a manual recovery from Microsoft.
74105

75106
For Azure Device Registry, Recovery Time Objective (RTO) is approximately 24 hours. For Recovery Point Objective (RPO), you can expect less than 15 minutes.
76107

108+
<!-- Is there any guidance for what to do if this capability doesn't meet a customer's needs - e.g. if you a customer has no tolerance for downtime or data loss? Are there approaches a customer could follow to geo-replicate the data themselves, by provisioning multiple registries? Or would this be too hard to keep in sync? -->
77109

78110
## Service-level agreement (SLA)
79111

80-
The service-level agreement (SLA) for Azure Device Registry describes the expected availability of the service, and the conditions that must be met to achieve that availability expectation. To understand those conditions, it's important that you review the [Service Level Agreements (SLA) for Online Services](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services).
112+
<!-- Does this service actually have an SLA? If so, where in the SLA document is this service? I can't see it, either as "IoT Operations" or "Device Registry". -->
81113

114+
The service-level agreement (SLA) for Azure Device Registry describes the expected availability of the service, and the conditions that must be met to achieve that availability expectation. To understand those conditions, it's important that you review the [Service Level Agreements (SLA) for Online Services](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services).
82115

83116
## Related content
84117

85-
86-
- [What is Azure IoT Operations? - Azure IoT Operations](/azure/iot-operations/overview-iot-operations)
118+
- [What is Azure IoT Operations? - Azure IoT Operations](/azure/iot-operations/overview-iot-operations)
87119

88120
- [Reliability in Azure](/azure/reliability/overview)

0 commit comments

Comments
 (0)