Skip to content

Commit d92130f

Browse files
authored
Merge pull request #298386 from johndowns/reliability-device-registry-april-2025
Reliability guide - Azure Device Registry - Updates
2 parents 9588e7f + 7b84350 commit d92130f

File tree

1 file changed

+80
-25
lines changed

1 file changed

+80
-25
lines changed
Lines changed: 80 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,88 +1,143 @@
11
---
22
title: Reliability in Azure Device Registry
33
description: Find out about reliability in Azure Device Registry, including availability zones and multi-region deployments.
4-
author: isabellaecr
4+
author: isabellaecr
55
ms.author: anaharris
66
ms.topic: reliability-article
77
ms.custom: subject-reliability, references_regions
88
ms.service: azure-device-registry
9-
ms.date: 11/19/2024
9+
ms.date: 07/30/2025
1010
---
1111

1212
# Reliability in Azure Device Registry
1313

14+
Azure Device Registry stores information about assets and devices in the cloud. Azure Device Registry projects assets as Azure resources in the cloud within a single registry. The single registry is a source of truth for device and asset metadata, and asset management capabilities. Device Registry can be used in conjunction with [Azure IoT Operations](/azure/iot-operations/overview-iot-operations).
15+
1416
This article describes reliability support in Azure Device Registry. It covers both intra-regional resiliency with [availability zones](#availability-zone-support) and information on [multi-region deployments](#multi-region-support).
1517

1618
[!INCLUDE [Shared responsibility description](includes/reliability-shared-responsibility-include.md)]
1719

20+
> [!NOTE]
21+
> Azure IoT Operations includes various other components beyond Device Registry. For detailed information on the high availability and zero data loss features of Azure IoT Operations components, refer to [Azure IoT Operations frequently asked questions](/azure/iot-operations/troubleshoot/iot-operations-faq).
1822
1923
## Transient faults
2024

2125
[!INCLUDE [Transient fault description](includes/reliability-transient-fault-description-include.md)]
2226

27+
Clients interact with Device Registry by using Azure Resource Manager. Commonly, you use the Azure portal, Azure CLI, or Azure SDKs to interact with Device Registry resources, and these tools provide automatic handling of transient faults. If you use the Resource Manager APIs directly, make sure to handle transient faults.
2328

2429
## Availability zone support
2530

2631
[!INCLUDE [AZ support description](includes/reliability-availability-zone-description-include.md)]
2732

28-
Azure Device Registry is zone-redundant, which means that it automatically replicates across multiple [availability zones](../reliability/availability-zones-overview.md). This setup enhances the resiliency of the service by providing high availability. If there's a failure in one zone, the service can continue to operate seamlessly from another zone.
33+
Azure Device Registry is zone redundant by default, which means that it automatically replicates your data across multiple [availability zones](../reliability/availability-zones-overview.md). This setup enhances the resiliency of the service by providing high availability. If there's a failure in one zone, the service can continue to operate seamlessly from another zone.
2934

30-
Microsoft manages setup and configuration for zone redundancy in Azure Device Registry. You don't need to perform any more configuration to enable this zone redundancy. Microsoft ensures that the service is configured to provide the highest level of availability and reliability.
35+
Microsoft manages setup and configuration for zone redundancy in Azure Device Registry. You don't need to perform any more configuration to enable this zone redundancy. Microsoft ensures that the service is configured to provide the highest level of availability and reliability.
3136

3237
### Regions supported
3338

3439
The following list of regions support availability zones in Azure Device Registry:
3540

36-
37-
| Americas | Europe | Middle East | Africa | Asia Pacific |
38-
|------------------|----------------------|---------------|--------------------|----------------|
39-
| East US | North Europe | | | |
40-
| East US 2 | West Europe | | | |
41-
| West US 2 | | | | |
42-
| West US 3 | | | | |
43-
41+
| Americas | Europe |
42+
|------------------|----------------------|
43+
| East US | North Europe |
44+
| East US 2 | West Europe |
45+
| West US | |
46+
| West US 2 | |
47+
| West US 3 | |
4448

4549
### Cost
4650

4751
There's no extra cost to use zone redundancy for Azure Device Registry.
4852

4953
### Configure availability zone support
5054

51-
**New resources:** When you create an Azure Device Registry resource in Azure IoT Operations, it automatically includes zone-redundancy by default. There's no need for you to perform any more configuration.
55+
**New resources:** When you create an Azure Device Registry resource in Azure IoT Operations, it automatically includes zone-redundancy by default. There's no need for you to perform any more configuration.
56+
57+
### Normal operations
5258

59+
The following information describes what happens when you have a zone-redundant device registry and all availability zones are operational:
60+
61+
- **Traffic routing between zones:** Requests are automatically spread across each availability zone. A request might go to a Device Registry instance in any availability zone.
62+
63+
- **Data replication between zones:** Device data is replicated synchronously across availability zones.
5364

5465
### Zone-down experience
5566

56-
During a zone-wide outage, you don't need to take any action to fail over to a healthy zone. The service automatically self-heals and rebalances itself to take advantage of the healthy zone automatically.
67+
The following information describes what happens when you have a zone-redundant device registry and an availability zone experiences an outage.
68+
69+
- **Detection and response:** Because Azure Device Registry detects and responds automatically to failures in an availability zone, you don't need to do anything to initiate an availability zone failover.
5770

58-
**Detection and response:** Because Azure Device Registry detects and responds automatically to failures in an availability zone, you don't need to do anything to initiate an availability zone failover.
71+
- **Notification:** Zone failure events can be monitored through Azure Service Health. Set up alerts to receive notifications of zone-level issues.
5972

73+
- **Active requests:** Some active requests may be dropped and so may need to be retried in the same way as other transient faults. To make sure that your application is resilient to any transient faults, see [transient fault handling guidance](#transient-faults).
74+
75+
- **Expected data loss:** A zone failure isn't expected to cause any data loss.
76+
77+
- **Expected downtime:** A zone failure isn't expected to cause downtime to your resources.
78+
79+
### Failback
80+
81+
When the availability zone recovers, Azure Device Registry automatically restores operations in the availability zone.
82+
83+
### Testing for zone failures
84+
85+
The Azure Device Registry platform manages traffic routing, failover, and failback across availability zones. You don't need to initiate anything. Because this feature is fully managed, you don't need to validate availability zone failure processes.
6086

6187
## Multi-region support
6288

63-
Azure Device Registry is a regional service with automatic geographical data replication. In a region-wide outage, Microsoft initiates compute failover from one region to another. If Azure Device Registry fails over, it continues to support its primary region, and no more actions by you're required.
89+
Device Registry is a single-region service. If the region becomes unavailable, your Device Registry resources are also unavailable.
90+
91+
However, your registry's data is replicated to the paired region. In the event of a prolonged region outage, Microsoft might elect to fail over to the paired region. If this happens, your registry continues to be available in the paired region.
6492

65-
When using Azure IoT Operations (Azure IoT Operations), Azure Device Registry projects assets as Azure resources in the cloud within a single registry. The single registry is a source of truth for asset metadata and asset management capabilities. However, Azure IoT Operations includes various other components beyond Azure Device Registry. For detailed information on the high availability and zero data loss features of Azure IoT Operations components, refer to [Azure IoT Operations frequently asked questions](/azure/iot-operations/troubleshoot/iot-operations-faq#does-azure-iot-operations-offer-high-availability-and-zero-data-loss-features-).
93+
### Region support
6694

95+
Default replication and failover is supported in all regions that Device Registry is available in, because [all of these regions are paired](./regions-paired.md).
6796

68-
### Region down experience
97+
### Cost
6998

70-
During a region outage, Microsoft adheres to the Recovery Time Objective (RTO) to recover the service. During this time, the customer can expect some service interruption until the service is fully recovered.
99+
There's no extra cost for cross-region data replication or failover.
71100

72-
In a complete region loss scenario, you can expect a manual recovery from Microsoft.
101+
### Configure replication and prepare for failover
73102

103+
By default, cross-region data replication is automatically configured when you create Device Registry resources in a region with a pair. This process is a default option and requires no intervention from you.
74104

75-
For Azure Device Registry, Recovery Time Objective (RTO) is approximately 24 hours. For Recovery Point Objective (RPO), you can expect less than 15 minutes.
105+
### Normal operations
76106

107+
This section describes what to expect when a device registry is configured for cross-region replication and failover, and the primary region is operational.
77108

78-
## Service-level agreement (SLA)
109+
- **Data replication between regions:** Data is replicated automatically to the paired region. Replication occurs asynchronously, which means that some data loss is expected if a failover occurs.
79110

80-
The service-level agreement (SLA) for Azure Device Registry describes the expected availability of the service, and the conditions that must be met to achieve that availability expectation. To understand those conditions, it's important that you review the [Service Level Agreements (SLA) for Online Services](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services).
111+
- **Traffic routing between regions:** In normal operations, traffic only flows to the primary region.
81112

113+
### Region-down experience
82114

83-
## Related content
115+
This section describes what to expect when a device registry is configured for cross-region replication and failover and there's an outage in the primary region.
116+
117+
- **Detection and response:** Microsoft can decide to perform a failover if the primary region is lost. This process can take several hours after the loss of the primary region, or even longer in some scenarios. Failover of Device Registry resources might not occur at the same time as other Azure services.
118+
119+
- **Notification:** Region failure events can be monitored through Azure Service Health. Set up alerts to receive notifications of region-level issues.
120+
121+
- **Active requests:** Any requests that the primary region is processing during a failover are likely to be lost. Clients should retry requests after failover completes.
122+
123+
- **Expected data loss:** Data is replicated asynchronously to the paired region. As a result, some data loss is expected after failover. You can expect less than 15 minutes of data loss following a region failover.
124+
125+
- **Expected downtime:** Expect approximately 24 hours of downtime from when the region is lost to when the resource is available in the paired region.
84126

127+
- **Traffic rerouting:** During the failover process, Device Registry updates DNS records to point to the paired region. All subsequent requests are sent to the paired region.
128+
129+
After the failover operation for the registry completes, all operations from the device and back-end applications are expected to continue working without requiring manual intervention.
130+
131+
### Failback
132+
133+
When the primary region recovers, Azure Device Registry automatically restores operations in the region.
134+
135+
### Testing for region failures
136+
137+
The Azure Device Registry platform manages traffic routing, failover, and failback across paired regions. You don't need to initiate anything. Because this feature is fully managed, you don't need to validate paired region failure processes.
138+
139+
## Related content
85140

86-
- [What is Azure IoT Operations? - Azure IoT Operations](/azure/iot-operations/overview-iot-operations)
141+
- [What is Azure IoT Operations? - Azure IoT Operations](/azure/iot-operations/overview-iot-operations)
87142

88143
- [Reliability in Azure](/azure/reliability/overview)

0 commit comments

Comments
 (0)