Skip to content

Commit 51126f9

Browse files
committed
Updates
1 parent d3fa5c9 commit 51126f9

File tree

1 file changed

+46
-24
lines changed

1 file changed

+46
-24
lines changed

articles/reliability/reliability-device-registry.md

Lines changed: 46 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -6,23 +6,26 @@ ms.author: anaharris
66
ms.topic: reliability-article
77
ms.custom: subject-reliability, references_regions
88
ms.service: azure-device-registry
9-
ms.date: 11/19/2024
9+
ms.date: 07/16/2025
1010
---
1111

1212
# Reliability in Azure Device Registry
1313

14-
<!-- Can we verify the branding? I can't see "Azure Device Registry" in any other docs. It seems like it's part of IoT Operations - should it be branded as part of that instead? -->
14+
Azure Device Registry stores information about IoT devices and other assets in the cloud. Azure Device Registry projects assets as Azure resources in the cloud within a single registry. The single registry is a source of truth for asset metadata and asset management capabilities. Device Registy is a component of [Azure IoT Operations](/azure/iot-operations/overview-iot-operations).
1515

1616
This article describes reliability support in Azure Device Registry. It covers both intra-regional resiliency with [availability zones](#availability-zone-support) and information on [multi-region deployments](#multi-region-support).
1717

1818
[!INCLUDE [Shared responsibility description](includes/reliability-shared-responsibility-include.md)]
1919

20-
## Transient faults
20+
> [!NOTE]
21+
> Azure IoT Operations includes various other components beyond Device Registry. For detailed information on the high availability and zero data loss features of Azure IoT Operations components, refer to [Azure IoT Operations frequently asked questions](/azure/iot-operations/troubleshoot/iot-operations-faq).
2122
22-
<!-- Is there any other information we should give here? For example, do clients typically interact with this service through an SDK - and if so does it handle transient faults autoatically? We should state that if so. -->
23+
## Transient faults
2324

2425
[!INCLUDE [Transient fault description](includes/reliability-transient-fault-description-include.md)]
2526

27+
Clients interact with Device Registry by using Azure Resource Manager. Commonly, you use the Azure portal, Azure CLI, or Azure SDKs to interact with Device Registry resources, and these tools provide automatic handling of transient faults. If you use the Resource Manager APIs directly, ensure you handle transient faults.
28+
2629
## Availability zone support
2730

2831
[!INCLUDE [AZ support description](includes/reliability-availability-zone-description-include.md)]
@@ -35,8 +38,6 @@ Microsoft manages setup and configuration for zone redundancy in Azure Device Re
3538

3639
The following list of regions support availability zones in Azure Device Registry:
3740

38-
<!-- Anastasia - style question: should we remove the empty columns? -->
39-
4041
| Americas | Europe | Middle East | Africa | Asia Pacific |
4142
|------------------|----------------------|---------------|--------------------|----------------|
4243
| East US | North Europe | | | |
@@ -56,7 +57,7 @@ There's no extra cost to use zone redundancy for Azure Device Registry.
5657

5758
The following information describes what happens when you have a zone-redundant device registry and all availability zones are operational:
5859

59-
- **Traffic routing between zones:** Requests are automatically spread across each availability zone. A request might go to a Device Registry instance in any availability zone. <!-- Need to verify -->
60+
- **Traffic routing between zones:** Requests are automatically spread across each availability zone. A request might go to a Device Registry instance in any availability zone.
6061

6162
- **Data replication between zones:** Device data is replicated synchronously across availability zones.
6263

@@ -66,6 +67,8 @@ The following information describes what happens when you have a zone-redundant
6667

6768
- **Detection and response:** Because Azure Device Registry detects and responds automatically to failures in an availability zone, you don't need to do anything to initiate an availability zone failover.
6869

70+
- **Notification:** Zone failure events can be monitored through Azure Service Health. Set up alerts to receive notifications of zone-level issues.
71+
6972
- **Active requests:** Any active requests could be dropped and might need to be retried. Follow [transient fault handling guidance](#transient-faults) to ensure your application is resilient to any transient faults.
7073

7174
- **Expected data loss:** A zone failure isn't expected to cause any data loss.
@@ -82,38 +85,57 @@ The Azure Device Registry platform manages traffic routing, failover, and failba
8285

8386
## Multi-region support
8487

85-
<!--
88+
Device Registry is a single-region service. If the region becomes unavailable, your Device Registry resources are also unavailable.
89+
90+
However, if your resources are in a [region that's paired](./regions-paired.md), your registry's data is replicated to the paired region.
91+
92+
In the event of a prolonged region outage, Microsoft might elect to fail over to the paired region. If this happens, your registry continues to be available in the paired region.
93+
94+
### Region support
95+
96+
Default replication and failover is only supported in regions that are paired.
8697

87-
This section is extremely vague.
98+
### Cost
99+
100+
For hubs in regions that are paired, there's no extra cost for cross-region data replication or failover.
101+
102+
### Configure replication and prepare for failover
88103

89-
1. Does this capability depend on paired regions? (All the regions listed above are paired, so I assume yes - but we need to be explicit about the secondary region being the pair.)
90-
2. What does this mean - "If Azure Device Registry fails over, it continues to support its primary region"? How can it continue to support its primary region if the primary region is unavailable?
104+
By default, cross-region data replication is automatically configured when you create Device Registry resources in a paired region. This process is a default option and requires no intervention from you.
105+
106+
### Normal operations
91107

92-
-->
108+
This section describes what to expect when a device regsitry is configured for cross-region replication and failover, and the primary region is operational.
93109

94-
Azure Device Registry is a regional service with automatic geographical data replication. In a region-wide outage, Microsoft initiates compute failover from one region to another. If Azure Device Registry fails over, it continues to support its primary region, and no more actions by you are required.
110+
- **Data replication between regions:** Data is replicated automatically to the paired region. Replication occurs asynchronously, which means that some data loss is expected if a failover occurs. There's no data replication between regions for device registeries in nonpaired regions.
95111

96-
When using Azure IoT Operations (Azure IoT Operations), Azure Device Registry projects assets as Azure resources in the cloud within a single registry. The single registry is a source of truth for asset metadata and asset management capabilities. However, Azure IoT Operations includes various other components beyond Azure Device Registry. For detailed information on the high availability and zero data loss features of Azure IoT Operations components, refer to [Azure IoT Operations frequently asked questions](/azure/iot-operations/troubleshoot/iot-operations-faq#does-azure-iot-operations-offer-high-availability-and-zero-data-loss-features-).
112+
- **Traffic routing between regions:** In normal operations, traffic only flows to the primary region.
97113

98-
### Region down experience
114+
### Region-down experience
99115

100-
<!-- Let's frame this in terms of "expected downtime" and "expected data loss" instead of "RTO" and "RPO". -->
116+
This section describes what to expect when a device registry is configured for cross-region replication and failover and there's an outage in the primary region.
101117

102-
During a region outage, Microsoft adheres to the Recovery Time Objective (RTO) to recover the service. During this time, the customer can expect some service interruption until the service is fully recovered.
118+
- **Detection and response:** Microsoft can decide to perform a failover if the primary region is lost. This process can take several hours after the loss of the primary region, or even longer in some scenarios. Failover of Device Registry resources might not occur at the same time as other Azure services.
103119

104-
In a complete region loss scenario, you can expect a manual recovery from Microsoft.
120+
- **Notification:** Region failure events can be monitored through Azure Service Health. Set up alerts to receive notifications of region-level issues.
105121

106-
For Azure Device Registry, Recovery Time Objective (RTO) is approximately 24 hours. For Recovery Point Objective (RPO), you can expect less than 15 minutes.
122+
- **Active requests:** Any requests that the primary region is processing during a failover are likely to be lost. Clients should retry requests after failover completes.
107123

108-
<!-- Is there any guidance for what to do if this capability doesn't meet a customer's needs - e.g. if you a customer has no tolerance for downtime or data loss? Are there approaches a customer could follow to geo-replicate the data themselves, by provisioning multiple registries? Or would this be too hard to keep in sync? -->
124+
- **Expected data loss:** For regions that are paired, data is replicated asynchronously to the paired region. As a result, some data loss is expected after failover. You can expect less than 15 minutes of data loss following a region failover.
109125

110-
<!-- Is there any way to back up/restore device registry data? -->
126+
- **Expected downtime:** Expect approximately 24 hours of downtime from when the region is lost to when the resource is available in the paired region.
127+
128+
- **Traffic rerouting:** During the failover process, Device Registry updates DNS records to point to the paired region. All subsequent requests are sent to the paired region.
129+
130+
After the failover operation for the registry completes, all operations from the device and back-end applications are expected to continue working without requiring manual intervention.
131+
132+
### Failback
111133

112-
## Service-level agreement (SLA)
134+
When the primary region recovers, Azure Device Registry automatically restores operations in the region.
113135

114-
<!-- Does this service actually have an SLA? If so, where in the SLA document is this service? I can't see it, either as "IoT Operations" or "Device Registry". -->
136+
### Testing for region failures
115137

116-
The service-level agreement (SLA) for Azure Device Registry describes the expected availability of the service, and the conditions that must be met to achieve that availability expectation. To understand those conditions, it's important that you review the [Service Level Agreements (SLA) for Online Services](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services).
138+
The Azure Device Registry platform manages traffic routing, failover, and failback across paired regions. You don't need to initiate anything. Because this feature is fully managed, you don't need to validate paired region failure processes.
117139

118140
## Related content
119141

0 commit comments

Comments
 (0)