You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/reliability/reliability-container-registry.md
+26-12Lines changed: 26 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -40,6 +40,7 @@ The service provides built-in resilency through zone redundancy within supported
40
40
41
41
### Regional storage
42
42
43
+
<!-- Chase: Please clarify how this relates to reliability. If the data in a paired region isn't there to support reliability, we should remove this section. -->
43
44
Azure Container Registry stores data in the region where the registry is created, to help customers meet data residency and compliance requirements. In all regions except Brazil South and Southeast Asia, Azure may also store registry data in a paired region in the same geography. In the Brazil South and Southeast Asia regions, registry data is always confined to the region, to accommodate data residency requirements for those regions.
44
45
45
46
## Transient faults
@@ -54,6 +55,8 @@ Monitor registry operations through Azure Monitor metrics and logs to identify p
54
55
55
56
When using geo-replicated registries, implement failover logic in your applications to automatically switch to alternative registry endpoints if the primary endpoint becomes temporarily unavailable. This provides additional resilience against transient faults that might affect a specific regional endpoint.
56
57
58
+
<!-- Chase: Do ACR tasks have transient fault handling behavior built in? For example, would it be reasonable to expect them to retry if there's a temporary blip downloading an image layer as part of a container build process? -->
59
+
57
60
## Availability zone support
58
61
59
62
[!INCLUDE [AZ support description](includes/reliability-availability-zone-description-include.md)]
@@ -66,7 +69,7 @@ Zone redundancy in Azure Container Registry protects your container images and a
66
69
67
70
Zone-redundant Premium registries can be deployed into [any region that supports availability zones](./regions-list.md).
68
71
69
-
If availability zones are added to an existing region, any previously created registries aren't automatically made zone-redundant. You need to create a new Premium registry to make it zone-redundant. <!-- Chase: Please verify this is accurate.>
72
+
If availability zones are added to an existing region, any previously created registries aren't automatically made zone-redundant. You need to create a new Premium registry to make it zone-redundant. <!-- Chase: Please verify this is accurate.-->
70
73
71
74
### Requirements
72
75
@@ -82,14 +85,16 @@ Zone redundancy is included with Premium tier registries at no additional cost.
82
85
83
86
### Configure availability zone support
84
87
85
-
-**Create zone-redundant registry**. Use the Azure portal, Azure CLI, Azure PowerShell, Bicep, or ARM templates to create Premium registries. Zone redundancy is automatically enabled when you create a Premium registry in a region that supports availability zones. For configuration details, see [Create a container registry using the Azure portal](/azure/container-registry/container-registry-get-started-portal).
88
+
-**Create zone-redundant registry**. Use the Azure portal, Azure CLI, Azure PowerShell, Bicep, or ARM templates to create Premium registries. Zone redundancy is automatically enabled when you create a Premium registry in a region that supports availability zones. For configuration details, see [Create a container registry using the Azure portal](/azure/container-registry/container-registry-get-started-portal).<!-- Chase: Please confirm this. [This document](https://learn.microsoft.com/azure/container-registry/zone-redundancy#create-zone-enabled-registry) indicates that you need to explicitly enable zone redundancy when creating a registry. -->
86
89
87
90
-**Migrate**. Existing Basic or Standard tier registries can be upgraded to Premium tier, however upgrading alone does not enable zone redundancy for existing registries. To get zone redundancy, you must create a new Premium registry in a supported region and migrate your container images.
88
91
89
92
To migrate your container images between registries, see [Transfer artifacts to another registry](/azure/container-registry/container-registry-transfer-prerequisites), or [Import container images to a container registry](/azure/container-registry/container-registry-import-images).
90
93
91
94
-**Disable zone redundancy**. Zone redundancy cannot be disabled.
92
95
96
+
<!-- Anastasia: Not sure where is best to put this paragraph. -->
97
+
If your registry uses geo-replication and zone redundancy together, you configure zone redundancy on each regional replica. You can't change the zone redundancy setting after a replication is created, except by deleting and re-creating the replication.
93
98
94
99
### Normal operations
95
100
@@ -115,9 +120,9 @@ When a zone becomes unavailable, Azure Container Registry automatically handles
115
120
116
121
-**Active requests**. Active registry operations are automatically retried against healthy zones. Most operations complete successfully with minimal delay.
117
122
118
-
-**Expected data loss**. No data loss occurs during zone failover due to synchronous replication across zones.
123
+
-**Expected data loss**. No data loss occurs during zone failures because data is synchronously replicated across multiple zones before write operations complete.
119
124
120
-
-**Expected downtime**. Minimal downtime during automatic failover, typicallyseconds for most registry operations.
125
+
-**Expected downtime**. A small amount of downtime - typically, a few seconds for most registry operations - may occur during automatic failover as traffic is redirected to healthy zones. We recommend following [transient fault handling best practices](#transient-faults) to minimize the effect of zone failover on your applications.
121
126
122
127
-**Traffic rerouting**. The platform automatically reroutes traffic to healthy zones without requiring configuration changes.
123
128
@@ -135,9 +140,9 @@ Azure Container Registry provides native multi-region support through geo-replic
135
140
136
141
If your registry isn't geo-replicated and a regional outage occurs, the registry data may become unavailable and is not automatically recovered. Customers who wish to have their registry data stored in multiple regions for better performance across different geographies or who wish to have resiliency in the event of a regional outage should enable geo-replication.
137
142
138
-
Unlike many Azure services, Container Registry geo-replication does not use Azure paired regions. You have complete flexibility to select any combination of Azure regions for replication based on your specific geographic, performance, and compliance requirements. Each geo-replicated registry functions as a complete registry endpoint, supporting all registry operations including image pushes, pulls, and management tasks.
143
+
Azure Container Registry geo-replication does not rely on Azure paired regions. You have complete flexibility to select any combination of Azure regions for replication based on your specific geographic, performance, and compliance requirements. Each geo-replicated registry functions as a complete registry endpoint, supporting all registry operations including image pushes, pulls, and management tasks.<!-- Chase: Please confirm how this works, if regions work through an active-passive approach. -->
139
144
140
-
Geo-replication automatically synchronizes container images and artifacts across all configured regions. The service uses content-addressable storage to efficiently replicate only the unique image layers, minimizing bandwidth usage and replication time. Registry operations are automatically routed to the nearest regional endpoint for optimal performance.
145
+
Geo-replication automatically synchronizes container images and artifacts across all configured regions. The service uses content-addressable storage to efficiently replicate only the unique image layers, minimizing bandwidth usage and replication time. Registry operations are automatically routed to the nearest regional endpoint for optimal performance.<!-- TODO waiting to verify this -->
141
146
142
147
:::image type="content" source="./media/reliability-acr/acr-geo-replication-healthy-ops.png" alt-text="Diagram that shows geo-replication architecture with global clients connecting to primary and replica registries across multiple regions with asynchronous replication." border="false" lightbox="./media/reliability-acr/acr-geo-replication-healthy-ops.png":::
143
148
@@ -151,10 +156,12 @@ You must use the Premium tier to enable geo-replication.
151
156
152
157
### Considerations
153
158
154
-
Each geo-replicated region functions as an independent registry endpoint. Container clients can connect to any regional endpoint for registry operations. Consider configuring your container orchestration platforms to use the regional endpoint closest to their deployment location for optimal performance.
159
+
Each geo-replicated region functions as an independent registry endpoint. Container clients can connect to any regional endpoint for registry operations. Consider configuring your container orchestration platforms to use the regional endpoint closest to their deployment location for optimal performance.<!-- TODO waiting to verify routing behaviour -->
155
160
156
161
Geo-replication provides eventual consistency across regions, with replication typically completing within minutes of changes. Large container images or high-frequency updates may take longer to replicate across all regions. Most changes are expected to be reflected within approximately 15 minutes, but sometimes might take longer.
157
162
163
+
When using geo-replication with zone-redundant registries, each replicated registry inherits the zone redundancy configuration of its deployment region, providing both zone-level and region-level protection.
164
+
158
165
### Cost
159
166
160
167
Each geo-replicated region is billed separately according to Premium tier pricing for the respective region. Additionally, egress charges apply for data transfer between regions during initial replication and ongoing synchronization.
@@ -189,20 +196,27 @@ When a region becomes unavailable, container operations can continue using alter
189
196
190
197
:::image type="content" source="./media/reliability-acr/acr-geo-replication-failover.png" alt-text="Diagram that shows regional failover scenario where primary region becomes unavailable and application health monitoring triggers failover to replica regions." border="false" lightbox="./media/reliability-acr/acr-geo-replication-failover.png":::
191
198
192
-
-**Detection and response**. Customer applications are responsible for detecting regional endpoint unavailability and switching to alternative regions. Configure health checks and failover logic in your container orchestration platforms.
199
+
-**Detection and response**. Customer applications are responsible for detecting regional endpoint unavailability and switching to alternative regions. Configure health checks and failover logic in your container orchestration platforms. <!-- Need to verify this. -->
200
+
193
201
-**Notification**. Regional outages are reported through Azure Service Health. Monitor registry availability metrics for each regional endpoint to detect issues. For service health information, see [Azure Service Health](/azure/service-health/).
194
-
-**Active requests**. Active requests to an unavailable region will fail and must be retried against alternative regional endpoints.
195
-
-**Expected data loss**. No data loss occurs as registry data is replicated across multiple regions. Recent changes that have not yet replicated may be temporarily unavailable.
196
-
-**Expected downtime**. No downtime for registry operations when using alternative regional endpoints. Applications must be configured to failover to available regions.
202
+
203
+
-**Active requests**. Active requests to an unavailable region will fail and must be retried against alternative regional endpoints. <!-- Need to verify this. -->
204
+
205
+
-**Expected data loss**. It's likely that you will have some data loss. This is because of the asynchronous replication lag, which means that recent writes may not be replicated. Typically the data loss is expected to be less than 15 minutes, but that's not guaranteed. <!-- Chase: Please verify this. -->
206
+
207
+
-**Expected downtime**. No downtime is expected for registry operations when your clients use alternative regional endpoints. Applications must be configured to fail over to available regions. <!-- TODO need to verify how automated failover might work and what downtime could be expected. -->
208
+
197
209
-**Traffic rerouting**. Applications must implement logic to route traffic to available regional endpoints when the primary region becomes unavailable.
198
210
211
+
<!-- Chase: if it's the case that there's an active region and multiple read replicas, then during a failure does one replica get promoted? Or would all write operations fail? -->
212
+
199
213
### Failback
200
214
201
215
When a region recovers, registry operations automatically resume for that regional endpoint. The service synchronizes any changes that occurred during the outage. Applications can resume using the recovered regional endpoint, though this typically requires that you implement manual reconfiguration or automated failback logic.
202
216
203
217
### Testing for region failures
204
218
205
-
Test your applications' ability to handle regional failures by temporarily blocking access to a regional registry endpoint and verifying that container operations successfully failover to alternative regions. Use Azure Chaos Studio or manual testing procedures to validate your disaster recovery capabilities.
219
+
While you can't simulate the failure of one of the regions associated with your registry, you can test your applications' ability to fail over between regions. You can temporarily block access to a regional registry endpoint and verify that container operations successfully fail over to alternative regions. To learn more, see [Temporarily disable routing to replication](/azure/container-registry/container-registry-geo-replication#temporarily-disable-routing-to-replication).
0 commit comments