Skip to content

Commit 39e29cc

Browse files
Apply suggestions from proofread
Co-authored-by: Jodi Martis <[email protected]>
1 parent 1df2da6 commit 39e29cc

File tree

1 file changed

+25
-25
lines changed

1 file changed

+25
-25
lines changed

articles/reliability/reliability-data-factory.md

Lines changed: 25 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Reliability in Azure Data Factory
3-
description: Learn about reliability in Azure Data Factory by using availability zones, multiple-region deployments, and resilient pipeline practices.
3+
description: Learn about reliability in Azure Data Factory, including availability zones, multiple-region deployments, and resilient pipeline practices.
44
author: jonburchel
55
ms.author: jburchel
66
ms.topic: reliability-article
@@ -22,15 +22,15 @@ You can use Azure Data Factory to create flexible and powerful data pipelines fo
2222

2323
- *Integration runtimes (IRs)*, which connect to data stores and perform activities defined in your pipeline.
2424

25-
- *Data stores that are connected to the data factory.* To help ensure that data stores meet your business continuity requirements, consult their product reliability documentation and guidance.
25+
- *Data stores that connect to the data factory.* To help ensure that data stores meet your business continuity requirements, consult their product reliability documentation and guidance.
2626

2727
## Reliability architecture overview
2828

2929
Azure Data Factory consists of multiple infrastructure components. Each component supports infrastructure reliability in various ways.
3030

3131
The components of Azure Data Factory include:
3232

33-
- **Core Azure Data Factory service**, which manages pipeline triggers and oversees the coordination of pipeline activities. The core service also manages metadata for each component in the data factory. Microsoft manages the core service.
33+
- **The core Azure Data Factory service**, which manages pipeline triggers and oversees the coordination of pipeline activities. The core service also manages metadata for each component in the data factory. Microsoft manages the core service.
3434

3535
- **[IRs](../data-factory/concepts-integration-runtime.md#integration-runtime-types)**, which perform specific activities within a pipeline. There are different types of IRs.
3636

@@ -50,26 +50,26 @@ Your pipeline activities should be *idempotent*, which means that they can be re
5050

5151
To prevent duplicate record insertion because of a transient fault, implement the following best practices:
5252

53-
- *Use unique identifiers* to each record before you write to the database. This approach can help you find and eliminate duplicate records.
53+
- *Use unique identifiers* for each record before you write to the database. This approach can help you find and eliminate duplicate records.
5454

5555
- *Use an upsert strategy* for connectors that support upsert. Before duplicate record insertion occurs, use this approach to check whether a record already exists. If it does exist, update it. If it doesn't exist, insert it. For example, SQL commands like `MERGE` or `ON DUPLICATE KEY UPDATE` use this upsert approach.
5656

57-
- *Use copy action strategies* that are described in [Data consistency verification in copy activity](../data-factory/copy-activity-data-consistency.md).
57+
- *Use copy action strategies.* For more information, see [Data consistency verification in copy activity](../data-factory/copy-activity-data-consistency.md).
5858

5959
### Retry policies
6060

6161
You can use retry policies to configure parts of your pipeline to retry if there's a problem, like transient faults in connected resources. In Azure Data Factory, you can configure retry policies on the following pipeline object types:
6262

63-
- [Tumbling window triggers](../data-factory/concepts-pipeline-execution-triggers.md#tumbling-window-trigger).
64-
- [Execution activities](../data-factory/concepts-pipelines-activities.md#execution-activities).
63+
- [Tumbling window triggers](../data-factory/concepts-pipeline-execution-triggers.md#tumbling-window-trigger)
64+
- [Implementation activities](../data-factory/concepts-pipelines-activities.md#execution-activities)
6565

66-
For more information about how to change or disable retry policies for your data factory triggers and activities, see [Pipeline execution and triggers](../data-factory/concepts-pipeline-execution-triggers.md).
66+
For more information about how to change or disable retry policies for your data factory triggers and activities, see [Pipeline runs and triggers](../data-factory/concepts-pipeline-execution-triggers.md).
6767

6868
## Availability zone support
6969

7070
[!INCLUDE [AZ support description](includes/reliability-availability-zone-description-include.md)]
7171

72-
Azure Data Factory supports *zone redundancy*, which provides resiliency to failures in [availability zones](availability-zones-overview.md). This section describes how each part of the Azure Data Factory service supports zone redundancy.
72+
Azure Data Factory supports zone redundancy, which provides resiliency to failures in [availability zones](availability-zones-overview.md). This section describes how each part of the Azure Data Factory service supports zone redundancy.
7373

7474
### Regions supported
7575

@@ -101,7 +101,7 @@ Zone-redundant Azure Data Factory resources can be deployed in [any region that
101101

102102
### Configure availability zone support
103103

104-
**Azure Data Factory core service:** No configuration required. Azure Data Factory core service automatically supports zone redundancy.
104+
**Core service:** No configuration required. The Data Factory core service automatically supports zone redundancy.
105105

106106
**IRs:**
107107

@@ -119,7 +119,7 @@ Zone-redundant Azure Data Factory resources can be deployed in [any region that
119119

120120
- *Azure IR* scales automatically based on demand, and you don't need to plan or manage capacity.
121121

122-
- *Azure-SSIS IR* requires you to specifically configure the number of nodes that you use. To prepare for availability zone failure, consider *over-provisioning* the capacity of your IR. Over-provisioning allows the solution to tolerate some degree of capacity loss and still continue to function without degraded performance. For more information, see [Manage capacity with over-provisioning](./concept-redundancy-replication-backup.md#manage-capacity-with-over-provisioning).
122+
- *Azure-SSIS IR* requires you to specifically configure the number of nodes that you use. To prepare for availability zone failure, consider over-provisioning the capacity of your IR. Over-provisioning allows the solution to tolerate some degree of capacity loss and still continue to function without degraded performance. For more information, see [Manage capacity by over-provisioning](./concept-redundancy-replication-backup.md#manage-capacity-with-over-provisioning).
123123

124124
- *SHIR* requires you to configure your own capacity and scaling. Consider over-provisioning when you deploy a SHIR.
125125

@@ -131,15 +131,15 @@ During normal operations, Azure Data Factory automatically distributes pipeline
131131

132132
**Detection and response:** The Azure Data Factory platform is responsible for detecting a failure in an availability zone and responding. You don't need to do anything to initiate a zone failover in your pipelines or other components.
133133

134-
**Active requests:** Any pipelines and triggers in progress continue to run, and you won't notice a zone failure. However, activities in progress during a zone failure might fail and be restarted. It's important to design activities to be idempotent, which helps them to recover from zone failures and other faults. For more information, see [Transient faults](#transient-faults).
134+
**Active requests:** Any pipelines and triggers in progress continue to run, and you don't experience any immediate disruption from a zone failure. However, activities in progress during a zone failure might fail and be restarted. It's important to design activities to be idempotent, which helps them recover from zone failures and other faults. For more information, see [Transient faults](#transient-faults).
135135

136136
### Failback
137137

138138
When the availability zone recovers, Azure Data Factory automatically fails back to the original zone. You don't need to do anything to initiate a zone failback in your pipelines or other components.
139139

140140
However, if you use the SHIR, you might need to restart your compute resources if they've been stopped.
141141

142-
### Testing for zone failures
142+
### Test for zone failures
143143

144144
For the core service, and for Azure and Azure-SSIS IRs, Azure Data Factory manages traffic routing, failover, and failback for zone-redundant resources. Because this feature is fully managed, you don't need to initiate or validate availability zone failure processes.
145145

@@ -151,11 +151,11 @@ Azure Data Factory resources are deployed into a single Azure region. If the reg
151151

152152
### Microsoft-managed failover to a paired region
153153

154-
Azure Data Factory supports Microsoft-managed failover for data factories in *paired regions*, except for Brazil South and Southeast Asia. In the unlikely event of a prolonged region failure, Microsoft might initiate a regional failover of your Azure Data Factory instance.
154+
Azure Data Factory supports Microsoft-managed failover for data factories in paired regions, except for Brazil South and Southeast Asia. In the unlikely event of a prolonged region failure, Microsoft might initiate a regional failover of your Azure Data Factory instance.
155155

156-
Because of data residency requirements in Brazil South and Southeast Asia, Azure Data Factory data is stored in the local region only by using [Azure Storage zone-redundant storage](../storage/common/storage-redundancy.md#zone-redundant-storage). For Southeast Asia, all data is stored in Singapore. For Brazil South, all data is stored in Brazil.
156+
Because of data residency requirements in Brazil South and Southeast Asia, Azure Data Factory data is stored only in the local region by using [Azure Storage zone-redundant storage](../storage/common/storage-redundancy.md#zone-redundant-storage). For Southeast Asia, all data is stored in Singapore. For Brazil South, all data is stored in Brazil.
157157

158-
For data factories in *nonpaired regions*, or in Brazil South or Southeast Asia, Microsoft doesn't perform regional failover on your behalf.
158+
For data factories in nonpaired regions, or in Brazil South or Southeast Asia, Microsoft doesn't perform regional failover on your behalf.
159159

160160
> [!IMPORTANT]
161161
> Microsoft triggers Microsoft-managed failover. It's likely to occur after a significant delay and is done on a best-effort basis. There are also some exceptions to this process. You might experience some loss of your data factory metadata. The failover of Azure Data Factory resources might occur at a different time than the failover of other Azure services.
@@ -166,37 +166,37 @@ For data factories in *nonpaired regions*, or in Brazil South or Southeast Asia,
166166

167167
To prepare for a failover, there might be some extra considerations, depending on the IR that you use.
168168

169-
- You can configure *Azure IR* to automatically resolve the region that it uses. If the region is set to *auto resolve* and there's an outage in the primary region, the Azure IR automatically fails over to the paired region. This failover is subject to the limitations described in [Microsoft-managed failover to a paired region](#microsoft-managed-failover-to-a-paired-region). To configure the Azure IR region for your activity execution or dispatch in the IR setup, set the region to *auto resolve*.
169+
- You can configure *Azure IR* to automatically resolve the region that it uses. If the region is set to *auto resolve* and there's an outage in the primary region, the Azure IR automatically fails over to the paired region. This failover is subject to [limitations](#microsoft-managed-failover-to-a-paired-region). To configure the Azure IR region for your activity implementation or dispatch in the IR setup, set the region to *auto resolve*.
170170

171-
- *Azure-SSIS IR* failover is managed separately from Microsoft-managed failover of the data factory. For more information, see [Alternative multiple-region approaches](#alternative-multiple-region-approaches).
171+
- *Azure-SSIS IR* failover is managed separately from a Microsoft-managed failover of the data factory. For more information, see [Alternative multiple-region approaches](#alternative-multiple-region-approaches).
172172

173-
- *SHIR* runs on infrastructure that you're responsible for, and so Microsoft-managed failover doesn't apply to SHIRs. For more information, see [Alternative multiple-region approaches](#alternative-multiple-region-approaches).
173+
- *SHIR* runs on infrastructure that you're responsible for, so a Microsoft-managed failover doesn't apply to SHIRs. For more information, see [Alternative multiple-region approaches](#alternative-multiple-region-approaches).
174174

175175
#### Post-failover reconfiguration
176176

177-
After a Microsoft-managed failover is complete, you can then access your Azure Data Factory pipeline in the paired region. However, after the failover completes, you might need to perform some reconfiguration for IRs or other components. This process includes re-establishing the networking configuration.
177+
After a Microsoft-managed failover is complete, you can access your Azure Data Factory pipeline in the paired region. However, after the failover completes, you might need to perform some reconfiguration for IRs or other components. This process includes re-establishing the networking configuration.
178178

179179
### Alternative multiple-region approaches
180180

181181
If you need your pipelines to be resilient to regional outages and you need control over the failover process, consider using a metadata-driven pipeline.
182182

183-
- **Set up source control for your Azure Data Factory** to track and audit any changes made to your metadata. You can use this approach to access your metadata JSON files for pipelines, datasets, linked services, and triggers. Azure Data Factory supports different Git repository types, like Azure DevOps and GitHub. For more information, see [Source control in Azure Data Factory](../data-factory/source-control.md).
183+
- **Set up source control for Azure Data Factory** to track and audit any changes to your metadata. You can use this approach to access your metadata JSON files for pipelines, datasets, linked services, and triggers. Azure Data Factory supports different Git repository types, like Azure DevOps and GitHub. For more information, see [Source control in Azure Data Factory](../data-factory/source-control.md).
184184

185-
- **Use a continuous integration and delivery (CI/CD) system**, such as Azure DevOps, to manage your pipeline metadata and deployments. You can use CI/CD to quickly restore operations to an instance in another region. If a region is unavailable, you can provision a new data factory manually or through automation. After the new data factory is created, you can restore your pipelines, datasets, and linked services JSON from the existing Git repository. For more information, see [Business continuity and disaster recovery (BCDR) for Azure Data Factory and Azure Synapse Analytics pipelines](/azure/architecture/example-scenario/analytics/pipelines-disaster-recovery).
185+
- **Use a continuous integration and continuous delivery (CI/CD) system**, such as Azure DevOps, to manage your pipeline metadata and deployments. You can use CI/CD to quickly restore operations to an instance in another region. If a region is unavailable, you can provision a new data factory manually or through automation. After the new data factory is created, you can restore your pipelines, datasets, and linked services JSON from the existing Git repository. For more information, see [Business continuity and disaster recovery (BCDR) for Azure Data Factory and Azure Synapse Analytics pipelines](/azure/architecture/example-scenario/analytics/pipelines-disaster-recovery).
186186

187187
Depending on the IR that you use, there might be other considerations.
188188

189-
- *Azure-SSIS IR* uses a database stored in Azure SQL Database or Azure SQL Managed Instance. You can configure geo-replication or a failover group for this database. The Azure-SSIS database is then located in a primary Azure region with read-write access (the *primary role*) and is continuously replicated to a secondary region with read-only access (the *secondary role*). If the primary region is lost, a failover is triggered, which causes the primary and secondary databases to swap roles.
189+
- *Azure-SSIS IR* uses a database stored in Azure SQL Database or Azure SQL Managed Instance. You can configure geo-replication or a failover group for this database. The Azure-SSIS database is located in a primary Azure region that has read-write access. The database is continuously replicated to a secondary region that has read-only access. If the primary region is unavailable, a failover triggers, which causes the primary and secondary databases to swap roles.
190190

191-
You can also configure a dual standby Azure SSIS IR pair that works in sync with Azure SQL Database or Azure SQL Managed Instance failover group.
191+
You can also configure a dual standby Azure SSIS IR pair that works in sync with a SQL Database or SQL Managed Instance failover group.
192192

193193
For more information, see [Configure Azure-SSIS IR for BCDR](../data-factory/configure-bcdr-azure-ssis-integration-runtime.md).
194194

195195
- *SHIR* runs on infrastructure that you manage. If the SHIR is deployed to an Azure VM, you can use [Azure Site Recovery](../site-recovery/site-recovery-overview.md) to trigger [VM failover](../site-recovery/azure-to-azure-architecture.md) to another region.
196196

197197
## Backup and restore
198198

199-
Azure Data Factory enables CI/CD by integrating with source control, which allows you to back up metadata from a data factory instance. This metadata can then be deployed seamlessly into a new environment. For more information, see [Continuous integration and delivery in Azure Data Factory](../data-factory/continuous-integration-delivery.md).
199+
Data Factory supports CI/CD through source control integration, so that you can back up the metadata of a data factory instance. CI/CD pipelines deploy this metadata seamlessly into a new environment. For more information, see [CI/CD in Azure Data Factory](../data-factory/continuous-integration-delivery.md).
200200

201201
## Related content
202202

0 commit comments

Comments
 (0)