You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/data-factory/data-migration-guidance-overview.md
+22-21Lines changed: 22 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,42 +13,43 @@ ms.tgt_pltfrm: na
13
13
ms.topic: conceptual
14
14
ms.date: 7/30/2019
15
15
---
16
-
# Use Azure Data Factory to migrate data from your data lake or data warehouse to Azure
16
+
# Use Azure Data Factory to migrate data from your data lake or data warehouse to Azure
17
17
18
-
Azure Data Factory can be the tool to do data migration when you want to migrate your data lake or enterprise data warehouse (EDW) to Azure. The data lake migration and data warehouse migration are related to the following scenarios:
18
+
If you want to migrate your data lake or enterprise data warehouse (EDW) to Microsoft Azure, consider using Azure Data Factory. Azure Data Factory is well-suited to the following scenarios:
19
19
20
-
- Big data workload migration from AWS S3, on-prem Hadoop File System to Azure.
21
-
- EDW migration from Oracle Exadata, Netezza, Teradata, AWS Redshift to Azure.
20
+
- Big data workload migration from Amazon Simple Storage Service (Amazon S3) or an on-premises Hadoop Distributed File System (HDFS) to Azure
21
+
- EDW migration from Oracle Exadata, Netezza, Teradata, or Amazon Redshift to Azure
22
22
23
-
Azure Data Factory can move PBs of data for data lake migration, and tens of TB data for data warehouse migration.
23
+
Azure Data Factory can move petabytes (PB) of data for data lake migration, and tens of terabytes (TB) of data for data warehouse migration.
24
24
25
-
## Why Azure Data Factory can be used for data migration
25
+
## Why Azure Data Factory can be used for data migration
26
26
27
-
- Azure Data Factory can easily scale up amount of horsepower to move data in serverless manner with high performance, resilience, and scalability and only pay for what you use.
28
-
- Azure Data Factory has no limitation on data volume and number of files.
29
-
- Azure Data Factory can 100% utilize your network and storage bandwidth to achieve the highest data movement throughput in your environment.
30
-
- Azure Data Factory follows the pay-as-you-go strategy, so that you only need to pay for the time when you are using Azure Data Factory to do the data migration to Azure.
31
-
- Azure Data Factory has ability to perform one-time historical load as well as scheduled incremental load.
32
-
- Azure Data Factory uses Azure IR for moving data between publicly accessible data lake/warehouse endpoints, or alternatively use self-hosted IR for moving data for data lake/warehouse endpoints inside VNet or behind firewall.
33
-
- Azure Data Factory has enterprise-grade security: either use MSI or Service Identity for secured service-to-service integration, or alternatively leverage Azure Key Vault for credential management.
34
-
- Azure Data Factory provides a code-free authoring experience and rich built-in monitoring dashboard.
27
+
- Azure Data Factory can easily scale up the amount of processing power to move data in a serverless manner with high performance, resilience, and scalability. And you pay only for what you use. Also note the following:
28
+
- Azure Data Factory has no limitations on data volume or on the number of files.
29
+
- Azure Data Factory can fully use your network and storage bandwidth to achieve the highest volume of data movement throughput in your environment.
30
+
- Azure Data Factory uses a pay-as-you-go method, so that you pay only for the time you actually use to run the data migration to Azure.
31
+
- Azure Data Factory can perform both a one-time historical load and scheduled incremental loads.
32
+
- Azure Data Factory uses Azure integration runtime (IR) to move data between publicly accessible data lake and warehouse endpoints. It can also use self-hosted IR for moving data for data lake and warehouse endpoints inside Azure Virtual Network (VNet) or behind a firewall.
33
+
- Azure Data Factory has enterprise-grade security: You can use Windows Installer (MSI) or Service Identity for secured service-to-service integration, or use Azure Key Vault for credential management.
34
+
- Azure Data Factory provides a code-free authoring experience and a rich, built-in monitoring dashboard.
35
35
36
36
## Online vs. offline data migration
37
37
38
-
Azure Data Factory is a typical online data migration tool to transfer data over network (Internet, ER, or VPN), where offline data migration is letting people physically ship datatransfer devices from your organization to Azure Data Center.
38
+
Azure Data Factory is a standard online data migration tool to transfer data over a network (internet, ER, or VPN). Whereas with offline data migration, users physically ship data-transfer devices from their organization to an Azure Data Center.
39
39
40
-
There are three key considerations when selecting online vs. offline migration approach:
40
+
There are three key considerations when you choose between an online and offline migration approach:
41
41
42
-
- Size of data to be migrated.
43
-
- Network bandwidth.
44
-
- Migration window.
42
+
- Size of data to be migrated
43
+
- Network bandwidth
44
+
- Migration window
45
45
46
-
If you want to complete the data migration within two weeks (migration window), you can see a cut line in the picture below to show when it is good to use online migration tool (Azure Data Factory) with different data size and network bandwidth.
46
+
For example, assume you plan to use Azure Data Factory to complete your data migration within two weeks (your *migration window*). Notice the pink/blue cut line in the following table. The lowest pink cell for any given column shows the data size/network bandwidth pairing whose migration window is closest to but less than two weeks. (Any size/bandwidth pairing in a blue cell has an online migration window of more than two weeks.)
47
47
48
48

49
+
This table helps you determine whether you can meet your intended migration window through online migration (Azure Data Factory) based on the size of your data and your available network bandwidth. If the online migration window is more than two weeks, you'll want to use offline migration.
49
50
50
51
> [!NOTE]
51
-
> The benefit of online migration approach is that you can achieve both historical data loading and incremental feeds end to end by one tool. By doing so, the data can be keeping synchronized between existing and new store during the entire migration window so that you can rebuild your ETL logic on the new store with refreshed data.
52
+
> By using online migration, you can achieve both historical data loading and incremental feeds end-to-end through a single tool. Through this approach, your data can be kept synchronized between the existing store and the new store during the entire migration window. This means you can rebuild your ETL logic on the new store with refreshed data.
0 commit comments