Skip to content

Commit 76e6c28

Browse files
authored
Merge pull request #170153 from normesta/WANDISCO
WANDISCO GA content
2 parents 3e97630 + 48cde12 commit 76e6c28

File tree

6 files changed

+76
-45
lines changed

6 files changed

+76
-45
lines changed
41.9 KB
Loading
4.73 KB
Loading
300 KB
Loading
535 KB
Loading

articles/storage/blobs/migrate-gen2-wandisco-live-data-platform.md

Lines changed: 76 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,115 +1,146 @@
11
---
2-
title: Data Lake Storage and WANdisco LiveData Platform for Azure (preview)
3-
description: Migrate on-premises Hadoop data to Azure Data Lake Storage Gen2 by using WANdisco LiveData Platform for Azure.
2+
title: Data Lake Storage and WANdisco LiveData Platform for Azure
3+
description: Learn how to migrate petabytes of on-premises Hadoop data to Azure Data Lake Storage Gen2 file systems without interrupting data operations or requiring downtime.
44
author: normesta
55
ms.topic: how-to
66
ms.author: normesta
77
ms.reviewer: b-pauls
8-
ms.date: 11/17/2020
8+
ms.date: 10/26/2021
99
ms.service: storage
1010
ms.custom: references_regions
1111
ms.subservice: data-lake-storage-gen2
1212
---
1313

14-
# Meet demanding migration requirements with WANdisco LiveData Platform for Azure (preview)
14+
# Migrate on-premises Hadoop data to Azure Data Lake Storage Gen2 with WANdisco LiveData Platform for Azure
1515

16-
Migrate on-premises Hadoop data to Azure Data Lake Storage Gen2 by using [WANdisco LiveData Platform for Azure](https://docs.wandisco.com/live-data-platform/docs/landing/). This platform eliminates the need for application downtime, remove the chance of data loss, and ensure data consistency even while operations continue on-premises.
16+
[WANdisco LiveData Platform for Azure](https://docs.wandisco.com/live-data-platform/docs/landing/) migrates petabytes of on-premises Hadoop data to Azure Data Lake Storage Gen2 file systems without interrupting data operations or requiring downtime. The platform's continuous checks prevent data from being lost while keeping it consistent at both ends of transference even while it undergoes modification.
1717

18-
> [!NOTE]
19-
> WANdisco LiveData Platform for Azure is in public preview. For regional availability, see [Supported regions](https://docs.wandisco.com/live-data-platform/docs/prereq#supported-regions).
20-
21-
The platform consists of two services: [LiveData Migrator for Azure](https://www.wandisco.com/products/livedata-migrator-for-azure) to migrate actively used data from on-premises environments to Azure storage, and [LiveData Plane for Azure](https://www.wandisco.com/products/livedata-plane-for-azure) which ensures that all modified data or ingest data are replicated consistently.
18+
The platform consists of two services. [LiveData Migrator for Azure](https://www.wandisco.com/products/livedata-migrator-for-azure) migrates actively used data from on-premises environments to Azure storage, and [LiveData Plane for Azure](https://www.wandisco.com/products/livedata-plane-for-azure) ensures that all modified or ingested data is replicated consistently.
2219

2320
> [!div class="mx-imgBorder"]
24-
> ![Live Data Platform Overview Illustration](./media/migrate-gen2-wandisco-live-data-platform/live-data-platform-overview.png)
21+
> ![Live Data Platform Overview illustration](./media/migrate-gen2-wandisco-live-data-platform/live-data-platform-overview.png)
2522
26-
You can manage both services by using the Azure portal and the Azure CLI, and both follow the same metered, pay-as-you-go billing model as all other Azure services. LiveData Platform for Azure consumption will appear on the same monthly Azure bill and will provide a consistent and convenient way to track and monitor your usage.
23+
Manage both services by using the Azure portal and the Azure CLI. Each service follows the same metered, pay-as-you-go billing model as all other Azure services: data consumption in LiveData Platform for Azure will appear on the monthly Azure bill, which will provide usage metrics.
2724

2825
Unlike migrating data *offline* by [copying static information to Azure Data Box](./data-lake-storage-migrate-on-premises-hdfs-cluster.md), or by using Hadoop tools like [DistCp](https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html), you can maintain full operation of your business systems during *online* migration with WANdisco LiveData for Azure. Keep your big data environments operating even while moving their data to Azure.
2926

30-
## Key features of WANdisco LiveData Platform for Azure
27+
## Key benefits of WANdisco LiveData Platform for Azure
28+
29+
[WANdisco LiveData Platform for Azure](https://docs.wandisco.com/live-data-platform/docs/landing/)'s wide-area network capable consensus engine achieves data consistency, and conducts real-time data replication at scale. See the following video for more information:<br><br>
30+
31+
>[!VIDEO https://www.youtube.com/embed/KRrmcYPxEho]
32+
33+
Key benefits of the platform include the following:
34+
35+
- **Data accuracy**: End-to-end validation of data prevents data loss and ensures transferred data is fit for use.
3136

32-
[WANdisco LiveData Platform for Azure](https://docs.wandisco.com/live-data-platform/docs/landing/) uses a unique, wide-area network capable consensus engine to achieve data consistency, and to conduct data replication at scale while applications can continue to modify the data under replication. <br><br>
37+
- **Data consistency**: Keep data volumes automatically consistent between environments even while they undergo continuous change.
3338

34-
> [!VIDEO https://www.youtube.com/embed/KRrmcYPxEho]
39+
- **Data efficiency**: Transfer large data volumes continuously with full control of bandwidth consumption.
40+
41+
- **Downtime elimination**: Freely create, modify, read, and delete data with other applications during migration, without the need to disrupt business operations during data transference to Azure. Continue to operate applications, analytics infrastructure, ingest jobs, and other processing.
42+
43+
- **Simple use**: Use the Platform's Azure integration to create, configure, schedule, and track the progress of automated migrations. Additionally, configure selective data replication, Hive metadata, data security, and confidentiality as needed.
44+
45+
## Key features of WANdisco LiveData Platform for Azure
3546

3647
Key features of the platform include the following:
3748

38-
- **Data consistency**: Solve the challenges of migrating large data volumes between environments and keeping those data consistent across storage systems throughput migration, even while they are under continual change. Employ WANdisco's unique, wide-area network capable consensus engine directly in Azure to achieve data consistency and to migrate data with consistency guarantees throughout data changes.
49+
- **Metadata Migration**: In addition to HDFS data, migrate metadata (from Hive and other storages) with LiveData Migrator for Azure.
50+
51+
- **Scheduled Transfer**: Use LiveData Migrator for Azure to control and automate when data transfer will initiate, eliminating the need to manually migrate changes to data.
52+
53+
- **Kerberos**: LiveData Migrator for Azure supports Kerberized clusters.
3954

40-
- **Maintain operations**: Because applications can continue to create, modify, read, and delete data during migration, there is no need to disrupt business operations or introduce an outage window just to migrate big data to Azure. Continue to operate applications, analytics infrastructure, ingest jobs, and other processing.
55+
- **Exclusion Templates**: Create rules in LiveData Migrator for Azure to prevent certain file sizes or file names (defined using glob patterns) from being migrated to your target storage. Create exclusion templates in the Azure portal or with the CLI, and apply them to any number of migrations.
4156

42-
- **Validate outcomes**: End-to-end validation that your data can be used effectively once migrated to Azure requires that you run production application workloads against them. Only a LiveData Service provides this without introducing the risk of data divergence, by maintaining data consistency regardless of whether change occurs at the source or target of your migration. Test and validate application behavior without risk or change to your processes and systems.
57+
- **Path Mappings**: Define alternate target paths for specific target file systems, which automatically move transferred data to directories you specify.
4358

44-
- **Reduce complexity**: Eliminate the need to create and manage scheduled jobs to copy data by migrating data through automation. Use the deep integration with Azure as a control plane to manage and monitor migration progress, including selective data replication, Hive metadata, data security and confidentiality.
59+
- **Bandwidth Management**: Configure the maximum amount of network bandwidth LiveData Migrator for Azure can use to prevent bandwidth over consumption.
4560

46-
- **Efficiency**: Maintain high throughput and performance, and scale to big data volumes easily. With control of bandwidth consumption, you can ensure that you can meet your migration goals without impacting other system operations.
61+
- **Exclusions**: Define template queries that prevent the migration of any files and directories that meet the criteria, allowing you to selectively migrate data from your source system.
62+
63+
- **Metrics**: View details about data transfer in LiveData Migrator for Azure, such as files transferred over time, excluded paths, items that failed to transfer and more.
4764

4865
## Migrate big data faster without risk
4966

50-
The first service of WANdisco LiveData Platform for Azure is [LiveData Migrator for Azure](https://www.wandisco.com/products/livedata-migrator-for-azure); a solution for migrating actively used data from on-premises environments to Azure storage. LiveData Migrator for Azure is provisioned and managed entirely from the Azure portal or Azure CLI, and operates alongside your Hadoop cluster on-premises without any configuration change, application modifications, or service restarts to begin migrating data immediately.
67+
The first service included in WANdisco LiveData Platform for Azure is [LiveData Migrator for Azure](https://www.wandisco.com/products/livedata-migrator-for-azure), which migrates data from on-premises environments to Azure Storage. Once you've deployed LiveData Migrator to your on-premises Hadoop cluster, it will automatically create the best configuration for your file system. From there, supply the Kerberos details for the system. LiveData Migrator for Azure will then be ready to migrate data to Azure Storage.
5168

5269
> [!div class="mx-imgBorder"]
53-
> ![LiveData Migrator for Azure Architecture](./media/migrate-gen2-wandisco-live-data-platform/live-data-migrator-architecture.png)
70+
> ![LiveData Migrator for Azure Architecture](./media/migrate-gen2-wandisco-live-data-platform/live-data-migrator-architecture-1.png)
71+
72+
Before you start with LiveData Migrator for Azure, review these [prerequisites](https://docs.wandisco.com/live-data-platform/docs/prereq/).
73+
74+
To perform a migration:
5475

55-
Big data migrations can be complex and challenging. Moving petabytes of information without disrupting business operations has been impossible to achieve with offline data copy technologies. [LiveData Migrator for Azure](https://www.wandisco.com/products/livedata-migrator-for-azure) offers simple deployment and can establish a LiveData Service with continuous data migration and replication while applications read, write, and modify the data being migrated.
76+
1. In the Azure CLI:
5677

57-
Performing a migration is as simple as these three steps:
78+
- Register for the WANdisco resource provider in the Azure CLI by running `az provider register --namespace Wandisco.Fusion --consent-to-permissions`.
79+
- Accept the metered billing terms of LiveData Platform by running `az vm image terms accept --offer ldma --plan metered-v1 --publisher Wandisco --subscription <subscriptionID>`.
5880

59-
1. Provision the LiveData Migrator instance from the Azure portal to your on-premises Hadoop cluster. No cluster change or downtime is needed, and applications can continue to operate.
81+
2. Deploy a LiveData Migrator instance from the Azure portal to your on-premises Hadoop cluster. (You do not need to make changes to or restart the cluster.)
6082

6183
> [!div class="mx-imgBorder"]
6284
> ![Create a LiveData Migrator instance](./media/migrate-gen2-wandisco-live-data-platform/create-live-data-migrator.png)
6385
64-
2. Define the target Azure Data Lake Storage Gen2-enabled storage account.
86+
> [!NOTE]
87+
> WANdisco LiveData Migrator for Azure provides the option to create a Hadoop Test Cluster.
88+
89+
3. Configure Kerberos details, if applicable.
90+
91+
4. Define the target Azure Data Lake Storage Gen2-enabled storage account.
6592

6693
> [!div class="mx-imgBorder"]
6794
> ![Create a LiveData Migrator target](./media/migrate-gen2-wandisco-live-data-platform/create-target.png)
6895
69-
3. Define the location of the data that you want to migrate, for example: `/user/hive/warehouse`, and start the migration.
96+
5. Define the location of the data that you want to migrate, for example: `/user/hive/warehouse`.
7097

7198
> [!div class="mx-imgBorder"]
7299
> ![Create a LiveData Migrator migration](./media/migrate-gen2-wandisco-live-data-platform/create-migration.png)
73100
74-
Monitor your migration progress through standard Azure tooling including the Azure CLI and Azure portal, and continue to use your on-premises environment throughout. Before you start, review these [prerequisites](https://docs.wandisco.com/live-data-platform/docs/prereq/).
101+
6. Start the migration.
102+
103+
Monitor your migration progress through standard Azure tooling including the Azure CLI and Azure portal.
75104

76-
## Replicate data under active change
105+
For more detailed instructions, see the [LiveData Migrator for Azure How-To video series](https://fast.wistia.com/embed/channel/qg51p8erky).
77106

78-
Large-scale migrations of on-premises data lakes to Azure need application testing and validation. Being able to do this without the risk of introducing data changes that will create multiple sources of truth that cannot be easily reconciled is critical to eliminating risk and minimizing the cost of moving to Azure. [LiveData Plane for Azure](https://www.wandisco.com/products/livedata-plane-for-azure) uses WANdisco's coordination engine technology to overcome these concerns.
107+
## Bidirectionally replicate data under active change with LiveData Plane for Azure
108+
109+
The second service included in the LiveData Platform is [LiveData Plane for Azure](https://www.wandisco.com/products/livedata-plane-for-azure). LiveData Plane uses WANdisco's coordination engine to keep data consistent across many on-premises Hadoop clusters and Azure Storage by intelligently applying changes to data on all systems, removing the risk of data conflicts at different points of use.
79110

80111
> [!div class="mx-imgBorder"]
81112
> ![LiveData Plane for Azure Architecture](./media/migrate-gen2-wandisco-live-data-platform/live-data-plane-architecture.png)
82113
83-
Keep your data consistent across on-premises Hadoop clusters and Azure storage with LiveData Plane for Azure after initial migration:
114+
After initial migration, keep your data consistent with LiveData Plane for Azure:
84115

85-
1. Provision LiveData Plane for Azure on-premises and in Azure, starting from the Azure portal. No application changes are required.
116+
1. Deploy LiveData Plane for Azure on-premises and in Azure, starting from the Azure portal. No application changes are required.
86117

87-
2. Configure replication rules that cover that data locations that you want to keep consistent, for example: `/user/contoso/sales/region/WA`.
118+
2. Configure replication rules that cover the data locations that you want to keep consistent, for example: `/user/contoso/sales/region/WA`.
88119

89-
3. Run applications that access and modify data in either location as a Hadoop-compatible file system as you need.
120+
3. Run applications that access and modify data in either location as you need.
90121

91-
LiveData Plane for Azure keeps your data consistent without imposing significant overhead on cluster operation or application performance. Modify or ingest data while all changes are replicated consistently.
122+
LiveData Plane for Azure consistently replicates data changes across all environments without significant impact on cluster operation or application performance.
92123

93-
## Next steps
124+
## Test drive or Trial
94125

95-
- [LiveData Platform for Azure](https://docs.wandisco.com/live-data-platform/docs/landing/) for Azure is used like any other Azure resource, and is available in preview now.
126+
From [LiveData Platform for Azure's Marketplace page](https://azuremarketplace.microsoft.com/marketplace/apps/wandisco.ldma?tab=Overview), you have two options:
96127

97-
- Understand the [prerequisites](https://docs.wandisco.com/live-data-platform/docs/prereq/), plan your migration, and complete a large-scale migration rapidly with LiveData Migrator for Azure.
128+
- The **Get It Now** button launches the service in your subscription. From there, you may use your own Hadoop cluster or WANdisco's Trial cluster.
98129

99-
- Try out the LiveData Migrator without needing to have an on-premise Hadoop cluster by using the [HDFS Sandbox](https://docs.wandisco.com/live-data-platform/docs/create-sandbox-intro/).
130+
- Click **Test Drive** to test LiveData Migrator for Azure in an environment that is preconfigured and hosted for you. This enables you to try LiveData Migrator for Azure before adding it to your subscription, without any cost or risk to your data.
100131

101-
## See also
132+
Watch the [Test Drive Demonstration Video](https://fast.wistia.net/embed/channel/qg51p8erky?wchannelid=qg51p8erky&wmediaid=ute6gsc60w) to see the test drive in action.
102133

103-
- [LiveData Migrator for Azure on Azure Marketplace](https://azuremarketplace.microsoft.com/marketplace/apps/wandisco.ldm?tab=Overview)
134+
## Next Steps
104135

105-
- [LiveData Plane for Azure on Azure Marketplace](https://azuremarketplace.microsoft.com/marketplace/apps/wandisco.ldp?tab=Overview)
136+
- [Plan and create a migration in LiveData Migrator for Azure](https://azuremarketplace.microsoft.com/marketplace/apps/wandisco.ldma).
106137

107-
- [LiveData Migrator for Azure plans and pricing](https://azuremarketplace.microsoft.com/marketplace/apps/wandisco.ldm?tab=PlansAndPrice)
138+
## See also
108139

109-
- [LiveData Plane for Azure plans and pricing](https://azuremarketplace.microsoft.com/marketplace/apps/wandisco.ldp?tab=PlansAndPrice)
140+
- [LiveData Migrator for Azure on Azure Marketplace](https://azuremarketplace.microsoft.com/marketplace/apps/wandisco.ldma?tab=Overview)
141+
142+
- [LiveData Migrator for Azure plans and pricing](https://azuremarketplace.microsoft.com/marketplace/apps/wandisco.ldma?tab=PlansAndPricee)
110143

111144
- [LiveData Platform for Azure Frequently Asked Questions](https://docs.wandisco.com/live-data-platform/docs/faq/)
112145

113146
- [Known Issues with LiveData Platform for Azure](https://docs.wandisco.com/live-data-platform/docs/known-issues/)
114-
115-
- [Introduction to Azure Data Lake Storage Gen2](data-lake-storage-introduction.md)

0 commit comments

Comments
 (0)