Skip to content

Commit 7d82ac4

Browse files
Merge pull request #78783 from alkohli/mig
added SPO article
2 parents 2698d27 + 4ea8f33 commit 7d82ac4

File tree

4 files changed

+121
-18
lines changed

4 files changed

+121
-18
lines changed

articles/databox/TOC.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,8 +161,12 @@
161161
href: data-box-disk-contact-microsoft-support.md
162162
- name: Migrate
163163
items:
164+
- name: To SharePoint Online
165+
href: data-box-heavy-migrate-spo.md
164166
- name: To Azure File Sync
165167
href: ../storage/files/storage-sync-offline-data-transfer.md?toc=/azure/databox/toc.json&bc=/azure/databox/breadcrumb/toc.json
168+
- name: From HDFS store
169+
href: ../storage/blobs/data-lake-storage-migrate-on-premises-hdfs-cluster.md?toc=/azure/databox/toc.json&bc=/azure/databox/breadcrumb/toc.json
166170
- name: Resources
167171
items:
168172
- name: Data Box product
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
---
2+
title: Use Azure Data Box Heavy to migrate file share content to SharePoint Online| Microsoft Docs
3+
description: Use this tutorial to learn how to migrate file share content to Share Point Online using your Azure Data Box Heavy
4+
services: databox
5+
author: alkohli
6+
7+
ms.service: databox
8+
ms.subservice: heavy
9+
ms.topic: tutorial
10+
ms.date: 06/05/2019
11+
ms.author: alkohli
12+
---
13+
14+
# Use the Azure Data Box Heavy to migrate your file share content to SharePoint Online
15+
16+
Use your Azure Data Box Heavy and the SharePoint Migration Tool (SPMT) to easily migrate your file share content to SharePoint Online and OneDrive. By using Data Box Heavy, you can remove the dependency on your Wide-area network (WAN) link to transfer the data.
17+
18+
The Microsoft Azure Data Box is a service that lets you order a device from the Microsoft Azure portal. You can then copy terabytes of data from your servers to the device. After shipping it back to Microsoft, your data is copied into Azure. Depending on the size of data you intend to transfer, you can choose from:
19+
20+
- [Data Box Disk](https://docs.microsoft.com/azure/databox/data-box-disk-overview) with 35-TB usable capacity per order for small-to-medium datasets.
21+
- [Data Box](https://docs.microsoft.com/azure/databox/data-box-overview) with 80-TB usable capacity per device for medium-to-large datasets.
22+
- [Data Box Heavy](https://docs.microsoft.com/azure/databox/data-box-heavy-overview) with 770-TB usable capacity per device for large datasets. Data Box Heavy is currently in preview.
23+
24+
This article specifically talks about how to use the Data Box Heavy to migrate your file share content to SharePoint Online.
25+
26+
## Requirements and costs
27+
28+
### For Data Box Heavy
29+
30+
- Data Box Heavy is only available for Enterprise Agreement (EA), Cloud solution provider (CSP), or Azure sponsorship offers. If your subscription does not fall in any of the above types, contact Microsoft Support to upgrade your subscription or see [Azure subscription pricing](https://azure.microsoft.com/pricing/).
31+
- There is a fee to use Data Box Heavy. Make sure to review the [Data Box Heavy pricing](https://azure.microsoft.com/pricing/details/databox/heavy/).
32+
33+
34+
### For SharePoint Online
35+
36+
- Review the [Minimum requirements for the SharePoint Migration Tool (SPMT)](https://docs.microsoft.com/sharepointmigration/how-to-use-the-sharepoint-migration-tool).
37+
38+
## Workflow overview
39+
40+
This workflow requires you to perform steps on Azure Data Box Heavy as well as on Share Point online.
41+
The following steps relate to your Azure Data Box Heavy.
42+
43+
1. Order Azure Data Box Heavy.
44+
2. Receive and set up your device.
45+
3. Copy data from your on-premises file share to folder for Azure Files on your device.
46+
4. After the copy is complete, ship the device back as per the instructions.
47+
5. Wait for the data to completely upload to Azure.
48+
49+
The following steps relate to Share Point online.
50+
51+
6. Create a VM in the Azure portal and mount the Azure file share on it.
52+
7. Install the SPMT tool on the Azure VM.
53+
8. Run the SPMT tool using the Azure file share as the *source*.
54+
9. Complete the final steps of the tool.
55+
10. Verify and confirm your data.
56+
57+
## Use Data Box Heavy to copy data
58+
59+
Take the following steps to copy data to your Data Box Heavy.
60+
61+
1. [Order your Data Box Heavy](data-box-heavy-deploy-ordered.md).
62+
2. After you receive your Data Box Heavy, [Set up the Data Box Heavy](data-box-heavy-deploy-set-up.md). You'll cable and configure both the nodes on your device.
63+
3. [Copy data to Azure Data Box Heavy](data-box-heavy-deploy-copy-data.md). While copying, make sure to:
64+
65+
- Use only the *AzureFile* folder in the Data Box Heavy to copy the data. This is because you want the data to end up in an Azure file share, not in block blobs or page blobs.
66+
- Copy files to a folder within *AzureFile* folder. A subfolder within *AzureFile* folder creates a file share. Files copied directly to *AzureFile* folder fail and are uploaded as block blobs. This is the file share that you will mount on your VM in the next step.
67+
- Copy data to both nodes of your Data Box Heavy.
68+
3. Run [Prepare to ship](data-box-heavy-deploy-picked-up.md#prepare-to-ship) on your device. A successful prepare to ship ensures a successful upload of files to Azure.
69+
4. [Return the device](data-box-heavy-deploy-picked-up.md#ship-data-box-heavy-back).
70+
5. [Verify the data upload to Azure](data-box-heavy-deploy-picked-up.md#verify-data-upload-to-azure).
71+
72+
## Use SPMT to migrate data
73+
74+
After you receive confirmation from the Azure data team that your data copy has completed, you can now proceed to migrate your data to SharePoint Online.
75+
76+
For best performance and connectivity, we recommend that you create an Azure Virtual Machine (VM).
77+
78+
1. Sign into the Azure portal, and then [Create a virtual machine](../virtual-machines/windows/quick-create-portal.md).
79+
2. [Mount the Azure file share onto the VM](../storage/files/storage-how-to-use-files-windows.md#mount-the-azure-file-share-with-file-explorer).
80+
3. [Download the SharePoint Migration tool](http://spmtreleasescus.blob.core.windows.net/install/default.htm) and install it on your Azure VM.
81+
4. Start the SharePoint Migration Tool. Click **Sign in** and enter your Office 365 username and password.
82+
5. When prompted **Where is your data?**, select **File share**. Enter the path to your Azure file share where your data is located.
83+
6. Follow the remaining prompts as normal, including your target location. For more information, go to [How to use the SharePoint Migration Tool](https://docs.microsoft.com/sharepointmigration/how-to-use-the-sharepoint-migration-tool).
84+
85+
> [!IMPORTANT]
86+
> - The speed at which data is ingested into SharePoint Online is affected by several factors, regardless if you have your data already in Azure. Understanding these factors will help you plan and maximize the efficiency of your migration. For more information, go to [SharePoint Online and OneDrive migration Speed](/sharepointmigration/sharepoint-online-and-onedrive-migration-speed).
87+
> - There is a risk of losing existing permissions on files when migrating the data to SharePoint Online. You may also lose certain metadata, such as *Created by* and *Date modified by*.
88+
89+
## Next steps
90+
91+
[Order your Data Box Heavy](./data-box-heavy-deploy-ordered.md)

articles/storage/blobs/data-lake-storage-migrate-on-premises-HDFS-cluster.md

Lines changed: 26 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,21 +5,21 @@ services: storage
55
author: normesta
66

77
ms.service: storage
8-
ms.date: 03/01/2019
8+
ms.date: 06/05/2019
99
ms.author: normesta
1010
ms.topic: article
1111
ms.component: data-lake-storage-gen2
1212
---
1313

1414
# Use Azure Data Box to migrate data from an on-premises HDFS store to Azure Storage
1515

16-
You can migrate data from an on-premises HDFS store of your Hadoop cluster into Azure Storage (blob storage or Data Lake Storage Gen2) by using a Data Box device.
16+
You can migrate data from an on-premises HDFS store of your Hadoop cluster into Azure Storage (blob storage or Data Lake Storage Gen2) by using a Data Box device. You can choose from an 80-TB Data Box or a 770-TB Data Box Heavy.
1717

1818
This article helps you complete these tasks:
1919

20-
:heavy_check_mark: Copy your data to a Data Box device.
20+
:heavy_check_mark: Copy your data to a Data Box or a Data Box Heavy device.
2121

22-
:heavy_check_mark: Ship the Data Box device to Microsoft.
22+
:heavy_check_mark: Ship the device back to Microsoft.
2323

2424
:heavy_check_mark: Move the data onto your Data Lake Storage Gen2 storage account.
2525

@@ -33,23 +33,23 @@ You need these things to complete the migration.
3333

3434
* An on-premises Hadoop cluster that contains your source data.
3535

36-
* An [Azure Data Box device](https://azure.microsoft.com/services/storage/databox/).
36+
* An [Azure Data Box device](https://azure.microsoft.com/services/storage/databox/).
3737

38-
- [Order your Data Box](https://docs.microsoft.com/azure/databox/data-box-deploy-ordered). While ordering your Box, remember to choose a storage account that **doesn't** have hierarchical namespaces enabled on it. This is because Data Box does not yet support direct ingestion into Azure Data Lake Storage Gen2. You will need to copy into a storage account and then do a second copy into the ADLS Gen2 account. Instructions for this are given in the steps below.
39-
- [Cable and connect your Data Box](https://docs.microsoft.com/azure/databox/data-box-deploy-set-up) to an on-premises network.
38+
- [Order your Data Box](https://docs.microsoft.com/azure/databox/data-box-deploy-ordered) or [Data Box Heavy](https://docs.microsoft.com/azure/databox/data-box-heavy-deploy-ordered). While ordering your device, remember to choose a storage account that **doesn't** have hierarchical namespaces enabled on it. This is because Data Box devices do not yet support direct ingestion into Azure Data Lake Storage Gen2. You will need to copy into a storage account and then do a second copy into the ADLS Gen2 account. Instructions for this are given in the steps below.
39+
- Cable and connect your [Data Box](https://docs.microsoft.com/azure/databox/data-box-deploy-set-up) or [Data Box Heavy](https://docs.microsoft.com/azure/databox/data-box-heavy-deploy-set-up) to an on-premises network.
4040

4141
If you are ready, let's start.
4242

4343
## Copy your data to a Data Box device
4444

4545
To copy the data from your on-premises HDFS store to a Data Box device, you'll set a few things up, and then use the [DistCp](https://hadoop.apache.org/docs/stable/hadoop-distcp/DistCp.html) tool.
4646

47-
If the amount of data that you are copying is more than the capacity of a single Data Box, you will have to break up your data set into sizes that do fit into your Data Boxes.
47+
If the amount of data that you are copying is more than the capacity of a single Data Box or that of single node on Data Box Heavy, break up your data set into sizes that do fit into your devices.
4848

49-
Follow these steps to copy data via the REST APIs of Blob/Object storage to your Data Box. The REST API interface will make the Data Box appear as a HDFS store to your cluster.
49+
Follow these steps to copy data via the REST APIs of Blob/Object storage to your Data Box device. The REST API interface will make the device appear as an HDFS store to your cluster.
5050

5151

52-
1. Before you copy the data via REST, identify the security and connection primitives to connect to the REST interface on the Data Box. Sign in to the local web UI of Data Box and go to **Connect and copy** page. Against the Azure storage account for your Data Box, under **Access settings**, locate and select **REST(Preview)**.
52+
1. Before you copy the data via REST, identify the security and connection primitives to connect to the REST interface on the Data Box or Data Box Heavy. Sign in to the local web UI of Data Box and go to **Connect and copy** page. Against the Azure storage account for your device, under **Access settings**, locate, and select **REST**.
5353

5454
!["Connect and copy" page](media/data-lake-storage-migrate-on-premises-HDFS-cluster/data-box-connect-rest.png)
5555

@@ -59,14 +59,14 @@ Follow these steps to copy data via the REST APIs of Blob/Object storage to your
5959

6060
!["Access storage account and upload data" dialog](media/data-lake-storage-migrate-on-premises-HDFS-cluster/data-box-connection-string-http.png)
6161

62-
3. Add the endpoint and the Data Box IP address to `/etc/hosts` on each node.
62+
3. Add the endpoint and the Data Box or Data Box Heavy node IP address to `/etc/hosts` on each node.
6363

6464
```
6565
10.128.5.42 mystorageaccount.blob.mydataboxno.microsoftdatabox.com
6666
```
6767
If you are using some other mechanism for DNS, you should ensure that the Data Box endpoint can be resolved.
6868
69-
4. Set a shell variable `azjars` to point to the `hadoop-azure` and the `microsoft-windowsazure-storage-sdk` jar files. These files are under the Hadoop installation directory (You can check if these files exist by using this command `ls -l $<hadoop_install_dir>/share/hadoop/tools/lib/ | grep azure` where `<hadoop_install_dir>` is the directory where you have installed Hadoop ) Use the full paths.
69+
4. Set a shell variable `azjars` to point to the `hadoop-azure` and the `microsoft-windowsazure-storage-sdk` jar files. These files are under the Hadoop installation directory (You can check if these files exist by using this command `ls -l $<hadoop_install_dir>/share/hadoop/tools/lib/ | grep azure` where `<hadoop_install_dir>` is the directory where you have installed Hadoop) Use the full paths.
7070
7171
```
7272
# azjars=$hadoop_install_dir/share/hadoop/tools/lib/hadoop-azure-2.6.0-cdh5.14.0.jar
@@ -118,22 +118,30 @@ Follow these steps to copy data via the REST APIs of Blob/Object storage to your
118118
119119
To improve the copy speed:
120120
- Try changing the number of mappers. (The above example uses `m` = 4 mappers.)
121-
- Try running mutliple `distcp` in parallel.
122-
- Remember that large files perform better than small files.
121+
- Try running multiple `distcp` in parallel.
122+
- Remember that large files perform better than small files.
123123
124124
## Ship the Data Box to Microsoft
125125
126126
Follow these steps to prepare and ship the Data Box device to Microsoft.
127127
128-
1. After the data copy is complete, run [Prepare to ship](https://docs.microsoft.com/azure/databox/data-box-deploy-copy-data-via-rest) on your Data Box. After the device preparation is complete, download the BOM files. You will use these BOM or manifest files later to verify the data uploaded to Azure. Shut down the device and remove the cables.
129-
2. Schedule a pickup with UPS to [Ship your Data Box back to Azure](https://docs.microsoft.com/azure/databox/data-box-deploy-picked-up).
130-
3. After Microsoft receives your device, it is connected to the network datacenter and data is uploaded to the storage account you specified (with hierarchical namespaces disabled) when you ordered the Data Box. Verify against the BOM files that all your data is uploaded to Azure. You can now move this data to a Data Lake Storage Gen2 storage account.
128+
1. After the data copy is complete, run:
129+
130+
- [Prepare to ship on your Data Box or Data Box Heavy](https://docs.microsoft.com/azure/databox/data-box-deploy-copy-data-via-rest).
131+
- After the device preparation is complete, download the BOM files. You will use these BOM or manifest files later to verify the data uploaded to Azure.
132+
- Shut down the device and remove the cables.
133+
2. Schedule a pickup with UPS. Follow the instructions to:
134+
135+
- [Ship your Data Box](https://docs.microsoft.com/azure/databox/data-box-deploy-picked-up)
136+
- [Ship your Data Box Heavy](https://docs.microsoft.com/azure/databox/data-box-heavy-deploy-picked-up).
137+
3. After Microsoft receives your device, it is connected to the datacenter network and the data is uploaded to the storage account you specified (with hierarchical namespaces disabled) when you placed the device order. Verify against the BOM files that all your data is uploaded to Azure. You can now move this data to a Data Lake Storage Gen2 storage account.
138+
131139
132140
## Move the data onto your Data Lake Storage Gen2 storage account
133141
134142
This step is needed if you are using Azure Data Lake Storage Gen2 as your data store. If you are using just a blob storage account without hierarchical namespace as your data store, you do not need to do this step.
135143
136-
You can do this in 2 ways.
144+
You can do this in two ways.
137145
138146
- Use [Azure Data Factory to move data to ADLS Gen2](https://docs.microsoft.com/azure/data-factory/load-azure-data-lake-storage-gen2). You will have to specify **Azure Blob Storage** as the source.
139147
-486 Bytes
Loading

0 commit comments

Comments
 (0)