You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/databox/data-box-deploy-copy-data.md
+71-33Lines changed: 71 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: stevenmatthew
7
7
ms.service: databox
8
8
ms.subservice: pod
9
9
ms.topic: tutorial
10
-
ms.date: 03/10/2024
10
+
ms.date: 03/21/2024
11
11
ms.author: shaas
12
12
13
13
# Customer intent: As an IT admin, I need to be able to copy data to Data Box to upload on-premises data from my server onto Azure.
@@ -27,6 +27,10 @@ ms.author: shaas
27
27
28
28
::: zone target="docs"
29
29
30
+
> [!IMPORTANT]
31
+
> The copy process has been changed to reflect new functionality which supports all access tiers for block blobs...
32
+
> For details on copying your blob data to an access tier, click here... [Determine appropriate access tiers for block blobs](#determine-appropriate-access-tiers-for-block-blobs)
33
+
30
34
This tutorial describes how to connect to and copy data from your host computer using the local web UI.
31
35
32
36
In this tutorial, you learn how to:
@@ -47,25 +51,59 @@ Before you begin, make sure that:
47
51
* Run a [Supported operating system](data-box-system-requirements.md).
48
52
* Be connected to a high-speed network. We strongly recommend that you have at least one 10-GbE connection. If a 10-GbE connection isn't available, use a 1-GbE data link but the copy speeds will be impacted.
49
53
54
+
## Determine appropriate access tiers for block blobs
55
+
56
+
57
+
58
+
Azure Storage allows you to store blob data in several access tiers within the same storage account. This ability allows data to be organized and stored more efficiently based on how often it's accessed. The following table contains information and recommendations about Azure Storage access tiers.
59
+
60
+
| Tier | Recommendation | Best practice |
61
+
|---------|----------------|---------------|
62
+
| Hot | Useful for online data accessed or modified frequently. This tier has the highest storage costs, but the lowest access costs. | Data in this tier should be in regular and active use. |
63
+
| Cool | Useful for online data accessed or modified infrequently. This tier has lower storage costs and higher access costs than the hot tier. | Data in this tier should be stored for at least 30 days. |
64
+
| Cold | Useful for online data accessed or modified rarely but still requiring fast retrieval. This tier has lower storage costs and higher access costs than the cool tier.| Data in this tier should be stored for a minimum of 90 days. |
65
+
| Archive | Useful for offline data rarely accessed and having lower latency requirements. | Data in this tier should be stored for a minimum of 180 days. Data removed from the archive tier within 180 days is subject to an early deletion charge. |
66
+
67
+
For more information on blob access tiers, see [Access tiers for blob data](../storage/blobs/access-tiers-overview.md). For more detailed best practices, see [Best practices for using blob access tiers](../storage/blobs/access-tiers-best-practices.md).
68
+
69
+
You can transfer your block blob data to the appropriate access tier by copying it to the corresponding folder within Data Box. Although this step is discussed in greater detail within the [Copy data to Azure Data Box](#copy-data-to-azure-data-box) section, it is important to organize your data before copying files to prevent errors during the ingestion process.
70
+
71
+
> [!IMPORTANT]
72
+
> For orders placed before April 2024, the old ingestion process will remain...
73
+
50
74
## Connect to Data Box
51
75
52
76
Based on the storage account selected, Data Box creates up to:
53
77
54
78
* Three shares for each associated storage account for GPv1 and GPv2.
55
79
* One share for premium storage.
56
-
*Four shares for a blob storage account.
80
+
*One share for a blob storage account, containing one folder for each of the four access tiers.
57
81
58
82
<!--Under block blob and page blob shares, first-level entities are containers, and second-level entities are blobs. Under shares for Azure Files, first-level entities are shares, second-level entities are files.-->
59
83
60
-
Within a block blob share, the first-level entity is a folder for each access tier type. Second-level entities are containers, and third-level entities are blobs. Under all other shares, first-level entities are shares, second-level entities are files.
84
+
The following table identifies the names of the Data Box shares to which you can connect, and the type of data uploaded to your target storage account. It also identifies the hierarchy of shares and directories into which you copy your source data.
85
+
86
+
| Storage type | Share name | First-level entity | Second-level entity | Third-level entity |
The following table shows the UNC path to the shares on your Data Box and the corresponding Azure Storage path URL where the data is uploaded. The final Azure Storage path URL can be derived from the UNC share path.
92
+
For example, you want to transfer a frequently used local file, *rajinikanth.png*, to the *Hot* access tier within the *superstar* storage container of your target storage account. To accomplish this task, connect to Data Box's **<storageAccountName>_BlockBlob** share. Next, expand the **Hot** folder at the root of the share. If the **superstar** folder doesn't exist, create it. Finally, copy the **rajinikanth.png** file into the **superstar** folder.
93
+
94
+
You can't copy files directly to the *root* folder of any share on Data Box. When copying files to the block blob share, the recommended best-practice is to add new sub-folders within the appropriate access tier. After creating new sub-folders, continue adding files as appropriate. Any folder created at the root of the block blob share will be created as a container; any file withing the folder will be copied to the storage account's default access tier.
95
+
96
+
<!--Within the block blob share, first-level entities represent access tiers. Second-level entities represent storage containers, and third-level entities represent blobs.
97
+
Within the page blob share, first-level entities represent storage containers while the second-level entities represent page blobs.
98
+
Within the files share, the first-level entities represent file shares while the second-level entities represent files.-->
99
+
100
+
The following table shows the UNC path to the shares on your Data Box and the corresponding Azure Storage path URL to which data is uploaded. The final Azure Storage path URL can be derived from the UNC share path.
Once you're connected to the Data Box shares, the next step is to copy data. Before you begin the data copy, review the following considerations:
121
159
122
-
* Make sure that you copy the data to shares that correspond to the appropriate data format. For instance, copy the block blob data to the share for block blobs. Copy the VHDs to page blob. If the data format doesn't match the appropriate share type, then at a later step, the data upload to Azure will fail.
123
-
*Unless copying data to be uploaded as block blobs, create a folder at the share's root, then copy files to that folder.
124
-
* When copying data to the block blob share, create a sub-folder within the desired access tier, then copy data to the newly created sub-folder. The sub-folder represents a container to which your data is uploaded as blobs. You cannot copy files directly to the *root* folder in the storage account.
160
+
* Make sure that you copy the data to shares that correspond to the appropriate data format. For instance, copy the block blob data to the share for block blobs. Copy VHDs to the page blob share. If the data format doesn't match the appropriate share type, the data upload to Azure will fail during a later step.
161
+
*When copying data to the *AzFile* or *PageBlob* shares, first create a folder at the share's root, then copy files to that folder.
162
+
* When copying data to the *BlockBlob* share, create a sub-folder within the desired access tier, then copy data to the newly created sub-folder. The sub-folder represents a container to which your data is uploaded as blobs. You cannot copy files directly to the *root* folder in the storage account.
125
163
* While copying data, make sure that the data size conforms to the size limits described in the [Azure storage account size limits](data-box-limits.md#azure-storage-account-size-limits).
126
164
* If you want to preserve metadata (ACLs, timestamps, and file attributes) when transferring data to Azure Files, follow the guidance in [Preserving file ACLs, attributes, and timestamps with Azure Data Box](data-box-file-acls-preservation.md)
127
165
* If data that is being uploaded by Data Box is also being uploaded by another application, outside Data Box, at the same time, this could result in upload job failures and data corruption.
The attributes are described in the following table.
143
181
144
-
|Attribute |Description|
145
-
|---------|---------|
146
-
|/e |Copies subdirectories including empty directories.|
147
-
|/r: |Specifies the number of retries on failed copies.|
148
-
|/w: |Specifies the wait time between retries, in seconds.|
149
-
|/is |Includes the same files.|
150
-
|/nfl |Specifies that file names aren't logged.|
151
-
|/ndl |Specifies that directory names aren't logged.|
152
-
|/np |Specifies that the progress of the copying operation (the number of files or directories copied so far) will not be displayed. Displaying the progress significantly lowers the performance.|
153
-
|/MT | Use multithreading, recommended 32 or 64 threads. This option not used with encrypted files. You may need to separate encrypted and unencrypted files. However, single threaded copy significantly lowers the performance.|
154
-
|/fft | Use to reduce the time stamp granularity for any file system.|
155
-
|/B | Copies files in Backup mode.|
156
-
|/z | Copies files in Restart mode, use this if the environment is unstable. This option reduces throughput due to additional logging.|
157
-
| /zb | Uses Restart mode. If access is denied, this option uses Backup mode. This option reduces throughput due to checkpointing.|
158
-
|/efsraw | Copies all encrypted files in EFS raw mode. Use only with encrypted files.|
182
+
|Attribute |Description |
183
+
|----------|------------|
184
+
|/e |Copies subdirectories including empty directories. |
185
+
|/r: |Specifies the number of retries on failed copies. |
186
+
|/w: |Specifies the wait time between retries, in seconds. |
187
+
|/is |Includes the same files. |
188
+
|/nfl |Specifies that file names aren't logged. |
189
+
|/ndl |Specifies that directory names aren't logged. |
190
+
|/np |Specifies that the progress of the copying operation (the number of files or directories copied so far) will not be displayed. Displaying the progress significantly lowers the performance. |
191
+
|/MT | Use multithreading, recommended 32 or 64 threads. This option not used with encrypted files. You may need to separate encrypted and unencrypted files. However, single threaded copy significantly lowers the performance. |
192
+
|/fft | Use to reduce the time stamp granularity for any file system. |
193
+
|/B | Copies files in Backup mode. |
194
+
|/z | Copies files in Restart mode, use this if the environment is unstable. This option reduces throughput due to additional logging. |
195
+
| /zb | Uses Restart mode. If access is denied, this option uses Backup mode. This option reduces throughput due to checkpointing. |
196
+
|/efsraw | Copies all encrypted files in EFS raw mode. Use only with encrypted files. |
159
197
|log+:\<LogFile>| Appends the output to the existing log file.|
160
198
161
199
The following sample shows the output of the robocopy command to copy files to the Data Box.
@@ -224,9 +262,9 @@ For more specific scenarios such as using `robocopy` to list, copy, or delete fi
224
262
225
263
To optimize the performance, use the following robocopy parameters when copying the data.
226
264
227
-
|Platform |Mostly small files < 512 KB |Mostly medium files 512 KB-1 MB |Mostly large files > 1 MB |
| Data Box |2 Robocopy sessions <br> 16 threads per session |3 Robocopy sessions <br> 16 threads per session |2 Robocopy sessions <br> 24 threads per session|
230
268
231
269
For more information on Robocopy command, go to [Robocopy and a few examples](https://social.technet.microsoft.com/wiki/contents/articles/1073.robocopy-and-a-few-examples.aspx).
232
270
@@ -262,9 +300,9 @@ To copy data via SMB:
262
300
263
301
1. If using a Windows host, use the following command to connect to the SMB shares:
264
302
265
-
`\\<IP address of your device>\ShareName`
303
+
`\\<Device IP address>\ShareName`
266
304
267
-
2. To get the share access credentials, go to the **Connect & copy** page in the local web UI of the Data Box.
305
+
2. To retrieve the share access credentials, go to the **Connect & copy** page within the local web UI of the Data Box.
268
306
3. Use an SMB compatible file copy tool such as Robocopy to copy data to shares.
269
307
270
308
For step-by-step instructions, go to [Tutorial: Copy data to Azure Data Box via SMB](data-box-deploy-copy-data.md).
@@ -273,7 +311,7 @@ For step-by-step instructions, go to [Tutorial: Copy data to Azure Data Box via
273
311
274
312
To copy data via NFS:
275
313
276
-
1.If using an NFS host, use the following command to mount the NFS shares on your Data Box:
314
+
1.When using an NFS host, use the following command to mount the NFS shares on your Data Box:
277
315
278
316
`sudo mount <Data Box device IP>:/<NFS share on Data Box device> <Path to the folder on local Linux computer>`
0 commit comments