Skip to content

Commit 584800a

Browse files
committed
Second round editorial fixes per Balaji
1 parent 00e259e commit 584800a

File tree

1 file changed

+71
-33
lines changed

1 file changed

+71
-33
lines changed

articles/databox/data-box-deploy-copy-data.md

Lines changed: 71 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: stevenmatthew
77
ms.service: databox
88
ms.subservice: pod
99
ms.topic: tutorial
10-
ms.date: 03/10/2024
10+
ms.date: 03/21/2024
1111
ms.author: shaas
1212

1313
# Customer intent: As an IT admin, I need to be able to copy data to Data Box to upload on-premises data from my server onto Azure.
@@ -27,6 +27,10 @@ ms.author: shaas
2727

2828
::: zone target="docs"
2929

30+
> [!IMPORTANT]
31+
> The copy process has been changed to reflect new functionality which supports all access tiers for block blobs...
32+
> For details on copying your blob data to an access tier, click here... [Determine appropriate access tiers for block blobs](#determine-appropriate-access-tiers-for-block-blobs)
33+
3034
This tutorial describes how to connect to and copy data from your host computer using the local web UI.
3135

3236
In this tutorial, you learn how to:
@@ -47,25 +51,59 @@ Before you begin, make sure that:
4751
* Run a [Supported operating system](data-box-system-requirements.md).
4852
* Be connected to a high-speed network. We strongly recommend that you have at least one 10-GbE connection. If a 10-GbE connection isn't available, use a 1-GbE data link but the copy speeds will be impacted.
4953

54+
## Determine appropriate access tiers for block blobs
55+
56+
57+
58+
Azure Storage allows you to store blob data in several access tiers within the same storage account. This ability allows data to be organized and stored more efficiently based on how often it's accessed. The following table contains information and recommendations about Azure Storage access tiers.
59+
60+
| Tier | Recommendation | Best practice |
61+
|---------|----------------|---------------|
62+
| Hot | Useful for online data accessed or modified frequently. This tier has the highest storage costs, but the lowest access costs. | Data in this tier should be in regular and active use. |
63+
| Cool | Useful for online data accessed or modified infrequently. This tier has lower storage costs and higher access costs than the hot tier. | Data in this tier should be stored for at least 30 days. |
64+
| Cold | Useful for online data accessed or modified rarely but still requiring fast retrieval. This tier has lower storage costs and higher access costs than the cool tier.| Data in this tier should be stored for a minimum of 90 days. |
65+
| Archive | Useful for offline data rarely accessed and having lower latency requirements. | Data in this tier should be stored for a minimum of 180 days. Data removed from the archive tier within 180 days is subject to an early deletion charge. |
66+
67+
For more information on blob access tiers, see [Access tiers for blob data](../storage/blobs/access-tiers-overview.md). For more detailed best practices, see [Best practices for using blob access tiers](../storage/blobs/access-tiers-best-practices.md).
68+
69+
You can transfer your block blob data to the appropriate access tier by copying it to the corresponding folder within Data Box. Although this step is discussed in greater detail within the [Copy data to Azure Data Box](#copy-data-to-azure-data-box) section, it is important to organize your data before copying files to prevent errors during the ingestion process.
70+
71+
> [!IMPORTANT]
72+
> For orders placed before April 2024, the old ingestion process will remain...
73+
5074
## Connect to Data Box
5175

5276
Based on the storage account selected, Data Box creates up to:
5377

5478
* Three shares for each associated storage account for GPv1 and GPv2.
5579
* One share for premium storage.
56-
* Four shares for a blob storage account.
80+
* One share for a blob storage account, containing one folder for each of the four access tiers.
5781

5882
<!--Under block blob and page blob shares, first-level entities are containers, and second-level entities are blobs. Under shares for Azure Files, first-level entities are shares, second-level entities are files.-->
5983

60-
Within a block blob share, the first-level entity is a folder for each access tier type. Second-level entities are containers, and third-level entities are blobs. Under all other shares, first-level entities are shares, second-level entities are files.
84+
The following table identifies the names of the Data Box shares to which you can connect, and the type of data uploaded to your target storage account. It also identifies the hierarchy of shares and directories into which you copy your source data.
85+
86+
| Storage type | Share name | First-level entity | Second-level entity | Third-level entity |
87+
|--------------|--------------------------------|---------------------|---------------------|--------------------|
88+
| Block blob | <storageAccountName>_BlockBlob | <accessTier> | <containerName> | <blockBlob> |
89+
| Page blob | <storageAccountName>_PageBlob | <containerName> | <pageBlob> | |
90+
| File storage | <storageAccountName>_AzFile | <fileShareName> | <file> | |
6191

62-
The following table shows the UNC path to the shares on your Data Box and the corresponding Azure Storage path URL where the data is uploaded. The final Azure Storage path URL can be derived from the UNC share path.
92+
For example, you want to transfer a frequently used local file, *rajinikanth.png*, to the *Hot* access tier within the *superstar* storage container of your target storage account. To accomplish this task, connect to Data Box's **<storageAccountName>_BlockBlob** share. Next, expand the **Hot** folder at the root of the share. If the **superstar** folder doesn't exist, create it. Finally, copy the **rajinikanth.png** file into the **superstar** folder.
93+
94+
You can't copy files directly to the *root* folder of any share on Data Box. When copying files to the block blob share, the recommended best-practice is to add new sub-folders within the appropriate access tier. After creating new sub-folders, continue adding files as appropriate. Any folder created at the root of the block blob share will be created as a container; any file withing the folder will be copied to the storage account's default access tier.
95+
96+
<!--Within the block blob share, first-level entities represent access tiers. Second-level entities represent storage containers, and third-level entities represent blobs.
97+
Within the page blob share, first-level entities represent storage containers while the second-level entities represent page blobs.
98+
Within the files share, the first-level entities represent file shares while the second-level entities represent files.-->
99+
100+
The following table shows the UNC path to the shares on your Data Box and the corresponding Azure Storage path URL to which data is uploaded. The final Azure Storage path URL can be derived from the UNC share path.
63101

64-
|Azure Storage types | Data Box shares |
65-
|-------------------|--------------------------------------------------------------------------------|
66-
| Azure Block blobs | <li>UNC path to shares: `\\<DeviceIPAddress>\<storageaccountname_BlockBlob>\<accessTier>\<ContainerName>\myFile.txt`</li><li>Azure Storage URL: `https://<storageaccountname>.blob.core.windows.net/<ContainerName>/myFile.txt`</li> |
67-
| Azure Page blobs | <li>UNC path to shares: `\\<DeviceIPAddress>\<storageaccountname_PageBlob>\<ContainerName>\myFile.txt`</li><li>Azure Storage URL: `https://<storageaccountname>.blob.core.windows.net/<ContainerName>/myFile.txt`</li> |
68-
| Azure Files |<li>UNC path to shares: `\\<DeviceIPAddress>\<storageaccountname_AzFile>\<ShareName>\myFile.txt`</li><li>Azure Storage URL: `https://<storageaccountname>.file.core.windows.net/<ShareName>/myFile.txt`</li> |
102+
| Azure Storage types | Data Box shares |
103+
|---------------------|-----------------|
104+
| Azure Block blobs | <li>UNC path to shares: `\\<DeviceIPAddress>\<storageaccountname_BlockBlob>\<accessTier>\<ContainerName>\myBlob.txt`</li><li>Azure Storage URL: `https://<storageaccountname>.blob.core.windows.net/<ContainerName>/myBlob.txt`</li> |
105+
| Azure Page blobs | <li>UNC path to shares: `\\<DeviceIPAddress>\<storageaccountname_PageBlob>\<ContainerName>\myBlob.vhd`</li><li>Azure Storage URL: `https://<storageaccountname>.blob.core.windows.net/<ContainerName>/myBlob.vhd`</li> |
106+
| Azure Files | <li>UNC path to shares: `\\<DeviceIPAddress>\<storageaccountname_AzFile>\<ShareName>\myFile.txt`</li><li>Azure Storage URL: `https://<storageaccountname>.file.core.windows.net/<ShareName>/myFile.txt`</li> |
69107

70108
If using a Windows Server host computer, follow these steps to connect to the Data Box.
71109

@@ -119,9 +157,9 @@ sudo mount -t cifs -o vers=2.1 10.126.76.138:/utsac1_BlockBlob /home/databoxubun
119157

120158
Once you're connected to the Data Box shares, the next step is to copy data. Before you begin the data copy, review the following considerations:
121159

122-
* Make sure that you copy the data to shares that correspond to the appropriate data format. For instance, copy the block blob data to the share for block blobs. Copy the VHDs to page blob. If the data format doesn't match the appropriate share type, then at a later step, the data upload to Azure will fail.
123-
* Unless copying data to be uploaded as block blobs, create a folder at the share's root, then copy files to that folder.
124-
* When copying data to the block blob share, create a sub-folder within the desired access tier, then copy data to the newly created sub-folder. The sub-folder represents a container to which your data is uploaded as blobs. You cannot copy files directly to the *root* folder in the storage account.
160+
* Make sure that you copy the data to shares that correspond to the appropriate data format. For instance, copy the block blob data to the share for block blobs. Copy VHDs to the page blob share. If the data format doesn't match the appropriate share type, the data upload to Azure will fail during a later step.
161+
* When copying data to the *AzFile* or *PageBlob* shares, first create a folder at the share's root, then copy files to that folder.
162+
* When copying data to the *BlockBlob* share, create a sub-folder within the desired access tier, then copy data to the newly created sub-folder. The sub-folder represents a container to which your data is uploaded as blobs. You cannot copy files directly to the *root* folder in the storage account.
125163
* While copying data, make sure that the data size conforms to the size limits described in the [Azure storage account size limits](data-box-limits.md#azure-storage-account-size-limits).
126164
* If you want to preserve metadata (ACLs, timestamps, and file attributes) when transferring data to Azure Files, follow the guidance in [Preserving file ACLs, attributes, and timestamps with Azure Data Box](data-box-file-acls-preservation.md)
127165
* If data that is being uploaded by Data Box is also being uploaded by another application, outside Data Box, at the same time, this could result in upload job failures and data corruption.
@@ -141,21 +179,21 @@ robocopy <Source> <Target> * /e /r:3 /w:60 /is /nfl /ndl /np /MT:32 or 64 /fft
141179

142180
The attributes are described in the following table.
143181

144-
|Attribute |Description |
145-
|---------|---------|
146-
|/e |Copies subdirectories including empty directories. |
147-
|/r: |Specifies the number of retries on failed copies. |
148-
|/w: |Specifies the wait time between retries, in seconds. |
149-
|/is |Includes the same files. |
150-
|/nfl |Specifies that file names aren't logged. |
151-
|/ndl |Specifies that directory names aren't logged. |
152-
|/np |Specifies that the progress of the copying operation (the number of files or directories copied so far) will not be displayed. Displaying the progress significantly lowers the performance. |
153-
|/MT | Use multithreading, recommended 32 or 64 threads. This option not used with encrypted files. You may need to separate encrypted and unencrypted files. However, single threaded copy significantly lowers the performance. |
154-
|/fft | Use to reduce the time stamp granularity for any file system. |
155-
|/B | Copies files in Backup mode. |
156-
|/z | Copies files in Restart mode, use this if the environment is unstable. This option reduces throughput due to additional logging. |
157-
| /zb | Uses Restart mode. If access is denied, this option uses Backup mode. This option reduces throughput due to checkpointing. |
158-
|/efsraw | Copies all encrypted files in EFS raw mode. Use only with encrypted files. |
182+
|Attribute |Description |
183+
|----------|------------|
184+
|/e |Copies subdirectories including empty directories. |
185+
|/r: |Specifies the number of retries on failed copies. |
186+
|/w: |Specifies the wait time between retries, in seconds. |
187+
|/is |Includes the same files. |
188+
|/nfl |Specifies that file names aren't logged. |
189+
|/ndl |Specifies that directory names aren't logged. |
190+
|/np |Specifies that the progress of the copying operation (the number of files or directories copied so far) will not be displayed. Displaying the progress significantly lowers the performance. |
191+
|/MT | Use multithreading, recommended 32 or 64 threads. This option not used with encrypted files. You may need to separate encrypted and unencrypted files. However, single threaded copy significantly lowers the performance. |
192+
|/fft | Use to reduce the time stamp granularity for any file system. |
193+
|/B | Copies files in Backup mode. |
194+
|/z | Copies files in Restart mode, use this if the environment is unstable. This option reduces throughput due to additional logging. |
195+
| /zb | Uses Restart mode. If access is denied, this option uses Backup mode. This option reduces throughput due to checkpointing. |
196+
|/efsraw | Copies all encrypted files in EFS raw mode. Use only with encrypted files. |
159197
|log+:\<LogFile>| Appends the output to the existing log file.|
160198

161199
The following sample shows the output of the robocopy command to copy files to the Data Box.
@@ -224,9 +262,9 @@ For more specific scenarios such as using `robocopy` to list, copy, or delete fi
224262

225263
To optimize the performance, use the following robocopy parameters when copying the data.
226264

227-
| Platform | Mostly small files < 512 KB | Mostly medium files 512 KB-1 MB | Mostly large files > 1 MB |
228-
|----------------|--------------------------------------------------------|--------------------------------------------------------|--------------------------------------------------------|
229-
| Data Box | 2 Robocopy sessions <br> 16 threads per sessions | 3 Robocopy sessions <br> 16 threads per sessions | 2 Robocopy sessions <br> 24 threads per sessions |
265+
| Platform | Mostly small files < 512 KB | Mostly medium files 512 KB - 1 MB | Mostly large files > 1 MB |
266+
|----------|-----------------------------|-----------------------------------|---------------------------|
267+
| Data Box | 2 Robocopy sessions <br> 16 threads per session | 3 Robocopy sessions <br> 16 threads per session | 2 Robocopy sessions <br> 24 threads per session |
230268

231269
For more information on Robocopy command, go to [Robocopy and a few examples](https://social.technet.microsoft.com/wiki/contents/articles/1073.robocopy-and-a-few-examples.aspx).
232270

@@ -262,9 +300,9 @@ To copy data via SMB:
262300

263301
1. If using a Windows host, use the following command to connect to the SMB shares:
264302

265-
`\\<IP address of your device>\ShareName`
303+
`\\<Device IP address>\ShareName`
266304

267-
2. To get the share access credentials, go to the **Connect & copy** page in the local web UI of the Data Box.
305+
2. To retrieve the share access credentials, go to the **Connect & copy** page within the local web UI of the Data Box.
268306
3. Use an SMB compatible file copy tool such as Robocopy to copy data to shares.
269307

270308
For step-by-step instructions, go to [Tutorial: Copy data to Azure Data Box via SMB](data-box-deploy-copy-data.md).
@@ -273,7 +311,7 @@ For step-by-step instructions, go to [Tutorial: Copy data to Azure Data Box via
273311

274312
To copy data via NFS:
275313

276-
1. If using an NFS host, use the following command to mount the NFS shares on your Data Box:
314+
1. When using an NFS host, use the following command to mount the NFS shares on your Data Box:
277315

278316
`sudo mount <Data Box device IP>:/<NFS share on Data Box device> <Path to the folder on local Linux computer>`
279317

0 commit comments

Comments
 (0)