Skip to content

Commit 2674062

Browse files
authored
Merge pull request #110639 from Samantha-Yu/adfupdate0408
Removed one tip
2 parents b249699 + 6c38c3d commit 2674062

File tree

1 file changed

+3
-5
lines changed

1 file changed

+3
-5
lines changed

articles/data-factory/connector-azure-data-lake-storage.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.service: data-factory
1010
ms.workload: data-services
1111
ms.topic: conceptual
1212
ms.custom: seo-lt-2019
13-
ms.date: 03/24/2020
13+
ms.date: 04/08/2020
1414
---
1515

1616
# Copy and transform data in Azure Data Lake Storage Gen2 using Azure Data Factory
@@ -39,8 +39,6 @@ For Copy activity, with this connector you can:
3939
>[!IMPORTANT]
4040
>If you enable the **Allow trusted Microsoft services to access this storage account** option on Azure Storage firewall settings and want to use Azure integration runtime to connect to your Data Lake Storage Gen2, you must use [managed identity authentication](#managed-identity) for ADLS Gen2.
4141
42-
>[!TIP]
43-
>If you enable the hierarchical namespace, currently there's no interoperability of operations between Blob and Data Lake Storage Gen2 APIs. If you hit the error "ErrorCode=FilesystemNotFound" with the message "The specified filesystem does not exist," it's caused by the specified sink file system that was created via the Blob API instead of Data Lake Storage Gen2 API elsewhere. To fix the issue, specify a new file system with a name that doesn't exist as the name of a Blob container. Then Data Factory automatically creates that file system during data copy.
4442

4543
## Get started
4644

@@ -309,7 +307,7 @@ The following properties are supported for Data Lake Storage Gen2 under `storeSe
309307
| ------------------------ | ------------------------------------------------------------ | -------- |
310308
| type | The type property under `storeSettings` must be set to **AzureBlobFSWriteSettings**. | Yes |
311309
| copyBehavior | Defines the copy behavior when the source is files from a file-based data store.<br/><br/>Allowed values are:<br/><b>- PreserveHierarchy (default)</b>: Preserves the file hierarchy in the target folder. The relative path of the source file to the source folder is identical to the relative path of the target file to the target folder.<br/><b>- FlattenHierarchy</b>: All files from the source folder are in the first level of the target folder. The target files have autogenerated names. <br/><b>- MergeFiles</b>: Merges all files from the source folder to one file. If the file name is specified, the merged file name is the specified name. Otherwise, it's an autogenerated file name. | No |
312-
| blockSizeInMB | Specify the block size in MB used to write data to ADLS Gen2. Learn more [about Block Blobs](https://docs.microsoft.com/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs#about-block-blobs). <br/>Allowed value is **between 4 and 100 MB**. <br/>By default, ADF automatically determine the block size based on your source store type and data. For non-binary copy into ADLS Gen2, the default block size is 100 MB so as to fit in at most 4.95 TB data. It may be not optimal when your data is not large, especially when you use Self-hosted Integration Runtime with poor network resulting in operation timeout or performance issue. You can explicitly specify a block size, while ensure blockSizeInMB*50000 is big enough to store the data, otherwise copy activity run will fail. | No |
310+
| blockSizeInMB | Specify the block size in MB used to write data to ADLS Gen2. Learn more [about Block Blobs](https://docs.microsoft.com/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs#about-block-blobs). <br/>Allowed value is **between 4 and 100 MB**. <br/>By default, ADF automatically determines the block size based on your source store type and data. For non-binary copy into ADLS Gen2, the default block size is 100 MB so as to fit in at most 4.95 TB data. It may be not optimal when your data is not large, especially when you use Self-hosted Integration Runtime with poor network resulting in operation timeout or performance issue. You can explicitly specify a block size, while ensure blockSizeInMB*50000 is big enough to store the data, otherwise copy activity run will fail. | No |
313311
| maxConcurrentConnections | The number of connections to connect to the data store concurrently. Specify only when you want to limit the concurrent connection to the data store. | No |
314312

315313
**Example:**
@@ -388,7 +386,7 @@ When transforming data in mapping data flow, you can read and write files from A
388386

389387
### Source transformation
390388

391-
In the source transformation, you can read from a container, folder or individual file in Azure Data Lake Storage Gen2. The **Source options** tab lets you manage how the files get read.
389+
In the source transformation, you can read from a container, folder, or individual file in Azure Data Lake Storage Gen2. The **Source options** tab lets you manage how the files get read.
392390

393391
![Source options](media/data-flow/sourceOptions1.png "Source options")
394392

0 commit comments

Comments
 (0)