Skip to content

Commit 7f605c0

Browse files
authored
Merge pull request #93196 from linda33wj/master
Update ADF connector articles
2 parents df2290d + c16afa5 commit 7f605c0

21 files changed

+467
-531
lines changed

articles/data-factory/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -308,6 +308,8 @@
308308
href: connector-oracle-responsys.md
309309
- name: Oracle Service Cloud
310310
href: connector-oracle-service-cloud.md
311+
- name: ORC format
312+
href: format-orc.md
311313
- name: Parquet format
312314
href: format-parquet.md
313315
- name: PayPal

articles/data-factory/connector-amazon-simple-storage-service.md

Lines changed: 11 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.reviewer: douglasl
99
ms.service: data-factory
1010
ms.workload: data-services
1111
ms.topic: conceptual
12-
ms.date: 09/09/2019
12+
ms.date: 10/24/2019
1313
ms.author: jingwang
1414

1515
---
@@ -98,12 +98,9 @@ Here is an example:
9898

9999
For a full list of sections and properties available for defining datasets, see the [Datasets](concepts-datasets-linked-services.md) article.
100100

101-
- For **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet, delimited text, JSON, Avro and binary format dataset](#format-based-dataset) section.
102-
- For other formats like **ORC format**, refer to [Other format dataset](#other-format-dataset) section.
101+
[!INCLUDE [data-factory-v2-file-formats](../../includes/data-factory-v2-file-formats.md)]
103102

104-
### <a name="format-based-dataset"></a> Parquet, delimited text, JSON, Avro and binary format dataset
105-
106-
To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet format](format-parquet.md), [Delimited text format](format-delimited-text.md), [Avro format](format-avro.md) and [Binary format](format-binary.md) article on format-based dataset and supported settings. The following properties are supported for Amazon S3 under `location` settings in format-based dataset:
103+
The following properties are supported for Amazon S3 under `location` settings in format-based dataset:
107104

108105
| Property | Description | Required |
109106
| ---------- | ------------------------------------------------------------ | -------- |
@@ -113,9 +110,6 @@ To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary
113110
| fileName | The file name under the given bucket + folderPath. If you want to use wildcard to filter files, skip this setting and specify in activity source settings. | No |
114111
| version | The version of the S3 object, if S3 versioning is enabled. If not specified, the latest version will be fetched. |No |
115112

116-
> [!NOTE]
117-
> **AmazonS3Object** type dataset with Parquet/Text format mentioned in next section is still supported as-is for Copy/Lookup/GetMetadata activity for backward compatibility, but it doesn't work with mapping data flow. You are suggested to use this new model going forward, and the ADF authoring UI has switched to generating these new types.
118-
119113
**Example:**
120114

121115
```json
@@ -143,9 +137,10 @@ To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary
143137
}
144138
```
145139

146-
### Other format dataset
140+
### Legacy dataset model
147141

148-
To copy data from Amazon S3 in **ORC format**, the following properties are supported:
142+
>[!NOTE]
143+
>The following dataset model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned above going forward, and the ADF authoring UI has switched to generating the new model.
149144
150145
| Property | Description | Required |
151146
|:--- |:--- |:--- |
@@ -227,12 +222,9 @@ For a full list of sections and properties available for defining activities, se
227222

228223
### Amazon S3 as source
229224

230-
- To copy from **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet, delimited text, JSON, Avro and binary format source](#format-based-source) section.
231-
- To copy from other formats like **ORC format**, refer to [Other format source](#other-format-source) section.
225+
[!INCLUDE [data-factory-v2-file-formats](../../includes/data-factory-v2-file-formats.md)]
232226

233-
#### <a name="format-based-source"></a> Parquet, delimited text, JSON, Avro and binary format source
234-
235-
To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet format](format-parquet.md), [Delimited text format](format-delimited-text.md), [Avro format](format-avro.md) and [Binary format](format-binary.md) article on format-based copy activity source and supported settings. The following properties are supported for Amazon S3 under `storeSettings` settings in format-based copy source:
227+
The following properties are supported for Amazon S3 under `storeSettings` settings in format-based copy source:
236228

237229
| Property | Description | Required |
238230
| ------------------------ | ------------------------------------------------------------ | ----------------------------------------------------------- |
@@ -245,9 +237,6 @@ To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary
245237
| modifiedDatetimeEnd | Same as above. | No |
246238
| maxConcurrentConnections | The number of the connections to connect to storage store concurrently. Specify only when you want to limit the concurrent connection to the data store. | No |
247239

248-
> [!NOTE]
249-
> For Parquet/delimited text format, **FileSystemSource** type copy activity source mentioned in next section is still supported as-is for backward compatibility. You are suggested to use this new model going forward, and the ADF authoring UI has switched to generating these new types.
250-
251240
**Example:**
252241

253242
```json
@@ -289,9 +278,10 @@ To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary
289278
]
290279
```
291280

292-
#### Other format source
281+
#### Legacy source model
293282

294-
To copy data from Amazon S3 in **ORC format**, the following properties are supported in the copy activity **source** section:
283+
>[!NOTE]
284+
>The following copy source model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned above going forward, and the ADF authoring UI has switched to generating the new model.
295285
296286
| Property | Description | Required |
297287
|:--- |:--- |:--- |

articles/data-factory/connector-azure-blob-storage.md

Lines changed: 18 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.reviewer: craigg
88
ms.service: data-factory
99
ms.workload: data-services
1010
ms.topic: conceptual
11-
ms.date: 09/09/2019
11+
ms.date: 10/24/2019
1212
ms.author: jingwang
1313

1414
---
@@ -38,8 +38,8 @@ Specifically, this Blob storage connector supports:
3838
- Copying blobs from block, append, or page blobs and copying data to only block blobs.
3939
- Copying blobs as is or parsing or generating blobs with [supported file formats and compression codecs](supported-file-formats-and-compression-codecs.md).
4040

41-
>[!NOTE]
42-
>If you enable the _"Allow trusted Microsoft services to access this storage account"_ option on Azure Storage firewall settings, using Azure Integration Runtime to connect to Blob storage will fail with a forbidden error, as ADF is not treated as a trusted Microsoft service. Please connect via a Self-hosted Integration Runtime instead.
41+
>[!IMPORTANT]
42+
>If you enable the **Allow trusted Microsoft services to access this storage account** option on Azure Storage firewall settings and want to use Azure integration runtime to connect to your Blob Storage, you must use [managed identity authentication](#managed-identity).
4343
4444
## Get started
4545

@@ -312,12 +312,9 @@ These properties are supported for an Azure Blob storage linked service:
312312

313313
For a full list of sections and properties available for defining datasets, see the [Datasets](concepts-datasets-linked-services.md) article.
314314

315-
- For **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet, delimited text, JSON, Avro and binary format dataset](#format-based-dataset) section.
316-
- For other formats like **ORC/JSON format**, refer to [Other format dataset](#other-format-dataset) section.
317-
318-
### <a name="format-based-dataset"></a> Parquet, delimited text, JSON, Avro and binary format dataset
315+
[!INCLUDE [data-factory-v2-file-formats](../../includes/data-factory-v2-file-formats.md)]
319316

320-
To copy data to and from Blob storage in Parquet, delimited text, Avro or binary format, refer to [Parquet format](format-parquet.md), [Delimited text format](format-delimited-text.md), [Avro format](format-avro.md) and [Binary format](format-binary.md) article on format-based dataset and supported settings. The following properties are supported for Azure Blob under `location` settings in format-based dataset:
317+
The following properties are supported for Azure Blob under `location` settings in format-based dataset:
321318

322319
| Property | Description | Required |
323320
| ---------- | ------------------------------------------------------------ | -------- |
@@ -326,10 +323,6 @@ To copy data to and from Blob storage in Parquet, delimited text, Avro or binary
326323
| folderPath | The path to folder under the given container. If you want to use wildcard to filter folder, skip this setting and specify in activity source settings. | No |
327324
| fileName | The file name under the given container + folderPath. If you want to use wildcard to filter files, skip this setting and specify in activity source settings. | No |
328325

329-
> [!NOTE]
330-
>
331-
> **AzureBlob** type dataset with Parquet/Text format mentioned in next section is still supported as-is for Copy/Lookup/GetMetadata activity for backward compatibility, but it doesn't work with mapping data flow. You are suggested to use this new model going forward, and the ADF authoring UI has switched to generating these new types.
332-
333326
**Example:**
334327

335328
```json
@@ -357,9 +350,10 @@ To copy data to and from Blob storage in Parquet, delimited text, Avro or binary
357350
}
358351
```
359352

360-
### Other format dataset
353+
### Legacy dataset model
361354

362-
To copy data to and from Blob storage in ORC/JSON format, set the type property of the dataset to **AzureBlob**. The following properties are supported.
355+
>[!NOTE]
356+
>The following dataset model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned in above section going forward, and the ADF authoring UI has switched to generating the new model.
363357
364358
| Property | Description | Required |
365359
|:--- |:--- |:--- |
@@ -410,12 +404,9 @@ For a full list of sections and properties available for defining activities, se
410404

411405
### Blob storage as a source type
412406

413-
- To copy from **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet, delimited text, JSON, Avro and binary format source](#format-based-source) section.
414-
- To copy from other formats like **ORC format**, refer to [Other format source](#other-format-source) section.
407+
[!INCLUDE [data-factory-v2-file-formats](../../includes/data-factory-v2-file-formats.md)]
415408

416-
#### <a name="format-based-source"></a> Parquet, delimited text, JSON, Avro and binary format source
417-
418-
To copy data to and from Blob storage in **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet format](format-parquet.md), [Delimited text format](format-delimited-text.md), [Avro format](format-avro.md) and [Binary format](format-binary.md) article on format-based dataset and supported settings. The following properties are supported for Azure Blob under `storeSettings` settings in format-based copy source:
409+
The following properties are supported for Azure Blob under `storeSettings` settings in format-based copy source:
419410

420411
| Property | Description | Required |
421412
| ------------------------ | ------------------------------------------------------------ | --------------------------------------------- |
@@ -471,9 +462,10 @@ To copy data to and from Blob storage in **Parquet, delimited text, JSON, Avro a
471462
]
472463
```
473464

474-
#### Other format source
465+
#### Legacy source model
475466

476-
To copy data from Blob storage in **ORC format**, set the source type in the copy activity to **BlobSource**. The following properties are supported in the copy activity **source** section.
467+
>[!NOTE]
468+
>The following copy source model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned above going forward, and the ADF authoring UI has switched to generating the new model.
477469
478470
| Property | Description | Required |
479471
|:--- |:--- |:--- |
@@ -515,22 +507,16 @@ To copy data from Blob storage in **ORC format**, set the source type in the cop
515507

516508
### Blob storage as a sink type
517509

518-
- To copy from **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet, delimited text, JSON, Avro and binary format source](#format-based-source) section.
519-
- To copy from other formats like **ORC format**, refer to [Other format source](#other-format-source) section.
520-
521-
#### <a name="format-based-source"></a> Parquet, delimited text, JSON, Avro and binary format source
510+
[!INCLUDE [data-factory-v2-file-formats](../../includes/data-factory-v2-file-formats.md)]
522511

523-
To copy data from Blob storage in **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet format](format-parquet.md), [Delimited text format](format-delimited-text.md), [Avro format](format-avro.md) and [Binary format](format-binary.md) article on format-based copy activity source and supported settings. The following properties are supported for Azure Blob under `storeSettings` settings in format-based copy sink:
512+
The following properties are supported for Azure Blob under `storeSettings` settings in format-based copy sink:
524513

525514
| Property | Description | Required |
526515
| ------------------------ | ------------------------------------------------------------ | -------- |
527516
| type | The type property under `storeSettings` must be set to **AzureBlobStorageWriteSetting**. | Yes |
528517
| copyBehavior | Defines the copy behavior when the source is files from a file-based data store.<br/><br/>Allowed values are:<br/><b>- PreserveHierarchy (default)</b>: Preserves the file hierarchy in the target folder. The relative path of source file to source folder is identical to the relative path of target file to target folder.<br/><b>- FlattenHierarchy</b>: All files from the source folder are in the first level of the target folder. The target files have autogenerated names. <br/><b>- MergeFiles</b>: Merges all files from the source folder to one file. If the file or blob name is specified, the merged file name is the specified name. Otherwise, it's an autogenerated file name. | No |
529518
| maxConcurrentConnections | The number of the connections to connect to storage store concurrently. Specify only when you want to limit the concurrent connection to the data store. | No |
530519

531-
> [!NOTE]
532-
> For Parquet/delimited text format, **BlobSink** type copy activity sink mentioned in next section is still supported as-is for backward compatibility. You are suggested to use this new model going forward, and the ADF authoring UI has switched to generating these new types.
533-
534520
**Example:**
535521

536522
```json
@@ -566,9 +552,10 @@ To copy data from Blob storage in **Parquet, delimited text, JSON, Avro and bina
566552
]
567553
```
568554

569-
#### Other format sink
555+
#### Legacy sink model
570556

571-
To copy data to Blob storage in **ORC format**, set the sink type in the copy activity to **BlobSink**. The following properties are supported in the **sink** section.
557+
>[!NOTE]
558+
>The following copy sink model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned above going forward, and the ADF authoring UI has switched to generating the new model.
572559
573560
| Property | Description | Required |
574561
|:--- |:--- |:--- |

0 commit comments

Comments
 (0)