You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/data-factory/connector-amazon-simple-storage-service.md
+11-21Lines changed: 11 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ ms.reviewer: douglasl
9
9
ms.service: data-factory
10
10
ms.workload: data-services
11
11
ms.topic: conceptual
12
-
ms.date: 09/09/2019
12
+
ms.date: 10/24/2019
13
13
ms.author: jingwang
14
14
15
15
---
@@ -98,12 +98,9 @@ Here is an example:
98
98
99
99
For a full list of sections and properties available for defining datasets, see the [Datasets](concepts-datasets-linked-services.md) article.
100
100
101
-
- For **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet, delimited text, JSON, Avro and binary format dataset](#format-based-dataset) section.
102
-
- For other formats like **ORC format**, refer to [Other format dataset](#other-format-dataset) section.
### <aname="format-based-dataset"></a> Parquet, delimited text, JSON, Avro and binary format dataset
105
-
106
-
To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet format](format-parquet.md), [Delimited text format](format-delimited-text.md), [Avro format](format-avro.md) and [Binary format](format-binary.md) article on format-based dataset and supported settings. The following properties are supported for Amazon S3 under `location` settings in format-based dataset:
103
+
The following properties are supported for Amazon S3 under `location` settings in format-based dataset:
@@ -113,9 +110,6 @@ To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary
113
110
| fileName | The file name under the given bucket + folderPath. If you want to use wildcard to filter files, skip this setting and specify in activity source settings. | No |
114
111
| version | The version of the S3 object, if S3 versioning is enabled. If not specified, the latest version will be fetched. |No |
115
112
116
-
> [!NOTE]
117
-
> **AmazonS3Object** type dataset with Parquet/Text format mentioned in next section is still supported as-is for Copy/Lookup/GetMetadata activity for backward compatibility, but it doesn't work with mapping data flow. You are suggested to use this new model going forward, and the ADF authoring UI has switched to generating these new types.
118
-
119
113
**Example:**
120
114
121
115
```json
@@ -143,9 +137,10 @@ To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary
143
137
}
144
138
```
145
139
146
-
### Other format dataset
140
+
### Legacy dataset model
147
141
148
-
To copy data from Amazon S3 in **ORC format**, the following properties are supported:
142
+
>[!NOTE]
143
+
>The following dataset model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned above going forward, and the ADF authoring UI has switched to generating the new model.
149
144
150
145
| Property | Description | Required |
151
146
|:--- |:--- |:--- |
@@ -227,12 +222,9 @@ For a full list of sections and properties available for defining activities, se
227
222
228
223
### Amazon S3 as source
229
224
230
-
- To copy from **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet, delimited text, JSON, Avro and binary format source](#format-based-source) section.
231
-
- To copy from other formats like **ORC format**, refer to [Other format source](#other-format-source) section.
#### <aname="format-based-source"></a> Parquet, delimited text, JSON, Avro and binary format source
234
-
235
-
To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet format](format-parquet.md), [Delimited text format](format-delimited-text.md), [Avro format](format-avro.md) and [Binary format](format-binary.md) article on format-based copy activity source and supported settings. The following properties are supported for Amazon S3 under `storeSettings` settings in format-based copy source:
227
+
The following properties are supported for Amazon S3 under `storeSettings` settings in format-based copy source:
@@ -245,9 +237,6 @@ To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary
245
237
| modifiedDatetimeEnd | Same as above. | No |
246
238
| maxConcurrentConnections | The number of the connections to connect to storage store concurrently. Specify only when you want to limit the concurrent connection to the data store. | No |
247
239
248
-
> [!NOTE]
249
-
> For Parquet/delimited text format, **FileSystemSource** type copy activity source mentioned in next section is still supported as-is for backward compatibility. You are suggested to use this new model going forward, and the ADF authoring UI has switched to generating these new types.
250
-
251
240
**Example:**
252
241
253
242
```json
@@ -289,9 +278,10 @@ To copy data from Amazon S3 in **Parquet, delimited text, JSON, Avro and binary
289
278
]
290
279
```
291
280
292
-
#### Other format source
281
+
#### Legacy source model
293
282
294
-
To copy data from Amazon S3 in **ORC format**, the following properties are supported in the copy activity **source** section:
283
+
>[!NOTE]
284
+
>The following copy source model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned above going forward, and the ADF authoring UI has switched to generating the new model.
Copy file name to clipboardExpand all lines: articles/data-factory/connector-azure-blob-storage.md
+18-31Lines changed: 18 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.reviewer: craigg
8
8
ms.service: data-factory
9
9
ms.workload: data-services
10
10
ms.topic: conceptual
11
-
ms.date: 09/09/2019
11
+
ms.date: 10/24/2019
12
12
ms.author: jingwang
13
13
14
14
---
@@ -38,8 +38,8 @@ Specifically, this Blob storage connector supports:
38
38
- Copying blobs from block, append, or page blobs and copying data to only block blobs.
39
39
- Copying blobs as is or parsing or generating blobs with [supported file formats and compression codecs](supported-file-formats-and-compression-codecs.md).
40
40
41
-
>[!NOTE]
42
-
>If you enable the _"Allow trusted Microsoft services to access this storage account"_ option on Azure Storage firewall settings, using Azure Integration Runtime to connect to Blob storage will fail with a forbidden error, as ADF is not treated as a trusted Microsoft service. Please connect via a Self-hosted Integration Runtime instead.
41
+
>[!IMPORTANT]
42
+
>If you enable the **Allow trusted Microsoft services to access this storage account** option on Azure Storage firewall settings and want to use Azure integration runtime to connect to your Blob Storage, you must use [managed identity authentication](#managed-identity).
43
43
44
44
## Get started
45
45
@@ -312,12 +312,9 @@ These properties are supported for an Azure Blob storage linked service:
312
312
313
313
For a full list of sections and properties available for defining datasets, see the [Datasets](concepts-datasets-linked-services.md) article.
314
314
315
-
- For **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet, delimited text, JSON, Avro and binary format dataset](#format-based-dataset) section.
316
-
- For other formats like **ORC/JSON format**, refer to [Other format dataset](#other-format-dataset) section.
317
-
318
-
### <aname="format-based-dataset"></a> Parquet, delimited text, JSON, Avro and binary format dataset
To copy data to and from Blob storage in Parquet, delimited text, Avro or binary format, refer to [Parquet format](format-parquet.md), [Delimited text format](format-delimited-text.md), [Avro format](format-avro.md) and [Binary format](format-binary.md) article on format-based dataset and supported settings. The following properties are supported for Azure Blob under `location` settings in format-based dataset:
317
+
The following properties are supported for Azure Blob under `location` settings in format-based dataset:
@@ -326,10 +323,6 @@ To copy data to and from Blob storage in Parquet, delimited text, Avro or binary
326
323
| folderPath | The path to folder under the given container. If you want to use wildcard to filter folder, skip this setting and specify in activity source settings. | No |
327
324
| fileName | The file name under the given container + folderPath. If you want to use wildcard to filter files, skip this setting and specify in activity source settings. | No |
328
325
329
-
> [!NOTE]
330
-
>
331
-
> **AzureBlob** type dataset with Parquet/Text format mentioned in next section is still supported as-is for Copy/Lookup/GetMetadata activity for backward compatibility, but it doesn't work with mapping data flow. You are suggested to use this new model going forward, and the ADF authoring UI has switched to generating these new types.
332
-
333
326
**Example:**
334
327
335
328
```json
@@ -357,9 +350,10 @@ To copy data to and from Blob storage in Parquet, delimited text, Avro or binary
357
350
}
358
351
```
359
352
360
-
### Other format dataset
353
+
### Legacy dataset model
361
354
362
-
To copy data to and from Blob storage in ORC/JSON format, set the type property of the dataset to **AzureBlob**. The following properties are supported.
355
+
>[!NOTE]
356
+
>The following dataset model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned in above section going forward, and the ADF authoring UI has switched to generating the new model.
363
357
364
358
| Property | Description | Required |
365
359
|:--- |:--- |:--- |
@@ -410,12 +404,9 @@ For a full list of sections and properties available for defining activities, se
410
404
411
405
### Blob storage as a source type
412
406
413
-
- To copy from **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet, delimited text, JSON, Avro and binary format source](#format-based-source) section.
414
-
- To copy from other formats like **ORC format**, refer to [Other format source](#other-format-source) section.
#### <aname="format-based-source"></a> Parquet, delimited text, JSON, Avro and binary format source
417
-
418
-
To copy data to and from Blob storage in **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet format](format-parquet.md), [Delimited text format](format-delimited-text.md), [Avro format](format-avro.md) and [Binary format](format-binary.md) article on format-based dataset and supported settings. The following properties are supported for Azure Blob under `storeSettings` settings in format-based copy source:
409
+
The following properties are supported for Azure Blob under `storeSettings` settings in format-based copy source:
@@ -471,9 +462,10 @@ To copy data to and from Blob storage in **Parquet, delimited text, JSON, Avro a
471
462
]
472
463
```
473
464
474
-
#### Other format source
465
+
#### Legacy source model
475
466
476
-
To copy data from Blob storage in **ORC format**, set the source type in the copy activity to **BlobSource**. The following properties are supported in the copy activity **source** section.
467
+
>[!NOTE]
468
+
>The following copy source model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned above going forward, and the ADF authoring UI has switched to generating the new model.
477
469
478
470
| Property | Description | Required |
479
471
|:--- |:--- |:--- |
@@ -515,22 +507,16 @@ To copy data from Blob storage in **ORC format**, set the source type in the cop
515
507
516
508
### Blob storage as a sink type
517
509
518
-
- To copy from **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet, delimited text, JSON, Avro and binary format source](#format-based-source) section.
519
-
- To copy from other formats like **ORC format**, refer to [Other format source](#other-format-source) section.
520
-
521
-
#### <aname="format-based-source"></a> Parquet, delimited text, JSON, Avro and binary format source
To copy data from Blob storage in **Parquet, delimited text, JSON, Avro and binary format**, refer to [Parquet format](format-parquet.md), [Delimited text format](format-delimited-text.md), [Avro format](format-avro.md) and [Binary format](format-binary.md) article on format-based copy activity source and supported settings. The following properties are supported for Azure Blob under `storeSettings` settings in format-based copy sink:
512
+
The following properties are supported for Azure Blob under `storeSettings` settings in format-based copy sink:
| type | The type property under `storeSettings` must be set to **AzureBlobStorageWriteSetting**. | Yes |
528
517
| copyBehavior | Defines the copy behavior when the source is files from a file-based data store.<br/><br/>Allowed values are:<br/><b>- PreserveHierarchy (default)</b>: Preserves the file hierarchy in the target folder. The relative path of source file to source folder is identical to the relative path of target file to target folder.<br/><b>- FlattenHierarchy</b>: All files from the source folder are in the first level of the target folder. The target files have autogenerated names. <br/><b>- MergeFiles</b>: Merges all files from the source folder to one file. If the file or blob name is specified, the merged file name is the specified name. Otherwise, it's an autogenerated file name. | No |
529
518
| maxConcurrentConnections | The number of the connections to connect to storage store concurrently. Specify only when you want to limit the concurrent connection to the data store. | No |
530
519
531
-
> [!NOTE]
532
-
> For Parquet/delimited text format, **BlobSink** type copy activity sink mentioned in next section is still supported as-is for backward compatibility. You are suggested to use this new model going forward, and the ADF authoring UI has switched to generating these new types.
533
-
534
520
**Example:**
535
521
536
522
```json
@@ -566,9 +552,10 @@ To copy data from Blob storage in **Parquet, delimited text, JSON, Avro and bina
566
552
]
567
553
```
568
554
569
-
#### Other format sink
555
+
#### Legacy sink model
570
556
571
-
To copy data to Blob storage in **ORC format**, set the sink type in the copy activity to **BlobSink**. The following properties are supported in the **sink** section.
557
+
>[!NOTE]
558
+
>The following copy sink model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned above going forward, and the ADF authoring UI has switched to generating the new model.
0 commit comments