Skip to content

Commit fd8c3c1

Browse files
Merge pull request #98515 from linda33wj/master
Update ADF copy content
2 parents 195e44d + 655994f commit fd8c3c1

18 files changed

+1854
-1752
lines changed

articles/data-factory/TOC.yml

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -389,14 +389,18 @@
389389
href: delete-activity.md
390390
- name: Copy Data tool
391391
href: copy-data-tool.md
392-
- name: Performance and tuning
393-
href: copy-activity-performance.md
394392
- name: Format and compression support
395393
href: supported-file-formats-and-compression-codecs.md
394+
- name: Performance and tuning
395+
href: copy-activity-performance.md
396+
- name: Preserve metadata and ACLs
397+
href: copy-activity-preserve-metadata.md
396398
- name: Schema and type mapping
397399
href: copy-activity-schema-and-type-mapping.md
398400
- name: Fault tolerance
399401
href: copy-activity-fault-tolerance.md
402+
- name: Format and compression support (legacy)
403+
href: supported-file-formats-and-compression-codecs-legacy.md
400404
- name: Transform data
401405
href: transform-data.md
402406
items:

articles/data-factory/connector-amazon-simple-storage-service.md

Lines changed: 97 additions & 94 deletions
Large diffs are not rendered by default.

articles/data-factory/connector-azure-blob-storage.md

Lines changed: 122 additions & 122 deletions
Large diffs are not rendered by default.

articles/data-factory/connector-azure-data-lake-storage.md

Lines changed: 123 additions & 175 deletions
Large diffs are not rendered by default.

articles/data-factory/connector-azure-data-lake-store.md

Lines changed: 124 additions & 128 deletions
Large diffs are not rendered by default.

articles/data-factory/connector-azure-file-storage.md

Lines changed: 119 additions & 123 deletions
Large diffs are not rendered by default.

articles/data-factory/connector-file-system.md

Lines changed: 119 additions & 125 deletions
Large diffs are not rendered by default.

articles/data-factory/connector-ftp.md

Lines changed: 73 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,8 @@ ms.reviewer: douglasl
99

1010
ms.service: data-factory
1111
ms.workload: data-services
12-
13-
1412
ms.topic: conceptual
15-
ms.date: 10/24/2019
13+
ms.date: 12/10/2019
1614
ms.author: jingwang
1715

1816
---
@@ -156,54 +154,6 @@ The following properties are supported for FTP under `location` settings in form
156154
}
157155
```
158156

159-
### Legacy dataset model
160-
161-
>[!NOTE]
162-
>The following dataset model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned in above section going forward, and the ADF authoring UI has switched to generating the new model.
163-
164-
| Property | Description | Required |
165-
|:--- |:--- |:--- |
166-
| type | The type property of the dataset must be set to: **FileShare** |Yes |
167-
| folderPath | Path to the folder. Wildcard filter is supported, allowed wildcards are: `*` (matches zero or more characters) and `?` (matches zero or single character); use `^` to escape if your actual folder name has wildcard or this escape char inside. <br/><br/>Examples: rootfolder/subfolder/, see more examples in [Folder and file filter examples](#folder-and-file-filter-examples). |Yes |
168-
| fileName | **Name or wildcard filter** for the file(s) under the specified "folderPath". If you don't specify a value for this property, the dataset points to all files in the folder. <br/><br/>For filter, allowed wildcards are: `*` (matches zero or more characters) and `?` (matches zero or single character).<br/>- Example 1: `"fileName": "*.csv"`<br/>- Example 2: `"fileName": "???20180427.txt"`<br/>Use `^` to escape if your actual file name has wildcard or this escape char inside. |No |
169-
| format | If you want to **copy files as-is** between file-based stores (binary copy), skip the format section in both input and output dataset definitions.<br/><br/>If you want to parse files with a specific format, the following file format types are supported: **TextFormat**, **JsonFormat**, **AvroFormat**, **OrcFormat**, **ParquetFormat**. Set the **type** property under format to one of these values. For more information, see [Text Format](supported-file-formats-and-compression-codecs.md#text-format), [Json Format](supported-file-formats-and-compression-codecs.md#json-format), [Avro Format](supported-file-formats-and-compression-codecs.md#avro-format), [Orc Format](supported-file-formats-and-compression-codecs.md#orc-format), and [Parquet Format](supported-file-formats-and-compression-codecs.md#parquet-format) sections. |No (only for binary copy scenario) |
170-
| compression | Specify the type and level of compression for the data. For more information, see [Supported file formats and compression codecs](supported-file-formats-and-compression-codecs.md#compression-support).<br/>Supported types are: **GZip**, **Deflate**, **BZip2**, and **ZipDeflate**.<br/>Supported levels are: **Optimal** and **Fastest**. |No |
171-
| useBinaryTransfer | Specify whether to use the binary transfer mode. The values are true for binary mode (default), and false for ASCII. |No |
172-
173-
>[!TIP]
174-
>To copy all files under a folder, specify **folderPath** only.<br>To copy a single file with a given name, specify **folderPath** with folder part and **fileName** with file name.<br>To copy a subset of files under a folder, specify **folderPath** with folder part and **fileName** with wildcard filter.
175-
176-
>[!NOTE]
177-
>If you were using "fileFilter" property for file filter, it is still supported as-is, while you are suggested to use the new filter capability added to "fileName" going forward.
178-
179-
**Example:**
180-
181-
```json
182-
{
183-
"name": "FTPDataset",
184-
"properties": {
185-
"type": "FileShare",
186-
"linkedServiceName":{
187-
"referenceName": "<FTP linked service name>",
188-
"type": "LinkedServiceReference"
189-
},
190-
"typeProperties": {
191-
"folderPath": "folder/subfolder/",
192-
"fileName": "myfile.csv.gz",
193-
"format": {
194-
"type": "TextFormat",
195-
"columnDelimiter": ",",
196-
"rowDelimiter": "\n"
197-
},
198-
"compression": {
199-
"type": "GZip",
200-
"level": "Optimal"
201-
}
202-
}
203-
}
204-
}
205-
```
206-
207157
## Copy activity properties
208158

209159
For a full list of sections and properties available for defining activities, see the [Pipelines](concepts-pipelines-activities.md) article. This section provides a list of properties supported by FTP source.
@@ -264,10 +214,80 @@ The following properties are supported for FTP under `storeSettings` settings in
264214
]
265215
```
266216

267-
#### Legacy source model
217+
### Folder and file filter examples
218+
219+
This section describes the resulting behavior of the folder path and file name with wildcard filters.
220+
221+
| folderPath | fileName | recursive | Source folder structure and filter result (files in **bold** are retrieved)|
222+
|:--- |:--- |:--- |:--- |
223+
| `Folder*` | (empty, use default) | false | FolderA<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File1.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File2.json**<br/>&nbsp;&nbsp;&nbsp;&nbsp;Subfolder1<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File3.csv<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File4.json<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File5.csv<br/>AnotherFolderB<br/>&nbsp;&nbsp;&nbsp;&nbsp;File6.csv |
224+
| `Folder*` | (empty, use default) | true | FolderA<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File1.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File2.json**<br/>&nbsp;&nbsp;&nbsp;&nbsp;Subfolder1<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**File3.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**File4.json**<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**File5.csv**<br/>AnotherFolderB<br/>&nbsp;&nbsp;&nbsp;&nbsp;File6.csv |
225+
| `Folder*` | `*.csv` | false | FolderA<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File1.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;File2.json<br/>&nbsp;&nbsp;&nbsp;&nbsp;Subfolder1<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File3.csv<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File4.json<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File5.csv<br/>AnotherFolderB<br/>&nbsp;&nbsp;&nbsp;&nbsp;File6.csv |
226+
| `Folder*` | `*.csv` | true | FolderA<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File1.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;File2.json<br/>&nbsp;&nbsp;&nbsp;&nbsp;Subfolder1<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**File3.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File4.json<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**File5.csv**<br/>AnotherFolderB<br/>&nbsp;&nbsp;&nbsp;&nbsp;File6.csv |
227+
228+
## Lookup activity properties
229+
230+
To learn details about the properties, check [Lookup activity](control-flow-lookup-activity.md).
231+
232+
## GetMetadata activity properties
233+
234+
To learn details about the properties, check [GetMetadata activity](control-flow-get-metadata-activity.md)
235+
236+
## Delete activity properties
237+
238+
To learn details about the properties, check [Delete activity](delete-activity.md)
239+
240+
## Legacy models
268241

269242
>[!NOTE]
270-
>The following copy source model is still supported as-is for backward compatibility. You are suggested to use the new model mentioned above going forward, and the ADF authoring UI has switched to generating the new model.
243+
>The following models are still supported as-is for backward compatibility. You are suggested to use the new model mentioned in above sections going forward, and the ADF authoring UI has switched to generating the new model.
244+
245+
### Legacy dataset model
246+
247+
| Property | Description | Required |
248+
|:--- |:--- |:--- |
249+
| type | The type property of the dataset must be set to: **FileShare** |Yes |
250+
| folderPath | Path to the folder. Wildcard filter is supported, allowed wildcards are: `*` (matches zero or more characters) and `?` (matches zero or single character); use `^` to escape if your actual folder name has wildcard or this escape char inside. <br/><br/>Examples: rootfolder/subfolder/, see more examples in [Folder and file filter examples](#folder-and-file-filter-examples). |Yes |
251+
| fileName | **Name or wildcard filter** for the file(s) under the specified "folderPath". If you don't specify a value for this property, the dataset points to all files in the folder. <br/><br/>For filter, allowed wildcards are: `*` (matches zero or more characters) and `?` (matches zero or single character).<br/>- Example 1: `"fileName": "*.csv"`<br/>- Example 2: `"fileName": "???20180427.txt"`<br/>Use `^` to escape if your actual file name has wildcard or this escape char inside. |No |
252+
| format | If you want to **copy files as-is** between file-based stores (binary copy), skip the format section in both input and output dataset definitions.<br/><br/>If you want to parse files with a specific format, the following file format types are supported: **TextFormat**, **JsonFormat**, **AvroFormat**, **OrcFormat**, **ParquetFormat**. Set the **type** property under format to one of these values. For more information, see [Text Format](supported-file-formats-and-compression-codecs-legacy.md#text-format), [Json Format](supported-file-formats-and-compression-codecs-legacy.md#json-format), [Avro Format](supported-file-formats-and-compression-codecs-legacy.md#avro-format), [Orc Format](supported-file-formats-and-compression-codecs-legacy.md#orc-format), and [Parquet Format](supported-file-formats-and-compression-codecs-legacy.md#parquet-format) sections. |No (only for binary copy scenario) |
253+
| compression | Specify the type and level of compression for the data. For more information, see [Supported file formats and compression codecs](supported-file-formats-and-compression-codecs-legacy.md#compression-support).<br/>Supported types are: **GZip**, **Deflate**, **BZip2**, and **ZipDeflate**.<br/>Supported levels are: **Optimal** and **Fastest**. |No |
254+
| useBinaryTransfer | Specify whether to use the binary transfer mode. The values are true for binary mode (default), and false for ASCII. |No |
255+
256+
>[!TIP]
257+
>To copy all files under a folder, specify **folderPath** only.<br>To copy a single file with a given name, specify **folderPath** with folder part and **fileName** with file name.<br>To copy a subset of files under a folder, specify **folderPath** with folder part and **fileName** with wildcard filter.
258+
259+
>[!NOTE]
260+
>If you were using "fileFilter" property for file filter, it is still supported as-is, while you are suggested to use the new filter capability added to "fileName" going forward.
261+
262+
**Example:**
263+
264+
```json
265+
{
266+
"name": "FTPDataset",
267+
"properties": {
268+
"type": "FileShare",
269+
"linkedServiceName":{
270+
"referenceName": "<FTP linked service name>",
271+
"type": "LinkedServiceReference"
272+
},
273+
"typeProperties": {
274+
"folderPath": "folder/subfolder/",
275+
"fileName": "myfile.csv.gz",
276+
"format": {
277+
"type": "TextFormat",
278+
"columnDelimiter": ",",
279+
"rowDelimiter": "\n"
280+
},
281+
"compression": {
282+
"type": "GZip",
283+
"level": "Optimal"
284+
}
285+
}
286+
}
287+
}
288+
```
289+
290+
### Legacy copy activity source model
271291

272292
| Property | Description | Required |
273293
|:--- |:--- |:--- |
@@ -307,28 +327,5 @@ The following properties are supported for FTP under `storeSettings` settings in
307327
]
308328
```
309329

310-
### Folder and file filter examples
311-
312-
This section describes the resulting behavior of the folder path and file name with wildcard filters.
313-
314-
| folderPath | fileName | recursive | Source folder structure and filter result (files in **bold** are retrieved)|
315-
|:--- |:--- |:--- |:--- |
316-
| `Folder*` | (empty, use default) | false | FolderA<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File1.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File2.json**<br/>&nbsp;&nbsp;&nbsp;&nbsp;Subfolder1<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File3.csv<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File4.json<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File5.csv<br/>AnotherFolderB<br/>&nbsp;&nbsp;&nbsp;&nbsp;File6.csv |
317-
| `Folder*` | (empty, use default) | true | FolderA<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File1.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File2.json**<br/>&nbsp;&nbsp;&nbsp;&nbsp;Subfolder1<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**File3.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**File4.json**<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**File5.csv**<br/>AnotherFolderB<br/>&nbsp;&nbsp;&nbsp;&nbsp;File6.csv |
318-
| `Folder*` | `*.csv` | false | FolderA<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File1.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;File2.json<br/>&nbsp;&nbsp;&nbsp;&nbsp;Subfolder1<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File3.csv<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File4.json<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File5.csv<br/>AnotherFolderB<br/>&nbsp;&nbsp;&nbsp;&nbsp;File6.csv |
319-
| `Folder*` | `*.csv` | true | FolderA<br/>&nbsp;&nbsp;&nbsp;&nbsp;**File1.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;File2.json<br/>&nbsp;&nbsp;&nbsp;&nbsp;Subfolder1<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**File3.csv**<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File4.json<br/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**File5.csv**<br/>AnotherFolderB<br/>&nbsp;&nbsp;&nbsp;&nbsp;File6.csv |
320-
321-
## Lookup activity properties
322-
323-
To learn details about the properties, check [Lookup activity](control-flow-lookup-activity.md).
324-
325-
## GetMetadata activity properties
326-
327-
To learn details about the properties, check [GetMetadata activity](control-flow-get-metadata-activity.md)
328-
329-
## Delete activity properties
330-
331-
To learn details about the properties, check [Delete activity](delete-activity.md)
332-
333330
## Next steps
334331
For a list of data stores supported as sources and sinks by the copy activity in Azure Data Factory, see [supported data stores](copy-activity-overview.md##supported-data-stores-and-formats).

0 commit comments

Comments
 (0)