You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/data-factory/connector-azure-blob-storage.md
+3-7Lines changed: 3 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ ms.service: data-factory
9
9
ms.workload: data-services
10
10
ms.topic: conceptual
11
11
ms.custom: seo-lt-2019
12
-
ms.date: 02/17/2020
12
+
ms.date: 04/09/2020
13
13
---
14
14
15
15
# Copy and transform data in Azure Blob storage by using Azure Data Factory
@@ -20,7 +20,8 @@ ms.date: 02/17/2020
20
20
21
21
This article outlines how to use Copy Activity in Azure Data Factory to copy data from and to Azure Blob storage, and use Data Flow to transform data in Azure Blob storage. To learn about Azure Data Factory, read the [introductory article](introduction.md).
>For data lake or data warehouse migration scenario, learn more from [Use Azure Data Factory to migrate data from your data lake or data warehouse to Azure](data-migration-guidance-overview.md).
24
25
25
26
## Supported capabilities
26
27
@@ -132,11 +133,6 @@ A shared access signature provides delegated access to resources in your storage
132
133
>- Data Factory now supports both **service shared access signatures** and **account shared access signatures**. For more information about shared access signatures, see [Grant limited access to Azure Storage resources using shared access signatures (SAS)](../storage/common/storage-sas-overview.md).
133
134
>- In later dataset configuration, the folder path is the absolute path starting from container level. You need to configure one aligned with the path in your SAS URI.
134
135
135
-
> [!TIP]
136
-
> To generate a service shared access signature for your storage account, you can execute the following PowerShell commands. Replace the placeholders and grant the needed permission.
Copy file name to clipboardExpand all lines: articles/data-factory/connector-azure-data-lake-storage.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,6 +19,9 @@ Azure Data Lake Storage Gen2 (ADLS Gen2) is a set of capabilities dedicated to b
19
19
20
20
This article outlines how to use Copy Activity in Azure Data Factory to copy data from and to Azure Data Lake Storage Gen2, and use Data Flow to transform data in Azure Data Lake Storage Gen2. To learn about Azure Data Factory, read the [introductory article](introduction.md).
21
21
22
+
>[!TIP]
23
+
>For data lake or data warehouse migration scenario, learn more from [Use Azure Data Factory to migrate data from your data lake or data warehouse to Azure](data-migration-guidance-overview.md).
24
+
22
25
## Supported capabilities
23
26
24
27
This Azure Data Lake Storage Gen2 connector is supported for the following activities:
| type | The type property of the dataset must be set to **DelimitedText**. | Yes |
29
29
| location | Location settings of the file(s). Each file-based connector has its own location type and supported properties under `location`. | Yes |
30
-
| columnDelimiter | The character(s) used to separate columns in a file. Currently, multi-char delimiter is only supported for mapping data flow but not Copy activity. <br>The default value is **comma `,`**, When the column delimiter is defined as empty string which means no delimiter, the whole line is taken as a single column. | No |
31
-
| rowDelimiter | The single character or "\r\n" used to separate rows in a file.<br>The default value is any of the following values **on read: ["\r\n", "\r", "\n"]**, and **"\n" or “\r\n” on write** by mapping data flow and Copy activity respectively. <br>When `rowDelimiter`is set to no delimiter (empty string), the `columnDelimiter`must be set as no delimiter (empty string) as well, which means to treat the entire content as a single value. | No |
30
+
| columnDelimiter | The character(s) used to separate columns in a file. <br>The default value is **comma `,`**. When the column delimiter is defined as empty string which means no delimiter, the whole line is taken as a single column.<br>Currently, column delimiter as empty string or multi-char is only supported for mapping data flow but not Copy activity. | No |
31
+
| rowDelimiter | The single character or "\r\n" used to separate rows in a file.<br>The default value is any of the following values **on read: ["\r\n", "\r", "\n"]**, and **"\n" or “\r\n” on write** by mapping data flow and Copy activity respectively. <br>When the row delimiter is set to no delimiter (empty string), the column delimiter must be set as no delimiter (empty string) as well, which means to treat the entire content as a single value.<br>Currently, row delimiter as empty string is only supported for mapping data flow but not Copy activity. | No |
32
32
| quoteChar | The single character to quote column values if it contains column delimiter. <br>The default value is **double quotes**`"`. <br>For mapping data flow, `quoteChar` cannot be an empty string. <br>For Copy activity, when `quoteChar` is defined as empty string, it means there is no quote char and column value is not quoted, and `escapeChar` is used to escape the column delimiter and itself. | No |
33
33
| escapeChar | The single character to escape quotes inside a quoted value.<br>The default value is **backslash `\`**. <br>For mapping data flow, `escapeChar` cannot be an empty string. <br/>For Copy activity, when `escapeChar` is defined as empty string, the `quoteChar` must be set as empty string as well, in which case make sure all column values don’t contain delimiters. | No |
34
34
| firstRowAsHeader | Specifies whether to treat/make the first row as a header line with names of columns.<br>Allowed values are **true** and **false** (default). | No |
35
35
| nullValue | Specifies the string representation of null value. <br>The default value is **empty string**. | No |
36
36
| encodingName | The encoding type used to read/write test files. <br>Allowed values are as follows: "UTF-8", "UTF-16", "UTF-16BE", "UTF-32", "UTF-32BE", "US-ASCII", “UTF-7”, "BIG5", "EUC-JP", "EUC-KR", "GB2312", "GB18030", "JOHAB", "SHIFT-JIS", "CP875", "CP866", "IBM00858", "IBM037", "IBM273", "IBM437", "IBM500", "IBM737", "IBM775", "IBM850", "IBM852", "IBM855", "IBM857", "IBM860", "IBM861", "IBM863", "IBM864", "IBM865", "IBM869", "IBM870", "IBM01140", "IBM01141", "IBM01142", "IBM01143", "IBM01144", "IBM01145", "IBM01146", "IBM01147", "IBM01148", "IBM01149", "ISO-2022-JP", "ISO-2022-KR", "ISO-8859-1", "ISO-8859-2", "ISO-8859-3", "ISO-8859-4", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", "ISO-8859-8", "ISO-8859-9", "ISO-8859-13", "ISO-8859-15", "WINDOWS-874", "WINDOWS-1250", "WINDOWS-1251", "WINDOWS-1252", "WINDOWS-1253", "WINDOWS-1254", "WINDOWS-1255", "WINDOWS-1256", "WINDOWS-1257", "WINDOWS-1258”.<br>Note mapping data flow doesn’t support UTF-7 encoding. | No |
37
-
| compressionCodec | The compression codec used to read/write text files. <br>Allowed values are **bzip2**, **gzip**, **deflate**, **ZipDeflate**, **snappy**, or **lz4**. to use when saving the file. <br>Note currently Copy activity doesn’t support "snappy" & "lz4", and mapping data flow doesn’t support "ZipDeflate". <br>Note when using copy activity to decompress ZipDeflate file(s) and write to file-based sink data store, files will be extracted to the folder: `<path specified in dataset>/<folder named as source zip file>/`. | No |
37
+
| compressionCodec | The compression codec used to read/write text files. <br>Allowed values are **bzip2**, **gzip**, **deflate**, **ZipDeflate**, **snappy**, or **lz4**. Default is not compressed. <br>**Note** currently Copy activity doesn’t support "snappy" & "lz4", and mapping data flow doesn’t support "ZipDeflate". <br>**Note** when using copy activity to decompress ZipDeflate file(s) and write to file-based sink data store, files will be extracted to the folder: `<path specified in dataset>/<folder named as source zip file>/`. | No |
38
38
| compressionLevel | The compression ratio. <br>Allowed values are **Optimal** or **Fastest**.<br>- **Fastest:** The compression operation should complete as quickly as possible, even if the resulting file is not optimally compressed.<br>- **Optimal**: The compression operation should be optimally compressed, even if the operation takes a longer time to complete. For more information, see [Compression Level](https://msdn.microsoft.com/library/system.io.compression.compressionlevel.aspx) topic. | No |
39
39
40
40
Below is an example of delimited text dataset on Azure Blob Storage:
@@ -57,6 +57,7 @@ Below is an example of delimited text dataset on Azure Blob Storage:
Copy file name to clipboardExpand all lines: articles/data-factory/format-json.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,8 +28,9 @@ For a full list of sections and properties available for defining datasets, see
28
28
| type | The type property of the dataset must be set to **Json**. | Yes |
29
29
| location | Location settings of the file(s). Each file-based connector has its own location type and supported properties under `location`. **See details in connector article -> Dataset properties section**. | Yes |
| compressionCodec | The compression codec used to read/write text files. <br>Allowed values are **bzip2**, **gzip**, **deflate**, **ZipDeflate**, **snappy**, or **lz4**. to use when saving the file. <br>Note currently Copy activity doesn’t support "snappy" & "lz4".<br>Note when using copy activity to decompress ZipDeflate file(s) and write to file-based sink data store, files will be extracted to the folder: `<path specified in dataset>/<folder named as source zip file>/`. | No |
32
-
| compressionLevel | The compression ratio. <br>Allowed values are **Optimal** or **Fastest**.<br>- **Fastest:** The compression operation should complete as quickly as possible, even if the resulting file is not optimally compressed.<br>- **Optimal**: The compression operation should be optimally compressed, even if the operation takes a longer time to complete. For more information, see [Compression Level](https://msdn.microsoft.com/library/system.io.compression.compressionlevel.aspx) topic. | No |
31
+
| compression | Group of properties to configure file compression. Configure this section when you want to do compression/decompression during activity execution. | No |
32
+
| type | The compression codec used to read/write JSON files. <br>Allowed values are **bzip2**, **gzip**, **deflate**, **ZipDeflate**, **snappy**, or **lz4**. to use when saving the file. Default is not compressed.<br>**Note** currently Copy activity doesn’t support "snappy" & "lz4", and mapping data flow doesn’t support "ZipDeflate".<br>**Note** when using copy activity to decompress ZipDeflate file(s) and write to file-based sink data store, files will be extracted to the folder: `<path specified in dataset>/<folder named as source zip file>/`. | No. |
33
+
| level | The compression ratio. <br>Allowed values are **Optimal** or **Fastest**.<br>- **Fastest:** The compression operation should complete as quickly as possible, even if the resulting file is not optimally compressed.<br>- **Optimal**: The compression operation should be optimally compressed, even if the operation takes a longer time to complete. For more information, see [Compression Level](https://msdn.microsoft.com/library/system.io.compression.compressionlevel.aspx) topic. | No |
33
34
34
35
Below is an example of JSON dataset on Azure Blob Storage:
35
36
@@ -51,6 +52,7 @@ Below is an example of JSON dataset on Azure Blob Storage:
0 commit comments