Skip to content

Commit 2d4be17

Browse files
committed
moving json docs, adding parquet to wrangling
1 parent 548f24c commit 2d4be17

File tree

4 files changed

+150
-145
lines changed

4 files changed

+150
-145
lines changed

.openpublishing.redirection.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28465,6 +28465,11 @@
2846528465
"redirect_url": "/azure/data-factory/v1/data-factory-web-table-connector",
2846628466
"redirect_document_id": true
2846728467
},
28468+
{
28469+
"source_path": "articles/data-factory/concepts-data-flow-json.md",
28470+
"redirect_url": "/azure/data-factory/format-json.md",
28471+
"redirect_document_id": true
28472+
},
2846828473
{
2846928474
"source_path": "articles/data-lake-store/data-lake-store-authenticate-using-active-directory.md",
2847028475
"redirect_url": "/azure/data-lake-store/data-lake-store-service-to-service-authenticate-using-active-directory",

articles/data-factory/concepts-data-flow-json.md

Lines changed: 0 additions & 141 deletions
Original file line numberDiff line numberDiff line change
@@ -12,147 +12,6 @@ ms.date: 08/30/2019
1212

1313
# Mapping data flow JSON handling
1414

15-
## Creating JSON structures in Derived Column
16-
17-
You can add a complex column to your data flow via the derived column expression builder. In the derived column transformation, add a new column and open the expression builder by clicking on the blue box. To make a column complex, you can enter the JSON structure manually or use the UX to add subcolumns interactively.
18-
19-
### Using the expression builder UX
20-
21-
In the output schema side pane, hover over a column and click the plus icon. Select **Add subcolumn** to make the column a complex type.
22-
23-
![Add subcolumn](media/data-flow/addsubcolumn.png "Add Subcolumn")
24-
25-
You can add additional columns and subcolumns in the same way. For each non-complex field, an expression can be added in the expression editor to the right.
26-
27-
![Complex column](media/data-flow/complexcolumn.png "Complex column")
28-
29-
### Entering the JSON structure manually
30-
31-
To manually add a JSON structure, add a new column and enter the expression in the editor. The expression follows the following general format:
32-
33-
```
34-
@(
35-
field1=0,
36-
field2=@(
37-
field1=0
38-
)
39-
)
40-
```
41-
42-
If this expression were entered for a column named "complexColumn", then it would be written to the sink as the following JSON:
43-
44-
```
45-
{
46-
"complexColumn": {
47-
"field1": 0,
48-
"field2": {
49-
"field1": 0
50-
}
51-
}
52-
}
53-
```
54-
55-
#### Sample manual script for complete hierarchical definition
56-
```
57-
@(
58-
title=Title,
59-
firstName=FirstName,
60-
middleName=MiddleName,
61-
lastName=LastName,
62-
suffix=Suffix,
63-
contactDetails=@(
64-
email=EmailAddress,
65-
phone=Phone
66-
),
67-
address=@(
68-
line1=AddressLine1,
69-
line2=AddressLine2,
70-
city=City,
71-
state=StateProvince,
72-
country=CountryRegion,
73-
postCode=PostalCode
74-
),
75-
ids=[
76-
toString(CustomerID), toString(AddressID), rowguid
77-
]
78-
)
79-
```
80-
81-
## Source format options
82-
83-
Using a JSON dataset as a source in your data flow allows you to set five additional settings. These settings can be found under the **JSON settings** accordion in the **Source Options** tab.
84-
85-
![JSON Settings](media/data-flow/json-settings.png "JSON Settings")
86-
87-
### Default
88-
89-
By default, JSON data is read in the following format.
90-
91-
```
92-
{ "json": "record 1" }
93-
{ "json": "record 2" }
94-
{ "json": "record 3" }
95-
```
96-
97-
### Single document
98-
99-
If **Single document** is selected, mapping data flows read one JSON document from each file.
100-
101-
``` json
102-
File1.json
103-
{
104-
"json": "record 1"
105-
}
106-
File2.json
107-
{
108-
"json": "record 2"
109-
}
110-
File3.json
111-
{
112-
"json": "record 3"
113-
}
114-
```
115-
116-
### Unquoted column names
117-
118-
If **Unquoted column names** is selected, mapping data flows reads JSON columns that aren't surrounded by quotes.
119-
120-
```
121-
{ json: "record 1" }
122-
{ json: "record 2" }
123-
{ json: "record 3" }
124-
```
125-
126-
### Has comments
127-
128-
Select **Has comments** if the JSON data has C or C++ style commenting.
129-
130-
``` json
131-
{ "json": /** comment **/ "record 1" }
132-
{ "json": "record 2" }
133-
{ /** comment **/ "json": "record 3" }
134-
```
135-
136-
### Single quoted
137-
138-
Select **Single quoted** if the JSON fields and values use single quotes instead of double quotes.
139-
140-
```
141-
{ 'json': 'record 1' }
142-
{ 'json': 'record 2' }
143-
{ 'json': 'record 3' }
144-
```
145-
146-
### Backslash escaped
147-
148-
Select **Single quoted** if backslashes are used to escape characters in the JSON data.
149-
150-
```
151-
{ "json": "record 1" }
152-
{ "json": "\} \" \' \\ \n \\n record 2" }
153-
{ "json": "record 3" }
154-
```
155-
15615
## Higher-order functions
15716

15817
A higher-order function is a function that takes in one or more functions as an argument. Below are a list of higher-order functions supported in mapping data flows that enable array operations.

articles/data-factory/format-json.md

Lines changed: 143 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.reviewer: craigg
88
ms.service: data-factory
99
ms.workload: data-services
1010
ms.topic: conceptual
11-
ms.date: 11/26/2019
11+
ms.date: 02/05/2020
1212
ms.author: jingwang
1313

1414
---
@@ -180,7 +180,148 @@ Copy activity can automatically detect and parse the following patterns of JSON
180180

181181
## Mapping data flow properties
182182

183-
Learn details from [source transformation](data-flow-source.md) and [sink transformation](data-flow-sink.md) in mapping data flow.
183+
JSON file types can be used as both a sink and a source in mapping data flow.
184+
185+
### Creating JSON structures in Derived Column
186+
187+
You can add a complex column to your data flow via the derived column expression builder. In the derived column transformation, add a new column and open the expression builder by clicking on the blue box. To make a column complex, you can enter the JSON structure manually or use the UX to add subcolumns interactively.
188+
189+
#### Using the expression builder UX
190+
191+
In the output schema side pane, hover over a column and click the plus icon. Select **Add subcolumn** to make the column a complex type.
192+
193+
![Add subcolumn](media/data-flow/addsubcolumn.png "Add Subcolumn")
194+
195+
You can add additional columns and subcolumns in the same way. For each non-complex field, an expression can be added in the expression editor to the right.
196+
197+
![Complex column](media/data-flow/complexcolumn.png "Complex column")
198+
199+
#### Entering the JSON structure manually
200+
201+
To manually add a JSON structure, add a new column and enter the expression in the editor. The expression follows the following general format:
202+
203+
```
204+
@(
205+
field1=0,
206+
field2=@(
207+
field1=0
208+
)
209+
)
210+
```
211+
212+
If this expression were entered for a column named "complexColumn", then it would be written to the sink as the following JSON:
213+
214+
```
215+
{
216+
"complexColumn": {
217+
"field1": 0,
218+
"field2": {
219+
"field1": 0
220+
}
221+
}
222+
}
223+
```
224+
225+
#### Sample manual script for complete hierarchical definition
226+
```
227+
@(
228+
title=Title,
229+
firstName=FirstName,
230+
middleName=MiddleName,
231+
lastName=LastName,
232+
suffix=Suffix,
233+
contactDetails=@(
234+
email=EmailAddress,
235+
phone=Phone
236+
),
237+
address=@(
238+
line1=AddressLine1,
239+
line2=AddressLine2,
240+
city=City,
241+
state=StateProvince,
242+
country=CountryRegion,
243+
postCode=PostalCode
244+
),
245+
ids=[
246+
toString(CustomerID), toString(AddressID), rowguid
247+
]
248+
)
249+
```
250+
251+
### Source format options
252+
253+
Using a JSON dataset as a source in your data flow allows you to set five additional settings. These settings can be found under the **JSON settings** accordion in the **Source Options** tab.
254+
255+
![JSON Settings](media/data-flow/json-settings.png "JSON Settings")
256+
257+
#### Default
258+
259+
By default, JSON data is read in the following format.
260+
261+
```
262+
{ "json": "record 1" }
263+
{ "json": "record 2" }
264+
{ "json": "record 3" }
265+
```
266+
267+
#### Single document
268+
269+
If **Single document** is selected, mapping data flows read one JSON document from each file.
270+
271+
``` json
272+
File1.json
273+
{
274+
"json": "record 1"
275+
}
276+
File2.json
277+
{
278+
"json": "record 2"
279+
}
280+
File3.json
281+
{
282+
"json": "record 3"
283+
}
284+
```
285+
286+
#### Unquoted column names
287+
288+
If **Unquoted column names** is selected, mapping data flows reads JSON columns that aren't surrounded by quotes.
289+
290+
```
291+
{ json: "record 1" }
292+
{ json: "record 2" }
293+
{ json: "record 3" }
294+
```
295+
296+
#### Has comments
297+
298+
Select **Has comments** if the JSON data has C or C++ style commenting.
299+
300+
``` json
301+
{ "json": /** comment **/ "record 1" }
302+
{ "json": "record 2" }
303+
{ /** comment **/ "json": "record 3" }
304+
```
305+
306+
#### Single quoted
307+
308+
Select **Single quoted** if the JSON fields and values use single quotes instead of double quotes.
309+
310+
```
311+
{ 'json': 'record 1' }
312+
{ 'json': 'record 2' }
313+
{ 'json': 'record 3' }
314+
```
315+
316+
#### Backslash escaped
317+
318+
Select **Single quoted** if backslashes are used to escape characters in the JSON data.
319+
320+
```
321+
{ "json": "record 1" }
322+
{ "json": "\} \" \' \\ \n \\n record 2" }
323+
{ "json": "record 3" }
324+
```
184325

185326
## Next steps
186327

articles/data-factory/wrangling-data-flow-overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,9 +38,9 @@ and conform it to a shape for fast analytics.
3838

3939
| Connector | Data format | Authentication type |
4040
| -- | -- | --|
41-
| [Azure Blob Storage](connector-azure-blob-storage.md) | CSV | Account Key |
41+
| [Azure Blob Storage](connector-azure-blob-storage.md) | CSV, Parquet | Account Key |
4242
| [Azure Data Lake Storage Gen1](connector-azure-data-lake-store.md) | CSV | Service Principal |
43-
| [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md) | CSV | Account Key, Service Principal |
43+
| [Azure Data Lake Storage Gen2](connector-azure-data-lake-storage.md) | CSV, Parquet | Account Key, Service Principal |
4444
| [Azure SQL Database](connector-azure-sql-database.md) | - | SQL authentication |
4545
| [Azure Synapse Analytics](connector-azure-sql-data-warehouse.md) | - | SQL authentication |
4646

0 commit comments

Comments
 (0)