Skip to content

Commit c421fb8

Browse files
authored
Create how-to-sqldb-to-cosmosdb
1 parent 33cc069 commit c421fb8

File tree

1 file changed

+89
-0
lines changed

1 file changed

+89
-0
lines changed
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
---
2+
title: Migrate Azure SQL Database tables to Azure CosmosDB with Azure Data Factory
3+
description: Take an existing normalized database schema from Azure SQL Database and migrate to CosmosDB denormalized collection with Azure Data Factory.
4+
services: data-factory
5+
author: kromerm
6+
7+
ms.service: data-factory
8+
ms.workload: data-services
9+
10+
ms.topic: conceptual
11+
ms.date: 04/29/2020
12+
ms.author: makromer
13+
---
14+
15+
# Migrate normalized database schema from Azure SQL Database to Azure CosmosDB denormalized collection
16+
17+
By using mapping data flows in Microsoft Azure Data Factory, you can transform data from fixed-width text files. In the following task, we'll define a dataset for a text file without a delimiter and then set up substring splits based on ordinal position.
18+
19+
## Create a pipeline
20+
21+
1. Select **+New Pipeline** to create a new pipeline.
22+
23+
2. Add a data flow activity, which will be used for processing fixed-width files:
24+
25+
![Fixed Width Pipeline](media/data-flow/fwpipe.png)
26+
27+
3. In the data flow activity, select **New mapping data flow**.
28+
29+
4. Add a Source, Derived Column, Select, and Sink transformation:
30+
31+
![Fixed Width Data Flow](media/data-flow/fw2.png)
32+
33+
5. Configure the Source transformation to use a new dataset, which will be of the Delimited Text type.
34+
35+
6. Don't set any column delimiter or headers.
36+
37+
Now we'll set field starting points and lengths for the contents of this file:
38+
39+
```
40+
1234567813572468
41+
1234567813572468
42+
1234567813572468
43+
1234567813572468
44+
1234567813572468
45+
1234567813572468
46+
1234567813572468
47+
1234567813572468
48+
1234567813572468
49+
1234567813572468
50+
1234567813572468
51+
1234567813572468
52+
1234567813572468
53+
```
54+
55+
7. On the **Projection** tab of your Source transformation, you should see a string column that's named *Column_1*.
56+
57+
8. In the Derived column, create a new column.
58+
59+
9. We'll give the columns simple names like *col1*.
60+
61+
10. In the expression builder, type the following:
62+
63+
```substring(Column_1,1,4)```
64+
65+
![derived column](media/data-flow/fwderivedcol1.png)
66+
67+
11. Repeat step 10 for all the columns you need to parse.
68+
69+
12. Select the **Inspect** tab to see the new columns that will be generated:
70+
71+
![inspect](media/data-flow/fwinspect.png)
72+
73+
13. Use the Select transform to remove any of the columns that you don't need for transformation:
74+
75+
![select transformation](media/data-flow/fwselect.png)
76+
77+
14. Use Sink to output the data to a folder:
78+
79+
![fixed width sink](media/data-flow/fwsink.png)
80+
81+
Here's what the output looks like:
82+
83+
![fixed width output](media/data-flow/fxdoutput.png)
84+
85+
The fixed-width data is now split, with four characters each and assigned to Col1, Col2, Col3, Col4, and so on. Based on the preceding example, the data is split into four columns.
86+
87+
## Next steps
88+
89+
* Build the rest of your data flow logic by using mapping data flows [transformations](concepts-data-flow-overview.md).

0 commit comments

Comments
 (0)