Skip to content

Commit 94bcb41

Browse files
authored
Update how-to-sqldb-to-cosmosdb.md
1 parent fd606cb commit 94bcb41

File tree

1 file changed

+28
-4
lines changed

1 file changed

+28
-4
lines changed

articles/data-factory/how-to-sqldb-to-cosmosdb.md

Lines changed: 28 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -57,23 +57,47 @@ The resulting CosmosDB container will embed the inner query into a single docume
5757

5858
7. On the top source, add a Derived Column transformation after "SourceOrderDetails". Call the new transformation "TypeCast". We need to round the ```UnitPrice``` column and cast it to a double data type for CosmosDB. Set the formula to: ```toDouble(round(UnitPrice,2))```.
5959

60-
8. Add another derived column and call it "MakeStruct". This is where we will create a hierarchical structure to hold the values from the details table. Remember, details is a ```M:1``` relation to header. Name the new structure "orderdetailsstruct" and create the hierarchy in this way, setting each subcolumn to the incoming column name:
60+
8. Add another derived column and call it "MakeStruct". This is where we will create a hierarchical structure to hold the values from the details table. Remember, details is a ```M:1``` relation to header. Name the new structure ```orderdetailsstruct``` and create the hierarchy in this way, setting each subcolumn to the incoming column name:
6161

6262
![Create Structure](media/data-flow/cosmosb9.png)
6363

6464
9. Now, let's go to the sales header source. Add a Join transformation. For the right-side select "MakeStruct". Leave it set to inner join and choose ```SalesOrderID``` for both sides of the join condition.
6565

6666
10. Click on the Data Preview tab in the new join that you added so that you can see your results up to this point. You should see all of the header rows joined with the detail rows. This is the result of the join being formed from the ```SalesOrderID```. Next, we'll combine the details from the common rows into the details struct and aggregate the common rows.
6767

68+
![Join](media/data-flow/cosmosb4.png)
69+
6870
11. Before we can create the arrays to denormalize these rows, we first need to remove unwanted columns and make sure the data values will match CosmosDB data types.
6971

7072
12. Add a Select transformation next and set the field mapping to look like this:
7173

72-
![Column scrubber](media/data-flow/cosmosb6.png)
74+
![Column scrubber](media/data-flow/cosmosb5.png)
7375

74-
13.
76+
13. Now let's again cast a currency column, this time ```TotalDue```. Like we did above in step 7, set the formula to: ```toDouble(round(TotalDue,2))```.
7577

76-
![Join](media/data-flow/cosmosb4.png)
78+
14. Here's where we will denormalize the rows by grouping by the common key ```SalesOrderID```. Add an Aggregate transformation and set the group by to ```SalesOrderID```.
79+
80+
15. In the aggregate formula, add a new column called "details" and use this formula to collect the values in the structure that we created earlier called ```orderdetailsstruct```: ```collect(orderdetailsstruct)```.
81+
82+
16. The aggregate transformation will only output columns that are part of aggregate or group by formulas. So, we need to include the columns from the sales header as well. To do that, add a column pattern in that same aggregate transformation. This pattern will include all other columns in the output:
83+
84+
```instr(name,'OrderQty')==0&&instr(name,'UnitPrice')==0&&instr(name,'SalesOrderID')==0```
85+
86+
17. Use the "this" syntax in the other properties so that we maintain the same column names and use the ```first()``` function as an aggregate:
87+
88+
![Aggregate](media/data-flow/cosmosb6.png)
89+
90+
18. We're ready to finish the migration flow by adding a sink transformation. Click "new" next to dataset and add a CosmosDB dataset that points to your CosmosDB database. For the collection, we'll call it "orders" and it will have no schema and no documents because it will be created on the fly.
91+
92+
19. In Sink Settings, Partition Key to ```\SalesOrderID``` and collection action to "recreate". Make sure your mapping tab looks like this:
93+
94+
![Sink settings](media/data-flow/cosmosb7.png)
95+
96+
20. Click on data preview to make sure that you are seeing these 32 rows set to insert as new documents into your new container:
97+
98+
![Sink settings](media/data-flow/cosmosb8.png)
99+
100+
If everything looks good, you are now ready to create a new pipeline, add this data flow activity to that pipeline and execute it. You can execute from debug or a triggered run. After a few minutes, you should have a new denormalized container of orders called "orders" in your CosmosDB database.
77101

78102
## Next steps
79103

0 commit comments

Comments
 (0)