Skip to content

Commit dd482bc

Browse files
authored
Merge pull request #97439 from kromerm/dataflow-1
Dataflow 1
2 parents ba672df + 93a87b1 commit dd482bc

File tree

2 files changed

+15
-1
lines changed

2 files changed

+15
-1
lines changed

articles/data-factory/concepts-data-flow-expression-builder.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,13 @@ If you put a comment at the top of your expression, it will appear in the transf
9999

100100
```toString(toTimestamp('12/31/2016T00:12:00', 'MM/dd/yyyy\'T\'HH:mm:ss'), 'MM/dd /yyyy\'T\'HH:mm:ss')```
101101

102-
Note that to include string literals in your timestamp output, you need to wrap your conversion inside of a toString()
102+
Note that to include string literals in your timestamp output, you need to wrap your conversion inside of ```toString()```.
103+
104+
Here is how to convert seconds from Epoch to a date or timestamp:
105+
106+
```toTimestamp(1574127407*1000l)```
107+
108+
Notice the trailing "l" at the end of the expression above. That signifies conversion to long as in-line syntax.
103109

104110
## Handling column names with special characters
105111

articles/data-factory/concepts-data-flow-performance.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,14 @@ For example, if you have a list of data files from July 2019 that you wish to pr
116116

117117
By using wildcarding, your pipeline will only contain one Data Flow activity. This will perform better than a Lookup against the Blob Store that then iterates across all matched files using a ForEach with an Execute Data Flow activity inside.
118118

119+
### Optimizing for CosmosDB
120+
121+
Setting throughput and batch properties on CosmosDB sinks only take effect during the execution of that data flow from a pipeline data flow activity. The original collection settings will be honored by CosmosDB after your data flow execution.
122+
123+
* Batch size: Calculate the rough row size of your data, and make sure that rowSize * batch size is less than two million. If it is, increase the batch size to get better throughput
124+
* Througput: Set a higher throughput setting here to allow documents to write faster to CosmosDB. Please keep in mind the higher RU costs based upon a high throughput setting.
125+
* Write Throughput Budget: Use a value which is smaller than total RUs per minute. If you have a data flow with a high number of Spark partitiongs, setting a a budget throughput will allow more balance across those partitions.
126+
119127
## Next steps
120128

121129
See other Data Flow articles related to performance:

0 commit comments

Comments
 (0)