Skip to content

Commit 67dea62

Browse files
authored
Update format-delta.md
1 parent 5873f3d commit 67dea62

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

articles/data-factory/format-delta.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,8 +141,10 @@ In Settings tab, you will find three more options to optimize delta sink transfo
141141

142142
* When **Auto compact** is enabled, after an individual write, transformation checks if files can further be compacted, and runs a quick OPTIMIZE job (with 128 MB file sizes instead of 1GB) to further compact files for partitions that have the most number of small files. Auto compaction helps in coalescing a large number of small files into a smaller number of large files. Auto compaction only kicks in when there are at least 50 files. Once a compaction operation is performed, it creates a new version of the table, and writes a new file containing the data of several previous files in a compact compressed form.
143143

144-
* When **Optimize write** is enabled, sink transformation dynamically optimizes partition sizes based on the actual data by attempting to write out 128 MB files for each table partition. This is an approximate size and can vary depending on dataset characteristics. Optimized writes improve the overall efficiency of the *writes and subsequent reads*. It organizes partitions such that the performance of subsequent reads will improve.
144+
* When **Optimize write** is enabled, sink transformation dynamically optimizes partition sizes based on the actual data by attempting to write out 128 MB files for each table partition. This is an approximate size and can vary depending on dataset characteristics. Optimized writes improve the overall efficiency of the *writes and subsequent reads*. It organizes partitions such that the performance of subsequent reads will improve
145145

146+
> [!TIP]
147+
> The optimized write process will slow down your overall ETL job because the Sink will issue the Spark Delta Lake Optimize command after your data is processed. It is recommended to use Optimized Write sparingly. For example, if you have an hourly data pipeline, execute a data flow with Optimized Write daily.
146148
147149
### Known limitations
148150

0 commit comments

Comments
 (0)