Skip to content

Commit 185fc53

Browse files
authored
Update how-to-move-data-in-out-of-pipelines.md
1 parent e5f0a69 commit 185fc53

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

articles/machine-learning/v1/how-to-move-data-in-out-of-pipelines.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,11 @@ step1_output_ds = step1_output_data.register_on_complete(name='processed_data',
235235

236236
Azure does not automatically delete intermediate data written with `OutputFileDatasetConfig`. To avoid storage charges for large amounts of unneeded data, you should either:
237237

238-
* Programmatically delete intermediate data at the end of a pipeline job, when it is no longer needed. Data should be deleted after a 30 day period, deleting the data earlier could cause the pipeline to fail.
238+
> [!CAUTION]
239+
> Negative potential consequences of an action
240+
> Only delete intermediate data after 30 days from the last change date of the data. Deleting thedata earlier could cause the pipeline run to fail because the pipeline will assume the intermediate data existed within 30 days.
241+
242+
* Programmatically delete intermediate data at the end of a pipeline job, when it is no longer needed.
239243
* Use blob storage with a short-term storage policy for intermediate data (see [Optimize costs by automating Azure Blob Storage access tiers](/azure/storage/blobs/lifecycle-management-overview)). This policy can only be set to a workspace's non-default datastore. Use `OutputFileDatasetConfig` to export intermediate data to another datastore that isn't the default.
240244
```Python
241245
# Get adls gen 2 datastore already registered with the workspace

0 commit comments

Comments
 (0)