Skip to content

Commit 70a1df4

Browse files
committed
edits
1 parent b980f06 commit 70a1df4

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

articles/machine-learning/v1/how-to-move-data-in-out-of-pipelines.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ This article provides code for importing data, transforming data, and moving dat
2424

2525
This article shows how to:
2626

27-
- Use `Dataset` objects for pre-existing data
27+
- Use `Dataset` objects for preexisting data
2828
- Access data within your steps
2929
- Split `Dataset` data into subsets, such as training and validation subsets
3030
- Create `OutputFileDatasetConfig` objects to transfer data to the next pipeline step
@@ -48,11 +48,11 @@ This article shows how to:
4848
ws = Workspace.from_config()
4949
```
5050

51-
- Some pre-existing data. This article briefly shows the use of an [Azure blob container](/azure/storage/blobs/storage-blobs-overview).
51+
- Some preexisting data. This article briefly shows the use of an [Azure blob container](/azure/storage/blobs/storage-blobs-overview).
5252

5353
- Optional: An existing machine learning pipeline, such as the one described in [Create and run machine learning pipelines with Azure Machine Learning SDK](./how-to-create-machine-learning-pipelines.md).
5454

55-
## Use `Dataset` objects for pre-existing data
55+
## Use `Dataset` objects for preexisting data
5656

5757
The preferred way to ingest data into a pipeline is to use a [Dataset](/python/api/azureml-core/azureml.core.dataset%28class%29) object. `Dataset` objects represent persistent data that's available throughout a workspace.
5858

@@ -250,7 +250,7 @@ step1_output_ds = step1_output_data.register_on_complete(
250250
Azure doesn't automatically delete intermediate data that's written with `OutputFileDatasetConfig`. To avoid storage charges for large amounts of unneeded data, you should take one of the following actions:
251251

252252
* Programmatically delete intermediate data at the end of a pipeline job, when it's no longer needed.
253-
* Use blob storage with a short-term storage policy for intermediate data. (See [Optimize costs by automating Azure Blob Storage access tiers](/azure/storage/blobs/lifecycle-management-overview).) This policy can be set only on a workspace's non-default datastore. Use `OutputFileDatasetConfig` to export intermediate data to another datastore that isn't the default.
253+
* Use blob storage with a short-term storage policy for intermediate data. (See [Optimize costs by automating Azure Blob Storage access tiers](/azure/storage/blobs/lifecycle-management-overview).) This policy can be set only on a workspace's nondefault datastore. Use `OutputFileDatasetConfig` to export intermediate data to another datastore that isn't the default.
254254

255255
```Python
256256
# Get Data Lake Storage Gen2 datastore that's already registered with the workspace
@@ -263,7 +263,7 @@ Azure doesn't automatically delete intermediate data that's written with `Output
263263
> [!CAUTION]
264264
> Only delete intermediate data after 30 days from the last change date of the data. Deleting intermediate data earlier could cause the pipeline run to fail because the pipeline assumes the data exists for a 30 day period for reuse.
265265
266-
For more information, see [Plan and manage costs for Azure Machine Learning](../concept-plan-manage-cost.md).
266+
For more information, see [Plan to manage costs for Azure Machine Learning](../concept-plan-manage-cost.md).
267267

268268
## Next steps
269269

0 commit comments

Comments
 (0)