Skip to content

Commit 315e4a2

Browse files
committed
Acrolizx pass
1 parent 0ccbfcc commit 315e4a2

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

articles/machine-learning/concept-pipeline-practices-tips.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,9 @@ There are several options for getting started if your are new to pipelines:
2626

2727
## How do you modularize pipeline code?
2828

29-
Modules and the `ModuleStep` class give you a great opportunity to modularize your ML code. However, it has to be kept in mind that moving between pipeline steps is vastly more expensive than a function call. The question you must ask is not so much "Are these functions and data conceptually different than those in this other section?" but "Do I want these functions and data to evolve separately?" or "Is this an expensive computation whose output I can reuse?" For more information, see this notebook [How to create Module, ModuleVersion, and use them in a pipeline with ModuleStep](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-modulestep.ipynb).
29+
Modules and the `ModuleStep` class give you a great opportunity to modularize your ML code. However, it has to be kept in mind that moving between pipeline steps is vastly more expensive than a function call. The question you must ask isn't so much "Are these functions and data conceptually different than the ones in this other section?" but "Do I want these functions and data to evolve separately?" or "Is this an expensive computation whose output I can reuse?" For more information, see thisn'tebook [How to create Module, ModuleVersion, and use them in a pipeline with ModuleStep](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-modulestep.ipynb).
3030

31-
As discussed previously, separating data preparation from training is often one such opportunity. Sometimes data preparation is complex and time-consuming enough that it can be appropriate to break into separate pipeline steps. Other opportunities include post-training testing and analysis.
31+
As discussed previously, separating data preparation from training is often one such opportunity. Sometimes data preparation is complex and time-consuming enough that you might break the process into separate pipeline steps. Other opportunities include post-training testing and analysis.
3232

3333
## How do you speed pipeline iteration?
3434

@@ -38,19 +38,19 @@ Common techniques for quickly iterating pipelines include:
3838
- Keeping the compute instance running, so as to avoid startup time
3939
- Configuring data and steps to allow reuse will allow the pipeline to skip recalculating unchanging data
4040

41-
When you want to quickly iterate, you can clone your pipeline, making a pipeline, and rerun the pipeline. Another helpful technique is If you keep your Compute warm, you will not incur the cost of spinning up the new compute. If you set up the Step to allow reuse of the result of a run, then the repeated execution will reuse results where possible (when there are no change in the Steps).
41+
When you want to quickly iterate, you can clone your pipeline, making a pipeline, and rerun the pipeline. Another helpful technique is If you keep your Compute warm, you won't incur the cost of spinning up the new compute. If you set up the Step to allow reuse of the result of a run, then the repeated execution will reuse results where possible (when there are no change in the Steps).
4242

4343
## How do you collaborate using ML pipelines?
4444

4545
Separate pipelines are natural lines along which to split effort. Multiple developers or even multiple teams can work on different steps, so long as the data and arguments flowing between steps are agreed upon.
4646

47-
During active development, you can retrieve `PipelineRun` and `StepRun` metadata from the workspace and use these to download final and intermediate output and artifacts, and use those for your own modularized work.
47+
During active development, you can retrieve `PipelineRun` and `StepRun` run results from the workspace, use these objects to download final and intermediate output and artifacts, and use those for your own modularized work.
4848

4949
## Use pipelines to test techniques in isolation
5050

51-
Real-world ML solutions generally involve considerable customization of every step. The raw data often needs to be filtered, transformed, and augmented. The training processes might have several potential architectures and, for deep learning, many possible variations in terms of layer sizes and activation functions. Even with a consistent architecture, hyperparameter search can produce significant wins.
51+
Real-world ML solutions generally involve considerable customization of every step. The raw data often needs to be filtered, transformed, and augmented. The training processes might have several potential architectures and, for deep learning, many possible variations for layer sizes and activation functions. Even with a consistent architecture, hyperparameter search can produce significant wins.
5252

53-
In addition to tools like [AutoML](concept-automated-ml.md) and [automated hyperparameter search](how-to-tune-hyperparameters.md), pipelines can be an important tool for A/B testing solutions. If you have several variants of your pipeline steps, it is easy to generate separate runs trying their variations:
53+
In addition to tools like [AutoML](concept-automated-ml.md) and [automated hyperparameter search](how-to-tune-hyperparameters.md), pipelines can be an important tool for A/B testing solutions. If you have several variants of your pipeline steps, it's easy to generate separate runs trying their variations:
5454

5555
```python
5656
data_preparation_variants = [data1, data2, data3]

0 commit comments

Comments
 (0)