You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -35,11 +33,11 @@ Automated ML in a pipeline is represented by an `AutoMLStep` object. The `AutoML
35
33
36
34
There are several subclasses of `PipelineStep`. In addition to the `AutoMLStep`, this article will show a `PythonScriptStep` for data preparation and another for registering the model.
37
35
38
-
The preferred way to initially move data _into_ an ML pipeline is with `Dataset` objects. To move data _between_ steps and possible save data output from jobs, the preferred way is with [`OutputFileDatasetConfig`](/python/api/azureml-core/azureml.data.outputfiledatasetconfig) and [`OutputTabularDatasetConfig`](/python/api/azureml-core/azureml.data.output_dataset_config.outputtabulardatasetconfig) objects. To be used with `AutoMLStep`, the `PipelineData` object must be transformed into a `PipelineOutputTabularDataset` object. For more information, see [Input and output data from ML pipelines](how-to-move-data-in-out-of-pipelines.md).
36
+
The preferred way to initially move data _into_ an ML pipeline is with `Dataset` objects. To move data _between_ steps and possible save data output from runs, the preferred way is with [`OutputFileDatasetConfig`](/python/api/azureml-core/azureml.data.outputfiledatasetconfig) and [`OutputTabularDatasetConfig`](/python/api/azureml-core/azureml.data.output_dataset_config.outputtabulardatasetconfig) objects. To be used with `AutoMLStep`, the `PipelineData` object must be transformed into a `PipelineOutputTabularDataset` object. For more information, see [Input and output data from ML pipelines](how-to-move-data-in-out-of-pipelines.md).
39
37
40
38
The `AutoMLStep` is configured via an `AutoMLConfig` object. `AutoMLConfig` is a flexible class, as discussed in [Configure automated ML experiments in Python](./how-to-configure-auto-train.md#configure-your-experiment-settings).
41
39
42
-
A `Pipeline` runs in an `Experiment`. The pipeline `Job` has, for each step, a child `StepJob`. The outputs of the automated ML `StepJob` are the training metrics and highest-performing model.
40
+
A `Pipeline` runs in an `Experiment`. The pipeline `Run` has, for each step, a child `StepRun`. The outputs of the automated ML `StepRun` are the training metrics and highest-performing model.
43
41
44
42
To make things concrete, this article creates a simple pipeline for a classification task. The task is predicting Titanic survival, but we won't be discussing the data or task except in passing.
45
43
@@ -104,9 +102,9 @@ After that, the code checks if the AML compute target `'cpu-cluster'` already ex
104
102
105
103
The code blocks until the target is provisioned and then prints some details of the just-created compute target. Finally, the named compute target is retrieved from the workspace and assigned to `compute_target`.
106
104
107
-
### Configure the training job
105
+
### Configure the training run
108
106
109
-
The runtime context is set by creating and configuring a `JobConfiguration` object. Here we set the compute target.
107
+
The runtime context is set by creating and configuring a `RunConfiguration` object. Here we set the compute target.
110
108
111
109
```python
112
110
from azureml.core.runconfig import RunConfiguration
The snippet shows an idiom commonly used with `AutoMLConfig`. Arguments that are more fluid (hyperparameter-ish) are specified in a separate dictionary while the values less likely to change are specified directly in the `AutoMLConfig` constructor. In this case, the `automl_settings` specify a brief job: the job will stop after only 2 iterations or 15 minutes, whichever comes first.
314
+
The snippet shows an idiom commonly used with `AutoMLConfig`. Arguments that are more fluid (hyperparameter-ish) are specified in a separate dictionary while the values less likely to change are specified directly in the `AutoMLConfig` constructor. In this case, the `automl_settings` specify a brief run: the run will stop after only 2 iterations or 15 minutes, whichever comes first.
317
315
318
316
The `automl_settings` dictionary is passed to the `AutoMLConfig` constructor as kwargs. The other parameters aren't complex:
319
317
@@ -402,7 +400,7 @@ run = experiment.submit(pipeline, show_output=True)
402
400
run.wait_for_completion()
403
401
```
404
402
405
-
The code above combines the data preparation, automated ML, and model-registering steps into a `Pipeline` object. It then creates an `Experiment` object. The `Experiment` constructor will retrieve the named experiment if it exists or create it if necessary. It submits the `Pipeline` to the `Experiment`, creating a `Run` object that will asynchronously run the pipeline. The `wait_for_completion()` function blocks until the job completes.
403
+
The code above combines the data preparation, automated ML, and model-registering steps into a `Pipeline` object. It then creates an `Experiment` object. The `Experiment` constructor will retrieve the named experiment if it exists or create it if necessary. It submits the `Pipeline` to the `Experiment`, creating a `Run` object that will asynchronously run the pipeline. The `wait_for_completion()` function blocks until the run completes.
406
404
407
405
### Examine pipeline results
408
406
@@ -456,11 +454,11 @@ with open(model_filename, "rb" ) as f:
456
454
457
455
For more information on loading and working with existing models, see [Use an existing model with Azure Machine Learning](how-to-deploy-and-where.md).
458
456
459
-
### Download the results of an automated ML job
457
+
### Download the results of an automated ML run
460
458
461
-
If you've been following along with the article, you'll have an instantiated `job` object. But you can also retrieve completed `Job` objects from the `Workspace` by way of an `Experiment` object.
459
+
If you've been following along with the article, you'll have an instantiated `run` object. But you can also retrieve completed `Run` objects from the `Workspace` by way of an `Experiment` object.
462
460
463
-
The workspace contains a complete record of all your experiments and jobs. You can either use the portal to find and download the outputs of experiments or use code. To access the records from a historic job, use Azure Machine Learning to find the ID of the job in which you are interested. With that ID, you can choose the specific `job` by way of the `Workspace` and `Experiment`.
461
+
The workspace contains a complete record of all your experiments and runs. You can either use the portal to find and download the outputs of experiments or use code. To access the records from a historic run, use Azure Machine Learning to find the ID of the run in which you are interested. With that ID, you can choose the specific `run` by way of the `Workspace` and `Experiment`.
run =next(run for run in ex.get_runs() if run.id == run_id)
470
468
```
471
469
472
-
You would have to change the strings in the above code to the specifics of your historical job. The snippet above assumes that you've assigned `ws` to the relevant `Workspace` with the normal `from_config()`. The experiment of interest is directly retrieved and then the code finds the `Job` of interest by matching the `run.id` value.
470
+
You would have to change the strings in the above code to the specifics of your historical run. The snippet above assumes that you've assigned `ws` to the relevant `Workspace` with the normal `from_config()`. The experiment of interest is directly retrieved and then the code finds the `Run` of interest by matching the `run.id` value.
473
471
474
-
Once you have a `Job` object, you can download the metrics and model.
472
+
Once you have a `Run` object, you can download the metrics and model.
475
473
476
474
```python
477
475
automl_run =next(r for r in run.get_children() if r.name =='AutoML_Classification')
Each `Job` object contains `StepRun` objects that contain information about the individual pipeline step job. The `job` is searched for the `StepRun` object for the `AutoMLStep`. The metrics and model are retrieved using their default names, which are available even if you don't pass `PipelineData` objects to the `outputs` parameter of the `AutoMLStep`.
484
+
Each `Run` object contains `StepRun` objects that contain information about the individual pipeline step run. The `run` is searched for the `StepRun` object for the `AutoMLStep`. The metrics and model are retrieved using their default names, which are available even if you don't pass `PipelineData` objects to the `outputs` parameter of the `AutoMLStep`.
487
485
488
486
Finally, the actual metrics and model are downloaded to your local machine, as was discussed in the "Examine pipeline results" section above.
0 commit comments