Skip to content

Commit 86c51b0

Browse files
committed
revised workflow to match new pattern
1 parent 781ede9 commit 86c51b0

File tree

2 files changed

+92
-72
lines changed

2 files changed

+92
-72
lines changed
39.5 KB
Loading

articles/machine-learning/tutorial-automated-ml-forecast.md

Lines changed: 92 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Also try automated machine learning for these other model types:
3636

3737
* An Azure Machine Learning workspace. See [Create workspace resources](quickstart-create-resources.md).
3838

39-
* Download the [bike-no.csv](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/forecasting-bike-share/bike-no.csv) data file
39+
* Download the [bike-no.csv](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/forecasting-bike-share/bike-no.csv) data file.
4040

4141
## Sign in to the studio
4242

@@ -46,69 +46,119 @@ For this tutorial, you create your automated ML experiment run in Azure Machine
4646

4747
1. Select your subscription and the workspace you created.
4848

49-
1. Select **Get started**.
5049

51-
1. In the left pane, select **Automated ML** under the **Author** section.
50+
## Create an experiment
5251

53-
1. Select **+New automated ML job**.
52+
1. On the left menu, select **Automated ML** under the **Authoring** section:
5453

55-
## Create and load dataset
54+
:::image type="content" source="media/how-to-use-automated-ml-for-ml-models/automated-ml-overview.png" border="false" alt-text="Screenshot that shows the Authoring overview page for Automated ML in Azure Machine Learning studio." lightbox="media/how-to-use-automated-ml-for-ml-models/automated-ml-overview-large.png":::
5655

57-
Before you configure your experiment, upload your data file to your workspace in the form of an Azure Machine Learning dataset. Doing so, allows you to ensure that your data is formatted appropriately for your experiment.
56+
The first time you work with experiments in the studio, you see an empty list and links to documentation. Otherwise, you see a list of your recent Automated ML experiments, including items created with the Azure Machine Learning SDK.
5857

59-
1. On the **Select dataset** form, select **From local files** from the **+Create dataset** drop-down.
58+
1. Select **New automated ML job** to start the **Submit an Automated ML job** process.
6059

61-
1. On the **Basic info** form, give your dataset a name and provide an optional description. The dataset type should default to **Tabular**, since automated ML in Azure Machine Learning studio currently only supports tabular datasets.
62-
63-
1. Select **Next** on the bottom left
60+
By default, the process selects the **Train automatically** option on the **Training method** tab and continues to the configuration settings.
61+
62+
1. On the **Basics settings** tab, enter values for the required settings, including the **Job** name and **Experiment** name. For this tutorial, use `automl-bikeshare` as the experiment name. You can also provide values for the optional settings, as desired.
63+
64+
1. Select **Next** to continue.
65+
66+
67+
68+
## Configure your task type and dataset
69+
70+
On the **Task type & data** tab, you specify the data asset for the experiment and the machine learning model to use to train the data.
71+
72+
In this tutorial, you will use the [bike-no.csv](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/forecasting-bike-share/bike-no.csv). If you have not downloaded the file, do so now.
73+
74+
1. On the **Task type and data** form, select **Time series forecasting** as the task type
75+
76+
1. Select **Create** to create a new data asset from the downloaded file.
77+
78+
1. On the **Create data asset** page, select **From local files** from the **+ Create data asset** drop-down menu.
79+
80+
1. Select **Next** to continue to the **Data type** page.
81+
82+
1. On the **Data type** page:
83+
84+
1. Enter a **Data asset** name and description.
85+
1. For the **Type**, select **Tabular** from the dropdown list.
86+
1. Select **Next**.
87+
88+
1. On the **Data source** page, select **From local files**.
89+
90+
Additional options are displayed in the left menu for you to configure the data source.
91+
92+
1. Select **Next** to continue to the **Destination storage type** page, where you specify the Azure Storage location to upload your data asset.
6493

65-
1. On the **Datastore and file selection** form, select the default datastore that was automatically set up during your workspace creation, **workspaceblobstore (Azure Blob Storage)**. This is the storage location where you upload your data file.
94+
1. For the **Datastore type**, select **Azure Blob Storage**.
95+
1. In the list of datastores, select the default datastore that was automatically set up during your workspace creation: "workspaceblobstore".
96+
1. Select **Next**.
6697

67-
1. Select **Upload files** from the **Upload** drop-down.
98+
1. On the **File and folder selection** page, use the **Upload files or folder** dropdown menu and select the **Upload files** option.
6899

69-
1. Choose the **bike-no.csv** file on your local computer. This is the file you downloaded as a [prerequisite](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/forecasting-bike-share/bike-no.csv).
100+
1. Choose the **bike-no.csv** file on your local computer. This is the file you downloaded as a [prerequisite](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/forecasting-bike-share/bike-no.csv).
70101

71-
1. Select **Next**
102+
1. After the files upload, select **Next**.
72103

73-
When the upload is complete, the Settings and preview form is pre-populated based on the file type.
74-
75-
1. Verify that the **Settings and preview** form is populated as follows and select **Next**.
76-
77-
Field|Description| Value for tutorial
78-
---|---|---
79-
File format|Defines the layout and type of data stored in a file.| Delimited
80-
Delimiter|One or more characters for specifying the boundary between  separate, independent regions in plain text or other data streams. |Comma
81-
Encoding|Identifies what bit to character schema table to use to read your dataset.| UTF-8
82-
Column headers| Indicates how the headers of the dataset, if any, will be treated.| Only first file has headers
83-
Skip rows | Indicates how many, if any, rows are skipped in the dataset.| None
84-
85-
1. The **Schema** form allows for further configuration of your data for this experiment.
104+
1. Check your uploaded data on the **Settings** page for accuracy. The fields on the page are prepopulated based on the file type of your data:
105+
106+
| Field | Description |
107+
| --- | --- |
108+
| **File format** | Defines the layout and type of data stored in a file. |
109+
| **Delimiter** | Identifies one or more characters for specifying the boundary between separate, independent regions in plain text or other data streams. |
110+
| **Encoding** | Identifies what bit to character schema table to use to read your dataset. |
111+
| **Column headers** | Indicates how the headers of the dataset, if any, are treated. |
112+
| **Skip rows** | Indicates how many, if any, rows are skipped in the dataset. |
113+
114+
1. Select **Next** to continue to the **Schema** page. This page is also prepopulated based on your **Settings** selections.
86115

87116
1. For this example, choose to ignore the **casual** and **registered** columns. These columns are a breakdown of the **cnt** column so, therefore we don't include them.
88117

89118
1. Also for this example, leave the defaults for the **Properties** and **Type**.
90-
91-
1. Select **Next**.
92119

93-
1. On the **Confirm details** form, verify the information matches what was previously populated on the **Basic info** and **Settings and preview** forms.
94120

95-
1. Select **Create** to complete the creation of your dataset.
121+
## Configure task and forecast settings
96122

97-
1. Select your dataset once it appears in the list.
123+
When the data asset is ready, Machine Learning studio returns to the **Task type & data** tab for the **Submit an Automated ML job** process. The new data asset is listed on the page.
98124

99-
1. Select **Next**.
125+
Follow these steps to complete the job configuration:
100126

101-
## Configure job
127+
1. Expand the **Select task type** dropdown menu, and choose the training model to use for the experiment. The options include classification, regression, time series forecasting, natural language processing (NLP), or computer vision. For more information about these options, see the descriptions of the [supported task types](concept-automated-ml.md#when-to-use-automl-classification-regression-forecasting-computer-vision--nlp).
102128

103-
After you load and configure your data, set up your remote compute target and select which column in your data you want to predict.
129+
1. After you specify the training model, select your dataset in the list.
130+
131+
1. Select **Next** to continue to the **Task settings** tab.
104132

105-
1. Populate the **Configure job** form as follows:
106-
1. Enter an experiment name: `automl-bikeshare`
133+
1. In the **Target column** dropdown list, select the **cnt** column to use for the model predictions.
107134

108-
1. Select **cnt** as the target column, what you want to predict. This column indicates the number of total bike share rentals.
135+
1. Select **date** as your **Time column** and leave **Time series identifiers** blank.
109136

110-
1. Select **compute cluster** as your compute type.
137+
1. The **Frequency** is how often your historic data is collected. Keep **Autodetect** selected.
111138

139+
1. The **forecast horizon** is the length of time into the future you want to predict. Deselect **Autodetect** and type 14 in the field.
140+
141+
1. Select **View additional configuration settings** and populate the fields as follows. These settings are to better control the training job and specify settings for your forecast. Otherwise, defaults are applied based on experiment selection and data.
142+
143+
Additional configurations|Description|Value for tutorial
144+
------|---------|---
145+
Primary metric| Evaluation metric that the machine learning algorithm will be measured by.|Normalized root mean squared error
146+
Explain best model| Automatically shows explainability on the best model created by automated ML.| Enable
147+
Blocked algorithms | Algorithms you want to exclude from the training job| Extreme Random Trees
148+
Additional forecasting settings| These settings help improve the accuracy of your model. <br><br> _**Forecast target lags:**_ how far back you want to construct the lags of the target variable <br> _**Target rolling window**_: specifies the size of the rolling window over which features, such as the *max, min* and *sum*, is generated. | <br><br>Forecast&nbsp;target&nbsp;lags: None <br> Target&nbsp;rolling&nbsp;window&nbsp;size: None
149+
Exit criterion| If a criteria is met, the training job is stopped. |Training&nbsp;job&nbsp;time (hours): 3 <br> Metric&nbsp;score&nbsp;threshold: None
150+
Concurrency| The maximum number of parallel iterations executed per iteration| Max&nbsp;concurrent&nbsp;iterations: 6
151+
152+
1. Select **Save**.
153+
154+
## Configure the compute target
155+
156+
After you load and configure your data, set up your remote compute target and select which column in your data you want to predict.
157+
158+
1. Populate the **Compute** form as follows:
159+
160+
1. Use the **Select compute type** dropdown list to select **compute cluster** as your compute type.
161+
112162
1. Select **+New** to configure your compute target. Automated ML only supports Azure Machine Learning compute.
113163

114164
1. Populate the **Select virtual machine** form to set up your compute.
@@ -121,7 +171,7 @@ After you load and configure your data, set up your remote compute target and se
121171

122172
1. Select **Next** to populate the **Configure settings form**.
123173

124-
Field | Description | Value for tutorial
174+
Field | Description | Value for tutorial
125175
----|---|---
126176
Compute name | A unique name that identifies your compute context. | bike-compute
127177
Min / Max nodes| To profile data, you must specify one or more nodes.|Min nodes: 1<br>Max nodes: 6
@@ -134,42 +184,12 @@ After you load and configure your data, set up your remote compute target and se
134184

135185
1. After creation, select your new compute target from the drop-down list.
136186

137-
1. Select **Next**.
138-
139-
## Select forecast settings
140-
141-
Complete the setup for your automated ML experiment by specifying the machine learning task type and configuration settings.
142-
143-
1. On the **Task type and settings** form, select **Time series forecasting** as the machine learning task type.
144-
145-
1. Select **date** as your **Time column** and leave **Time series identifiers** blank.
187+
1. Select **Next** to continue to the **Review** page. Review the summary of your configuration settings for the job.
146188

147-
1. The **Frequency** is how often your historic data is collected. Keep **Autodetect** selected.
148-
1.
149-
1. The **forecast horizon** is the length of time into the future you want to predict. Deselect **Autodetect** and type 14 in the field.
150-
151-
1. Select **View additional configuration settings** and populate the fields as follows. These settings are to better control the training job and specify settings for your forecast. Otherwise, defaults are applied based on experiment selection and data.
152-
153-
Additional&nbsp;configurations|Description|Value&nbsp;for&nbsp;tutorial
154-
------|---------|---
155-
Primary metric| Evaluation metric that the machine learning algorithm will be measured by.|Normalized root mean squared error
156-
Explain best model| Automatically shows explainability on the best model created by automated ML.| Enable
157-
Blocked algorithms | Algorithms you want to exclude from the training job| Extreme Random Trees
158-
Additional forecasting settings| These settings help improve the accuracy of your model. <br><br> _**Forecast target lags:**_ how far back you want to construct the lags of the target variable <br> _**Target rolling window**_: specifies the size of the rolling window over which features, such as the *max, min* and *sum*, is generated. | <br><br>Forecast&nbsp;target&nbsp;lags: None <br> Target&nbsp;rolling&nbsp;window&nbsp;size: None
159-
Exit criterion| If a criteria is met, the training job is stopped. |Training&nbsp;job&nbsp;time (hours): 3 <br> Metric&nbsp;score&nbsp;threshold: None
160-
Concurrency| The maximum number of parallel iterations executed per iteration| Max&nbsp;concurrent&nbsp;iterations: 6
161-
162-
Select **Save**.
163-
164-
1. Select **Next**.
165-
166-
1. On the **[Optional] Validate and test** form,
167-
1. Select k-fold cross-validation as your **Validation type**.
168-
1. Select 5 as your **Number of cross validations**.
169189

170190
## Run experiment
171191

172-
To run your experiment, select **Finish**. The **Job details** screen opens with the **Job status** at the top next to the job number. This status updates as the experiment progresses. Notifications also appear in the top right corner of the studio, to inform you of the status of your experiment.
192+
To run your experiment, select **Submit training job**. The **Job details** screen opens with the **Job status** at the top next to the job number. This status updates as the experiment progresses. Notifications also appear in the top right corner of the studio, to inform you of the status of your experiment.
173193

174194
>[!IMPORTANT]
175195
> Preparation takes **10-15 minutes** to prepare the experiment job.

0 commit comments

Comments
 (0)