You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/tutorial-automated-ml-forecast.md
+92-72Lines changed: 92 additions & 72 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,7 +36,7 @@ Also try automated machine learning for these other model types:
36
36
37
37
* An Azure Machine Learning workspace. See [Create workspace resources](quickstart-create-resources.md).
38
38
39
-
* Download the [bike-no.csv](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/forecasting-bike-share/bike-no.csv) data file
39
+
* Download the [bike-no.csv](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/forecasting-bike-share/bike-no.csv) data file.
40
40
41
41
## Sign in to the studio
42
42
@@ -46,69 +46,119 @@ For this tutorial, you create your automated ML experiment run in Azure Machine
46
46
47
47
1. Select your subscription and the workspace you created.
48
48
49
-
1. Select **Get started**.
50
49
51
-
1. In the left pane, select **Automated ML** under the **Author** section.
50
+
## Create an experiment
52
51
53
-
1.Select **+New automated ML job**.
52
+
1.On the left menu, select **Automated ML** under the **Authoring** section:
54
53
55
-
## Create and load dataset
54
+
:::image type="content" source="media/how-to-use-automated-ml-for-ml-models/automated-ml-overview.png" border="false" alt-text="Screenshot that shows the Authoring overview page for Automated ML in Azure Machine Learning studio." lightbox="media/how-to-use-automated-ml-for-ml-models/automated-ml-overview-large.png":::
56
55
57
-
Before you configure your experiment, upload your data file to your workspace in the form of an Azure Machine Learning dataset. Doing so, allows you to ensure that your data is formatted appropriately for your experiment.
56
+
The first time you work with experiments in the studio, you see an empty list and links to documentation. Otherwise, you see a list of your recent Automated ML experiments, including items created with the Azure Machine Learning SDK.
58
57
59
-
1.On the **Select dataset** form, select **From local files** from the **+Create dataset** drop-down.
58
+
1. Select **New automated ML job** to start the **Submit an Automated ML job** process.
60
59
61
-
1. On the **Basic info** form, give your dataset a name and provide an optional description. The dataset type should default to **Tabular**, since automated ML in Azure Machine Learning studio currently only supports tabular datasets.
62
-
63
-
1. Select **Next** on the bottom left
60
+
By default, the process selects the **Train automatically** option on the **Training method** tab and continues to the configuration settings.
61
+
62
+
1. On the **Basics settings** tab, enter values for the required settings, including the **Job** name and **Experiment** name. For this tutorial, use `automl-bikeshare` as the experiment name. You can also provide values for the optional settings, as desired.
63
+
64
+
1. Select **Next** to continue.
65
+
66
+
67
+
68
+
## Configure your task type and dataset
69
+
70
+
On the **Task type & data** tab, you specify the data asset for the experiment and the machine learning model to use to train the data.
71
+
72
+
In this tutorial, you will use the [bike-no.csv](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/forecasting-bike-share/bike-no.csv). If you have not downloaded the file, do so now.
73
+
74
+
1. On the **Task type and data** form, select **Time series forecasting** as the task type
75
+
76
+
1. Select **Create** to create a new data asset from the downloaded file.
77
+
78
+
1. On the **Create data asset** page, select **From local files** from the **+ Create data asset** drop-down menu.
79
+
80
+
1. Select **Next** to continue to the **Data type** page.
81
+
82
+
1. On the **Data type** page:
83
+
84
+
1. Enter a **Data asset** name and description.
85
+
1. For the **Type**, select **Tabular** from the dropdown list.
86
+
1. Select **Next**.
87
+
88
+
1. On the **Data source** page, select **From local files**.
89
+
90
+
Additional options are displayed in the left menu for you to configure the data source.
91
+
92
+
1. Select **Next** to continue to the **Destination storage type** page, where you specify the Azure Storage location to upload your data asset.
64
93
65
-
1. On the **Datastore and file selection** form, select the default datastore that was automatically set up during your workspace creation, **workspaceblobstore (Azure Blob Storage)**. This is the storage location where you upload your data file.
94
+
1. For the **Datastore type**, select **Azure Blob Storage**.
95
+
1. In the list of datastores, select the default datastore that was automatically set up during your workspace creation: "workspaceblobstore".
96
+
1. Select **Next**.
66
97
67
-
1. Select **Upload files**from the **Upload**drop-down.
98
+
1. On the **File and folder selection** page, use the **Upload files or folder**dropdown menu and select the **Upload files**option.
68
99
69
-
1. Choose the **bike-no.csv** file on your local computer. This is the file you downloaded as a [prerequisite](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/forecasting-bike-share/bike-no.csv).
100
+
1. Choose the **bike-no.csv** file on your local computer. This is the file you downloaded as a [prerequisite](https://github.com/Azure/azureml-examples/blob/v1-archive/v1/python-sdk/tutorials/automl-with-azureml/forecasting-bike-share/bike-no.csv).
70
101
71
-
1.Select **Next**
102
+
1.After the files upload, select **Next**.
72
103
73
-
When the upload is complete, the Settings and preview form is pre-populated based on the file type.
74
-
75
-
1. Verify that the **Settings and preview** form is populated as follows and select **Next**.
76
-
77
-
Field|Description| Value for tutorial
78
-
---|---|---
79
-
File format|Defines the layout and type of data stored in a file.| Delimited
80
-
Delimiter|One or more characters for specifying the boundary between separate, independent regions in plain text or other data streams. |Comma
81
-
Encoding|Identifies what bit to character schema table to use to read your dataset.| UTF-8
82
-
Column headers| Indicates how the headers of the dataset, if any, will be treated.| Only first file has headers
83
-
Skip rows | Indicates how many, if any, rows are skipped in the dataset.| None
84
-
85
-
1. The **Schema** form allows for further configuration of your data for this experiment.
104
+
1. Check your uploaded data on the **Settings** page for accuracy. The fields on the page are prepopulated based on the file type of your data:
105
+
106
+
| Field | Description |
107
+
| --- | --- |
108
+
|**File format**| Defines the layout and type of data stored in a file. |
109
+
|**Delimiter**| Identifies one or more characters for specifying the boundary between separate, independent regions in plain text or other data streams. |
110
+
|**Encoding**| Identifies what bit to character schema table to use to read your dataset. |
111
+
|**Column headers**| Indicates how the headers of the dataset, if any, are treated. |
112
+
|**Skip rows**| Indicates how many, if any, rows are skipped in the dataset. |
113
+
114
+
1. Select **Next** to continue to the **Schema** page. This page is also prepopulated based on your **Settings** selections.
86
115
87
116
1. For this example, choose to ignore the **casual** and **registered** columns. These columns are a breakdown of the **cnt** column so, therefore we don't include them.
88
117
89
118
1. Also for this example, leave the defaults for the **Properties** and **Type**.
90
-
91
-
1. Select **Next**.
92
119
93
-
1. On the **Confirm details** form, verify the information matches what was previously populated on the **Basic info** and **Settings and preview** forms.
94
120
95
-
1. Select **Create** to complete the creation of your dataset.
121
+
## Configure task and forecast settings
96
122
97
-
1. Select your dataset once it appears in the list.
123
+
When the data asset is ready, Machine Learning studio returns to the **Task type & data** tab for the **Submit an Automated ML job** process. The new data asset is listed on the page.
98
124
99
-
1. Select **Next**.
125
+
Follow these steps to complete the job configuration:
100
126
101
-
## Configure job
127
+
1. Expand the **Select task type** dropdown menu, and choose the training model to use for the experiment. The options include classification, regression, time series forecasting, natural language processing (NLP), or computer vision. For more information about these options, see the descriptions of the [supported task types](concept-automated-ml.md#when-to-use-automl-classification-regression-forecasting-computer-vision--nlp).
102
128
103
-
After you load and configure your data, set up your remote compute target and select which column in your data you want to predict.
129
+
1. After you specify the training model, select your dataset in the list.
130
+
131
+
1. Select **Next** to continue to the **Task settings** tab.
104
132
105
-
1. Populate the **Configure job** form as follows:
106
-
1. Enter an experiment name: `automl-bikeshare`
133
+
1. In the **Target column** dropdown list, select the **cnt** column to use for the model predictions.
107
134
108
-
1. Select **cnt** as the target column, what you want to predict. This column indicates the number of total bike share rentals.
135
+
1. Select **date** as your **Time column** and leave **Time series identifiers** blank.
109
136
110
-
1.Select**compute cluster**as your compute type.
137
+
1.The**Frequency**is how often your historic data is collected. Keep **Autodetect** selected.
111
138
139
+
1. The **forecast horizon** is the length of time into the future you want to predict. Deselect **Autodetect** and type 14 in the field.
140
+
141
+
1. Select **View additional configuration settings** and populate the fields as follows. These settings are to better control the training job and specify settings for your forecast. Otherwise, defaults are applied based on experiment selection and data.
Primary metric| Evaluation metric that the machine learning algorithm will be measured by.|Normalized root mean squared error
146
+
Explain best model| Automatically shows explainability on the best model created by automated ML.| Enable
147
+
Blocked algorithms | Algorithms you want to exclude from the training job| Extreme Random Trees
148
+
Additional forecasting settings| These settings help improve the accuracy of your model. <br><br> _**Forecast target lags:**_ how far back you want to construct the lags of the target variable <br> _**Target rolling window**_: specifies the size of the rolling window over which features, such as the *max, min* and *sum*, is generated. | <br><br>Forecast target lags: None <br> Target rolling window size: None
149
+
Exit criterion| If a criteria is met, the training job is stopped. |Training job time (hours): 3 <br> Metric score threshold: None
150
+
Concurrency| The maximum number of parallel iterations executed per iteration| Max concurrent iterations: 6
151
+
152
+
1. Select **Save**.
153
+
154
+
## Configure the compute target
155
+
156
+
After you load and configure your data, set up your remote compute target and select which column in your data you want to predict.
157
+
158
+
1. Populate the **Compute** form as follows:
159
+
160
+
1. Use the **Select compute type** dropdown list to select **compute cluster** as your compute type.
161
+
112
162
1. Select **+New** to configure your compute target. Automated ML only supports Azure Machine Learning compute.
113
163
114
164
1. Populate the **Select virtual machine** form to set up your compute.
@@ -121,7 +171,7 @@ After you load and configure your data, set up your remote compute target and se
121
171
122
172
1. Select **Next** to populate the **Configure settings form**.
123
173
124
-
Field | Description | Value for tutorial
174
+
Field | Description | Value for tutorial
125
175
----|---|---
126
176
Compute name | A unique name that identifies your compute context. | bike-compute
127
177
Min / Max nodes| To profile data, you must specify one or more nodes.|Min nodes: 1<br>Max nodes: 6
@@ -134,42 +184,12 @@ After you load and configure your data, set up your remote compute target and se
134
184
135
185
1. After creation, select your new compute target from the drop-down list.
136
186
137
-
1. Select **Next**.
138
-
139
-
## Select forecast settings
140
-
141
-
Complete the setup for your automated ML experiment by specifying the machine learning task type and configuration settings.
142
-
143
-
1. On the **Task type and settings** form, select **Time series forecasting** as the machine learning task type.
144
-
145
-
1. Select **date** as your **Time column** and leave **Time series identifiers** blank.
187
+
1. Select **Next** to continue to the **Review** page. Review the summary of your configuration settings for the job.
146
188
147
-
1. The **Frequency** is how often your historic data is collected. Keep **Autodetect** selected.
148
-
1.
149
-
1. The **forecast horizon** is the length of time into the future you want to predict. Deselect **Autodetect** and type 14 in the field.
150
-
151
-
1. Select **View additional configuration settings** and populate the fields as follows. These settings are to better control the training job and specify settings for your forecast. Otherwise, defaults are applied based on experiment selection and data.
Primary metric| Evaluation metric that the machine learning algorithm will be measured by.|Normalized root mean squared error
156
-
Explain best model| Automatically shows explainability on the best model created by automated ML.| Enable
157
-
Blocked algorithms | Algorithms you want to exclude from the training job| Extreme Random Trees
158
-
Additional forecasting settings| These settings help improve the accuracy of your model. <br><br> _**Forecast target lags:**_ how far back you want to construct the lags of the target variable <br> _**Target rolling window**_: specifies the size of the rolling window over which features, such as the *max, min* and *sum*, is generated. | <br><br>Forecast target lags: None <br> Target rolling window size: None
159
-
Exit criterion| If a criteria is met, the training job is stopped. |Training job time (hours): 3 <br> Metric score threshold: None
160
-
Concurrency| The maximum number of parallel iterations executed per iteration| Max concurrent iterations: 6
161
-
162
-
Select **Save**.
163
-
164
-
1. Select **Next**.
165
-
166
-
1. On the **[Optional] Validate and test** form,
167
-
1. Select k-fold cross-validation as your **Validation type**.
168
-
1. Select 5 as your **Number of cross validations**.
169
189
170
190
## Run experiment
171
191
172
-
To run your experiment, select **Finish**. The **Job details** screen opens with the **Job status** at the top next to the job number. This status updates as the experiment progresses. Notifications also appear in the top right corner of the studio, to inform you of the status of your experiment.
192
+
To run your experiment, select **Submit training job**. The **Job details** screen opens with the **Job status** at the top next to the job number. This status updates as the experiment progresses. Notifications also appear in the top right corner of the studio, to inform you of the status of your experiment.
173
193
174
194
>[!IMPORTANT]
175
195
> Preparation takes **10-15 minutes** to prepare the experiment job.
0 commit comments