You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/tutorial-first-experiment-automated-ml.md
+34-32Lines changed: 34 additions & 32 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ ms.topic: tutorial
9
9
author: ssalgadodev
10
10
ms.author: ssalgado
11
11
ms.reviewer: manashg
12
-
ms.date: 08/08/2023
12
+
ms.date: 08/09/2024
13
13
ms.custom: automl, build-2023
14
14
#Customer intent: As a non-coding data scientist, I want to use automated machine learning techniques so that I can build a classification model.
15
15
---
@@ -76,20 +76,28 @@ You complete the following experiment set-up and run steps via the Azure Machine
76
76
77
77

78
78
79
-
1. Select **+New automated ML job**.
79
+
1. Select **+New automated ML job**.
80
+
81
+
1. Select **Train automatically**
82
+
83
+
1. Select **Start configuring job**
84
+
85
+
1. In the **Experiment name** section, select the option **Create new** and enter this experiment name: `my-1st-automl-experiment`
80
86
81
87
## Create and load a dataset as a data asset
82
88
89
+
83
90
Before you configure your experiment, upload your data file to your workspace in the form of an Azure Machine Learning data asset. In the case of this tutorial, you can think of a data asset as your dataset for the AutoML job. Doing so, allows you to ensure that your data is formatted appropriately for your experiment.
84
91
85
-
1. Create a new data asset by selecting **From local files** from the **+Create data asset** drop-down.
92
+
1. Select **Classfication** as your task type.
93
+
94
+
1. Create a new data asset by selecting **Create**.
86
95
87
96
1. On the **Basic info** form, give your data asset a name and provide an optional description. The automated ML interface currently only supports TabularDatasets, so the dataset type should default to *Tabular*.
88
97
89
98
1. Select **Next** on the bottom left
90
99
91
100
1. On the **Datastore and file selection** form, select the default datastore that was automatically set up during your workspace creation, **workspaceblobstore (Azure Blob Storage)**. This is where you'll upload your data file to make it available to your workspace.
92
-
93
101
1. Select **Upload files** from the **Upload** drop-down.
94
102
95
103
1. Choose the **bankmarketing_train.csv** file on your local computer. This is the file you downloaded as a [prerequisite](https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/bankmarketing_train.csv).
@@ -124,17 +132,30 @@ Before you configure your experiment, upload your data file to your workspace in
124
132
125
133
After you load and configure your data, you can set up your experiment. This setup includes experiment design tasks such as, selecting the size of your compute environment and specifying what column you want to predict.
126
134
127
-
1. Select the **Create new** radio button.
128
-
129
135
1. Populate the **Configure Job** form as follows:
130
-
1. Enter this experiment name: `my-1st-automl-experiment`
131
136
132
137
1. Select **y** as the target column, what you want to predict. This column indicates whether the client subscribed to a term deposit or not.
138
+
1. Select **View additional configuration settings** and populate the fields as follows. These settings are to better control the training job. Otherwise, defaults are applied based on experiment selection and data.
Primary metric| Evaluation metric that the machine learning algorithm will be measured by.|AUC_weighted
143
+
Explain best model| Automatically shows explainability on the best model created by automated ML.| Enable
144
+
Blocked algorithms | Algorithms you want to exclude from the training job| None
145
+
Additional classification settings | These settings help improve the accuracy of your model |Positive class label: None
146
+
Exit criterion| If a criteria is met, the training job is stopped. |Training job time (hours): 1 <br> Metric score threshold: None
147
+
Concurrency| The maximum number of parallel iterations executed per iteration| Max concurrent iterations: 5
148
+
149
+
1. Select **Save**.
133
150
134
-
1. Select **compute cluster** as your compute type.
135
-
1. A compute target is a local or cloud-based resource environment used to run your training script or host your service deployment. For this experiment, you can either try a cloud-based serverless compute (preview) or create your own cloud-based compute.
136
-
1. To use serverless compute, [enable the preview feature](./how-to-use-serverless-compute.md#how-to-use-serverless-compute), select **Serverless**, and skip the rest of this step.
137
-
1. To create your own compute target, select **+New** to configure your compute target.
151
+
1. On the **[Optional] Validate and test** form,
152
+
1. Select k-fold cross-validation as your **Validation type**.
153
+
1. Select 2 as your **Number of cross validations**.
154
+
1. Select **Next**
155
+
1. Select **compute cluster** as your compute type.
156
+
1. A compute target is a local or cloud-based resource environment used to run your training script or host your service deployment. For this experiment, you can either try a cloud-based serverless compute (preview) or create your own cloud-based compute.
157
+
1. To use serverless compute, [enable the preview feature](./how-to-use-serverless-compute.md#how-to-use-serverless-compute), select **Serverless**, and skip the rest of this step.
158
+
1. To create your own compute target, select **+New** to configure your compute target.
138
159
1. Populate the **Select virtual machine** form to set up your compute.
139
160
140
161
Field | Description | Value for tutorial
@@ -161,31 +182,12 @@ After you load and configure your data, you can set up your experiment. This set
161
182
162
183
1. After creation, select your new compute target from the drop-down list.
163
184
164
-
1. Select **Next**.
185
+
1. Select **Next**.
165
186
166
-
1. On the **Select task and settings** form, complete the setup for your automated ML experiment by specifying the machine learning task type and configuration settings.
167
-
168
-
1. Select **Classification** as the machine learning task type.
169
187
170
-
1. Select **View additional configuration settings** and populate the fields as follows. These settings are to better control the training job. Otherwise, defaults are applied based on experiment selection and data.
Primary metric| Evaluation metric that the machine learning algorithm will be measured by.|AUC_weighted
175
-
Explain best model| Automatically shows explainability on the best model created by automated ML.| Enable
176
-
Blocked algorithms | Algorithms you want to exclude from the training job| None
177
-
Additional classification settings | These settings help improve the accuracy of your model |Positive class label: None
178
-
Exit criterion| If a criteria is met, the training job is stopped. |Training job time (hours): 1 <br> Metric score threshold: None
179
-
Concurrency| The maximum number of parallel iterations executed per iteration| Max concurrent iterations: 5
180
-
181
-
Select **Save**.
182
-
1. Select **Next**.
183
-
184
-
1. On the **[Optional] Validate and test** form,
185
-
1. Select k-fold cross-validation as your **Validation type**.
186
-
1. Select 2 as your **Number of cross validations**.
187
189
188
-
1. Select **Finish** to run the experiment. The **Job Detail** screen opens with the **Job status** at the top as the experiment preparation begins. This status updates as the experiment progresses. Notifications also appear in the top right corner of the studio to inform you of the status of your experiment.
190
+
1. Select **Submit training job** to run the experiment. The **Job overview** screen opens with the **Job status** at the top as the experiment preparation begins. This status updates as the experiment progresses. Notifications also appear in the top right corner of the studio to inform you of the status of your experiment.
189
191
190
192
>[!IMPORTANT]
191
193
> Preparation takes **10-15 minutes** to prepare the experiment run.
0 commit comments