You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#Customer intent: As a professional data scientist, I want to know how to build and deploy a model with Azure Machine Learning by using Python in a Jupyter Notebook.
This tutorial series shows how features seamlessly integrate all phases of the machine learning lifecycle: prototyping, training, and operationalization.
21
21
22
-
The first tutorial showed how to create a feature set specification with custom transformations, and then use that feature set to generate training data, enable materialization, and perform a backfill. This tutorial shows how to enable materialization, and perform a backfill. It also shows how to experiment with features, as a way to improve model performance.
22
+
The first tutorial showed how to create a feature set specification with custom transformations. Then, it showed how to use that feature set to generate training data, enable materialization, and perform a backfill. This tutorial shows how to enable materialization and perform a backfill. It also shows how to experiment with features, as a way to improve model performance.
23
23
24
24
In this tutorial, you learn how to:
25
25
26
26
> [!div class="checklist"]
27
-
> * Prototype a new `accounts` feature set specification, by using existing precomputed values as features. Then, register the local feature set specification as a feature set in the feature store. This process differs from the first tutorial, where you created a feature set that had custom transformations.
27
+
> * Prototype a new `accounts` feature set specification, through use of existing precomputed values as features. Then, register the local feature set specification as a feature set in the feature store. This process differs from the first tutorial, where you created a feature set that had custom transformations.
28
28
> * Select features for the model from the `transactions` and `accounts` feature sets, and save them as a feature retrieval specification.
29
29
> * Run a training pipeline that uses the feature retrieval specification to train a new model. This pipeline uses the built-in feature retrieval component to generate the training data.
30
30
@@ -40,22 +40,22 @@ Before you proceed with this tutorial, be sure to complete the first tutorial in
40
40
41
41
1. On the top menu, in the **Compute** dropdown list, select **Serverless Spark Compute** under **Azure Machine Learning Serverless Spark**.
42
42
43
-
2. Configure the session:
43
+
1. Configure the session:
44
44
45
45
1. When the toolbar displays **Configure session**, select it.
46
-
2. On the **Python packages** tab, select **Upload Conda file**.
47
-
3. Upload the *conda.yml* file that you [uploaded in the first tutorial](./tutorial-get-started-with-feature-store.md#prepare-the-notebook-environment).
48
-
4. Optionally, increase the session time-out (idle time) to avoid frequent prerequisite reruns.
46
+
1. On the **Python packages** tab, select **Upload Conda file**.
47
+
1. Upload the *conda.yml* file that you [uploaded in the first tutorial](./tutorial-get-started-with-feature-store.md#prepare-the-notebook-environment).
48
+
1. As an option, you can increase the session time-out (idle time) to avoid frequent prerequisite reruns.
8. Create a compute cluster named `cpu-cluster` in the project workspace.
95
+
1. Create a compute cluster named `cpu-cluster` in the project workspace.
96
96
97
97
You need this compute cluster when you run the training/batch inference jobs.
98
98
@@ -104,12 +104,12 @@ In the first tutorial, you created a `transactions` feature set that had custom
104
104
105
105
To onboard precomputed features, you can create a feature set specification without writing any transformation code. You use a feature set specification to develop and test a feature set in a fully local development environment.
106
106
107
-
You don't need to connect to a feature store. In this procedure, you create the feature set specification locally, and then sample the values from it. For capabilities of managed feature store, you must use a feature asset definition to register the feature set specification with a feature store. Later steps in this tutorial provide more details.
107
+
You don't need to connect to a feature store. In this procedure, you create the feature set specification locally, and then sample the values from it. To benefit from the capabilities of managed feature store, you must use a feature asset definition to register the feature set specification with a feature store. Later steps in this tutorial provide more details.
108
108
109
109
1. Explore the source data for the accounts.
110
110
111
111
> [!NOTE]
112
-
> This notebook uses sample data hosted in a publicly accessible blob container. Only a `wasbs` driver can read it in Spark. When you create feature sets by using your own source data, host those feature sets in an Azure Data Lake Storage Gen2 account, and use an `abfss` driver in the data path.
112
+
> This notebook uses sample data hosted in a publicly accessible blob container. Only a `wasbs` driver can read it in Spark. When you create feature sets through use of your own source data, host those feature sets in an Azure Data Lake Storage Gen2 account, and use an `abfss` driver in the data path.
@@ -133,7 +133,7 @@ You don't need to connect to a feature store. In this procedure, you create the
133
133
134
134
-`index_columns`: The join keys required to access values from the feature set.
135
135
136
-
To learn more, see [Understanding top-level entities in managed feature store](./concept-top-level-entities-in-managed-feature-store.md) and the [CLI (v2) feature set specification YAML schema](./reference-yaml-featureset-spec.md).
136
+
To learn more, visit the [Understanding top-level entities in managed feature store](./concept-top-level-entities-in-managed-feature-store.md) and the [CLI (v2) feature set specification YAML schema](./reference-yaml-featureset-spec.md) resources.
137
137
138
138
As an extra benefit, persisting supports source control.
139
139
@@ -143,7 +143,7 @@ You don't need to connect to a feature store. In this procedure, you create the
143
143
144
144
## Locally experiment with unregistered features and register with feature store when ready
145
145
146
-
As you develop features, you might want to locally test and validate them before you register them with the feature store or run training pipelines in the cloud. A combination of a local unregistered feature set (`accounts`) and a feature set registered in the feature store (`transactions`) generates training data for the machine learning model.
146
+
As you develop features, you might want to locally test and validate them, before you register them with the feature store or run training pipelines in the cloud. A combination of a local unregistered feature set (`accounts`) and a feature set registered in the feature store (`transactions`) generates training data for the machine learning model.
147
147
148
148
1. Select features for the model.
149
149
@@ -159,7 +159,7 @@ As you develop features, you might want to locally test and validate them before
159
159
160
160
1. Register the `accounts` feature set with the feature store.
161
161
162
-
After you locally experiment with feature definitions, and they seem reasonable, you can register a feature set asset definition with the feature store.
162
+
After you locally experiment with feature definitions, and if they seem reasonable, you can register a feature set asset definition with the feature store.
@@ -169,7 +169,7 @@ As you develop features, you might want to locally test and validate them before
169
169
170
170
## Run a training experiment
171
171
172
-
In the following steps, you select a list of features, run a training pipeline, and register the model. You can repeat these steps until the model performs as you want.
172
+
In these steps, you select a list of features, run a training pipeline, and register the model. You can repeat these steps until the model performs as you want.
173
173
174
174
1. Optionally, discover features from the feature store UI.
175
175
@@ -187,19 +187,19 @@ In the following steps, you select a list of features, run a training pipeline,
187
187
188
188
1. Select features for the model, and export the model as a feature retrieval specification.
189
189
190
-
In the previous steps, you selected features from a combination of registered and unregistered feature sets, for local experimentation and testing. You can now experiment in the cloud. Your model-shipping agility increases if you save the selected features as a feature retrieval specification, and then use the specification in the machine learning operations (MLOps) or continuous integration and continuous delivery (CI/CD) flow for training and inference.
190
+
In the previous steps, you selected features from a combination of registered and unregistered feature sets for local experimentation and testing. You can now experiment in the cloud. Your model-shipping agility increases if you save the selected features as a feature retrieval specification, and then use the specification in the machine learning operations (MLOps) or continuous integration and continuous delivery (CI/CD) flow for training and inference.
2. Export the selected features as a feature retrieval specification.
196
+
1. Export the selected features as a feature retrieval specification.
197
197
198
-
A feature retrieval specification is a portable definition of the feature list associated with a model. It can help streamline the development and operationalization of a machine learning model. It becomes an input to the training pipeline that generates the training data. It's then packaged with the model.
198
+
A feature retrieval specification is a portable definition of the feature list associated with a model. It can help streamline the development and operationalization of a machine learning model. It becomes an input to the training pipeline that generates the training data. Then, it's packaged with the model.
199
199
200
200
The inference phase uses the feature retrieval to look up the features. It integrates all phases of the machine learning lifecycle. Changes to the training/inference pipeline can stay at a minimum as you experiment and deploy.
201
201
202
-
Use of the feature retrieval specification and the built-in feature retrieval component is optional. You can directly use the `get_offline_features()` API, as shown earlier. The name of the specification should be *feature_retrieval_spec.yaml* when it's packaged with the model. This way, the system can recognize it.
202
+
Use of the feature retrieval specification and the built-in feature retrieval component is optional. You can directly use the `get_offline_features()` API, as shown earlier. The name of the specification should be *feature_retrieval_spec.yaml* when you package it with the model. This way, the system can recognize it.
@@ -211,7 +211,7 @@ In this procedure, you manually trigger the training pipeline. In a production s
211
211
212
212
The training pipeline has these steps:
213
213
214
-
1. Feature retrieval: For its input, this built-in component takes the feature retrieval specification, the observation data, and the time-stamp column name. It then generates the training data as output. It runs these steps as a managed Spark job.
214
+
1. Feature retrieval: For its input, this built-in component takes the feature retrieval specification, the observation data, and the **time-stamp** column name. It then generates the training data as output. It runs these steps as a managed Spark job.
215
215
216
216
1. Training: Based on the training data, this step trains the model and then generates a model (not yet registered).
217
217
@@ -228,20 +228,20 @@ In this procedure, you manually trigger the training pipeline. In a production s
228
228
229
229
- To display the pipeline steps, select the hyperlink for the **Web View** pipeline, and open it in a new window.
230
230
231
-
2. Use the feature retrieval specification in the model artifacts:
231
+
1. Use the feature retrieval specification in the model artifacts:
232
232
233
233
1. On the left pane of the current workspace, select **Models** with the right mouse button.
234
-
2. Select **Open in a new tab or window**.
235
-
3. Select **fraud_model**.
236
-
4. Select **Artifacts**.
234
+
1. Select **Open in a new tab or window**.
235
+
1. Select **fraud_model**.
236
+
1. Select **Artifacts**.
237
237
238
-
The feature retrieval specification is packaged along with the model. The model registration step in the training pipeline handled this step. You created the feature retrieval specification during experimentation. Now it's part of the model definition. In the next tutorial, you'll see how inferencing uses it.
238
+
The feature retrieval specification is packaged along with the model. The model registration step in the training pipeline handled this step. You created the feature retrieval specification during experimentation. Now it's part of the model definition. In the next tutorial, you'll see how the inferencing process uses it.
239
239
240
240
## View the feature set and model dependencies
241
241
242
242
1. View the list of feature sets associated with the model.
243
243
244
-
On the same **Models** page, select the **Feature sets** tab. This tab shows both the `transactions` and `accounts` feature sets on which this model depends.
244
+
On the same **Models** page, select the **Feature sets** tab. This tab shows both the `transactions` and `accounts` feature sets. This model depends on these feature sets.
245
245
246
246
1. View the list of models that use the feature sets:
0 commit comments