Merge pull request #568 from fbsolo-ms1/document-freshness-maintenance

prmerger-automator[bot] · web-flow · commit 1f7dea704999 · 2024-09-30T21:16:52.000Z
Freshness update for tutorial-experiment-train-models-using-features.md . . .
diff --git a/articles/machine-learning/tutorial-experiment-train-models-using-features.md b/articles/machine-learning/tutorial-experiment-train-models-using-features.md
@@ -9,7 +9,7 @@ ms.subservice: core
 ms.topic: tutorial
 author: fbsolo-ms1
 ms.author: franksolomon
-ms.date: 10/27/2023
+ms.date: 09/30/2024
 ms.reviewer: seramasu
 ms.custom: sdkv2, build-2023, ignite-2023, update-code
 #Customer intent: As a professional data scientist, I want to know how to build and deploy a model with Azure Machine Learning by using Python in a Jupyter Notebook.
@@ -19,12 +19,12 @@ ms.custom: sdkv2, build-2023, ignite-2023, update-code
 
 This tutorial series shows how features seamlessly integrate all phases of the machine learning lifecycle: prototyping, training, and operationalization.
 
-The first tutorial showed how to create a feature set specification with custom transformations, and then use that feature set to generate training data, enable materialization, and perform a backfill. This tutorial shows how to enable materialization, and perform a backfill. It also shows how to experiment with features, as a way to improve model performance.
+The first tutorial showed how to create a feature set specification with custom transformations. Then, it showed how to use that feature set to generate training data, enable materialization, and perform a backfill. This tutorial shows how to enable materialization and perform a backfill. It also shows how to experiment with features, as a way to improve model performance.
 
 In this tutorial, you learn how to:
 
 > [!div class="checklist"]
-> * Prototype a new `accounts` feature set specification, by using existing precomputed values as features. Then, register the local feature set specification as a feature set in the feature store. This process differs from the first tutorial, where you created a feature set that had custom transformations.
+> * Prototype a new `accounts` feature set specification, through use of existing precomputed values as features. Then, register the local feature set specification as a feature set in the feature store. This process differs from the first tutorial, where you created a feature set that had custom transformations.
 > * Select features for the model from the `transactions` and `accounts` feature sets, and save them as a feature retrieval specification.
 > * Run a training pipeline that uses the feature retrieval specification to train a new model. This pipeline uses the built-in feature retrieval component to generate the training data.
 
@@ -40,22 +40,22 @@ Before you proceed with this tutorial, be sure to complete the first tutorial in
 
    1. On the top menu, in the **Compute** dropdown list, select **Serverless Spark Compute** under **Azure Machine Learning Serverless Spark**.
 
-   2. Configure the session:
+   1. Configure the session:
 
       1. When the toolbar displays **Configure session**, select it.
-      2. On the **Python packages** tab, select **Upload Conda file**.
-      3. Upload the *conda.yml* file that you [uploaded in the first tutorial](./tutorial-get-started-with-feature-store.md#prepare-the-notebook-environment).
-      4. Optionally, increase the session time-out (idle time) to avoid frequent prerequisite reruns.
+      1. On the **Python packages** tab, select **Upload Conda file**.
+      1. Upload the *conda.yml* file that you [uploaded in the first tutorial](./tutorial-get-started-with-feature-store.md#prepare-the-notebook-environment).
+      1. As an option, you can increase the session time-out (idle time) to avoid frequent prerequisite reruns.
 
-2. Start the Spark session.
+1. Start the Spark session.
 
    [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_only/2.Experiment-train-models-using-features.ipynb?name=start-spark-session)]
 
-3. Set up the root directory for the samples.
+1. Set up the root directory for the samples.
 
    [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_only/2.Experiment-train-models-using-features.ipynb?name=root-dir)]
 
-4. Set up the CLI.
+1. Set up the CLI.
    ### [Python SDK](#tab/python)
 
    Not applicable.
@@ -66,33 +66,33 @@ Before you proceed with this tutorial, be sure to complete the first tutorial in
 
       [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_and_cli/2.Experiment-train-models-using-features.ipynb?name=install-ml-ext-cli)]
 
-   2. Authenticate.
+   1. Authenticate.
 
       [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_and_cli/2.Experiment-train-models-using-features.ipynb?name=auth-cli)]
 
-   3. Set the default subscription.
+   1. Set the default subscription.
 
       [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_and_cli/2.Experiment-train-models-using-features.ipynb?name=set-default-subs-cli)]
 
    ---
 
-5. Initialize the project workspace variables.
+1. Initialize the project workspace variables.
 
    This is the current workspace, and the tutorial notebook runs in this resource.
 
    [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_only/2.Experiment-train-models-using-features.ipynb?name=init-ws-crud-client)]
 
-6. Initialize the feature store variables.
+1. Initialize the feature store variables.
 
-   Be sure to update the `featurestore_name` and `featurestore_location` values to reflect what you created in the first tutorial.
+   Be sure to update the `featurestore_name` and `featurestore_location` values, to reflect what you created in the first tutorial.
 
    [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_only/2.Experiment-train-models-using-features.ipynb?name=init-fs-crud-client)]
 
-7. Initialize the feature store consumption client.
+1. Initialize the feature store consumption client.
 
    [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_only/2.Experiment-train-models-using-features.ipynb?name=init-fs-core-sdk)]
 
-8. Create a compute cluster named `cpu-cluster` in the project workspace.
+1. Create a compute cluster named `cpu-cluster` in the project workspace.
 
    You need this compute cluster when you run the training/batch inference jobs.
 
@@ -104,12 +104,12 @@ In the first tutorial, you created a `transactions` feature set that had custom
 
 To onboard precomputed features, you can create a feature set specification without writing any transformation code. You use a feature set specification to develop and test a feature set in a fully local development environment.
 
-You don't need to connect to a feature store. In this procedure, you create the feature set specification locally, and then sample the values from it. For capabilities of managed feature store, you must use a feature asset definition to register the feature set specification with a feature store. Later steps in this tutorial provide more details.
+You don't need to connect to a feature store. In this procedure, you create the feature set specification locally, and then sample the values from it. To benefit from the capabilities of managed feature store, you must use a feature asset definition to register the feature set specification with a feature store. Later steps in this tutorial provide more details.
 
 1. Explore the source data for the accounts.
 
    > [!NOTE]
-   > This notebook uses sample data hosted in a publicly accessible blob container. Only a `wasbs` driver can read it in Spark. When you create feature sets by using your own source data, host those feature sets in an Azure Data Lake Storage Gen2 account, and use an `abfss` driver in the data path.
+   > This notebook uses sample data hosted in a publicly accessible blob container. Only a `wasbs` driver can read it in Spark. When you create feature sets through use of your own source data, host those feature sets in an Azure Data Lake Storage Gen2 account, and use an `abfss` driver in the data path.
 
    [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_only/2.Experiment-train-models-using-features.ipynb?name=explore-accts-fset-src-data)]
 
@@ -133,7 +133,7 @@ You don't need to connect to a feature store. In this procedure, you create the
 
    - `index_columns`: The join keys required to access values from the feature set.
 
-   To learn more, see [Understanding top-level entities in managed feature store](./concept-top-level-entities-in-managed-feature-store.md) and the [CLI (v2) feature set specification YAML schema](./reference-yaml-featureset-spec.md).
+   To learn more, visit the [Understanding top-level entities in managed feature store](./concept-top-level-entities-in-managed-feature-store.md) and the [CLI (v2) feature set specification YAML schema](./reference-yaml-featureset-spec.md) resources.
 
    As an extra benefit, persisting supports source control.
 
@@ -143,7 +143,7 @@ You don't need to connect to a feature store. In this procedure, you create the
 
 ## Locally experiment with unregistered features and register with feature store when ready
 
-As you develop features, you might want to locally test and validate them before you register them with the feature store or run training pipelines in the cloud. A combination of a local unregistered feature set (`accounts`) and a feature set registered in the feature store (`transactions`) generates training data for the machine learning model.
+As you develop features, you might want to locally test and validate them, before you register them with the feature store or run training pipelines in the cloud. A combination of a local unregistered feature set (`accounts`) and a feature set registered in the feature store (`transactions`) generates training data for the machine learning model.
 
 1. Select features for the model.
 
@@ -159,7 +159,7 @@ As you develop features, you might want to locally test and validate them before
 
 1. Register the `accounts` feature set with the feature store.
 
-   After you locally experiment with feature definitions, and they seem reasonable, you can register a feature set asset definition with the feature store.
+   After you locally experiment with feature definitions, and if they seem reasonable, you can register a feature set asset definition with the feature store.
 
    [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_only/2.Experiment-train-models-using-features.ipynb?name=reg-accts-fset)]
 
@@ -169,7 +169,7 @@ As you develop features, you might want to locally test and validate them before
 
 ## Run a training experiment
 
-In the following steps, you select a list of features, run a training pipeline, and register the model. You can repeat these steps until the model performs as you want.
+In these steps, you select a list of features, run a training pipeline, and register the model. You can repeat these steps until the model performs as you want.
 
 1. Optionally, discover features from the feature store UI.
 
@@ -187,19 +187,19 @@ In the following steps, you select a list of features, run a training pipeline,
 
 1. Select features for the model, and export the model as a feature retrieval specification.
 
-   In the previous steps, you selected features from a combination of registered and unregistered feature sets, for local experimentation and testing. You can now experiment in the cloud. Your model-shipping agility increases if you save the selected features as a feature retrieval specification, and then use the specification in the machine learning operations (MLOps) or continuous integration and continuous delivery (CI/CD) flow for training and inference.
+   In the previous steps, you selected features from a combination of registered and unregistered feature sets for local experimentation and testing. You can now experiment in the cloud. Your model-shipping agility increases if you save the selected features as a feature retrieval specification, and then use the specification in the machine learning operations (MLOps) or continuous integration and continuous delivery (CI/CD) flow for training and inference.
 
    1. Select features for the model.
 
       [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_only/2.Experiment-train-models-using-features.ipynb?name=select-reg-features)]
 
-   2. Export the selected features as a feature retrieval specification.
+   1. Export the selected features as a feature retrieval specification.
 
-      A feature retrieval specification is a portable definition of the feature list associated with a model. It can help streamline the development and operationalization of a machine learning model. It becomes an input to the training pipeline that generates the training data. It's then packaged with the model.
+      A feature retrieval specification is a portable definition of the feature list associated with a model. It can help streamline the development and operationalization of a machine learning model. It becomes an input to the training pipeline that generates the training data. Then, it's packaged with the model.
 
       The inference phase uses the feature retrieval to look up the features. It integrates all phases of the machine learning lifecycle. Changes to the training/inference pipeline can stay at a minimum as you experiment and deploy.
 
-      Use of the feature retrieval specification and the built-in feature retrieval component is optional. You can directly use the `get_offline_features()` API, as shown earlier. The name of the specification should be *feature_retrieval_spec.yaml* when it's packaged with the model. This way, the system can recognize it.
+      Use of the feature retrieval specification and the built-in feature retrieval component is optional. You can directly use the `get_offline_features()` API, as shown earlier. The name of the specification should be *feature_retrieval_spec.yaml* when you package it with the model. This way, the system can recognize it.
 
       [!notebook-python[] (~/azureml-examples-main/sdk/python/featurestore_sample/notebooks/sdk_only/2.Experiment-train-models-using-features.ipynb?name=export-as-frspec)]
 
@@ -211,7 +211,7 @@ In this procedure, you manually trigger the training pipeline. In a production s
 
    The training pipeline has these steps:
 
-   1. Feature retrieval: For its input, this built-in component takes the feature retrieval specification, the observation data, and the time-stamp column name. It then generates the training data as output. It runs these steps as a managed Spark job.
+   1. Feature retrieval: For its input, this built-in component takes the feature retrieval specification, the observation data, and the **time-stamp** column name. It then generates the training data as output. It runs these steps as a managed Spark job.
 
    1. Training: Based on the training data, this step trains the model and then generates a model (not yet registered).
 
@@ -228,20 +228,20 @@ In this procedure, you manually trigger the training pipeline. In a production s
 
       - To display the pipeline steps, select the hyperlink for the **Web View** pipeline, and open it in a new window.
 
-2. Use the feature retrieval specification in the model artifacts:
+1. Use the feature retrieval specification in the model artifacts:
 
    1. On the left pane of the current workspace, select  **Models** with the right mouse button.
-   2. Select **Open in a new tab or window**.
-   3. Select **fraud_model**.
-   4. Select **Artifacts**.
+   1. Select **Open in a new tab or window**.
+   1. Select **fraud_model**.
+   1. Select **Artifacts**.
 
-   The feature retrieval specification is packaged along with the model. The model registration step in the training pipeline handled this step. You created the feature retrieval specification during experimentation. Now it's part of the model definition. In the next tutorial, you'll see how inferencing uses it.
+   The feature retrieval specification is packaged along with the model. The model registration step in the training pipeline handled this step. You created the feature retrieval specification during experimentation. Now it's part of the model definition. In the next tutorial, you'll see how the inferencing process uses it.
 
 ## View the feature set and model dependencies
 
 1. View the list of feature sets associated with the model.
 
-   On the same **Models** page, select the **Feature sets** tab. This tab shows both the `transactions` and `accounts` feature sets on which this model depends.
+   On the same **Models** page, select the **Feature sets** tab. This tab shows both the `transactions` and `accounts` feature sets. This model depends on these feature sets.
 
 1. View the list of models that use the feature sets: