You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Find detailed examples in the **2.Experiment and trainmodelsusingfeatures.ipynb** notebook, hosted at [this resource](https://github.com/Azure/azureml-examples/tree/main/sdk/python/featurestore_sample/notebooks/sdk_only).
72
+
Find detailed examples in the **2.Experiment-train-models-using-features.ipynb** notebook, hosted at [this resource](https://github.com/Azure/azureml-examples/tree/main/sdk/python/featurestore_sample/notebooks/sdk_only).
73
73
74
74
The function generates a YAML file artifact, which has a structure similar to the structure in this example:
Review the **2.Experiment and trainmodelsusingfeatures.ipynb** notebook, hosted at [this resource](https://github.com/Azure/azureml-examples/tree/main/sdk/python/featurestore_sample/notebooks/sdk_only), for a complete pipeline example that uses a built-in feature retrieval component to generate training data and run the training job with the packaging.
152
+
Review the **2.Experiment-train-models-using-features.ipynb** notebook, hosted at [this resource](https://github.com/Azure/azureml-examples/tree/main/sdk/python/featurestore_sample/notebooks/sdk_only), for a complete pipeline example that uses a built-in feature retrieval component to generate training data and run the training job with the packaging.
153
153
154
154
For training data generated by other methods, the feature retrieval specification can be passed as an input to the training job, and then handle the copy and package process in the training script.
155
155
@@ -172,7 +172,7 @@ def init()
172
172
init_online_lookup(features, credential)
173
173
```
174
174
175
-
Visit the **4.Enableonlinestore and run online inference.ipynb** notebook, hosted at [this resource](https://github.com/Azure/azureml-examples/tree/main/sdk/python/featurestore_sample/notebooks/sdk_only), for a detailed code snippet.
175
+
Visit the **4.Enable-online-store-run-inference.ipynb** notebook, hosted at [this resource](https://github.com/Azure/azureml-examples/tree/main/sdk/python/featurestore_sample/notebooks/sdk_only), for a detailed code snippet.
176
176
177
177
## Use feature retrieval specification in batch inference
178
178
@@ -183,7 +183,7 @@ Batch inference requires:
183
183
184
184
The feature retrieval specification used in step 1 operates the same way as it does to [generate training data](#use-feature-retrieval-specification-to-create-training-data). The built-in feature retrieval component generates the inference data. As long as the feature retrieval specification is packaged with the model, the model can serve, as a convenience, as the input to the component. This approach is an alternative to directly passing the inference data in the feature retrieval specification.
185
185
186
-
Visit the **3.Enablerecurrentmaterialization and runbatchinference.ipynb** notebook, hosted at [this resource](https://github.com/Azure/azureml-examples/tree/main/sdk/python/featurestore_sample/notebooks/sdk_only), for a detailed code snippet.
186
+
Visit the **3.Enable-recurrent-materialization-run-batch-inference.ipynb** notebook, hosted at [this resource](https://github.com/Azure/azureml-examples/tree/main/sdk/python/featurestore_sample/notebooks/sdk_only), for a detailed code snippet.
187
187
188
188
## Built-in feature retrieval component
189
189
@@ -220,8 +220,8 @@ To use the component, reference its component ID in a pipeline job YAML file, or
220
220
221
221
Review these notebooks for examples of the built-in component, both hosted at [this resource](https://github.com/Azure/azureml-examples/tree/main/sdk/python/featurestore_sample/notebooks/sdk_only):
222
222
223
-
- **2.Experiment and trainmodelsusingfeatures.ipynb**
224
-
- **3.Enablerecurrentmaterialization and runbatchinference.ipynb**
Copy file name to clipboardExpand all lines: articles/machine-learning/feature-set-materialization-concepts.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,7 +53,7 @@ To avoid the limit, users should run backfill jobs in advance to [fill the gaps]
53
53
54
54
Before you run a data materialization job, enable the offline and/or online data materializations at the feature set level.
55
55
56
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/4.Enableonlinestore and run online inference.ipynb?name=enable-accounts-material)]
You can submit the data materialization jobs as a:
59
59
@@ -75,7 +75,7 @@ User can submit a backfill request with:
75
75
- A list of data materialization status values - Incomplete, Complete, or None
76
76
- A feature window (optional)
77
77
78
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/1.Develop a featuresetandregister with managed feature store.ipynb?name=backfill-txns-fset)]
After submission of the backfill request, a new materialization job is created for each *data interval* that has a matching data materialization status (Incomplete, Complete, or None). Additionally, the relevant data intervals must fall within the defined *feature window*. If the data materialization status is `Pending` for a *data interval*, no materialization job is submitted for that interval.
Copy file name to clipboardExpand all lines: articles/machine-learning/tutorial-develop-feature-set-with-custom-source.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,7 +46,7 @@ You don't need to explicitly install these resources for this tutorial, because
46
46
47
47
### Configure the Azure Machine Learning Spark notebook
48
48
49
-
You can create a new notebook and execute the instructions in this tutorial step by step. You can also open and run the existing notebook *featurestore_sample/notebooks/sdk_only/5.Develop a featureset with customsource.ipynb*. Keep this tutorial open and refer to it for documentation links and more explanation.
49
+
You can create a new notebook and execute the instructions in this tutorial step by step. You can also open and run the existing notebook *featurestore_sample/notebooks/sdk_only/5.Develop-feature-set-custom-source.ipynb*. Keep this tutorial open and refer to it for documentation links and more explanation.
50
50
51
51
1. On the top menu, in the **Compute** dropdown list, select **Serverless Spark Compute** under **Azure Machine Learning Serverless Spark**.
52
52
@@ -61,17 +61,17 @@ You can create a new notebook and execute the instructions in this tutorial step
61
61
## Set up the root directory for the samples
62
62
This code cell sets up the root directory for the samples. It needs about 10 minutes to install all dependencies and start the Spark session.
63
63
64
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/5.Develop a featureset with customsource.ipynb?name=root-dir)]
## Initialize the CRUD client of the feature store workspace
67
67
Initialize the `MLClient` for the feature store workspace, to cover the create, read, update, and delete (CRUD) operations on the feature store workspace.
68
68
69
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/5.Develop a featureset with customsource.ipynb?name=init-fset-crud-client)]
As mentioned earlier, this tutorial uses the Python feature store core SDK (`azureml-featurestore`). This initialized SDK client covers create, read, update, and delete (CRUD) operations on feature stores, feature sets, and feature store entities.
73
73
74
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/5.Develop a featureset with customsource.ipynb?name=init-fs-core-sdk)]
You can define your own source loading logic from any data storage that has a custom source definition. Implement a source processor user-defined function (UDF) class (`CustomSourceTransformer` in this tutorial) to use this feature. This class should define an `__init__(self, **kwargs)` function, and a `process(self, start_time, end_time, **kwargs)` function. The `kwargs` dictionary is supplied as a part of the feature set specification definition. This definition is then passed to the UDF. The `start_time` and `end_time` parameters are calculated and passed to the UDF function.
@@ -111,11 +111,11 @@ class CustomSourceTransformer:
111
111
## Create a feature set specification with a custom source, and experiment with it locally
112
112
Now, create a feature set specification with a custom source definition, and use it in your development environment to experiment with the feature set. The tutorial notebook attached to **Serverless Spark Compute** serves as the development environment.
113
113
114
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/5.Develop a featureset with customsource.ipynb?name=create-fs-custom-src)]
Next, define a feature window, and display the feature values in this feature window.
117
117
118
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/5.Develop a featureset with customsource.ipynb?name=display-features)]
To register the feature set specification with the feature store, first save that specification in a specific format. Review the generated `transactions_custom_source` feature set specification. Open this file from the file tree to see the specification: `featurestore/featuresets/transactions_custom_source/spec/FeaturesetSpec.yaml`.
@@ -129,22 +129,22 @@ To learn more about the specification, see [Understanding top-level entities in
129
129
130
130
Feature set specification persistence offers another benefit: the feature set specification can be source controlled.
131
131
132
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/5.Develop a featureset with customsource.ipynb?name=dump-txn-fs-spec)]
## Register the transaction feature set with the feature store
135
135
Use this code to register a feature set asset loaded from custom source with the feature store. You can then reuse that asset, and easily share it. Registration of a feature set asset offers managed capabilities, including versioning and materialization.
136
136
137
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/5.Develop a featureset with customsource.ipynb?name=register-txn-fset)]
Obtain the registered feature set, and print related information.
140
140
141
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/5.Develop a featureset with customsource.ipynb?name=get-txn-fset)]
## Test feature generation from registered feature set
144
144
Use the `to_spark_dataframe()` function of the feature set to test the feature generation from the registered feature set, and display the features.
145
145
print-txn-fset-sample-values
146
146
147
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/5.Develop a featureset with customsource.ipynb?name=print-txn-fset-sample-values)]
You should be able to successfully fetch the registered feature set as a Spark dataframe, and then display it. You can now use these features for a point-in-time join with observation data, and the subsequent steps in your machine learning pipeline.
Copy file name to clipboardExpand all lines: articles/machine-learning/tutorial-enable-recurrent-materialization-run-batch-inference.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,11 +49,11 @@ Before you proceed with this tutorial, be sure to complete the first and second
49
49
50
50
2. Start the Spark session.
51
51
52
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=start-spark-session)]
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=root-dir)]
@@ -64,33 +64,33 @@ Before you proceed with this tutorial, be sure to complete the first and second
64
64
65
65
1. Install the Azure Machine Learning extension.
66
66
67
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_and_cli/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=install-ml-ext-cli)]
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_and_cli/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=auth-cli)]
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_and_cli/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=set-default-subs-cli)]
5. Initialize the project workspace CRUD (create, read, update, and delete) client.
80
80
81
81
The tutorial notebook runs from this current workspace.
82
82
83
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=init-ws-crud-client)]
Be sure to update the `featurestore_name` value, to reflect what you created in the first tutorial.
88
88
89
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=init-fs-crud-client)]
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=init-fs-core-sdk)]
## Enable recurrent materialization on the transactions feature set
96
96
@@ -109,15 +109,15 @@ To handle inference of the model in production, you might want to set up recurre
109
109
110
110
As explained in earlier tutorials, after data is materialized (backfill or recurrent materialization), feature retrieval uses the materialized data by default.
111
111
112
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=enable-recurrent-mat-txns-fset)]
## (Optional) Save the YAML file for the feature set asset
115
115
116
116
You use the updated settings to save the YAML file.
117
117
118
118
### [Python SDK](#tab/python)
119
119
120
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=dump-txn-fset-with-mat-yaml)]
@@ -138,7 +138,7 @@ The batch inference has these steps:
138
138
> [!NOTE]
139
139
> You use a job for batch inference in this example. You can also use batch endpoints in Azure Machine Learning.
140
140
141
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=run-batch-inf-pipeline)]
In the batch inference pipeline (*/project/fraud_mode/pipelines/batch_inference_pipeline.yaml*) outputs, because you didn't provide `name` or `version` values for `outputs` of `inference_step`, the system created an untracked data asset with a GUID as the name value and `1` as the version value. In this cell, you derive and then display the data path from the asset.
153
153
154
-
[!notebook-python[] (~/azureml-examples-temp-fix/sdk/python/featurestore_sample/notebooks/sdk_only/3.Enablerecurrentmaterialization and runbatchinference.ipynb?name=inspect-batch-inf-output-data)]
0 commit comments