Skip to content

Commit cd763fd

Browse files
authored
Odh 23178 2 Update Feature Store guide (#891)
* odh-23178-2 feature store SME review comments * odh-23178-2 address PR and code rabbit comments * odh-23178-2 feature store updates * odh-23178-2 peer review comments
1 parent ce9938f commit cd763fd

12 files changed

+35
-46
lines changed

.DS_Store

0 Bytes
Binary file not shown.

assemblies/defining-ml-features.adoc

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,11 @@
22

33
ifdef::context[:parent-context: {context}]
44

5-
:context: featurestore-defining
6-
75
[id="defining-ml-features_{context}"]
86
= Defining machine learning features
97

108
[role='_abstract']
11-
As part of the Feature Store workflow, ML engineers are responsible for identifying data sources and defining features of interest.
9+
As part of the Feature Store workflow, ML engineers or data scientists are responsible for identifying data sources and defining features of interest.
1210

1311
include::modules/setting-up-your-working-environment.adoc[leveloffset=+1]
1412

@@ -20,7 +18,5 @@ include::modules/about-organizing-features-by-using-entities.adoc[leveloffset=+1
2018

2119
include::modules/creating-feature-views.adoc[leveloffset=+1]
2220

23-
include::modules/making-features-available-for-real-time-inference.adoc[leveloffset=+1]
24-
2521
ifdef::parent-context[:context: {parent-context}]
2622
ifndef::parent-context[:!context:]

assemblies/retrieving-features-for-model-training.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ ifdef::context[:parent-context: {context}]
77

88
[role='_abstract']
99

10-
After a cluster administrator configures Feature Store on {productname-long}, a data scientist can retrieve features to train models for inference.
10+
After a cluster administrator configures Feature Store on {productname-long}, an ML engineer or a data scientist can retrieve features to train models for inference.
1111

12-
include::modules/setting-up-your-working-environment.adoc[leveloffset=+1]
12+
include::modules/making-features-available-for-real-time-inference.adoc[leveloffset=+1]
1313

1414
include::modules/retrieving-online-features-for-model-inference.adoc[leveloffset=+1]
1515

assemblies/setting-up-feature-store.adoc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,8 @@ As a cluster administrator, you must complete the following tasks to set up Feat
1010
. Enable the Feature Store component.
1111
. Create a data science project and add a Feature Store instance.
1212
. Initialize the Feature Store instance.
13-
. Make features available to data scientists for model training and inference.
13+
. Set up Feature Store so that ML Engineers and data scientists can push and retrieve features to use for model training and inference.
14+
1415

1516
include::modules/before-you-begin.adoc[leveloffset=+1]
1617

modules/creating-a-feature-store-instance-in-a-data-science-project.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,4 +160,4 @@ The `feature_store.yaml` file defines the following components:
160160

161161
* Optionally, you can customize the default configurations for the offline store, online store, or registry by editing the YAML configuration for the Feature Store CR, as described in _Customizing your feature store configuration_.
162162

163-
* Give your data scientists access to the data science project so that they can create a workbench. and provide them with a copy of the `feature_store.yaml` file so that they can add it to their workbench IDE, such as Jupyter.
163+
* Give your ML engineers and data scientists access to the data science project so that they can create a workbench. and provide them with a copy of the `feature_store.yaml` file so that they can add it to their workbench IDE, such as Jupyter.

modules/enabling-the-feature-store-component.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
= Enabling the Feature Store component
55

66
[role='_abstract']
7-
To allow the data scientists in your organization to work with machine learning features, you must enable the Feature Store component in {productname-long}.
7+
To allow the ML engineers and data scientists in your organization to work with machine learning features, you must enable the Feature Store component in {productname-long}.
88

99
.Prerequisites
1010

modules/feature-store-workflow.adoc

Lines changed: 12 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -3,33 +3,27 @@
33
[id='feature-store-workflow_{context}']
44
= Feature Store workflow
55

6-
The Feature Store workflow involves the following tasks for ML engineers, OpenShift cluster administrators, and data scientists:
6+
The Feature Store workflow involves the following tasks OpenShift cluster administrators, and machine learning (ML) engineers or data scientists:
77

8-
*Note:* This Feature Store workflow describes a local implementation. A production client-server architecture with full authorization and role assignments is planned for a future release.
8+
*Note:* This Feature Store workflow describes a local implementation that is available in this Technology Preview release.
99

1010
*Cluster administrator*
1111

12+
Installs and configures Feature Store, as described in _Chapter 2. Configuring Feature Store_:
13+
1214
. Installs OpenShift AI.
1315
. Enables the Feature Store component by using the Feature Store operator.
1416
. Creates a data science project.
1517
. In the data science project, creates a Feature Store instance by using a `feast.yaml` file that specifies the offline and online stores.
1618
. Sets up Feature Store so that ML Engineers and data scientists can push and retrieve features to use for model training and inference.
1719

18-
For more information, see _Chapter 2. Configuring Feature Store_.
19-
20-
*ML Engineer*
21-
22-
* Creates a feature definition file.
23-
* Defines the data sources and other feature store objects.
24-
* Makes features available for real-time inference.
25-
26-
For more information, see _Chapter 3: Defining features_.
27-
28-
*Data scientist*
20+
*ML Engineer or data scientist*
2921

30-
. Creates a workbench.
31-
. Obtains the `feature_store.yaml` from the cluster administrator.
32-
. Installs the Feature Store Python SDK in their IDE environment, for example Jupyter.
33-
. Uses `feast` Python APIs to retrieve features for model training in the workbench.
22+
* Prepares features, as described in _Chapter 3: Defining features_:
23+
. Creates a feature definition file.
24+
. Defines the data sources and other feature store objects.
25+
. Makes features available for real-time inference.
3426

35-
For more information, see _Chapter 4. Retrieving features for model training_.
27+
* Prepares features for model training and real-time inference, as described in _Chapter 4. Retrieving features for model training_:
28+
. Makes features available to models.
29+
. Uses `feast` Python APIs to retrieve features for model training and inference.

modules/making-features-available-for-real-time-inference.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
[id="making-features-available-for-real-time-inference_{context}"]
44
= Making features available for real-time inference
55

6-
To make features available for real-time inference for use by the data scientists on your team, load feature data to the online store.
6+
To make features available for real-time inference for use by ML engineers and data scientists on your team, load or _materialize_ feature data to the online store.
77

88
Materialization ensures that the same features are available for both model training and real-time predictions, making your ML workflow robust and reproducible. The online store serves the latest features to models for online prediction. Data scientists on your team with access to your data science project can access the features in the online store.
99

modules/overview-of-feature-store.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ ML platform teams use Feature Store to store and serve features consistently for
3232

3333
Feature Store consists of the following key components:
3434

35-
Registry:: A central catalog of all feature definitions and their related metadata. It allows data scientists to search, discover, and collaborate on new features. The registry exposes methods to apply, list, retrieve, and delete features.
35+
Registry:: A central catalog of all feature definitions and their related metadata. It allows ML engineers and data scientists to search, discover, and collaborate on new features. The registry exposes methods to apply, list, retrieve, and delete features.
3636

3737
Offline Store:: The data store that contains historical data for scale-out batch scoring or model training. The offline store persists batch data that has been ingested into Feature Store. This data is used for producing training datasets. Examples of offline stores include Dask, Snowflake, BigQuery, Redshift, and DuckDB.
3838

modules/retrieving-online-features-for-model-inference.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,10 +19,10 @@ Typically, you create one feature service per model version, allowing for tracki
1919

2020
* In your IDE environment, you have installed the Feature Store Python SDK and added a `feature_store.yaml` file, as described in _Setting up your working environment_.
2121

22-
* Your cluster administrator has pushed feature data to the online store.
23-
2422
* Optionally, you have cloned a Git repository that includes your model training code.
2523

24+
* Feature data is loaded in the online store, as described in _Making features available for real-time inference_.
25+
2626
.Procedure
2727

2828
. From the {productname-short} dashboard, click *Data science projects*.

0 commit comments

Comments
 (0)