Skip to content

Commit c328742

Browse files
committed
Freshness update for feature-set-specification-transformation-concepts.md . . .
1 parent 3a42ac7 commit c328742

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

articles/machine-learning/feature-set-specification-transformation-concepts.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,15 @@ ms.topic: how-to
88
ms.author: franksolomon
99
author: fbsolo-ms1
1010
ms.reviewer: yogipandey
11-
ms.date: 12/06/2023
11+
ms.date: 01/22/2025
1212
ms.custom: template-concept
1313
---
1414

1515
# Feature transformation and best practices
1616

17-
This article describes feature set specifications, the different kinds of transformations that can be used with it, and related best practices.
17+
This article describes feature set specifications, the different kinds of transformations that can be used with them, and related best practices.
1818

19-
A feature set is a collection of features generated by source data transformations. A feature set specification is a self-contained definition for feature set development and local testing. After its development and local testing, you can register that feature set as a feature set asset with the feature store. You then have versioning and materialization available as managed capabilities.
19+
A feature set is a collection of features generated by source data transformations. A feature set specification is a self-contained definition for feature set development and local testing. After development and local testing of a feature set, you can register that feature set as a feature set asset with the feature store. You then have versioning and materialization available as managed capabilities.
2020

2121
## Define a feature set
2222

@@ -85,7 +85,7 @@ The calculation happens in these steps:
8585
- Apply the feature transformer, defined by `feature_transformation.transformation_code`, on the data, and get the calculated features
8686
- Filter the feature values to return only those feature records within the feature window `[feature_window_start_ts, feature_window_end_ts)`
8787

88-
In this code sample, the feature store API computes the features:
88+
In this code sample, the feature store API calculates the features:
8989

9090
```python
9191
# define the source data time window according to feature window
@@ -153,9 +153,9 @@ This shows the calculated feature values:
153153

154154
### Sliding window aggregation
155155

156-
Sliding window aggregation can help handle feature values that present statistics (for example, sum, average, etc.) that accumulate over time. The SparkSQL `Window` function defines a sliding window around each row in the data, is useful in these cases.
156+
Sliding window aggregation can help handle feature values that present statistics (for example, sum, average, etc.) that accumulate over time. The SparkSQL `Window` function defines a sliding window around each row in the data, which is useful in these cases.
157157

158-
For each row, the `Window` object can look into both future and past. In the context of machine learning features, you should define the `Window` object to look only the past, for each row. Visit the [Best Practice](#prevent-data-leakage-in-feature-transformation) section for more details.
158+
For each row, the `Window` object can look into both the future and the past. In the context of machine learning features, you should define the `Window` object to look only in the past, for each row. Visit the [Best Practice](#prevent-data-leakage-in-feature-transformation) section for more information.
159159

160160
Start with this source data:
161161

@@ -329,7 +329,7 @@ Data leakage in the feature transformation definition can lead to these problems
329329

330330
### Set proper `source_lookback`
331331

332-
For time-series (sliding/tumbling/stagger window aggregation) data aggregations, properly set the `source_lookback` property. This diagram shows the relationship between the source data window and the feature window in the feature (set) calculation:
332+
For time-series (sliding/tumbling/stagger window aggregation) data aggregations, set the `source_lookback` property correctly. This diagram shows the relationship between the source data window and the feature window in the feature (set) calculation:
333333

334334
:::image type="content" source="./media/feature-set-specification-transformation-concepts/illustration-source-lookback.png" lightbox="./media/feature-set-specification-transformation-concepts/illustration-source-lookback.png" alt-text="Illustration showing the concept of source_lookback.":::
335335

0 commit comments

Comments
 (0)