-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Implement Chronon-style tiling transformation engine for streaming features with ComputeEngine, Aggregation support, and comprehensive testing #5644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
Copilot
wants to merge
7
commits into
master
Choose a base branch
from
copilot/fix-595f737e-8dac-461b-9558-1b5ac082752d
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
f7ccce3
Initial plan
Copilot 7f80220
Implement tiling transformation engine for streaming features
Copilot aec46b5
Complete tiling transformation implementation with documentation and …
Copilot fefd284
Refactor tiled_transformation to use mode parameter following on_dema…
Copilot 915896a
Add sources and schema parameters to tiled_transformation following o…
Copilot 1932fff
Refactor tiled transformation to work with ComputeEngine and support …
Copilot 29ee76d
Move streaming integration to unit tests and add tiled transformation…
Copilot File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| # Tiled Streaming Features Example | ||
|
|
||
| This example demonstrates how to use Feast's tiling transformation engine for efficient streaming feature engineering. | ||
|
|
||
| ## Overview | ||
|
|
||
| Tiling in Feast is inspired by Chronon's tiled architecture and provides: | ||
| - Time-based data partitioning into manageable tiles | ||
| - Efficient temporal aggregations over sliding windows | ||
| - Chaining features across different time horizons | ||
| - Memory-efficient processing of streaming data | ||
| - Late-arriving data handling | ||
|
|
||
| ## Examples | ||
|
|
||
| See the example files: | ||
| - `basic_tiling.py` - Basic tiled transformation usage | ||
| - `advanced_tiling.py` - Advanced features like chaining and complex aggregations | ||
|
|
||
| For production integration examples, see: | ||
| - `sdk/python/feast/templates/local/feature_repo/example_repo.py` - Template example with tiled transformations | ||
| - `sdk/python/tests/unit/transformation/test_tiled_transformation_integration.py` - Integration tests with StreamFeatureView | ||
|
|
||
| ## Running the Examples | ||
|
|
||
| ```bash | ||
| # Basic tiling example | ||
| python basic_tiling.py | ||
|
|
||
| # Advanced tiling with chaining | ||
| python advanced_tiling.py | ||
| ``` | ||
|
|
||
| ## Key Concepts | ||
|
|
||
| ### Tile Configuration | ||
| - **tile_size**: Duration of each time tile (e.g., `timedelta(hours=1)`) | ||
| - **window_size**: Window size for aggregations within tiles (defaults to tile_size) | ||
| - **overlap**: Optional overlap between tiles for continuity | ||
| - **max_tiles_in_memory**: Maximum number of tiles to keep in memory | ||
| - **enable_late_data_handling**: Whether to handle late-arriving data | ||
|
|
||
| ### Aggregation Functions | ||
| Functions that operate within each tile to compute aggregated features. | ||
|
|
||
| ### Chaining Functions | ||
| Functions that chain results across tiles for derived features that require continuity across time boundaries. | ||
|
|
||
| ### ComputeEngine Integration | ||
| Tiled transformations work with Feast's ComputeEngine architecture: | ||
| - Mode specified at StreamFeatureView level (not in transformation) | ||
| - Supports Spark, Ray, and other distributed engines | ||
| - Integrates with Feast Aggregation objects |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't make sense to pass in a pandas data frame transformation into the Stream Feature View, does it? I suppose it may, we support the "spark" mode today. But this isn't compatible with it as the "mode" is passed into the tiled_transformation vs the StreamFeatureView, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right! I've refactored the architecture to fix this inconsistency.
The mode is now specified at the
StreamFeatureViewlevel (where it belongs), not within the transformation itself:This makes it consistent with Feast's architecture where transformations are engine-agnostic and the execution mode is determined by the feature view. Commit: 1932fff