CHIP : Incremental feature aggregation#979
Conversation
| accuracy=Accuracy.SNAPSHOT | ||
| ) | ||
| ``` | ||
| To compute above groupBy incrementally |
There was a problem hiding this comment.
An alternative approach is to store an intermediate 'tiled' representation: each day store the aggregate for just that day, then compute the longer windows from the intermediate.
e.g. For the above example, store the count of inp_col each day, then your 3 and 10 day windows just need to sum those intermediate counts to get the final values.
The benefit here is it works for almost any kind of aggregation, including max, min etc.
I'm fairly sure this is how the 'tiled architecture' works for the online flow
There was a problem hiding this comment.
@blrnw3 Yes. I am going to change the architecture. Going to get the daily aggregations and store it in table. The only change would be the way we store the IRs. For example, for avg, we need to store both sum/count.
Summary
Proposal to support incremental aggregations.
Why / Goal
Test Plan
Checklist
Reviewers