Add Insight Table for Expenditure#58
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces an Insights Table for the dashboard to store pre-calculated expenditure trend analysis, including segments, slopes, and p-values. This optimization reduces dashboard load times by computing insights ahead of time rather than on-demand.
Changes:
- Added a
TrendDetectorclass that uses piecewise linear fitting to identify significant trend breakpoints in time series data - Created an
InsightExtractorclass to compute volatility and structural segments for expenditure metrics - Implemented a script to generate insights for Health, Education, and total expenditure across countries and persist them to a Delta table
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| analytics/insight_extractor.py | Defines core classes for trend detection using piecewise linear regression and insight extraction |
| analytics/extract_insight.py | Orchestrates insight generation by applying the extractor to expenditure data and saving results to Delta table |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
TrendDetector and InsightExtractor now imported from the standalone yukinko-iwasaki/trend-narrative package. insight_extractor.py becomes a thin re-export shim; extract_insight.py drops the %%run magic in favour of a direct package import. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Now redundant — extract_insight.py imports InsightExtractor directly from the trend-narrative package. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
separated out the logic for the insight-extraction in a separate package. |
TODO: test this on databricks
|
@yukinko-iwasaki changes looks good. I think we can take better advantage of spark to parallelize the processing across countries. I pushed a commit, but haven't had the chance to verify it on databricks. Please go ahead and verify. If good, proceed with merging: cc4b9af |
thanks! I could confirm that the new commit generated the same insight output df! |
This PR introduces an Insights Table designed to support data visualization on the dashboard. The table currently stores expenditure insight data used for trend detection, including segments, slopes, and p-values.
By pre-calculating these insights, we significantly optimize dashboard load times.
Note:
To ensure narratives remain synced with the datasets and charts, this script will be integrated into the automated pipeline within the existing job once the branch is merged into main.