Commit e9b0f8e
docs: Add ROADMAP for analytics engine (#8)
* docs: Add ROADMAP for analytics engine
Comprehensive roadmap for transforming polar-flow-server into a health
analytics engine for AI coaching applications.
Phases:
- Phase 1: Derived Metrics Engine (baselines, rolling averages)
- Phase 2: Pattern Detection (correlations, anomalies)
- Phase 3: ML Models (optional - predictions, forecasting)
- Phase 4: Insights API (unified endpoint for coaching layer)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: Address review feedback on ROADMAP
Key changes based on CI review:
Security (Critical):
- Replace pickle/joblib with JSON params or ONNX for ML model storage
- Add whitelist of allowed model classes
- Document security rationale
Statistical correctness:
- Use Spearman correlation instead of Pearson (robust to non-normal)
- Replace Z-score anomaly detection with IQR method (HRV is right-skewed)
- Increase minimum sample size from 14 to 21 for correlation
- Increase minimum training data from 30 to 60 days for ML
Performance:
- Add required database indices for (user_id, date) lookups
- Document incremental calculation strategy
- Fix timezone handling (use UTC, not date.today())
New sections:
- Testing Strategy with unit/integration test examples
- Data Privacy & Compliance (GDPR, right to deletion)
- Minimum data requirements table for ML models
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: Add data readiness convention and implementation plan
Data Readiness Convention:
- Feature unlock timeline (7/14/21/30/60/90 days)
- Consistent API response structure with status, feature_availability
- unlock_progress for gamification ("2 more days until patterns!")
- Coach integration notes for adjusting language based on data age
- New /users/{id}/status endpoint spec
Implementation Plan:
- Sprint 1: Foundation (baselines, status endpoint)
- Sprint 2: Patterns (correlations, anomalies)
- Sprint 3: Insights API (aggregation, observations)
- Sprint 4: ML (optional predictions)
- Clear task dependencies for each sprint
Test Data Seeding:
- Realistic data generators with weekly patterns
- generate_realistic_hrv_data() with Monday dips, gradual trends
- generate_sleep_data() with weekend variations
- generate_overtraining_scenario() for pattern detection tests
- generate_anomaly_scenario() for IQR edge cases
- Pytest fixtures for 7/14/30/60/90 day scenarios
- Data age scenario test matrix
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>1 parent 3f516b4 commit e9b0f8e
1 file changed
+1401
-0
lines changed
0 commit comments