feat: Pattern Detection & Anomaly Scanning (Sprint 2)#10
Conversation
- Document all 5 baselines endpoints with examples - Explain IQR-based anomaly detection (warning/critical) - Document feature unlock timeline (7/14/21/30/60/90 days) - Update index.md with analytics overview and examples Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add pattern detection service with Spearman correlation, overtraining
risk scoring, and trend analysis. Includes bulk anomaly scanning across
all metrics.
New endpoints:
- GET /users/{id}/patterns - List all detected patterns
- GET /users/{id}/patterns/{name} - Get specific pattern
- POST /users/{id}/patterns/detect - Trigger pattern detection
- GET /users/{id}/anomalies - Bulk scan for anomalies
Pattern types:
- sleep_hrv_correlation: Spearman correlation with p-value confidence
- overtraining_risk: Multi-metric composite score (0-100)
- hrv_trend: 7-day vs 30-day baseline comparison
- sleep_trend: 7-day vs 30-day baseline comparison
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes from code review: - Fix baseline calculation to exclude recent 7 days from baseline (was including recent data which diluted trend sensitivity) - Add NaN validation for spearmanr results (handles constant arrays) - Use explicit lambda for metrics_involved default in model - Fix type safety in interpretation property (check isinstance) - Pin scipy version to >=1.14.0,<2.0.0 - Add comment clarifying migration ID generation approach Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@claude please re-review |
|
Claude finished @StuMason's task in 1m 35s —— View job Re-reviewing PR #10Tasks
Code Review SummaryGreat work on the pattern detection implementation! The code is well-structured, thoroughly tested, and follows the project's architectural patterns. I found several issues that should be addressed before merging. Critical Issues1. SQL Injection Risk in Migration (src/polar_flow_server/services/pattern.py:542-548)The upsert uses string-based constraint name which could be vulnerable. While the current implementation is safe, consider using the ORM constraint object for better type safety: stmt = stmt.on_conflict_do_update(
index_elements=['user_id', 'pattern_type', 'pattern_name'],
set_=pattern_data,
)This is more explicit and prevents potential issues if the constraint name changes. 2. Missing Session Rollback on Error (src/polar_flow_server/services/pattern.py:64-99)The try:
# ... pattern detection ...
await self.session.commit()
except Exception as e:
await self.session.rollback()
self.logger.error("Pattern detection failed", error=str(e))
raiseHigh Priority3. Hardcoded Confidence Value (src/polar_flow_server/services/pattern.py:393)The trend confidence is hardcoded at 0.8. This should be calculated based on sample size or data quality: # Current:
confidence=0.8, # Trend confidence is high with sufficient data
# Better:
confidence=min(0.9, 0.5 + (sample_count / 30) * 0.4) # Scale with sample size4. Potential Division by Zero (src/polar_flow_server/services/pattern.py:463-465)While there's a check for if baseline_avg <= 0:
return None5. Missing API Guard Configuration (src/polar_flow_server/api/patterns.py:178)The patterns router uses Verify: Is this intentional? If API keys are optional for other endpoints, why require them for patterns? 6. Inefficient Trend Calculation (src/polar_flow_server/services/pattern.py:449-466)The # Query 1: Recent 7 days
# Query 2: Days 8-30 for baselineThis matters for users with sparse data. Medium Priority7. Inconsistent Sample Count (src/polar_flow_server/services/pattern.py:395)The sample_count=len(recent), # Actual count8. Magic Numbers (Multiple locations)Consider extracting these to class constants:
9. Missing Type Hints for Details Dict (src/polar_flow_server/services/pattern.py:41)details: dict[str, object] | None = None # Too genericConsider a TypedDict for better type checking: class CorrelationDetails(TypedDict, total=False):
correlation_coefficient: float
p_value: float
strength: str
direction: str
interpretation: str10. Unused Baseline Model Import (src/polar_flow_server/services/pattern.py:14)from polar_flow_server.models.baseline import UserBaselineThis import is never used in the PatternService. It's only used by AnomalyService. Clean imports. 11. SQL Query Efficiency (src/polar_flow_server/services/pattern.py:672-737)The # Use CTEs or subqueries to fetch all latest values in 2-3 queries maxWith appropriate indexes, this could be significantly faster for large datasets. Low Priority / Style12. Inconsistent Docstring StyleMost docstrings are excellent, but some are missing return type descriptions (e.g., 13. Test Coverage - Edge CasesMissing test cases for:
14. API Response - Missing HTTP 404
from litestar.exceptions import NotFoundException
if not pattern:
raise NotFoundException("Pattern not found")15. Validation Function Placement (src/polar_flow_server/api/patterns.py:19-35)The # src/polar_flow_server/api/validation.pyPositive Highlights✅ Excellent Statistical Approach: Using Spearman correlation over Pearson is the right choice for health metrics Migration ReviewThe migration looks good overall:
Minor suggestion: Consider adding an index on op.create_index('ix_pattern_analyzed_at', 'pattern_analyses', ['analyzed_at'])DocumentationThe API documentation is thorough and includes:
RecommendationsBefore Merge:
Post-Merge (Technical Debt):
SummaryThis is a solid implementation with good statistical foundations and thorough testing. The main concerns are around error handling and some API design decisions. Once the critical issues are addressed, this will be ready to merge. Verdict: Approve with changes requested for issues #2, #5, and #14. |
- Add try/except with rollback to detect_all_patterns - Return 404 for invalid/missing patterns instead of null - API guard verified as consistent with all other endpoints Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
New Endpoints
/users/{id}/patterns/users/{id}/patterns/{name}/users/{id}/patterns/detect/users/{id}/anomaliesPattern Types
sleep_hrv_correlation- Spearman correlation with p-value confidenceovertraining_risk- Multi-metric composite score with recovery recommendationshrv_trend- 7-day vs 30-day HRV baseline comparisonsleep_trend- 7-day vs 30-day sleep score comparisonTest plan
🤖 Generated with Claude Code