Feature/analytics engine by StuMason · Pull Request #9 · StuMason/polar-flow-server

StuMason · 2026-01-13T05:39:40Z

No description provided.

- Add UserBaseline model with IQR-based anomaly detection - Add Alembic migration for user_baselines table and analytics indices - Implement BaselineService with calculations for: - HRV RMSSD (from nightly recharge) - Sleep score - Resting heart rate - Training load and ratio - Add baselines API endpoints: - GET /users/{id}/baselines - List all baselines - GET /users/{id}/baselines/{metric} - Get specific baseline - POST /users/{id}/baselines/calculate - Trigger calculation - GET /users/{id}/baselines/check/{metric}/{value} - Anomaly check - GET /users/{id}/analytics/status - Data readiness and feature unlocks Features available based on data days: - 7 days: Basic stats - 14 days: Trend analysis - 21 days: Personalized baselines - 30 days: Predictive models - 60 days: Advanced ML - 90 days: Long-term patterns Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…1.9) - Add automatic baseline recalculation after sync operations - Create test data seeding fixtures with 90 days of realistic data - Weekly HRV patterns (Monday dips, Friday peaks) - Sleep score variations - Training load periodization (4-week cycles) - Occasional anomalies (2% chance) - Add analytics fixtures for different data scenarios (90d, 21d, 7d, 3d) - Write 13 comprehensive tests covering: - Baseline status levels (ready/partial/insufficient) - IQR calculations and bounds - Anomaly detection (warning/critical thresholds) - Multi-metric baseline calculation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-01-13T05:44:47Z

Test comment from review

- Add type params to ASGIConnection[Any, Any, Any, Any] - Add type params to Response[Any] - Add type params to Request[Any, Any, Any] - Import ASGIApp from litestar.types - Add type: ignore for scope comparison (litestar typing issue) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

StuMason · 2026-01-13T05:55:39Z

PR Review: Feature/analytics engine

Executive Summary

This PR adds a comprehensive analytics baseline system for health metrics. Overall implementation is solid and well-architected, with proper multi-tenancy, good statistical methods, and security-conscious design. Found several issues that should be addressed before merge.

🔴 Critical Issues

1. Authorization Bypass: Missing user_id Validation

Location: src/polar_flow_server/core/auth.py:265-271

Issue: In per_user_api_key_guard, user-scoped API keys are validated ONLY when api_key.user_id is not None. However, if an attacker uses a service-level API key (where user_id=None), they can access ANY user's data by changing the path parameter.

Current logic:

if api_key.user_id is not None and path_user_id:
    if api_key.user_id != path_user_id:
        raise NotAuthorizedException(...)

Problem: When api_key.user_id is None (service-level key), the check is skipped entirely.

Impact: Anyone with a service-level API key can access all users' baseline data.

Recommendation: Add explicit service-level key validation or require admin privileges for cross-user access.

Severity: 🔴 HIGH - Authorization bypass in multi-tenant system

2. Type Ignore Comments Hide Real Issue

Location: src/polar_flow_server/models/baseline.py:182-183

extreme_lower = self.q1 - 3 * self.iqr  # type: ignore
extreme_upper = self.q3 + 3 * self.iqr  # type: ignore

Issue: The type ignores are suppressing legitimate type errors. self.q1 and self.q3 can be None, but the check on line 181 only validates self.iqr.

Correct fix:

if self.iqr is not None and self.q1 is not None and self.q3 is not None:
    extreme_lower = self.q1 - 3 * self.iqr
    extreme_upper = self.q3 + 3 * self.iqr

Severity: 🟠 Medium (could cause runtime errors if q1/q3 are None)

⚠️ Security Concerns

3. SQL Injection Risk via User-Controlled user_id

Location: All API endpoints in src/polar_flow_server/api/baselines.py

Issue: The user_id path parameter is passed directly to SQLAlchemy queries without validation. While SQLAlchemy's parameterization provides protection, there's no validation that user_id is well-formed.

Recommendation: Add validation or use Litestar's path parameter constraints.

Severity: Medium-High (mitigated by SQLAlchemy but defense-in-depth is important)

4. Race Condition in Baseline Calculation

Location: src/polar_flow_server/services/baseline.py:71

Issue: calculate_all_baselines() commits once at the end, but concurrent requests for the same user could cause lost updates or database deadlocks.

Recommendation: Add user-level locking using PostgreSQL advisory locks.

Severity: Medium

5. No Input Validation on Float Values

Location: src/polar_flow_server/api/baselines.py:149-155

The check_anomaly endpoint accepts value: float without bounds checking. Could pass Infinity, -Infinity, or NaN.

Recommendation: Add validation using math.isfinite(value).

Severity: Low-Medium

🐛 Bugs & Logic Issues

6. Incorrect Quantile Handling

Location: src/polar_flow_server/services/baseline.py:309-312

if sample_count >= 4:
    qs = quantiles(values, n=4)
    q1 = qs[0]
    q3 = qs[2] if len(qs) > 2 else None

Issue: The if len(qs) > 2 check is redundant—quantiles(n=4) always returns exactly 3 values.

Recommendation: Simplify to q1, q3 = qs[0], qs[2]

Severity: Low (no functional impact)

✅ What's Done Well

Excellent multi-tenancy: UserScopedMixin and user_id on all queries
Robust statistics: IQR-based anomaly detection is appropriate for skewed health metrics
Good migration: Proper indices, composite keys, and comments
Idempotent upserts: ON CONFLICT DO UPDATE prevents duplicates
Rate limiting: Proper implementation in auth guard
Clear API documentation: Well-documented endpoints

📋 Recommendations Summary

Must Fix Before Merge:

✅ Fix authorization bypass for service-level API keys (Add Claude Code GitHub Workflow #1)
✅ Fix type ignore issues in baseline.py is_anomaly method (feat: Complete V3 API Integration with HTMX Admin Panel #2)

Should Fix:

Add user_id validation across all endpoints (feat: Complete Polar Experience - Dashboard, API, Biosensing & Analytics #3)
Add race condition protection for concurrent calculations (ci(deps)(deps): Bump actions/checkout from 4 to 6 #4)
Add input validation for float values (feat: API key authentication and Alembic migrations #5)

Nice to Have:

Remove redundant quantile check (feat: Admin authentication and dynamic OAuth URLs #6)

Overall Assessment

Recommendation: Approve with required changes

The implementation demonstrates strong architectural design and appropriate statistical methods. The critical authorization issue (#1) and type safety issue (#2) should be addressed. Once fixed, this will be a solid foundation for the analytics engine.

Great work on the comprehensive baseline infrastructure! 🎉

Review generated by Claude Code Review workflow

- Enable use_sticky_comment for automatic PR comment posting - Remove gh pr comment from allowed tools (was causing shell escaping issues) - Add Write tool for flexibility - Update prompt to output review directly instead of posting manually The previous approach had Claude try to post via gh pr comment, but markdown with code blocks caused shell escaping failures. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Fix type safety in is_anomaly(): Remove type: ignore comments by checking all required values (q1, q3, iqr) before calculations - Add user_id format validation: Regex-based validation prevents injection attacks, limits to alphanumeric with _ and - only - Add float input validation: Reject NaN, inf, -inf values in check_anomaly endpoint to prevent edge case errors - Remove redundant quantile check: quantiles(n=4) always returns exactly [Q1, Q2, Q3], no length check needed - Document service-level API key authorization model: Clarify that user_id=None keys intentionally have admin access for SaaS backends Security improvements: - ValidationException for invalid user_id format (1-100 chars) - ValidationException for non-finite float values - Don't reveal all valid metric names in error messages Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

StuMason and others added 3 commits January 13, 2026 05:24

style: Fix import sorting and formatting

f92d5c8

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

StuMason and others added 2 commits January 13, 2026 05:55

StuMason merged commit 9016c3c into main Jan 13, 2026
4 of 5 checks passed

StuMason deleted the feature/analytics-engine branch January 13, 2026 06:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/analytics engine#9

Feature/analytics engine#9
StuMason merged 6 commits intomainfrom
feature/analytics-engine

StuMason commented Jan 13, 2026

Uh oh!

claude bot commented Jan 13, 2026

Uh oh!

StuMason commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

StuMason commented Jan 13, 2026

Uh oh!

claude bot commented Jan 13, 2026

Uh oh!

StuMason commented Jan 13, 2026

PR Review: Feature/analytics engine

Executive Summary

🔴 Critical Issues

1. Authorization Bypass: Missing user_id Validation

2. Type Ignore Comments Hide Real Issue

⚠️ Security Concerns

3. SQL Injection Risk via User-Controlled user_id

4. Race Condition in Baseline Calculation

5. No Input Validation on Float Values

🐛 Bugs & Logic Issues

6. Incorrect Quantile Handling

✅ What's Done Well

📋 Recommendations Summary

Must Fix Before Merge:

Should Fix:

Nice to Have:

Overall Assessment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant