Conversation
📝 WalkthroughWalkthroughAdds a data lifecycle system: new archival models, repositories, service, API endpoints, Celery task and schedule, migration and seed, frontend UI and hooks, and repository/service integrations to aggregate, archive, and retain historical time-series data as daily summaries. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Frontend
participant API as FastAPI
participant Service as ArchivalService
participant RepoA as ArchivalSettingRepo
participant RepoD as DataPointArchiveRepo
participant DB as Database
rect rgb(100,200,150,0.5)
Client->>API: GET /api/v1/settings/archival
API->>Service: get_settings(db)
Service->>RepoA: get(db)
RepoA->>DB: SELECT archival_settings
DB-->>RepoA: setting
Service->>RepoD: get_storage_estimate(db)
RepoD->>DB: pg_catalog queries
DB-->>RepoD: size metrics
RepoD-->>Service: StorageEstimate
Service-->>API: ArchivalSettingWithEstimate
API-->>Client: JSON
end
rect rgb(150,150,200,0.5)
Client->>API: POST /api/v1/settings/archival/run
API->>Service: run_daily_archival(db)
Service->>RepoD: archive_data_before(cutoff)
RepoD->>DB: SELECT/INSERT/DELETE (batched)
DB-->>RepoD: affected counts
RepoD-->>Service: archived_rows
Service->>RepoD: delete_archive_before(cutoff_retention)
RepoD->>DB: DELETE old archives
RepoD-->>Service: deleted_rows
Service-->>API: summary
API-->>Client: JSON (summary)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Linked issues
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
|
🚅 Deployed to the open-wearables-pr-535 environment in Open Wearables
|
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (8)
backend/tests/conftest.py (1)
218-218: Align variable naming with repository rule.On Line [218], rename
webhook_moduleto camelCase to match project convention for variables.Proposed fix
- webhook_module = "app.api.routes.v1.garmin_webhooks" + webhookModule = "app.api.routes.v1.garmin_webhooks" ... - patch(f"{webhook_module}.get_trace_id", return_value=None), - patch(f"{webhook_module}.mark_type_success", return_value=False), + patch(f"{webhookModule}.get_trace_id", return_value=None), + patch(f"{webhookModule}.mark_type_success", return_value=False), ... - f"{webhook_module}.get_backfill_status", + f"{webhookModule}.get_backfill_status", ... - patch(f"{webhook_module}.trigger_next_pending_type", return_value=None), + patch(f"{webhookModule}.trigger_next_pending_type", return_value=None),As per coding guidelines, "Follow PEP 8 naming conventions: camelCase for variables, UPPER_SNAKE_CASE for constants, PascalCase for classes".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/conftest.py` at line 218, The variable webhook_module should be renamed to camelCase (webhookModule) to follow project conventions; rename the identifier webhook_module to webhookModule in backend/tests/conftest.py and update every reference/usage in that file (fixtures, tests, or helper calls) to the new name so references like webhook_module -> webhookModule remain consistent and tests import/use the updated symbol.backend/app/schemas/series_types.py (1)
434-439: Fail fast when a series type has no aggregation mapping.Defaulting to
AVGcan silently corrupt archival semantics for future cumulative metrics. Prefer explicit failure so mapping gaps are caught during development.Proposed hardening
def get_aggregation_method(series_type: SeriesType) -> AggregationMethod: """Get the aggregation method for a series type. - Falls back to AVG for unknown types (safe default for rate-like metrics). + Raises when no mapping exists to avoid silent mis-aggregation. """ - return AGGREGATION_METHOD_BY_TYPE.get(series_type, AggregationMethod.AVG) + return AGGREGATION_METHOD_BY_TYPE[series_type]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/schemas/series_types.py` around lines 434 - 439, Replace the silent default in get_aggregation_method with an explicit failure: check AGGREGATION_METHOD_BY_TYPE for the provided series_type inside get_aggregation_method (rather than using .get(..., AggregationMethod.AVG)), and raise a clear exception (e.g., ValueError or KeyError) that includes the SeriesType value when no mapping exists; keep the function signature and return type but remove the fallback to AggregationMethod.AVG so missing mappings fail fast during development.frontend/src/lib/api/services/archival.service.ts (1)
49-84: Optional cleanup: remove duplicated error wrapping aroundapiClientcalls.
apiClientalready normalizes non-ApiErrorfailures intoApiError.networkError, so thesetry/catchblocks repeat central behavior. Flattening this service reduces noise.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/api/services/archival.service.ts` around lines 49 - 84, Remove the duplicated try/catch wrappers around apiClient calls in the methods getSettings, updateSettings, and triggerArchival and let apiClient propagate its normalized ApiError; specifically, replace each async method body so it directly returns await apiClient.get/put/post with the same generic types and endpoints (keep method names and signatures: getSettings, updateSettings, triggerArchival) rather than catching and re-wrapping errors, thereby flattening the service code and relying on apiClient's centralized error normalization.backend/app/schemas/archival.py (1)
59-59: Constraingrowth_classto a literal union instead of free-formstr.Line 59 currently allows any string. Tightening this field improves validation and keeps backend/frontend contracts aligned.
🔒 Suggested schema tightening
+from typing import Literal @@ - growth_class: str = Field(description="Growth complexity class: 'bounded' | 'linear_efficient' | 'linear'") + growth_class: Literal["bounded", "linear_efficient", "linear"] = Field( + description="Growth complexity class" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/schemas/archival.py` at line 59, The growth_class field on the Pydantic model (growth_class: str = Field(...)) is too permissive—change its type to a Literal union to enforce allowed values (Literal['bounded', 'linear_efficient', 'linear']), import Literal from typing (or typing_extensions if needed), and keep the existing Field(description=...) on growth_class so validation and schema generation reflect the constrained enum-like choices.frontend/src/hooks/api/use-archival.ts (1)
21-33: AddonMutate+ rollback touseUpdateArchivalSettingsfor immediate UI feedback.Line 21-33 currently waits for server success before local state updates. This hook can follow the project’s optimistic mutation pattern to avoid a stale-feeling settings form.
💡 Suggested refactor
export function useUpdateArchivalSettings() { const queryClient = useQueryClient(); return useMutation({ mutationFn: (data: ArchivalSettingsUpdate) => archivalService.updateSettings(data), + onMutate: async (next) => { + await queryClient.cancelQueries({ + queryKey: queryKeys.archival.settings(), + }); + const previous = queryClient.getQueryData(queryKeys.archival.settings()); + queryClient.setQueryData(queryKeys.archival.settings(), (current: any) => + current + ? { + ...current, + settings: { + ...current.settings, + ...next, + }, + } + : current + ); + return { previous }; + }, onSuccess: async () => { await queryClient.invalidateQueries({ queryKey: queryKeys.archival.all, }); toast.success('Data lifecycle settings updated'); }, - onError: (error) => { + onError: (error, _vars, context) => { + if (context?.previous) { + queryClient.setQueryData( + queryKeys.archival.settings(), + context.previous + ); + } toast.error(`Failed to update settings: ${getErrorMessage(error)}`); }, }); }Based on learnings: Applies to frontend/src/hooks/api/**/*.ts : Use optimistic updates in mutations with onMutate for immediate UI feedback and onError rollback.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/hooks/api/use-archival.ts` around lines 21 - 33, The mutation in useUpdateArchivalSettings (useMutation calling archivalService.updateSettings) lacks an onMutate optimistic update and rollback; add onMutate that cancels queries for queryKeys.archival.all, snapshots the previous cache state, applies the updated ArchivalSettings to the query cache for immediate UI feedback, return the snapshot as context; modify onError to restore the previous snapshot from context to rollback if the server call fails; keep invalidateQueries in onSuccess or onSettled to refetch authoritative data.backend/app/services/archival_service.py (2)
27-31: Add explicit return type annotation to__init__.Per coding guidelines, all functions must have explicit return type annotations.
Proposed fix
- def __init__(self, log: Logger): + def __init__(self, log: Logger) -> None: self.logger = log self.settings_repo = ArchivalSettingRepository() self.archive_repo = DataPointSeriesArchiveRepository()As per coding guidelines, "Type hints are required on all functions with explicit parameter and return type annotations".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/archival_service.py` around lines 27 - 31, The __init__ method in ArchivalService is missing an explicit return type; update the constructor signature of def __init__(self, log: Logger) to include an explicit return type annotation (-> None) so it reads def __init__(self, log: Logger) -> None:, leaving body unchanged (self.logger, self.settings_repo = ArchivalSettingRepository(), self.archive_repo = DataPointSeriesArchiveRepository()); ensure any required imports for typing are present if your linter/enforcer requires them.
56-70: Consider adding@handle_exceptionsfor consistency or using a TypedDict return type.The
run_daily_archivalmethod lacks the@handle_exceptionsdecorator that other public methods use. While the Celery task has its own try/except, this creates inconsistency. The return typedictcould also be more specific.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/archival_service.py` around lines 56 - 70, The public method run_daily_archival should be decorated with the existing `@handle_exceptions` decorator for consistency with other service methods and to centralize error handling; add `@handle_exceptions` above the run_daily_archival method definition and ensure imports/reference to handle_exceptions are present; also replace the broad return type dict with a specific TypedDict (e.g., ArchivalSummary with keys archived_rows, deleted_rows, deleted_live_rows) and use that type for the method signature and the summary variable to make the return shape explicit (update any callers if needed).backend/app/repositories/archival_repository.py (1)
345-360: Inconsistentelse_clauses in case statements.The
distance_sumandflights_climbed_sumcase statements are missingelse_=0orelse_=Noneunlike the other aggregations. While the post-processing handlesNone, this inconsistency reduces readability.Proposed fix for consistency
func.sum( case( ( DataPointSeriesArchive.series_type_definition_id == distance_id, DataPointSeriesArchive.value, ), + else_=None, ) ).label("distance_sum"), func.sum( case( ( DataPointSeriesArchive.series_type_definition_id == flights_id, DataPointSeriesArchive.value, ), + else_=None, ) ).label("flights_climbed_sum"),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/repositories/archival_repository.py` around lines 345 - 360, The two SUM+CASE aggregations for distance_sum and flights_climbed_sum in archival_repository.py are missing explicit else_ clauses; update the case(...) expressions used inside func.sum for DataPointSeriesArchive.series_type_definition_id == distance_id and == flights_id (the ones labeled "distance_sum" and "flights_climbed_sum") to include else_=0 (or else_=None to match desired behavior) so they are consistent with the other aggregations—locate the case() calls wrapped by func.sum(...) and add the else_ parameter to each.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/api/routes/v1/archival.py`:
- Around line 21-25: Add explicit response_model and status_code kwargs to the
three `@router` decorators for the archival endpoints: the GET
"/settings/archival" route, the route in the 33-41 block, and the route in the
50-54 block; set status_code=200 for normal GET responses and status_code=202
for the "run" endpoint (the long-running/run handler referenced in the review),
and set response_model to the Pydantic model that the handler actually returns
(use the handler's return type/model defined for each route). Ensure you add
these arguments directly on the `@router.get/`@router.post decorators for those
endpoints so OpenAPI and response contracts are explicit.
In `@backend/app/models/data_point_series_archive.py`:
- Around line 25-33: The model's __table_args__ currently only defines a
composite UniqueConstraint ("data_source_id","series_type_definition_id","date",
name="uq_archive_source_type_date"); add a separate Index on the "date" column
to that __table_args__ tuple to support efficient retention/cleanup queries
(e.g., an Index("ix_data_point_series_archive_date", "date")), ensuring the
index is declared alongside the existing UniqueConstraint.
In `@backend/app/models/user_invitation_code.py`:
- Line 3: Replace the raw sqlalchemy.Index import/usages with the project's
mapping wrapper: import and use Indexed from app.mappings instead of Index, and
update any model column/type annotations that currently rely on plain Index to
use the Indexed constraint (alongside Mapped/PrimaryKey/Unique patterns) in the
UserInvitationCode model (references: the import line and the places where Index
is applied around the fields at the spots corresponding to the Index usages on
the model, e.g., the definitions around lines where Index is referenced). Ensure
the file uses Mapped typing with the app.mappings constraints (Indexed) rather
than sqlalchemy.Index so model-layer contracts are consistent.
In `@backend/app/repositories/archival_repository.py`:
- Around line 232-243: The deletion block can run even when values_list is empty
due to unknown series_type_definition_id values, risking data loss; modify the
logic around the DataPointSeries delete (references: values_list,
series_type_definition_id, source_id_list, cutoff_date, deleted_count,
total_deleted, db.commit()) to skip the delete and db.commit() when values_list
(or the computed source_id_list for the batch) is empty—either continue to next
batch or log and raise—so no live rows are removed unless there are archived
values to back them up.
- Around line 278-283: The deletion uses Query.delete() with .limit(), which
SQLAlchemy forbids; modify the batch delete in archival_repository.py to first
select a limited set of primary keys and then delete by those IDs: use
DataPointSeries.with_entities(DataPointSeries.id).filter(cast(DataPointSeries.recorded_at,
Date) < cutoff_date).limit(ARCHIVE_BATCH_SIZE) to produce a subquery or list of
ids, then call
db.query(DataPointSeries).filter(DataPointSeries.id.in_(...)).delete(synchronize_session=False)
(or fetch ids and delete by id list) so the deletion is performed safely in
batches; keep references to DataPointSeries, cutoff_date and ARCHIVE_BATCH_SIZE
when implementing the change.
In `@backend/app/services/summaries_service.py`:
- Around line 202-206: The current code in summaries_service.py returns
live_results when archival_settings_repo.get(db_session) yields
settings.archive_after_days is None, which hides previously archived data;
remove that early return and always merge archive reads with live results.
Specifically, in the method using archival_settings_repo.get and returning
live_results, stop gating archive reads on settings.archive_after_days being
set; instead call the archive read path unconditionally (or detect whether any
archived data exists) and merge its results with live_results, while keeping the
existing exception handling (catch errors from archival reads but do not return
only live_results unless the archive read actually fails). Use the existing
symbols archival_settings_repo.get, live_results, and the archive read/merge
logic in this file to implement the change.
In
`@backend/migrations/versions/2026_02_27_1248-0421f956a084_archive_data_point_series.py`:
- Around line 29-41: Add a standalone index on the archive table's date column
to speed retention deletes: in the migration where
op.create_table("data_point_series_archive", ...) is defined (the block that
declares columns id, data_source_id, series_type_definition_id, date, value,
sample_count and the UniqueConstraint "uq_archive_source_type_date"), add an
op.create_index call that creates an index on the "date" column (choose a clear
name like "ix_data_point_series_archive_date") so DELETE ... WHERE date <
cutoff_date can use the index instead of scanning the unique composite index.
- Around line 22-28: The migration creates "archival_settings" but does not
enforce the singleton row (id=1) or insert it; update the migration to (1) add a
CheckConstraint on the table creation (e.g., sa.CheckConstraint("id = 1",
name="ck_archival_settings_singleton")) to enforce singleton semantics and (2)
after op.create_table("archival_settings", ...) insert the required row with
id=1 via op.execute or op.bulk_insert (provide sensible defaults or NULLs for
archive_after_days/delete_after_days). Also update the downgrade to delete the
row where id=1 (or rely on DROP TABLE) before dropping the table so the
migration cleanly reverses.
In `@backend/scripts/init/seed_archival_settings.py`:
- Around line 11-20: The seed logic for ArchivalSetting uses a check-then-insert
(db.query(ArchivalSetting).filter(ArchivalSetting.id == 1).first() then
db.add/commit) which is racy; update the seed to handle a concurrent insert by
attempting the insert and catching the DB uniqueness error (e.g.,
sqlalchemy.exc.IntegrityError) on commit, performing db.rollback() and treating
that as “already initialized”; alternatively use an upsert/ON CONFLICT DO
NOTHING or a get_or_create-style operation for ArchivalSetting(id=1) so two
simultaneous instances won’t crash on PK conflict.
In `@backend/tests/conftest.py`:
- Line 234: Rename the snake_case variable webhook_module to camelCase
webhookModule where it's declared and referenced (ensure all uses in this file,
including in the patch call, are updated), and update the patch call for
trigger_next_pending_type to include an explicit return_value matching its
contract (e.g., return_value={} or return_value={"status": "ok"}) so the patched
function returns a dict[str, Any] like the surrounding patches.
---
Nitpick comments:
In `@backend/app/repositories/archival_repository.py`:
- Around line 345-360: The two SUM+CASE aggregations for distance_sum and
flights_climbed_sum in archival_repository.py are missing explicit else_
clauses; update the case(...) expressions used inside func.sum for
DataPointSeriesArchive.series_type_definition_id == distance_id and ==
flights_id (the ones labeled "distance_sum" and "flights_climbed_sum") to
include else_=0 (or else_=None to match desired behavior) so they are consistent
with the other aggregations—locate the case() calls wrapped by func.sum(...) and
add the else_ parameter to each.
In `@backend/app/schemas/archival.py`:
- Line 59: The growth_class field on the Pydantic model (growth_class: str =
Field(...)) is too permissive—change its type to a Literal union to enforce
allowed values (Literal['bounded', 'linear_efficient', 'linear']), import
Literal from typing (or typing_extensions if needed), and keep the existing
Field(description=...) on growth_class so validation and schema generation
reflect the constrained enum-like choices.
In `@backend/app/schemas/series_types.py`:
- Around line 434-439: Replace the silent default in get_aggregation_method with
an explicit failure: check AGGREGATION_METHOD_BY_TYPE for the provided
series_type inside get_aggregation_method (rather than using .get(...,
AggregationMethod.AVG)), and raise a clear exception (e.g., ValueError or
KeyError) that includes the SeriesType value when no mapping exists; keep the
function signature and return type but remove the fallback to
AggregationMethod.AVG so missing mappings fail fast during development.
In `@backend/app/services/archival_service.py`:
- Around line 27-31: The __init__ method in ArchivalService is missing an
explicit return type; update the constructor signature of def __init__(self,
log: Logger) to include an explicit return type annotation (-> None) so it reads
def __init__(self, log: Logger) -> None:, leaving body unchanged (self.logger,
self.settings_repo = ArchivalSettingRepository(), self.archive_repo =
DataPointSeriesArchiveRepository()); ensure any required imports for typing are
present if your linter/enforcer requires them.
- Around line 56-70: The public method run_daily_archival should be decorated
with the existing `@handle_exceptions` decorator for consistency with other
service methods and to centralize error handling; add `@handle_exceptions` above
the run_daily_archival method definition and ensure imports/reference to
handle_exceptions are present; also replace the broad return type dict with a
specific TypedDict (e.g., ArchivalSummary with keys archived_rows, deleted_rows,
deleted_live_rows) and use that type for the method signature and the summary
variable to make the return shape explicit (update any callers if needed).
In `@backend/tests/conftest.py`:
- Line 218: The variable webhook_module should be renamed to camelCase
(webhookModule) to follow project conventions; rename the identifier
webhook_module to webhookModule in backend/tests/conftest.py and update every
reference/usage in that file (fixtures, tests, or helper calls) to the new name
so references like webhook_module -> webhookModule remain consistent and tests
import/use the updated symbol.
In `@frontend/src/hooks/api/use-archival.ts`:
- Around line 21-33: The mutation in useUpdateArchivalSettings (useMutation
calling archivalService.updateSettings) lacks an onMutate optimistic update and
rollback; add onMutate that cancels queries for queryKeys.archival.all,
snapshots the previous cache state, applies the updated ArchivalSettings to the
query cache for immediate UI feedback, return the snapshot as context; modify
onError to restore the previous snapshot from context to rollback if the server
call fails; keep invalidateQueries in onSuccess or onSettled to refetch
authoritative data.
In `@frontend/src/lib/api/services/archival.service.ts`:
- Around line 49-84: Remove the duplicated try/catch wrappers around apiClient
calls in the methods getSettings, updateSettings, and triggerArchival and let
apiClient propagate its normalized ApiError; specifically, replace each async
method body so it directly returns await apiClient.get/put/post with the same
generic types and endpoints (keep method names and signatures: getSettings,
updateSettings, triggerArchival) rather than catching and re-wrapping errors,
thereby flattening the service code and relying on apiClient's centralized error
normalization.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
backend/app/api/routes/v1/__init__.pybackend/app/api/routes/v1/archival.pybackend/app/integrations/celery/core.pybackend/app/integrations/celery/tasks/__init__.pybackend/app/integrations/celery/tasks/archival_task.pybackend/app/models/__init__.pybackend/app/models/archival_setting.pybackend/app/models/data_point_series.pybackend/app/models/data_point_series_archive.pybackend/app/models/refresh_token.pybackend/app/models/user_invitation_code.pybackend/app/repositories/__init__.pybackend/app/repositories/archival_repository.pybackend/app/repositories/repositories.pybackend/app/schemas/archival.pybackend/app/schemas/series_types.pybackend/app/services/__init__.pybackend/app/services/archival_service.pybackend/app/services/summaries_service.pybackend/migrations/versions/2026_02_27_1248-0421f956a084_archive_data_point_series.pybackend/scripts/init/seed_archival_settings.pybackend/scripts/start/app.shbackend/tests/conftest.pyfrontend/src/hooks/api/use-archival.tsfrontend/src/lib/api/services/archival.service.tsfrontend/src/lib/query/keys.tsfrontend/src/routes/_authenticated/settings.tsxfrontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
backend/app/services/summaries_service.py (1)
201-204: Narrow the exception handling to expected failure cases.Using bare
except Exceptioncould mask unexpected errors (e.g., database connectivity issues) and silently skip archive merging. Consider catching the specific exception for a missing settings row.Proposed narrower exception handling
+from sqlalchemy.orm.exc import NoResultFound + def _merge_archive_activity( ... ): try: self.archival_settings_repo.get(db_session) - except Exception: + except NoResultFound: + # Settings row not seeded yet — archival not available return live_results🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/summaries_service.py` around lines 201 - 204, The code currently swallows all exceptions when calling archival_settings_repo.get(db_session); replace the bare except with a narrow catch for the specific "missing row" error (e.g., sqlalchemy.orm.exc.NoResultFound or your repository's NotFound/EntityNotFound) so only the expected "no settings" case returns live_results, and let other exceptions propagate (or re-raise) so database/connection errors are not hidden; update the try/except around archival_settings_repo.get to import and catch that specific exception and keep the existing return live_results behavior for that case.backend/app/repositories/archival_repository.py (1)
266-277: Consider adding batching todelete_archive_beforefor consistency.Unlike
delete_live_before(which batches with safety limits), this method performs an unboundedDELETE. While archive data is expected to be smaller (~1/500 of raw), large retention windows could still accumulate significant data. An unbounded delete may cause long table locks.Optional: Add batching similar to delete_live_before
def delete_archive_before(self, db: DbSession, cutoff_date: date) -> int: - deleted = ( - db.query(DataPointSeriesArchive) - .filter(DataPointSeriesArchive.date < cutoff_date) - .delete(synchronize_session=False) - ) - db.commit() - return deleted + """Permanently delete archive rows older than *cutoff_date*. + + Uses batched deletes to avoid long-running transactions. + """ + total_deleted = 0 + while True: + ids_to_delete = ( + db.query(DataPointSeriesArchive.id) + .filter(DataPointSeriesArchive.date < cutoff_date) + .limit(ARCHIVE_BATCH_SIZE) + .subquery() + ) + deleted = ( + db.query(DataPointSeriesArchive) + .filter(DataPointSeriesArchive.id.in_(ids_to_delete)) + .delete(synchronize_session=False) + ) + if deleted == 0: + break + total_deleted += deleted + db.commit() + return total_deleted🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/repositories/archival_repository.py` around lines 266 - 277, The current delete_archive_before in function delete_archive_before deletes all DataPointSeriesArchive rows older than cutoff_date in one unbounded operation; change it to perform batched deletes (mirror delete_live_before behavior) by looping delete(...) with a batch_size limit (e.g., max_rows_per_pass) using .filter(DataPointSeriesArchive.date < cutoff_date).delete(synchronize_session=False), committing after each batch, accumulating a total_deleted count, and breaking the loop when a pass deletes fewer than the batch size or when a safety max_passes/max_rows limit is reached; return the accumulated total_deleted to preserve the original return contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/integrations/celery/core.py`:
- Around line 5-8: The import block is not isort-compliant; reorder the imports
so third-party packages come before first-party modules: import Celery, signals,
current_app as current_celery_app and crontab from celery first, then import
settings from app.config (i.e., ensure the symbols Celery, signals,
current_celery_app, crontab appear before from app.config import settings) so
isort will keep the file stable.
- Around line 89-92: The comment for the "run-daily-archival" Celery beat entry
is misleading: crontab(hour=3, minute=0) runs at 03:00 in the configured
timezone (timezone="Europe/Warsaw" set near line 50), not UTC; update the inline
comment next to the "run-daily-archival" schedule (task
"app.integrations.celery.tasks.archival_task.run_daily_archival", schedule
crontab(hour=3, minute=0)) to say "Daily at 03:00 Europe/Warsaw" (or change the
Celery timezone to UTC if you want it to run at 03:00 UTC instead).
In `@frontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx`:
- Around line 195-198: The current parseInt(archiveDays) || 90 and
parseInt(deleteDays) || 365 silently mask invalid/zero inputs; change the
projection props to pass explicit numeric values or undefined/NaN instead of
falling back to defaults (e.g., compute const archiveDaysNum =
Number(archiveDays); const deleteDaysNum = Number(deleteDays); and pass those
values directly), and update the GrowthProjection component to treat non-finite
numbers and values <= 0 as “invalid / awaiting valid input” (render an invalid
state instead of showing a default projection) so the chart reflects form
validity.
- Around line 323-326: The "Run Now" Button currently calls
triggerMutation.mutate() directly (and is only disabled by
triggerMutation.isPending), which allows runs using stale persisted settings
when there are unsaved edits; implement a handleRunNow function that first
checks for unsaved changes (the form's dirty flag or the existing save function
for the retention policy), save those changes (call the existing
savePolicy/saveChanges function or submit the form) and only after that resolves
call triggerMutation.mutate(); replace onClick={() => triggerMutation.mutate()}
with onClick={handleRunNow} and also disable the button when the form is dirty
or save is in progress to prevent concurrent runs.
- Around line 43-52: The effect currently freezes form state to the first
payload by gating updates on hasInitialized; update the useEffect in this file
so it syncs whenever data.settings changes (remove or change the hasInitialized
check) and always call setArchiveEnabled, setArchiveDays, setDeleteEnabled,
setDeleteDays from that effect when data is present; also ensure the toggle and
input handlers for archive/delete call setIsDirty(true) when the user mutates
fields, and after the successful save handler (e.g., in your saveSettings or
onSaveSuccess function) reset setIsDirty(false) so the UI correctly reflects
dirty/clean state.
---
Nitpick comments:
In `@backend/app/repositories/archival_repository.py`:
- Around line 266-277: The current delete_archive_before in function
delete_archive_before deletes all DataPointSeriesArchive rows older than
cutoff_date in one unbounded operation; change it to perform batched deletes
(mirror delete_live_before behavior) by looping delete(...) with a batch_size
limit (e.g., max_rows_per_pass) using .filter(DataPointSeriesArchive.date <
cutoff_date).delete(synchronize_session=False), committing after each batch,
accumulating a total_deleted count, and breaking the loop when a pass deletes
fewer than the batch size or when a safety max_passes/max_rows limit is reached;
return the accumulated total_deleted to preserve the original return contract.
In `@backend/app/services/summaries_service.py`:
- Around line 201-204: The code currently swallows all exceptions when calling
archival_settings_repo.get(db_session); replace the bare except with a narrow
catch for the specific "missing row" error (e.g.,
sqlalchemy.orm.exc.NoResultFound or your repository's NotFound/EntityNotFound)
so only the expected "no settings" case returns live_results, and let other
exceptions propagate (or re-raise) so database/connection errors are not hidden;
update the try/except around archival_settings_repo.get to import and catch that
specific exception and keep the existing return live_results behavior for that
case.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
backend/app/api/routes/v1/archival.pybackend/app/integrations/celery/core.pybackend/app/models/archival_setting.pybackend/app/models/data_point_series_archive.pybackend/app/repositories/archival_repository.pybackend/app/schemas/archival.pybackend/app/services/summaries_service.pybackend/migrations/versions/2026_02_27_1710-8f618017e710_archive_data_point_series.pybackend/scripts/init/seed_archival_settings.pybackend/tests/conftest.pyfrontend/src/lib/api/services/archival.service.tsfrontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx
🚧 Files skipped from review as they are similar to previous changes (3)
- backend/app/models/data_point_series_archive.py
- frontend/src/lib/api/services/archival.service.ts
- backend/app/api/routes/v1/archival.py
frontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx
Outdated
Show resolved
Hide resolved
frontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx
Outdated
Show resolved
Hide resolved
frontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
backend/migrations/versions/2026_03_03_2227-cd18aa553afd_archive.py (1)
30-50:⚠️ Potential issue | 🟠 MajorAdd a standalone
bucket_start_atindex for archive retention queries.At Line 30-49, the archive table has only PK + composite unique index. Retention and range reads are time-based, so this will degrade to scans as archive grows.
Proposed migration patch
op.create_table( "data_point_series_archive", sa.Column("id", sa.UUID(), nullable=False), sa.Column("data_source_id", sa.UUID(), nullable=False), sa.Column("series_type_definition_id", sa.Integer(), nullable=False), sa.Column("bucket_start_at", sa.DateTime(timezone=True), nullable=False), sa.Column("aggregation_type", sa.String(length=32), nullable=False), sa.Column("value", sa.Numeric(precision=10, scale=3), nullable=False), sa.Column("sample_count", sa.Integer(), nullable=False), sa.ForeignKeyConstraint(["data_source_id"], ["data_source.id"], ondelete="CASCADE"), sa.ForeignKeyConstraint(["series_type_definition_id"], ["series_type_definition.id"], ondelete="RESTRICT"), sa.PrimaryKeyConstraint("id"), sa.UniqueConstraint( "data_source_id", "series_type_definition_id", "bucket_start_at", "aggregation_type", name="uq_archive_source_type_start_agg", ), ) + op.create_index( + op.f("ix_data_point_series_archive_bucket_start_at"), + "data_point_series_archive", + ["bucket_start_at"], + unique=False, + ) @@ def downgrade() -> None: # ### commands auto generated by Alembic - please adjust! ### op.create_index( op.f("idx_data_point_series_source_type_time"), "data_point_series", ["data_source_id", "series_type_definition_id", "recorded_at"], unique=False, ) + op.drop_index(op.f("ix_data_point_series_archive_bucket_start_at"), table_name="data_point_series_archive") op.drop_table("data_point_series_archive") op.drop_table("archival_settings")Based on learnings, for new tables and indexes in
backend/app/models, use SQLAlchemy’s defaultix_prefix.Also applies to: 56-63
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/migrations/versions/2026_03_03_2227-cd18aa553afd_archive.py` around lines 30 - 50, Add a standalone time index on bucket_start_at for the archive table to avoid full scans: create an index on "data_point_series_archive"."bucket_start_at" using SQLAlchemy's default ix_ prefix (e.g. ix_data_point_series_archive_bucket_start_at) in the migration where op.create_table(...) defines data_point_series_archive; also apply the same pattern for the other archive-related table/section referenced around lines 56-63 so both migrations add standalone ix_... indexes for bucket_start_at instead of relying solely on the composite unique constraint.
🧹 Nitpick comments (1)
backend/migrations/versions/2026_03_03_2227-cd18aa553afd_archive.py (1)
1-6: Use the migration title convention in the Alembic header.At Line 1,
"""archivedoes not follow the project naming pattern (Add <feature> <object>), which makes migration history harder to scan.As per coding guidelines,
backend/migrations/versions/*.py: use Alembic migrations with naming convention'Add <feature> <object>'.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/migrations/versions/2026_03_03_2227-cd18aa553afd_archive.py` around lines 1 - 6, The migration header docstring currently starts with """archive which doesn't follow our Alembic title convention; update the top-level docstring in the migration file with Revision ID cd18aa553afd to use the pattern "Add <feature> <object>" (e.g., "Add archive support for X" or "Add archive table"), keeping the existing Revision ID and Reviser lines intact so the migration metadata isn't changed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/repositories/archival_repository.py`:
- Around line 163-167: The deletion step is too broad and removes rows of
unknown series types when mixed with known types; modify the archival flow so
deletes only target rows that were actually archived by adding a series-type
constraint (e.g., DataPointSeries.series_type.in_(processed_types) or
equivalent) to the queries used to build source_ids and to the final delete
call, or else collect the exact primary keys archived and delete by those ids;
update the logic around the parts that skip unknown types (the branch that skips
unknown types at the block referencing unknown types) and the delete block (the
scope currently using source+date) to include the processed/known series types
or archived ids so unknown types are never deleted.
- Around line 165-166: The predicate uses cast(DataPointSeries.recorded_at,
Date) which prevents index use; replace those casts with a timestamp cutoff
comparison so the DB can use indexes. Specifically, change
.filter(cast(DataPointSeries.recorded_at, Date) < cutoff_date) to compare the
raw column against a datetime timestamp (e.g.
.filter(DataPointSeries.recorded_at < cutoff_timestamp)), where cutoff_timestamp
is a datetime (timezone-aware as needed) representing the cutoff instant (e.g.
midnight of cutoff_date or end/start boundary you intend). Apply the same
replacement for every occurrence of cast(DataPointSeries.recorded_at, Date) in
this module so all queries use DataPointSeries.recorded_at < cutoff_timestamp
instead.
In `@frontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx`:
- Around line 527-587: The early conditional return for invalid config
(archiveInvalid || deleteInvalid) must be moved after all hooks to avoid
breaking hook ordering; keep the validation logic but remove the early return at
the top and instead compute all hooks and derived values (archivalEffective,
growthClass useMemo, cfg, liveTotalBytes, spanDays, dailyRawEstimate useMemo,
chartData useMemo) first, then render the invalid-configuration JSX when
archiveInvalid || deleteInvalid (e.g., by returning that JSX after the hooks or
setting a variable like invalidView and returning it at the end) so hooks
(useMemo calls in growthClass, dailyRawEstimate, chartData) always run in the
same order across renders.
---
Duplicate comments:
In `@backend/migrations/versions/2026_03_03_2227-cd18aa553afd_archive.py`:
- Around line 30-50: Add a standalone time index on bucket_start_at for the
archive table to avoid full scans: create an index on
"data_point_series_archive"."bucket_start_at" using SQLAlchemy's default ix_
prefix (e.g. ix_data_point_series_archive_bucket_start_at) in the migration
where op.create_table(...) defines data_point_series_archive; also apply the
same pattern for the other archive-related table/section referenced around lines
56-63 so both migrations add standalone ix_... indexes for bucket_start_at
instead of relying solely on the composite unique constraint.
---
Nitpick comments:
In `@backend/migrations/versions/2026_03_03_2227-cd18aa553afd_archive.py`:
- Around line 1-6: The migration header docstring currently starts with
"""archive which doesn't follow our Alembic title convention; update the
top-level docstring in the migration file with Revision ID cd18aa553afd to use
the pattern "Add <feature> <object>" (e.g., "Add archive support for X" or "Add
archive table"), keeping the existing Revision ID and Reviser lines intact so
the migration metadata isn't changed.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
backend/app/database.pybackend/app/integrations/celery/core.pybackend/app/models/data_point_series_archive.pybackend/app/repositories/archival_repository.pybackend/migrations/versions/2026_03_03_2227-cd18aa553afd_archive.pyfrontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/models/data_point_series_archive.py
| from app.schemas.series_types import ( | ||
| AGGREGATION_METHOD_BY_TYPE, | ||
| AggregationMethod, | ||
| SeriesType, | ||
| get_series_type_from_id, | ||
| get_series_type_id, | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Repository layer is coupled to app.schemas; move aggregation policy mapping out of repository.
At Line 14-20, the repository depends on schema-layer enums/mappings. This breaks the repository boundary and makes DB code depend on API/schema concerns.
As per coding guidelines, backend/app/repositories/**/*.py: repository layer must handle only database operations using SQLAlchemy models as input/output (never Pydantic schemas).
| source_ids = ( | ||
| db.query(DataPointSeries.data_source_id) | ||
| .filter(cast(DataPointSeries.recorded_at, Date) < cutoff_date) | ||
| .distinct() | ||
| .limit(10) |
There was a problem hiding this comment.
Unknown series types can still be deleted without archival in mixed batches.
At Line 229-232 unknown types are skipped, but at Line 253-259 delete is scoped only by source+date.
If a batch has both known and unknown types, unknown rows are deleted even though they were never archived.
Proposed safety fix
def archive_data_before(self, db: DbSession, cutoff_date: date) -> int:
@@
- while True:
+ known_series_type_ids = [get_series_type_id(series_type) for series_type in SeriesType]
+
+ while True:
@@
source_ids = (
db.query(DataPointSeries.data_source_id)
- .filter(cast(DataPointSeries.recorded_at, Date) < cutoff_date)
+ .filter(
+ cast(DataPointSeries.recorded_at, Date) < cutoff_date,
+ DataPointSeries.series_type_definition_id.in_(known_series_type_ids),
+ )
.distinct()
.limit(10)
.all()
)
@@
.filter(
DataPointSeries.data_source_id.in_(source_id_list),
cast(DataPointSeries.recorded_at, Date) < cutoff_date,
+ DataPointSeries.series_type_definition_id.in_(known_series_type_ids),
)
@@
deleted_count = (
db.query(DataPointSeries)
.filter(
DataPointSeries.data_source_id.in_(source_id_list),
cast(DataPointSeries.recorded_at, Date) < cutoff_date,
+ DataPointSeries.series_type_definition_id.in_(known_series_type_ids),
)
.delete(synchronize_session=False)
)Also applies to: 229-237, 253-259
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/repositories/archival_repository.py` around lines 163 - 167, The
deletion step is too broad and removes rows of unknown series types when mixed
with known types; modify the archival flow so deletes only target rows that were
actually archived by adding a series-type constraint (e.g.,
DataPointSeries.series_type.in_(processed_types) or equivalent) to the queries
used to build source_ids and to the final delete call, or else collect the exact
primary keys archived and delete by those ids; update the logic around the parts
that skip unknown types (the branch that skips unknown types at the block
referencing unknown types) and the delete block (the scope currently using
source+date) to include the processed/known series types or archived ids so
unknown types are never deleted.
| .filter(cast(DataPointSeries.recorded_at, Date) < cutoff_date) | ||
| .distinct() |
There was a problem hiding this comment.
Avoid cast(recorded_at, Date) in predicates; use a timestamp cutoff to keep filters index-friendly.
At Line 165/191/257/307, casting the column side prevents efficient index use on recorded_at and will hurt at scale.
Proposed predicate change
def archive_data_before(self, db: DbSession, cutoff_date: date) -> int:
@@
- total_deleted = 0
+ total_deleted = 0
+ cutoff_ts = datetime.combine(cutoff_date, datetime.min.time(), tzinfo=timezone.utc)
@@
- .filter(cast(DataPointSeries.recorded_at, Date) < cutoff_date)
+ .filter(DataPointSeries.recorded_at < cutoff_ts)
@@
DataPointSeries.data_source_id.in_(source_id_list),
- cast(DataPointSeries.recorded_at, Date) < cutoff_date,
+ DataPointSeries.recorded_at < cutoff_ts,
@@
DataPointSeries.data_source_id.in_(source_id_list),
- cast(DataPointSeries.recorded_at, Date) < cutoff_date,
+ DataPointSeries.recorded_at < cutoff_ts,
)
def delete_live_before(self, db: DbSession, cutoff_date: date) -> int:
@@
- total_deleted = 0
+ total_deleted = 0
+ cutoff_ts = datetime.combine(cutoff_date, datetime.min.time(), tzinfo=timezone.utc)
@@
- .filter(cast(DataPointSeries.recorded_at, Date) < cutoff_date)
+ .filter(DataPointSeries.recorded_at < cutoff_ts)Also applies to: 191-192, 257-258, 307-307
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/repositories/archival_repository.py` around lines 165 - 166, The
predicate uses cast(DataPointSeries.recorded_at, Date) which prevents index use;
replace those casts with a timestamp cutoff comparison so the DB can use
indexes. Specifically, change .filter(cast(DataPointSeries.recorded_at, Date) <
cutoff_date) to compare the raw column against a datetime timestamp (e.g.
.filter(DataPointSeries.recorded_at < cutoff_timestamp)), where cutoff_timestamp
is a datetime (timezone-aware as needed) representing the cutoff instant (e.g.
midnight of cutoff_date or end/start boundary you intend). Apply the same
replacement for every occurrence of cast(DataPointSeries.recorded_at, Date) in
this module so all queries use DataPointSeries.recorded_at < cutoff_timestamp
instead.
| if (archiveInvalid || deleteInvalid) { | ||
| return ( | ||
| <div className="bg-zinc-900/50 border border-zinc-800 rounded-xl overflow-hidden min-h-[200px] flex items-center justify-center"> | ||
| <div className="text-center text-zinc-500"> | ||
| <AlertTriangle className="h-8 w-8 mx-auto mb-2 opacity-50" /> | ||
| <p>Invalid configuration</p> | ||
| <p className="text-xs mt-1"> | ||
| Please enter valid duration in days ({archiveInvalid && 'Archival'} | ||
| {archiveInvalid && deleteInvalid && ' & '} | ||
| {deleteInvalid && 'Retention'}) | ||
| </p> | ||
| </div> | ||
| </div> | ||
| ); | ||
| } | ||
|
|
||
| const archivalEffective = | ||
| archiveEnabled && | ||
| archiveDays > 0 && | ||
| (!deleteEnabled || deleteDays > archiveDays); | ||
|
|
||
| const growthClass: GrowthClass = useMemo(() => { | ||
| if (deleteEnabled) return 'bounded'; | ||
| if (archiveEnabled) return 'linear_efficient'; | ||
| return 'linear'; | ||
| }, [archiveEnabled, deleteEnabled]); | ||
|
|
||
| const cfg = GROWTH_CONFIG[growthClass]; | ||
|
|
||
| const liveTotalBytes = storage.live_data_bytes + storage.live_index_bytes; | ||
| const spanDays = Math.max(storage.live_data_span_days, 1); | ||
|
|
||
| const dailyRawEstimate = useMemo(() => { | ||
| if (storage.live_row_count === 0) return 0; | ||
| return liveTotalBytes / spanDays; | ||
| }, [storage.live_row_count, liveTotalBytes, spanDays]); | ||
|
|
||
| const chartData = useMemo( | ||
| () => | ||
| computeProjection( | ||
| archiveEnabled, | ||
| archiveDays, | ||
| deleteEnabled, | ||
| deleteDays, | ||
| liveTotalBytes, | ||
| storage.live_row_count, | ||
| storage.archive_data_bytes + storage.archive_index_bytes, | ||
| storage.live_data_span_days | ||
| ), | ||
| [ | ||
| archiveEnabled, | ||
| archiveDays, | ||
| deleteEnabled, | ||
| deleteDays, | ||
| liveTotalBytes, | ||
| storage.live_row_count, | ||
| storage.archive_data_bytes, | ||
| storage.archive_index_bytes, | ||
| storage.live_data_span_days, | ||
| ] | ||
| ); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n frontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx | head -600 | tail -100Repository: the-momentum/open-wearables
Length of output: 3991
🏁 Script executed:
wc -l frontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsxRepository: the-momentum/open-wearables
Length of output: 141
🏁 Script executed:
cat -n frontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx | sed -n '500,600p'Repository: the-momentum/open-wearables
Length of output: 3999
Move validation check after all hooks to fix React hook ordering violation.
The early return on line 527 breaks React's hook ordering rules by conditionally skipping the useMemo calls on lines 548, 559, and 564. Move the validation check after all hooks are called to ensure consistent execution order across renders.
Proposed fix
- const archiveInvalid =
- archiveEnabled && (!Number.isFinite(archiveDays) || archiveDays <= 0);
- const deleteInvalid =
- deleteEnabled && (!Number.isFinite(deleteDays) || deleteDays <= 0);
-
- if (archiveInvalid || deleteInvalid) {
- return (
- <div className="bg-zinc-900/50 border border-zinc-800 rounded-xl overflow-hidden min-h-[200px] flex items-center justify-center">
- <div className="text-center text-zinc-500">
- <AlertTriangle className="h-8 w-8 mx-auto mb-2 opacity-50" />
- <p>Invalid configuration</p>
- <p className="text-xs mt-1">
- Please enter valid duration in days ({archiveInvalid && 'Archival'}
- {archiveInvalid && deleteInvalid && ' & '}
- {deleteInvalid && 'Retention'})
- </p>
- </div>
- </div>
- );
- }
-
const archivalEffective =
archiveEnabled &&
archiveDays > 0 &&
(!deleteEnabled || deleteDays > archiveDays);
@@
const chartData = useMemo(
@@
);
+
+ const archiveInvalid =
+ archiveEnabled && (!Number.isFinite(archiveDays) || archiveDays <= 0);
+ const deleteInvalid =
+ deleteEnabled && (!Number.isFinite(deleteDays) || deleteDays <= 0);
+
+ if (archiveInvalid || deleteInvalid) {
+ return (
+ <div className="bg-zinc-900/50 border border-zinc-800 rounded-xl overflow-hidden min-h-[200px] flex items-center justify-center">
+ <div className="text-center text-zinc-500">
+ <AlertTriangle className="h-8 w-8 mx-auto mb-2 opacity-50" />
+ <p>Invalid configuration</p>
+ <p className="text-xs mt-1">
+ Please enter valid duration in days ({archiveInvalid && 'Archival'}
+ {archiveInvalid && deleteInvalid && ' & '}
+ {deleteInvalid && 'Retention'})
+ </p>
+ </div>
+ </div>
+ );
+ }🧰 Tools
🪛 Biome (2.4.4)
[error] 548-548: This hook is being called conditionally, but all hooks must be called in the exact same order in every component render.
(lint/correctness/useHookAtTopLevel)
[error] 559-559: This hook is being called conditionally, but all hooks must be called in the exact same order in every component render.
(lint/correctness/useHookAtTopLevel)
[error] 564-564: This hook is being called conditionally, but all hooks must be called in the exact same order in every component render.
(lint/correctness/useHookAtTopLevel)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@frontend/src/routes/_authenticated/settings/data-lifecycle-tab.tsx` around
lines 527 - 587, The early conditional return for invalid config (archiveInvalid
|| deleteInvalid) must be moved after all hooks to avoid breaking hook ordering;
keep the validation logic but remove the early return at the top and instead
compute all hooks and derived values (archivalEffective, growthClass useMemo,
cfg, liveTotalBytes, spanDays, dailyRawEstimate useMemo, chartData useMemo)
first, then render the invalid-configuration JSX when archiveInvalid ||
deleteInvalid (e.g., by returning that JSX after the hooks or setting a variable
like invalidView and returning it at the end) so hooks (useMemo calls in
growthClass, dailyRawEstimate, chartData) always run in the same order across
renders.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/schemas/series_types.py`:
- Around line 343-437: AGGREGATION_METHOD_BY_TYPE is missing
SeriesType.body_fat_mass, SeriesType.skeletal_muscle_mass, and
SeriesType.hydration (causing silent AVG fallback) — add these series types to
AGGREGATION_METHOD_BY_TYPE with the correct AggregationMethod (e.g., SUM for
cumulative metrics like hydration) and remove any code paths that use a
.get(..., AggregationMethod.AVG) fallback; instead, ensure lookups use direct
mapping access (AGGREGATION_METHOD_BY_TYPE[series_type]) or explicitly raise a
clear error (KeyError/ValueError) so unmapped SeriesType values fail fast.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7ea8d091-20eb-4e7f-8a0d-09dd27befe47
📒 Files selected for processing (4)
backend/app/integrations/celery/core.pybackend/app/integrations/celery/tasks/__init__.pybackend/app/schemas/series_types.pybackend/tests/conftest.py
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/tests/conftest.py
- backend/app/integrations/celery/core.py
| AGGREGATION_METHOD_BY_TYPE: dict[SeriesType, AggregationMethod] = { | ||
| # ── Heart & Cardiovascular ── | ||
| SeriesType.heart_rate: AggregationMethod.AVG, | ||
| SeriesType.resting_heart_rate: AggregationMethod.AVG, | ||
| SeriesType.heart_rate_variability_sdnn: AggregationMethod.AVG, | ||
| SeriesType.heart_rate_recovery_one_minute: AggregationMethod.AVG, | ||
| SeriesType.walking_heart_rate_average: AggregationMethod.AVG, | ||
| SeriesType.recovery_score: AggregationMethod.AVG, | ||
| SeriesType.heart_rate_variability_rmssd: AggregationMethod.AVG, | ||
| # ── Blood & Respiratory ── | ||
| SeriesType.oxygen_saturation: AggregationMethod.AVG, | ||
| SeriesType.blood_glucose: AggregationMethod.AVG, | ||
| SeriesType.blood_pressure_systolic: AggregationMethod.AVG, | ||
| SeriesType.blood_pressure_diastolic: AggregationMethod.AVG, | ||
| SeriesType.respiratory_rate: AggregationMethod.AVG, | ||
| SeriesType.sleeping_breathing_disturbances: AggregationMethod.SUM, | ||
| SeriesType.blood_alcohol_content: AggregationMethod.AVG, | ||
| SeriesType.peripheral_perfusion_index: AggregationMethod.AVG, | ||
| SeriesType.forced_vital_capacity: AggregationMethod.AVG, | ||
| SeriesType.forced_expiratory_volume_1: AggregationMethod.AVG, | ||
| SeriesType.peak_expiratory_flow_rate: AggregationMethod.AVG, | ||
| # ── Body Composition ── | ||
| SeriesType.height: AggregationMethod.AVG, | ||
| SeriesType.weight: AggregationMethod.AVG, | ||
| SeriesType.body_fat_percentage: AggregationMethod.AVG, | ||
| SeriesType.body_mass_index: AggregationMethod.AVG, | ||
| SeriesType.lean_body_mass: AggregationMethod.AVG, | ||
| SeriesType.body_temperature: AggregationMethod.AVG, | ||
| SeriesType.skin_temperature: AggregationMethod.AVG, | ||
| SeriesType.waist_circumference: AggregationMethod.AVG, | ||
| # ── Fitness Metrics ── | ||
| SeriesType.vo2_max: AggregationMethod.AVG, | ||
| SeriesType.six_minute_walk_test_distance: AggregationMethod.MAX, | ||
| # ── Activity — Basic ── | ||
| SeriesType.steps: AggregationMethod.SUM, | ||
| SeriesType.energy: AggregationMethod.SUM, | ||
| SeriesType.basal_energy: AggregationMethod.SUM, | ||
| SeriesType.stand_time: AggregationMethod.SUM, | ||
| SeriesType.exercise_time: AggregationMethod.SUM, | ||
| SeriesType.physical_effort: AggregationMethod.AVG, | ||
| SeriesType.flights_climbed: AggregationMethod.SUM, | ||
| SeriesType.average_met: AggregationMethod.AVG, | ||
| # ── Activity — Distance ── | ||
| SeriesType.distance_walking_running: AggregationMethod.SUM, | ||
| SeriesType.distance_cycling: AggregationMethod.SUM, | ||
| SeriesType.distance_swimming: AggregationMethod.SUM, | ||
| SeriesType.distance_downhill_snow_sports: AggregationMethod.SUM, | ||
| SeriesType.distance_other: AggregationMethod.SUM, | ||
| # ── Activity — Walking Metrics ── | ||
| SeriesType.walking_step_length: AggregationMethod.AVG, | ||
| SeriesType.walking_speed: AggregationMethod.AVG, | ||
| SeriesType.walking_double_support_percentage: AggregationMethod.AVG, | ||
| SeriesType.walking_asymmetry_percentage: AggregationMethod.AVG, | ||
| SeriesType.walking_steadiness: AggregationMethod.AVG, | ||
| SeriesType.stair_descent_speed: AggregationMethod.AVG, | ||
| SeriesType.stair_ascent_speed: AggregationMethod.AVG, | ||
| # ── Activity — Running Metrics ── | ||
| SeriesType.running_power: AggregationMethod.AVG, | ||
| SeriesType.running_speed: AggregationMethod.AVG, | ||
| SeriesType.running_vertical_oscillation: AggregationMethod.AVG, | ||
| SeriesType.running_ground_contact_time: AggregationMethod.AVG, | ||
| SeriesType.running_stride_length: AggregationMethod.AVG, | ||
| # ── Activity — Swimming Metrics ── | ||
| SeriesType.swimming_stroke_count: AggregationMethod.SUM, | ||
| SeriesType.underwater_depth: AggregationMethod.AVG, | ||
| # ── Activity — Generic ── | ||
| SeriesType.cadence: AggregationMethod.AVG, | ||
| SeriesType.power: AggregationMethod.AVG, | ||
| SeriesType.speed: AggregationMethod.AVG, | ||
| SeriesType.workout_effort_score: AggregationMethod.AVG, | ||
| SeriesType.estimated_workout_effort_score: AggregationMethod.AVG, | ||
| # ── Environmental ── | ||
| SeriesType.environmental_audio_exposure: AggregationMethod.AVG, | ||
| SeriesType.headphone_audio_exposure: AggregationMethod.AVG, | ||
| SeriesType.environmental_sound_reduction: AggregationMethod.AVG, | ||
| SeriesType.time_in_daylight: AggregationMethod.SUM, | ||
| SeriesType.water_temperature: AggregationMethod.AVG, | ||
| SeriesType.uv_exposure: AggregationMethod.SUM, | ||
| SeriesType.inhaler_usage: AggregationMethod.SUM, | ||
| SeriesType.weather_temperature: AggregationMethod.AVG, | ||
| SeriesType.weather_humidity: AggregationMethod.AVG, | ||
| # ── Garmin-specific ── | ||
| SeriesType.garmin_stress_level: AggregationMethod.AVG, | ||
| SeriesType.garmin_skin_temperature: AggregationMethod.AVG, | ||
| SeriesType.garmin_fitness_age: AggregationMethod.AVG, | ||
| SeriesType.garmin_body_battery: AggregationMethod.AVG, | ||
| # ── Other ── | ||
| SeriesType.electrodermal_activity: AggregationMethod.AVG, | ||
| SeriesType.push_count: AggregationMethod.SUM, | ||
| SeriesType.atrial_fibrillation_burden: AggregationMethod.SUM, | ||
| SeriesType.insulin_delivery: AggregationMethod.SUM, | ||
| SeriesType.number_of_times_fallen: AggregationMethod.SUM, | ||
| SeriesType.number_of_alcoholic_beverages: AggregationMethod.SUM, | ||
| SeriesType.nike_fuel: AggregationMethod.SUM, | ||
| } |
There was a problem hiding this comment.
Fail fast on unmapped series types instead of silently defaulting to AVG.
AGGREGATION_METHOD_BY_TYPE currently omits SeriesType.body_fat_mass, SeriesType.skeletal_muscle_mass, and SeriesType.hydration. With the current AVG fallback, omissions are silent and can produce incorrect canonical archive values (especially for cumulative metrics like hydration).
💡 Proposed fix
AGGREGATION_METHOD_BY_TYPE: dict[SeriesType, AggregationMethod] = {
@@
SeriesType.body_temperature: AggregationMethod.AVG,
SeriesType.skin_temperature: AggregationMethod.AVG,
SeriesType.waist_circumference: AggregationMethod.AVG,
+ SeriesType.body_fat_mass: AggregationMethod.AVG,
+ SeriesType.skeletal_muscle_mass: AggregationMethod.AVG,
@@
SeriesType.number_of_alcoholic_beverages: AggregationMethod.SUM,
SeriesType.nike_fuel: AggregationMethod.SUM,
+ SeriesType.hydration: AggregationMethod.SUM,
}
+_UNMAPPED_SERIES_TYPES: set[SeriesType] = set(SeriesType) - set(AGGREGATION_METHOD_BY_TYPE)
+if _UNMAPPED_SERIES_TYPES:
+ raise RuntimeError(
+ "Missing aggregation mapping for series types: "
+ + ", ".join(sorted(series_type.value for series_type in _UNMAPPED_SERIES_TYPES))
+ )
+
def get_aggregation_method(series_type: SeriesType) -> AggregationMethod:
- """Get the aggregation method for a series type.
-
- Falls back to AVG for unknown types (safe default for rate-like metrics).
- """
- return AGGREGATION_METHOD_BY_TYPE.get(series_type, AggregationMethod.AVG)
+ """Get the aggregation method for a series type."""
+ return AGGREGATION_METHOD_BY_TYPE[series_type]Also applies to: 440-445
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/schemas/series_types.py` around lines 343 - 437,
AGGREGATION_METHOD_BY_TYPE is missing SeriesType.body_fat_mass,
SeriesType.skeletal_muscle_mass, and SeriesType.hydration (causing silent AVG
fallback) — add these series types to AGGREGATION_METHOD_BY_TYPE with the
correct AggregationMethod (e.g., SUM for cumulative metrics like hydration) and
remove any code paths that use a .get(..., AggregationMethod.AVG) fallback;
instead, ensure lookups use direct mapping access
(AGGREGATION_METHOD_BY_TYPE[series_type]) or explicitly raise a clear error
(KeyError/ValueError) so unmapped SeriesType values fail fast.
You can find new settings in "Data Lifecycle" tab. Using that admin can now set up auto archiving data points series (it re-aggregates data into daily instead of batch-size chunks and move it into new table) and/or auto retention (it removes data from both live & archive tables). Both options allows to set up number of days after which policies are triggered (once per day).
This tab also includes prediction of how the storage size will grow with currently existing data.
Closes #489
Summary by CodeRabbit