Skip to content

Async opportunistic repair queue for unresolved historical range misses #165

@unav4ila8le

Description

@unav4ila8le

Context

We recently disabled blocking live quote repair for range/history requests to restore dashboard and asset-page latency.
That solved UX latency, but it also reduced opportunities to repair historical cache gaps during range reads.

Problem

For range paths (e.g. net worth history), unresolved exact-date misses now rely on fallback behavior without attempting synchronous provider repair.
Result: user-facing performance is good, but historical quote gaps may persist longer unless cron naturally backfills them.

Goal

Add a lightweight async repair mechanism that:

  • preserves fast range reads
  • gradually repairs unresolved historical quote gaps in the background
  • avoids introducing blocking latency on user requests

Proposed Solution

Implement an opportunistic quote repair queue:

  1. During range fetches, when a request remains unresolved at exact date (or resolves via fallback), enqueue a repair item.
  2. Use a small queue table with dedup and attempt tracking.
  3. Extend cron (or add a small cron step) to drain queue in bounded batches and upsert repaired quotes.
  4. Mark queue items as resolved/failed with retry/backoff policy.

Suggested Schema (draft)

quote_repair_queue:

  • id (uuid pk)
  • symbol_id (uuid, indexed)
  • target_date (date, indexed)
  • status (pending|processing|done|failed)
  • attempt_count (int, default 0)
  • last_attempt_at (timestamptz null)
  • next_attempt_at (timestamptz null)
  • last_error (text null)
  • created_at, updated_at
  • unique: (symbol_id, target_date) to dedupe

Processing Rules (draft)

  • Drain in small batches (e.g. 50–200 items per run).
  • Respect max attempts and exponential backoff.
  • Only treat quote as repaired when exact target-date row is inserted.
  • Keep non-fatal failures in queue with next_attempt_at.
  • Permanent provider-unsupported symbols can be marked failed after threshold.

Acceptance Criteria

  1. Range/read paths remain non-blocking (no live provider fetch in request path).
  2. Unresolved historical misses are queued and eventually retried asynchronously.
  3. Queue processing is bounded and safe under retries/restarts.
  4. Duplicate queue entries are prevented.
  5. Observability exists for queue depth, processed count, success/failure count.
  6. No regression in existing quote cron behavior.

Out of Scope

  • UI changes
  • Full worker infrastructure
  • Multi-provider quote source redesign

Notes

This is a follow-up to the range-latency fix where liveFetchOnMiss is disabled for range quote fetches in symbolHandler.fetchForPositionsRange.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions