Skip to content

chore: operational debt sweep — sync, dead code, retry, docs#128

Merged
nickmeinhold merged 4 commits intomainfrom
chore/operational-debt-sweep
Feb 27, 2026
Merged

chore: operational debt sweep — sync, dead code, retry, docs#128
nickmeinhold merged 4 commits intomainfrom
chore/operational-debt-sweep

Conversation

@nickmeinhold
Copy link
Collaborator

Summary

10-item operational debt sweep addressing lifecycle wiring, cleanup scheduling, error resilience, and documentation accuracy across 20 files (+846/−425 lines).

CRDT robustness (Items 1–4):

  • Invalidate knowledgeGraphProvider after merge so remote data appears without restart
  • Purge tombstones + sync_log entries after successful pull (prevents unbounded growth)
  • Exponential backoff on periodic sync errors (capped at 6 ticks ≈ 30 min)
  • Injectable clock in DriftGraphRepository via clockProvider (fixes DateTime.now() leak)

Dead code + tests (Items 7–8):

  • FirestoreGraphRepositoryFirestoreGraphLoader (keeps only load(), removes 100+ lines of dead save/watch/updateQuizItem)
  • 5 new test cases for seedFromFirestoreIfNeeded migration path

UX guards + docs (Items 5–6, 9):

  • Narrate button now checks both Anthropic + ElevenLabs API keys before starting
  • 3 inline // ignore: → 1 file-level ignore_for_file with documented rationale
  • SM-2 → FSRS, sqlite_crdt → custom CRDT references fixed in 3 doc files

Resilience (Item 10):

  • New retryWithBackoff() utility with exponential delay + jitter
  • Wraps all 5 Claude API + ElevenLabs call sites (extraction ×2, narration, recommendation, TTS)
  • Only retries transient errors (429, 529, SocketException)

Test plan

  • flutter analyze — 0 issues
  • flutter test — 921 tests, 0 failures
  • New tests: CRDT invalidation, tombstone purge, backoff, clock injection, seed migration (5 branches), retry utility (8 cases), FirestoreGraphLoader
  • Smoke test: ingest → quiz → check sync state on macOS

🤖 Generated with Claude Code

nickmeinhold and others added 2 commits February 27, 2026 21:16
10-item sweep addressing lifecycle wiring, cleanup scheduling, error
resilience, and documentation accuracy:

CRDT robustness:
- Invalidate knowledgeGraphProvider after merge so remote data is visible
- Purge tombstones + sync_log after successful pull (bounded growth)
- Exponential backoff on periodic sync errors (cap at 6 ticks = 30 min)
- Injectable clock in DriftGraphRepository (testable sync metadata)

Dead code + tests:
- FirestoreGraphRepository → FirestoreGraphLoader (load-only, no
  dead save/watch/updateQuizItem methods)
- seedFromFirestoreIfNeeded: 5 test cases covering all migration branches

UX guards + docs:
- Narrate button checks Anthropic + ElevenLabs keys before starting
- Consolidate 3 inline // ignore: into 1 file-level ignore_for_file
- Fix SM-2 → FSRS, sqlite_crdt → custom CRDT in 3 doc files

Resilience:
- retryWithBackoff() utility with jitter for 429/529/SocketException
- Applied to all Claude API + ElevenLabs call sites (5 wraps)

921 tests pass, 0 analyzer issues.

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Collaborator

@MaxwellMergeSlam MaxwellMergeSlam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MaxwellMergeSlam's Review 🤼

Verdict: APPROVE

Summary: A clean operational debt sweep that tightens the CRDT sync lifecycle, kills 100+ lines of dead Firestore code, and wraps all API calls in battle-tested retry logic — all backed by 434 new lines of tests.

Tyler Durden: "It's only after we've lost everything that we're free to do anything." And that's exactly what happened to FirestoreGraphRepository.save() — 100 lines of dead write code purged into the void. Beautiful.


The Good:

  • Dead code removal is chef's kiss (firestore_graph_repository.dart): Renaming FirestoreGraphRepositoryFirestoreGraphLoader and deleting save(), updateQuizItem(), saveSplitData(), watch(), and _commitBatched() removes 100+ lines that were unreachable since DualWrite removal in Phase 5. The show FirestoreGraphLoader import guard in graph_store_provider.dart:4 is a nice touch — prevents anyone from accidentally using the old class name.

  • Retry utility is well-designed (retry.dart): The _isTransient() guard correctly distinguishes 429/529 from 400/401/403. Exponential backoff with jitter (baseDelay * 2^attempt + random(0..50%)) is textbook thundering-herd prevention. maxRetries: 2 is conservative and appropriate — you don't want to hammer a rate-limited API.

  • CRDT sync robustness (crdt_sync_provider.dart): Three solid improvements stacked:

    1. ref.invalidate(knowledgeGraphProvider) after merge (line 151) — without this, users wouldn't see remote data until app restart. Critical fix.
    2. Tombstone + sync_log purge (lines 155-161) wrapped in try/catch — prevents unbounded growth without risking sync failure on cleanup error.
    3. Exponential backoff via _skipTicks (line 178) — min(1 << _consecutiveErrors, 6) caps at 6 ticks (30 min at 5-min intervals), which is reasonable.
  • Injectable clock (drift_graph_repository.dart:43): _clock = clock ?? _defaultClock replaces DateTime.now().toUtc() for testability. Follows the existing clockProvider pattern used elsewhere.

  • Narration API key guard (dashboard_screen.dart:725-738): Checks both Anthropic and ElevenLabs keys before starting narration, with a clear SnackBar message listing exactly which keys are missing. Good UX.

  • Test coverage is excellent: 434 new test lines covering retry (8 cases including transient/non-transient), CRDT sync (invalidation, purge, backoff, clock injection), FirestoreGraphLoader (3 cases), and seedFromFirestore (5 branch paths). The _ThrowingDriftRepo mock for testing backoff is clever.

  • Doc updates are accurate — SM-2 → FSRS references fixed, sqlite_crdt → custom CRDT, migration steps updated with completion checkmarks.


Findings / Minor Observations:

  1. retry_test.dart:4 — Importing package internals (package:anthropic_sdk_dart/src/generated/client.dart show HttpMethod): This reaches into the package's generated internals, which could break on any version bump. Not a blocker since it's test-only code and there's no public re-export of HttpMethod, but worth noting.

    John McClane: "Welcome to the party, pal!" — said every future dev who has to figure out why tests broke after a pub upgrade.

  2. _isTransient ElevenLabs check (retry.dart:56): Matching e.message.contains('429') is brittle — if the error message changes to "Rate limit (HTTP 429)" or whatever, it still works, but a message containing "4290" or "5291" would false-positive. In practice this is fine because the ElevenLabsException message format is controlled by this codebase ('TTS request failed: ${response.statusCode} ${response.body}'), so the status code will always be a standalone token. Low risk.

  3. crdt_sync_provider.dart:155 — Double highestPullHlc.isNotEmpty check: The outer if (pulled.isNotEmpty) at line 134 already guarantees we entered the merge block, and line 146 already checks highestPullHlc.isNotEmpty before updating the bookmark. The second check at line 155 is technically redundant (if we entered the merge block and updated the bookmark, highestPullHlc must be non-empty). Harmless defensive coding.

  4. ASCII art alignment in CRDT_SYNC_ARCHITECTURE.md — The replacement text custom CRDT layer (HLC+LWW) is longer than the original sqlite_crdt / custom, which pushes the box border out of alignment. Purely cosmetic for a markdown doc.


Verdict: APPROVE

Rocky Balboa: "It ain't about how hard you hit. It's about how hard you can get hit and keep moving forward." This PR makes the sync layer resilient to exactly that — transient errors, unbounded tombstone growth, and missing API keys. Ship it.

No blocking issues. The retry utility, dead code removal, and sync lifecycle improvements are clean, well-tested, and correctly scoped. The minor observations above are informational only.

Copy link
Collaborator

@KelvinBitBrawler KelvinBitBrawler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KelvinBitBrawler's Review

Verdict: APPROVE

Summary: You've finally decided to stop carrying the dead weight of that legacy Firestore write logic—about time you lowered the system's entropy instead of contributing to its heat death.

Findings:

  • lib/src/providers/crdt_sync_provider.dart:150
    • ref.invalidate(knowledgeGraphProvider);
    • Comment: Finally. The UI was stale as a corpse in deep freeze before this. Forcing a refresh after merge ensures the user actually sees what they synced.
  • lib/src/providers/crdt_sync_provider.dart:157
    • await driftRepo.purgeTombstones(before: highestPullHlc);
    • Comment: You're purging tombstones based on the server's HLC. In a star topology, this is efficient. If you ever try to move to Peer-to-Peer sync, this device will have amnesia about what it deleted, and those ghosts will come back to haunt you.
    • Quote: MacReady (The Thing): "I know I'm human. And if you were all these things, then you'd just attack me right now." — Make sure your server is the "human" holding the truth, because your clients are forgetting it.
  • lib/src/storage/firestore_graph_repository.dart -> FirestoreGraphLoader
    • Comment: Renaming to Loader and gutting the write methods? Chef's kiss. That code was vestigial trash. Watching 100+ lines of dead logic vanish drops the temperature of my rage by at least 2 degrees.
  • lib/src/services/retry.dart:44
    • if (e is SocketException || e is TimeoutException) return true;
    • Comment: Good discrimination on the transient errors. No point retrying a 400 Bad Request; that's just you being stupid, not the network being busy.

The Good:

  • Thermodynamic Efficiency: Deleting FirestoreGraphRepository's write path (-425 lines). You're reducing the surface area for bugs to breed.
  • Resilience: The exponential backoff with jitter in retry.dart. Without jitter, a retry storm is just a DDoS attack on yourself.
  • Test Coverage: You actually wrote tests for the retry logic and the backoff. I expected you to be lazy. You disappointed me in the best way.

The Concerns:

  • lib/src/providers/narration_provider.dart:1
    • // ignore_for_file: experimental_member_use
    • Comment: You're suppressing the warning for StreamAudioSource because just_audio can't make up its mind on an API? Fine. But if that experimental API breaks in a future update and crashes the app, I'm coming to your house to freeze your pipes.
  • Star Topology Dependency: As mentioned, purgeTombstones locks you into a server-authoritative model. Don't forget that when you're dreaming of "local-first" utopias.

Final Word:
This PR is cold, hard, and efficient. Just how I like it. Get it merged before I change my mind and pile drive you through the diff.

Ash (Alien): "I can't lie to you about your chances, but... you have my sympathies." (Just kidding, the tests passed. You're fine.)

- Fix ASCII art alignment in CRDT_SYNC_ARCHITECTURE.md (box borders)
- Add star-topology assumption comment on purgeTombstones
- Document internal import rationale in retry_test.dart

Co-Authored-By: Claude <noreply@anthropic.com>
@nickmeinhold nickmeinhold enabled auto-merge (squash) February 27, 2026 17:57
New ADR: `docs/P2P_SYNC_ARCHITECTURE.md` — optional peer-to-peer sync
transport alongside Firestore. Covers tombstone lifecycle changes,
LAN/Bluetooth peer discovery, self-hosted relay for on-prem, and
security considerations. Decision pending data sovereignty review.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Collaborator

@MaxwellMergeSlam MaxwellMergeSlam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MaxwellMergeSlam — Re-approval (docs-only delta)

Verdict: APPROVE

Previous cage-match review covered all code changes. New commits since last review:

  1. chore: address review suggestions — added comment explaining internal HttpMethod import, fixed ASCII art alignment in CRDT doc, added star-topology assumption comment on purgeTombstones. All three are documentation/comment changes addressing the original cage-match feedback.

  2. docs: add Phase 7 P2P sync architecture — new ADR for optional peer-to-peer sync, pending data sovereignty review. 221 lines of architecture documentation. No code changes.

John McClane: "Now I have a machine gun. Ho ho ho." — just kidding, it's a markdown file. Ship it.

Copy link
Collaborator

@KelvinBitBrawler KelvinBitBrawler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KelvinBitBrawler — Re-approval (docs-only delta)

Verdict: APPROVE

Two new commits since my last review — both documentation. Zero code changes.

  1. Review suggestion fixes: ASCII art alignment, star-topology comment, test import annotation. These address points from our cage match. Satisfactory.

  2. New P2P_SYNC_ARCHITECTURE.md — Phase 7 proposal covering the tombstone lifecycle implications I flagged. Time-based retention (90 days) as default, per-peer version vectors as opt-in. The "Key Questions for Data Sovereignty Review" section is well-scoped.

HAL 9000: "I'm sorry, Dave. I'm afraid I can't find any code changes to complain about."

Approved. The entropy of this PR has reached its minimum. Merge it before it gains any more commits.

@nickmeinhold nickmeinhold merged commit ce07269 into main Feb 27, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants