Decision: Migrate to local-first with Drift/SQLite as primary storage, Firestore as sync peer. The personal knowledge atlas should live on the device first, sync to the cloud for backup and collaboration.
Engram has two souls:
- Personal knowledge atlas — FSRS scheduling, spaced repetition, knowledge graph, sub-concept mastery, cross-discipline semantic relationships. Inherently personal and offline-friendly.
- Cooperative team game — guardians, glory board, challenges, nudges, repair missions. Inherently networked.
The current architecture (Firestore-primary, local JSON fallback) optimizes for soul #2 at the expense of soul #1. Local-first inverts this: the device is the primary read/write path, the server handles sync, compute, and social coordination.
Local-first does not mean server-less. It means:
- Local storage is the primary read/write path — no spinners, no network in the hot path
- The app works fully offline for personal features (quiz review, sub-concept splitting, knowledge graph)
- A server exists for sync, backup, Claude API calls, and social feature coordination
- Changes sync via CRDTs for conflict-free multi-device and multi-user merging (see
CRDT_SYNC_ARCHITECTURE.md)
User action → Riverpod provider → Firestore write (network) → Local state update
↓
Other devices (real-time)
- Quiz review requires network write (~200ms)
- Offline mode is degraded (JSON fallback, no social features)
- Every quiz review costs a Firestore write
User action → Riverpod provider → Drift/SQLite write (<1ms) → UI update
↓ (background)
CRDT sync → Server → Other devices
- Quiz review is instant (local write)
- Full offline experience for personal learning
- Social features gracefully degrade offline, fully functional online
- Server handles Claude API, friend discovery, challenge routing
The server is not diminished — its role is clarified:
| Server Role | What It Does | Why It Needs a Server |
|---|---|---|
| Compute node | Claude API for concept extraction + embeddings | API keys, rate limits, heavy compute |
| Sync peer | Receives CRDT operations, merges, fans out | Durable storage, always-on availability |
| Social hub | Friend discovery, challenge/nudge routing | Needs central index to match wiki URLs |
| Backup | Durable storage of merged CRDT state | Device loss recovery |
| Aggregation | Glory board rankings, team health (optional) | Can also be computed client-side |
- Instant quiz reviews — no network latency in the learning loop
- Full offline capability — review on planes, in tunnels, during outages
- Sub-concept splitting is frictionless — restructure your graph freely, sync later
- Embeddings work offline — Claude computes them at extraction time, stored locally forever
- Privacy by default — user data stays on device unless team features are enabled
- Lower costs — no Firestore read/write charges for personal operations
- Resilience — Firestore outage doesn't break the core experience
- No vendor lock-in — local SQLite is portable; sync backend can be swapped
- Social features work exactly like now when online (server mediates)
- Guardian points, goal contributions, glory sync via CRDTs (naturally additive)
- Offline operations queue and sync when connectivity returns
- Implementing Drift/SQLite tables to mirror the existing Firestore schema
- Adding HLC timestamps for CRDT ordering
- Building the sync layer (custom CRDT on Drift — done, see Phase 4-5)
- Dual-running period where both storage paths coexist (completed and removed)
- Testing sync edge cases (offline for weeks, large deltas)
- Rewriting providers — Riverpod still manages UI state; reads come from Drift instead of Firestore
- Losing social features — server stays; social features route through it
- Changing the data model — concepts, relationships, quiz items stay the same
- Claude API changes — extraction still hits the server
- Sync conflicts — mitigated by CRDT design (see
CRDT_SYNC_ARCHITECTURE.md) - Data loss on device — mitigated by server backup
- Complexity — more moving parts than Firestore-only, but each part is simpler
- Already normalized — Firestore already stores concepts, relationships, quiz items in separate subcollections. Drift tables mirror this exactly.
- Reactive queries —
watch()returns streams, giving fine-grained UI rebuilds for free - Graph traversal — recursive CTEs for shortest path, dependency chains
- FTS5 — full-text search across concept descriptions
- Battle-tested — cross-platform, actively maintained, type-safe
- CRDT-compatible — custom HLC + LWW merge layer built on top of Drift tables (Phases 1-5 complete)
CREATE TABLE concepts (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
description TEXT NOT NULL,
source_document_id TEXT NOT NULL,
hlc TEXT NOT NULL -- Hybrid Logical Clock for CRDT
);
CREATE TABLE relationships (
id TEXT PRIMARY KEY,
from_concept_id TEXT REFERENCES concepts(id),
to_concept_id TEXT REFERENCES concepts(id),
label TEXT NOT NULL,
hlc TEXT NOT NULL
);
CREATE TABLE quiz_items (
id TEXT PRIMARY KEY,
concept_id TEXT REFERENCES concepts(id),
question TEXT NOT NULL,
answer TEXT NOT NULL,
difficulty REAL NOT NULL DEFAULT 0.0,
stability REAL NOT NULL DEFAULT 0.0,
fsrs_state INTEGER NOT NULL DEFAULT 0,
lapses INTEGER NOT NULL DEFAULT 0,
interval INTEGER NOT NULL DEFAULT 0,
next_review TEXT,
last_review TEXT,
predicted_difficulty REAL,
review_count INTEGER NOT NULL DEFAULT 0,
hlc TEXT NOT NULL,
is_deleted INTEGER NOT NULL DEFAULT 0
);The existing GraphStore interface already abstracts storage. The migration was incremental:
- ✅ Added Drift as a parallel storage backend (dual-write with Firestore)
- ✅ Switched reads to Drift-primary (Firestore becomes write-through sync)
- ✅ Added CRDT sync layer for multi-device consistency (Phases 1-5 complete)
- Make Firestore sync optional (personal-only mode works without it) — Phase 6, next up
- Graph state management (
GRAPH_STATE_MANAGEMENT.md): Drift's reactive queries provide per-entity granularity naturally, reducing the need for manual Riverpodfamilyprovider splitting. - CRDT sync (
CRDT_SYNC_ARCHITECTURE.md): Local-first requires CRDTs for conflict-free sync. The knowledge graph's additive nature makes this clean. - Sub-concept splitting: Local-first makes this frictionless — split, experiment, restructure without network round-trips.
- Embeddings: Computed by Claude at extraction (online), stored locally, queried offline forever.
- Ink & Switch: "Local-First Software" (Martin Kleppmann et al.) — the foundational paper
sqlite_crdt(Daniel Cachapa) — SQLite with built-in CRDT support for Dartcrdt_sync— sync protocol companion tosqlite_crdt- PowerSync — commercial local-first sync for Flutter (SQLite + Postgres)
- Drift documentation — reactive SQLite for Dart