Skip to content

Commit e0c1f12

Browse files
nickmeinholdclaude
andauthored
feat(crdt): changeset generation, LWW merge, and serialization (#41) (#124)
* feat(crdt): changeset generation, LWW merge, and serialization (#41) Phase 4 of the CRDT sync migration: schema v3 with HLC indexes, GraphChangeset data class, getChangeset/mergeChangeset/getLastModified on DriftGraphRepository, toInsertCompanion reverse mappers, and drift_sync_metadata table for future transport layer. Also adds SOCIAL_KNOWLEDGE_PLAN.md documenting the federated knowledge graph vision (shared concepts, per-user relationships, quiz format evolution). Co-Authored-By: Claude <noreply@anthropic.com> * fix(crdt): add @TableIndex for HLC on all tables, fix migration test (#41) Addresses cage match review feedback: - Add @TableIndex annotations for HLC on all 6 tables so fresh v3 databases get HLC indexes via onCreate (not just onUpgrade) - Fix schema migration test to actually verify all 6 indexes exist - Add TODO comment for batch SELECT optimization in mergeChangeset - Add doc warning on save() about tombstoning partial graphs Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 43d9ab9 commit e0c1f12

File tree

8 files changed

+2089
-11
lines changed

8 files changed

+2089
-11
lines changed

docs/CRDT_SYNC_ARCHITECTURE.md

Lines changed: 31 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -205,26 +205,48 @@ PowerSync provides a complete local-first sync layer for Flutter:
205205

206206
## Migration Strategy
207207

208-
### Phase 1: Dual-Write Foundation
208+
### Phase 1: Dual-Write Foundation
209209

210210
- Add Drift/SQLite as a parallel storage backend
211211
- Write to both Drift and Firestore on every operation
212212
- Read from Drift (local-primary)
213213
- Verify consistency between both stores
214214

215-
### Phase 2: CRDT Timestamps
215+
### Phase 2: HLC Timestamps + Tombstones ✅
216216

217-
- Add HLC columns to all Drift tables
218-
- Start recording timestamps on every write
219-
- Build changeset generation (`getChangeset(since: hlc)`)
217+
- Add `hlc` and `is_deleted` columns to all 6 Drift tables
218+
- `HlcManager` stamps every write with a monotonic HLC
219+
- `purgeTombstones()` for garbage collection after sync confirmation
220220

221-
### Phase 3: Sync Layer
221+
### Phase 3: Upsert + Orphan Tombstoning ✅
222222

223-
- Implement merge logic (per CRDT type mapping above)
224-
- Build background sync service (push changesets to server, pull from server)
223+
- `save()` uses INSERT OR REPLACE (upserts) + orphan soft-deletion
224+
- Active rows not in incoming graph are tombstoned, not physically deleted
225+
- Tombstones preserved for changeset propagation
226+
227+
### Phase 4: Changeset Generation & Merge ✅
228+
229+
- Schema v3: HLC indexes on all 6 tables, `drift_sync_metadata` table
230+
- `GraphChangeset` data class: typed Drift rows internally, JSON wire format
231+
(table name → list of column maps, compatible with `package:crdt`'s
232+
`CrdtChangeset` typedef)
233+
- `getChangeset(modifiedAfter:)`: extract modified rows since an HLC, includes
234+
tombstones for deletion propagation
235+
- `mergeChangeset()`: LWW per HLC, `receive()` advances local clock for causal
236+
ordering, idempotent, atomic (transaction), fires `watch()` listeners
237+
- `getLastModified()`: highest HLC across all tables for sync bookkeeping
238+
- `toInsertCompanion()` reverse mappers on all 6 Drift data classes
239+
- Same-node guard prevents `DuplicateNodeException` from `Hlc.merge()`
240+
- Row-level LWW (sufficient for single-user multi-device sync; per-field LWW
241+
deferred to social knowledge layer — see `docs/SOCIAL_KNOWLEDGE_PLAN.md`)
242+
243+
### Phase 5: Sync Transport Layer (next)
244+
245+
- Background sync service (push/pull changesets over network)
246+
- Populate `drift_sync_metadata` with per-peer last-synced HLC
225247
- Server-side merge in Firestore (or migrate to Postgres)
226248

227-
### Phase 4: Firestore Optional
249+
### Phase 6: Firestore Optional
228250

229251
- Personal features work entirely offline with Drift
230252
- Firestore (or replacement) used only for social sync + backup

docs/SOCIAL_KNOWLEDGE_PLAN.md

Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
# Social Knowledge Building & Competition
2+
3+
> Vision: A federated knowledge graph where each learner has their own evolving
4+
> view. Graphs grow as you learn; you can browse how others structure the same
5+
> concepts, selectively adopt relationships, compete on quiz items, and
6+
> negotiate "ground truth" collaboratively.
7+
8+
## Core Principles
9+
10+
1. **Personal graph is sovereign** — your concepts, relationships, quiz items,
11+
and FSRS scheduling state belong to you. Your devices sync automatically
12+
via the CRDT layer (Phase 4, row-level LWW).
13+
2. **Concepts are shared, opinions differ** — a concept like "Docker" exists
14+
once in a shared pool, but the *relationships* around it ("Docker is
15+
prerequisite of Kubernetes" vs "Docker is related to Kubernetes") are
16+
per-user opinions that can diverge intentionally.
17+
3. **Adoption by choice, not automatic merge** — when you see someone's
18+
different relationship structure, you can inspect it, compare it to yours,
19+
and adopt it with a tap. This is explicit, not background sync.
20+
4. **Quiz formats evolve** — flash cards are the starting point. The quiz item
21+
model should be extensible toward multiple-choice, visual quizzes, and
22+
diagram-based questions without breaking the knowledge graph or FSRS state.
23+
24+
## Data Model Evolution
25+
26+
### Current (Personal Graph)
27+
28+
```
29+
User A's graph:
30+
concepts: [Docker, Kubernetes, Pods]
31+
relationships: [K8s --prerequisite--> Docker, Pods --composition--> K8s]
32+
quizItems: [personal FSRS state per item]
33+
```
34+
35+
### Future (Federated Graph)
36+
37+
```
38+
Shared concept pool:
39+
concepts: [Docker, Kubernetes, Pods, ...] (canonical definitions)
40+
41+
User A's view:
42+
relationships: [K8s --prerequisite--> Docker] (A's opinion)
43+
quizItems: [A's FSRS state]
44+
adoptedFrom: {r42: userB} (tracking provenance)
45+
46+
User B's view:
47+
relationships: [K8s --enables--> Docker] (B's opinion)
48+
quizItems: [B's FSRS state]
49+
```
50+
51+
### Key Schema Changes
52+
53+
1. **`owner_id`** on relationships, quiz items — distinguishes "mine" from
54+
"theirs" without separate tables
55+
2. **`provenance`** on adopted relationships — tracks who you adopted from and
56+
when, enabling "undo adoption" and social credit
57+
3. **`quiz_format`** enum on quiz items — `flashcard | multipleChoice | visual |
58+
diagramLabel` — extensible without schema migration
59+
4. **`alternatives`** JSON on multiple-choice quiz items — stores distractor
60+
options alongside the correct answer
61+
5. **Concept deduplication** — shared concept pool uses content-addressable IDs
62+
(hash of name + source) or server-assigned canonical IDs after merge
63+
64+
## Social Features
65+
66+
### Browse & Compare
67+
68+
- View another user's relationship graph overlaid on yours (different edge
69+
colors)
70+
- Diff view: "User B has 3 relationships you don't, you have 2 they don't"
71+
- Tap any foreign relationship to preview, then adopt or dismiss
72+
73+
### Selective Adoption
74+
75+
- One-tap adopt: copies a relationship into your graph with provenance metadata
76+
- Batch adopt: "adopt all of User B's relationships for this concept cluster"
77+
- Undo adoption: removes the relationship and its provenance record
78+
79+
### Competition
80+
81+
- Challenge a friend: "quiz me on your mastered concepts" (existing mechanic)
82+
- Leaderboard: who has the most stable (high-retrievability) graph?
83+
- Concept coverage race: who can master a topic cluster first?
84+
85+
### Negotiation
86+
87+
- Propose a relationship change to another user (like a PR)
88+
- Vote on contested relationships within a wiki group
89+
- "Ground truth" emerges from consensus, not authority
90+
91+
## Quiz Format Evolution
92+
93+
### Phase 1: Current (Flash Cards)
94+
95+
```dart
96+
QuizItem(question: 'What is Docker?', answer: 'A container runtime')
97+
```
98+
99+
### Phase 2: Multiple Choice
100+
101+
```dart
102+
QuizItem(
103+
question: 'What is Docker?',
104+
answer: 'A container runtime',
105+
format: QuizFormat.multipleChoice,
106+
alternatives: ['A programming language', 'An operating system', 'A database'],
107+
)
108+
```
109+
110+
Distractors can be:
111+
- **AI-generated** — Claude generates plausible wrong answers from nearby
112+
concepts in the graph
113+
- **Graph-derived** — siblings of the correct concept (same parent, same
114+
relationship type) make natural distractors
115+
116+
### Phase 3: Visual Quizzes
117+
118+
- "Which node in this subgraph represents Docker?" (highlight the correct node)
119+
- "Draw the missing relationship" (given two concepts, name the edge)
120+
- "Label this diagram" (given a subgraph, fill in concept names)
121+
122+
### Phase 4: Concept Splitting Quizzes
123+
124+
- "Docker was split into Docker Images, Docker Containers, and Docker Networks.
125+
Which sub-concept does this description belong to?"
126+
- Tests understanding at the sub-concept level after a split
127+
128+
## Implementation Phases
129+
130+
### Phase S1: Shared Concept Pool
131+
132+
- Server-side concept deduplication (name + source hash → canonical ID)
133+
- Personal graphs reference shared concept IDs
134+
- No relationship sharing yet — just concepts
135+
136+
### Phase S2: Relationship Browsing
137+
138+
- API to fetch another user's relationships for a given concept cluster
139+
- Overlay UI on the knowledge graph (foreign edges as dashed/colored lines)
140+
- Diff computation (your edges vs theirs)
141+
142+
### Phase S3: Selective Adoption + Provenance
143+
144+
- `adoptRelationship(fromUser, relationshipId)` — copies into personal graph
145+
- Provenance tracking on adopted relationships
146+
- Undo adoption
147+
148+
### Phase S4: Multiple Choice + Visual Quizzes
149+
150+
- `QuizFormat` enum on `QuizItem`
151+
- AI distractor generation during extraction
152+
- Graph-derived distractors from sibling concepts
153+
- New quiz UI widgets per format
154+
155+
### Phase S5: Negotiation + Consensus
156+
157+
- Relationship proposals (like PRs)
158+
- Wiki-group voting on contested edges
159+
- Consensus threshold for "ground truth" relationships
160+
161+
## Dependencies
162+
163+
- **CRDT sync (Phase 4, #41)** — personal graph sync must work first
164+
- **Concept embeddings (#39)** — enables smart deduplication and distractor
165+
generation
166+
- **Local-first Drift (#40)** — all social features read from local DB, server
167+
pushes updates via changeset sync
168+
169+
## Open Questions
170+
171+
1. Should concept *definitions* (description field) also be per-user, or is
172+
the shared pool's definition canonical?
173+
2. How to handle concept splits in a shared pool — does splitting create new
174+
shared concepts, or personal sub-concepts?
175+
3. Should FSRS state ever be shared (e.g., "this concept is hard for 80% of
176+
learners")?
177+
4. How to prevent spam in relationship proposals / voting?

0 commit comments

Comments
 (0)