They provide:
- Immutability
- Deterministic replay
- Simple conflict resolution
- Easy sync
- No loading spinners
- Works offline
- User owns data
- Deterministic execution
- Shared logic between backend and browser
- Strong concurrency model
- Easier scaling
- No trust required
- Replaceable relays
This document explains the core design decisions behind the decentralized database.
- Offline-first: Operations work without network connectivity
- Conflict-free: CRDT merge logic ensures eventual consistency
- Privacy-first: End-to-end encryption, zero server trust
- Simple sync: Dumb relay server keeps infrastructure minimal
Traditional databases use mutable state:
- UPDATE statements modify rows in-place
- DELETE removes data permanently
- Concurrent writes require locks
- Conflicts need central coordination
Every operation is appended, never modified:
[op1: set user:1 = "Alice"]
[op2: set user:1 = "Bob"] ← Doesn't override op1
[op3: delete user:1] ← Doesn't erase previous ops
- System crashes mid-write? No corruption.
- Restart and replay the log from disk.
- Atomic writes: operation fully written or not at all.
- Can merge logs from different devices
- No "last modified" conflicts
- Deterministic replay on all peers
- Complete history of all changes
- Can reconstruct state at any point in time
- Useful for debugging and compliance
func Append(op Operation) error {
data := json.Marshal(op)
file.Write(append(data, '\n')) // Just append!
}No in-place updates, no B-trees, no WAL complexity.
- ❌ Log grows unbounded (needs compaction strategy)
- ❌ Slower queries (must replay entire log)
- ✅ Rock-solid consistency guarantees
- ✅ Trivial to implement correctly
Race condition example:
// Thread 1 // Thread 2
ops := store.ops ops := store.ops
ops = append(ops, op1) ops = append(ops, op2)
store.ops = ops store.ops = ops
← op1 lost!Traditional solutions:
- Mutexes (slow, blocking)
- Lock-free algorithms (complex, error-prone)
- Transactions (heavyweight overhead)
func (s *Store) run() {
for op := range s.input {
s.ops = append(s.ops, op) // Only this goroutine writes
}
}
func Apply(op Operation) {
s.input <- op // Everyone else just sends
}- Only one goroutine modifies
s.ops - Go channels provide synchronization
- No mutex overhead
- Slow disk I/O? Channel fills up
- Fast producers automatically wait
- No manual flow control needed
- Operations processed in channel order
- Easier to reason about
- Deterministic behavior
go test -race # Zero data races!- ✅ Single-threaded writes scale to ~100K ops/sec
- ✅ Reads can be concurrent (RWMutex for Get)
- ❌ Can't parallelize writes across cores
- ❌ Bounded by single disk throughput
For most CRDT use cases (collaborative apps, local-first), this is plenty fast.
| Storage | Capacity | Async | Structured | Verdict |
|---|---|---|---|---|
localStorage |
5-10 MB | ❌ Sync | ❌ String-only | Too limited |
sessionStorage |
5-10 MB | ❌ Sync | ❌ String-only | Too limited |
| IndexedDB | 50+ MB | ✅ Async | ✅ Objects | Winner |
| WebSQL | Deprecated | ✅ | ✅ | Dead standard |
// localStorage: Only ~5MB
localStorage.setItem('ops', JSON.stringify(ops)) // Quota exceeded!
// IndexedDB: 50MB+ (browser-dependent)
db.put('operations', ops) // Plenty of room// localStorage blocks main thread
const data = localStorage.getItem('big-data') // UI freezes 😰
// IndexedDB doesn't block
db.getAll('operations').then(ops => {
// UI stays responsive ✨
})// localStorage: Everything is a string
const user = JSON.parse(localStorage.getItem('user')) // Manual parse
// IndexedDB: Store objects directly
db.put('users', { id: 1, name: 'Alice' }) // Native objects// Find all ops after timestamp 100
const tx = db.transaction('operations')
const index = tx.objectStore('operations').index('timestamp')
const ops = index.getAll(IDBKeyRange.lowerBound(100))const tx = db.transaction(['operations', 'seen'], 'readwrite')
tx.objectStore('operations').add(op)
tx.objectStore('seen').put(op.id, true)
// Atomic: both succeed or both fail//go:build js && wasm
import "syscall/js"
func (s *IndexedDBStorage) Append(op Operation) error {
db := js.Global().Get("indexedDB")
// Call browser IndexedDB APIs from Go!
}┌─────────────────┐
│ User's Device │
│ ┌───────────┐ │
│ │ Plain Ops │ │ Encryption happens HERE
│ └─────┬─────┘ │ ↓
│ │ AES-GCM│ ┌──────────────┐
│ ┌─────▼─────┐ │ │ Encrypted Op │
│ │ Cipher Op │──┼────>│ (Base64) │
│ └───────────┘ │ └──────────────┘
└─────────────────┘ │
│
┌──────▼──────┐
│ Server │
│ (can't read)│
└─────────────┘
// Server receives this:
{
"id": "op123",
"value": "nfj38f9h2f9h2f9..." // Encrypted blob
}
// Server has zero idea what this meansEven if server is compromised:
- ✅ Attacker gets encrypted gibberish
- ✅ No encryption keys on server
- ✅ No plaintext data
- GDPR: "Right to be forgotten" → Just delete your key
- HIPAA: PHI never leaves device unencrypted
- SOC2: Server doesn't process sensitive data
User A's key encrypts User A's data
User B's key encrypts User B's data
Server stores both, can't distinguish them
No user can decrypt another user's data, even if they hack the server.
// Works without network
encrypted := Encrypt(data, identity.Key)
storage.Append(encrypted)
// Sync later when onlinefunc Encrypt(data, key []byte) ([]byte, error) {
block, _ := aes.NewCipher(key)
gcm, _ := cipher.NewGCM(block)
nonce := make([]byte, 12)
rand.Read(nonce)
// Authenticated encryption
return gcm.Seal(nonce, nonce, data, nil), nil
}AES-GCM provides:
- Confidentiality (encryption)
- Authenticity (can't tamper without detection)
- Associated data (can include metadata)
┌──────────────────────┐
│ Server │
│ ┌────────────────┐ │
│ │ User Auth │ │
│ │ Permissions │ │ ← Complex logic
│ │ Conflict Res │ │ ← State machine
│ │ Data Transform │ │ ← Business rules
│ └────────────────┘ │
└──────────────────────┘
Problems:
- ❌ Server must understand application logic
- ❌ Tight coupling between client and server
- ❌ Complex deployment and scaling
- ❌ Single point of failure
func relay() {
for msg := range broadcast {
for client := range clients {
client.Send(msg) // Just forward!
}
}
}That's it. 42 lines of code.
Server doesn't know:
- What the data means
- Who can access what
- How to resolve conflicts
All intelligence in clients.
# Run 10 relays behind load balancer
for i in 1..10; do
./relay --port=$((8080 + i)) &
doneNo shared state, no coordination.
Client A ──┐
├──> Relay ($5/month VPS) ──┐
Client B ──┘ ├──> Client C
└──> Client D
Compare to Firebase ($25-$500/month) with vendor lock-in.
# Users can run their own relay
docker run -p 8080:8080 decentralized-db/relayNo need to trust a third party.
┌─────────┐
│ WebRTC │ ← Peer-to-peer, no server
└─────────┘
┌─────────┐
│ Relay │ ← Simple broadcast
└─────────┘
┌─────────┐
│ libp2p │ ← Decentralized network
└─────────┘
Same CRDT core, different transport layers.
Client sends: ENCRYPTED operation
Relay sees: Random bytes
Relay does: Broadcast to peers
Peers decrypt using their keys
No auth needed! Encrypted data is self-authenticating.
✅ Simplicity: Core CRDT logic is ~300 lines
✅ Correctness: Deterministic merge, no race conditions
✅ Privacy: E2E encryption, server can't read data
✅ Resilience: Offline-first, crash recovery
✅ Cost: Self-host on $5/month VPS
❌ Storage overhead: Log needs compaction
❌ Query performance: Must replay log
❌ No indexes: Linear scan for queries
❌ No transactions: Eventually consistent only
✅ Collaborative apps (Notion, Figma)
✅ Local-first software (offline editing)
✅ Privacy-critical apps (healthcare, finance)
✅ Multi-device sync (phone + laptop + tablet)
❌ Real-time analytics (need aggregations)
❌ Complex joins (need relational DB)
❌ Strong consistency (need distributed consensus)
- CRDTs: The Hard Parts - Martin Kleppmann
- Local-First Software
- Designing Data-Intensive Applications
Architecture designed for simplicity, privacy, and resilience.