Skip to content

Jimmymonster/settle-ledger-db

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Settle — a database-depth project disguised as a ledger

A deliberately constrained learning project: build the money-movement core of a fintech wallet / SaaS-credits system on MySQL / InnoDB, where the rule is I never get handed a schema. The schema is trivial; keeping money correct under concurrency and at scale is not. That hard part can't be answered by a schema dump — it has to be reasoned about. So I write the design and the reasoning; my AI tooling is configured to only attack it, never author it.

This repo is two things at once:

  1. A portfolio piece — a public record of how I think about data modeling, indexing, transactions, and query-plan tuning.
  2. Deliberate practice — I'm a backend engineer (Go, ~1.5y) whose honest weakest area is the database layer: I'd never designed a schema, an index, or read an EXPLAIN plan. This project exists to close exactly that gap, on purpose, the hard way.

Why the AI isn't allowed to help (the interesting part)

If you ask an AI for a ledger schema, you get a correct one instantly and learn nothing. So the .claude/CLAUDE.md in this repo configures Claude Code as a skeptical senior DBA that is forbidden from writing any schema, SQL, or code for me. It may only ask questions, cite the official MySQL/InnoDB docs, and tell me when I'm wrong — never how to fix it. Every design decision in this repo is one I had to sweat for and defend out loud.

Using AI as an adversary that sharpens reasoning, rather than a vending machine that replaces it, is itself the meta-skill I'm practicing here.


The system

The money-movement core of a product where users can top up, spend / withdraw, transfer to another user, and refund. Modeled from six correctness invariants rather than from a given table layout:

  1. Money is conserved — internal ops never create or destroy value (→ double-entry).
  2. No negative balance unless overdraft is explicitly allowed.
  3. Every operation is idempotent — a retried top-up applies exactly once.
  4. The ledger is append-only & auditable — no UPDATE/DELETE of posted entries; corrections are compensating entries.
  5. History is queryable — "balance of account X at time T?" must be answerable.
  6. Reads are hot — "last 50 transactions" and "current balance" are called constantly.

The schema is derived from these invariants. See the full brief: ledger_db_learning_project.md.


The adversarial scenarios

Each scenario is run against a real MySQL instance to watch a specific failure happen:

# Scenario What it teaches
A Two concurrent withdrawals of 80 from a balance of 100 Isolation levels, SELECT … FOR UPDATE, optimistic versioning
B Transfer A→B crashes after debit, before credit Atomicity, transaction boundaries, dual-write problem
C SUM() over 10M ledger rows takes 2s EXPLAIN the scan; running-balance vs snapshot vs sum-on-read
D "Transactions between two dates for account X" is slow Composite index, column order, leftmost-prefix, selectivity
E LIMIT 50 OFFSET 1000000 crawls Keyset / cursor pagination
F Transfer A→B and B→A concurrently Deadlocks, lock ordering, reading the InnoDB deadlock log

Roadmap

  • Phase 0 — Model on paper. No DB, no AI. Invariants → tables → exact queries.
  • Phase 1 — B-tree reality: full scan vs range scan, B+-tree fan-out, InnoDB clustered PK.
  • Phase 2 — Selectivity & composite indexes; covering indexes (Using index).
  • Phase 3 — Transactions & isolation levels; finding write skew under REPEATABLE READ.
  • Phase 4 — Locking & deadlocks; pessimistic vs optimistic; InnoDB gap locks.
  • Phase 5 — The aggregation trade-off: sum-on-read vs running balance vs snapshot, measured.

Tech

  • MySQL 8.x / InnoDB (primary), Postgres notes where behavior differs
  • Go for the thin application layer (the DB is the point)
  • Dockerized local MySQL; synthetic data generator for scale tests

Repo layout

.
├── README.md                          # you are here
├── ledger_db_learning_project.md      # full project brief / curriculum
├── .claude/CLAUDE.md                  # the "DBA adversary" rules for Claude Code
└── docs/                              # my design artifacts: paper model, notes, EXPLAIN logs

Status: active learning project, not production software, and not intended to be. The point is the reasoning trail, not a deployable product.

About

Settle — a database-depth learning project disguised as a fintech ledger. Schema-from-invariants on MySQL/InnoDB; AI is configured to grill the design, never write it.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors