A deliberately constrained learning project: build the money-movement core of a fintech wallet / SaaS-credits system on MySQL / InnoDB, where the rule is I never get handed a schema. The schema is trivial; keeping money correct under concurrency and at scale is not. That hard part can't be answered by a schema dump — it has to be reasoned about. So I write the design and the reasoning; my AI tooling is configured to only attack it, never author it.
This repo is two things at once:
- A portfolio piece — a public record of how I think about data modeling, indexing, transactions, and query-plan tuning.
- Deliberate practice — I'm a backend engineer (Go, ~1.5y) whose honest weakest area is
the database layer: I'd never designed a schema, an index, or read an
EXPLAINplan. This project exists to close exactly that gap, on purpose, the hard way.
If you ask an AI for a ledger schema, you get a correct one instantly and learn nothing. So the
.claude/CLAUDE.md in this repo configures Claude Code as a skeptical
senior DBA that is forbidden from writing any schema, SQL, or code for me. It may only ask
questions, cite the official MySQL/InnoDB docs, and tell me when I'm wrong — never how to
fix it. Every design decision in this repo is one I had to sweat for and defend out loud.
Using AI as an adversary that sharpens reasoning, rather than a vending machine that replaces it, is itself the meta-skill I'm practicing here.
The money-movement core of a product where users can top up, spend / withdraw, transfer to another user, and refund. Modeled from six correctness invariants rather than from a given table layout:
- Money is conserved — internal ops never create or destroy value (→ double-entry).
- No negative balance unless overdraft is explicitly allowed.
- Every operation is idempotent — a retried top-up applies exactly once.
- The ledger is append-only & auditable — no
UPDATE/DELETEof posted entries; corrections are compensating entries. - History is queryable — "balance of account X at time T?" must be answerable.
- Reads are hot — "last 50 transactions" and "current balance" are called constantly.
The schema is derived from these invariants. See the full brief: ledger_db_learning_project.md.
Each scenario is run against a real MySQL instance to watch a specific failure happen:
| # | Scenario | What it teaches |
|---|---|---|
| A | Two concurrent withdrawals of 80 from a balance of 100 | Isolation levels, SELECT … FOR UPDATE, optimistic versioning |
| B | Transfer A→B crashes after debit, before credit | Atomicity, transaction boundaries, dual-write problem |
| C | SUM() over 10M ledger rows takes 2s |
EXPLAIN the scan; running-balance vs snapshot vs sum-on-read |
| D | "Transactions between two dates for account X" is slow | Composite index, column order, leftmost-prefix, selectivity |
| E | LIMIT 50 OFFSET 1000000 crawls |
Keyset / cursor pagination |
| F | Transfer A→B and B→A concurrently | Deadlocks, lock ordering, reading the InnoDB deadlock log |
- Phase 0 — Model on paper. No DB, no AI. Invariants → tables → exact queries.
- Phase 1 — B-tree reality: full scan vs range scan, B+-tree fan-out, InnoDB clustered PK.
- Phase 2 — Selectivity & composite indexes; covering indexes (
Using index). - Phase 3 — Transactions & isolation levels; finding write skew under
REPEATABLE READ. - Phase 4 — Locking & deadlocks; pessimistic vs optimistic; InnoDB gap locks.
- Phase 5 — The aggregation trade-off: sum-on-read vs running balance vs snapshot, measured.
- MySQL 8.x / InnoDB (primary), Postgres notes where behavior differs
- Go for the thin application layer (the DB is the point)
- Dockerized local MySQL; synthetic data generator for scale tests
.
├── README.md # you are here
├── ledger_db_learning_project.md # full project brief / curriculum
├── .claude/CLAUDE.md # the "DBA adversary" rules for Claude Code
└── docs/ # my design artifacts: paper model, notes, EXPLAIN logs
Status: active learning project, not production software, and not intended to be. The point is the reasoning trail, not a deployable product.