A compact, heavily-commented reference implementation of a .NET-style generational GC using unmanaged memory, raw pointer references (
nint), card tables (bitmap-like), a heap-wide brick index, thread-local allocation (TLH) for Gen0, minor (ephemeral) GC with compaction + promotion, and optional non-moving regions/arenas with free-all semantics.
- Mirrors key CLR GC concepts while staying approachable.
- Uses unmanaged memory per segment so object references are actual addresses.
- No console I/O in the engine; exposes a report object for inspection/tests.
- Clean, modular structure — easy to extend to multi-segment, pinning, background GC, etc.
- Gen0 / Gen1 / Gen2 / LOH / Region each have a
Segment:- Unmanaged buffer (
IntPtr BasePtr,SizeBytes), bump pointerAllocPtr - Card table (bitmap-like) per segment for write barrier tracking
- Unmanaged buffer (
- Sorted vector of segments by base address for fast address→segment mapping.
- Objects are
[Header | Payload]:- Header: 16 bytes —
[SyncBlock(8)][MethodTable(8=TypeId)] - Payload: computed by
TypeDesc(explicit alignment/padding)
- Header: 16 bytes —
- References are raw 64-bit addresses (
nint) into segments.
- Fixed-size, C-like types only (classes and nested structs).
- Arrays/variable-sized objects are out of scope for simplicity.
- Gen0 allocation uses TLH slabs (e.g., 32 KiB) carved from the Gen0 segment.
ThreadLocalHeapholds{ SlabStart, SlabCursor, SlabLimit }.- When a slab is full, reserve a new slab (or trigger minor GC first).
- On every reference write, if
parent ∈ {Gen1, Gen2, LOH}andchild ∈ Gen0:- Mark the card covering that write as dirty in the parent’s segment.
- During minor GC, only dirty card ranges are scanned for Gen0 children.
- A card covers a small contiguous range of bytes in a segment (default: 256 B).
- Implementation uses 1 byte per card (
0 = clean,1 = dirty). Conceptually, this is a bitmap. - Overhead:
segmentSize / cardSize.- Example: 32 KiB segment / 256 B per card ⇒ 128 cards ⇒ 128 bytes overhead.
- Purpose: bound scanning during minor GC to just modified ranges in old generations.
ASCII intuition:
Segment bytes (e.g. 32 KiB)
[--256B card0--][--256B card1--] ... [--card127--]
CardTable bytes:
index 0..127, value {0=clean, 1=dirty}
Real runtimes pick a small size that balances precision and overhead, and fits CPU cache lines well (e.g., 64–512 B). 256 is a common sweet spot; tweak to explore trade-offs.
- For each brick (e.g., 2 KiB), store the last object start ≤ brick start.
- Lets the collector snap an arbitrary interior address to a nearby object start.
- Used when scanning dirty card ranges (you know a byte range, not object boundaries).
Arbitrary interior address: any address inside an object (not necessarily the header).
Given a dirty range [A, B), we:
- Snap A back to a known object start with the brick index.
- Walk object-by-object until we pass
B.
- Mark:
- Seed with roots that point into Gen0/Gen1.
- Add Gen0 children while scanning dirty cards in Gen1/Gen2/LOH.
- Traverse only inside Gen0/Gen1.
- Compact Gen0:
- Copy survivors densely within Gen0; build relocation map (oldAbs→newAbs).
- Rebuild brick entries for the Gen0 address range.
- Fix all references (roots + heap) via relocation map.
- Promote Gen0 → Gen1:
- Copy remaining Gen0 survivors into Gen1; add entries to the brick index.
- Apply relocation, fix references again.
- Reset TLHs and clear dirty cards in old generations.
- Mark from all roots (including region→GC external roots).
- Traversal visits all generations.
- Sweep/compaction of Gen2/LOH is intentionally omitted to keep the model compact.
- Region is a non-moving arena with free-all semantics:
AllocInRegion(type, region)bumps inside the region’s segment.- Disposing the region makes all its objects unreachable immediately.
- Reference policy:
- GC→Region references are disallowed (could dangle after
Dispose()). - Region→GC references are allowed; we record these addresses as external GC roots so the GC keeps them alive while the region lives.
- GC→Region references are disallowed (could dangle after
- Typical uses:
- Transient graphs with clear lexical lifetime.
- High-throughput allocations with cheap teardown (
Dispose()).
| Method | Count | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen0 | Allocated | Alloc Ratio |
|---|---|---|---|---|---|---|---|---|---|---|
| 'GC: Gen0 TLH (minor GCs as needed)' | 10000 | 1.574 ms | 0.2579 ms | 0.7605 ms | 1.333 ms | 1.31 | 0.98 | - | 2.52 MB | 1.00 |
| 'GC: Region (arena) free-all' | 10000 | 2.126 ms | 0.0404 ms | 0.0758 ms | 2.107 ms | 1.76 | 0.92 | - | 1.91 MB | 0.76 |
| 'GC: Gen0 TLH (minor GCs as needed)' | 50000 | 5.435 ms | 0.1071 ms | 0.2139 ms | 5.399 ms | 1.00 | 0.05 | 1000.0000 | 12.59 MB | 1.00 |
| 'GC: Region (arena) free-all' | 50000 | 2.852 ms | 0.0682 ms | 0.1748 ms | 2.844 ms | 0.53 | 0.04 | 1000.0000 | 9.54 MB | 0.76 |
| 'GC: Gen0 TLH (minor GCs as needed)' | 200000 | 27.924 ms | 0.2964 ms | 0.2314 ms | 27.954 ms | 1.00 | 0.01 | 6000.0000 | 50.36 MB | 1.00 |
| 'GC: Region (arena) free-all' | 200000 | 10.750 ms | 0.2301 ms | 0.6711 ms | 10.691 ms | 0.39 | 0.02 | 4000.0000 | 38.15 MB | 0.76 |