Skip to content

Commit a0db03b

Browse files
committed
update findings
1 parent 7a92c17 commit a0db03b

File tree

1 file changed

+90
-27
lines changed

1 file changed

+90
-27
lines changed
Lines changed: 90 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,100 @@
1-
## Memory Diagnostics Results (5.5 hours, 07:49 - 13:19, Mar 19)
1+
## Memory Diagnostics Results
22

3-
**rts_heap (total process heap) — grows monotonically, never shrinks:**
4-
```
5-
07:49 10.1 GB
6-
09:24 14.1 GB
7-
11:24 18.7 GB
8-
13:19 20.7 GB
9-
```
10-
Growth: **+10.6 GB in 5.5 hours** (~1.9 GB/hour). GHC never returns memory to the OS.
3+
### Data Collection
4+
5+
Server: smp19.simplex.im, PostgreSQL backend, `useCache = False`
6+
RTS flags: `+RTS -N -A16m -I0.01 -Iw15 -s -RTS` (16 cores)
7+
8+
### Mar 20 Data (1 hour, 07:19-08:19)
119

12-
**rts_live (actual live data) — sawtooth pattern, minimums growing:**
1310
```
14-
07:54 5.5 GB (post-GC valley)
15-
08:54 6.2 GB
16-
09:44 6.6 GB
17-
11:24 6.6 GB
18-
13:14 9.1 GB
11+
Time rts_live rts_heap rts_large rts_frag clients non-large
12+
07:19 7.5 GB 8.2 GB 5.5 GB 0.03 GB 14,000 2.0 GB
13+
07:24 6.4 GB 10.8 GB 5.2 GB 3.6 GB 14,806 1.2 GB
14+
07:29 8.2 GB 10.8 GB 6.5 GB 1.8 GB 15,667 1.7 GB
15+
07:34 10.0 GB 12.3 GB 7.9 GB 1.4 GB 15,845 2.1 GB
16+
07:39 6.7 GB 13.0 GB 5.3 GB 5.6 GB 16,589 1.4 GB
17+
07:44 8.5 GB 13.0 GB 6.7 GB 3.7 GB 16,283 1.8 GB
18+
07:49 6.5 GB 13.0 GB 5.2 GB 5.8 GB 16,532 1.3 GB
19+
07:54 6.0 GB 13.0 GB 4.8 GB 6.3 GB 16,636 1.2 GB
20+
07:59 6.4 GB 13.0 GB 5.1 GB 5.9 GB 16,769 1.3 GB
21+
08:04 8.3 GB 13.0 GB 6.5 GB 3.9 GB 17,352 1.8 GB
22+
08:09 10.2 GB 13.0 GB 8.0 GB 1.9 GB 17,053 2.2 GB
23+
08:14 5.6 GB 13.0 GB 4.5 GB 6.8 GB 17,147 1.1 GB
24+
08:19 7.6 GB 13.0 GB 6.1 GB 4.6 GB 17,496 1.5 GB
1925
```
20-
The post-GC floor is rising: **+3.6 GB over 5.5 hours**. This confirms a genuine leak.
2126

22-
**But smpQSubs is NOT the cause** — it oscillates between 1.2M-1.4M, not growing monotonically. At ~130 bytes/entry, 1.4M entries = ~180MB. Can't explain 9GB.
27+
non-large = rts_live - rts_large (normal Haskell heap objects: Maps, TVars, closures)
28+
29+
### Mar 19 Data (5.5 hours, 07:49-13:19)
30+
31+
rts_heap grew from 10.1 GB to 20.7 GB over 5.5 hours.
32+
Post-GC rts_live floor rose from 5.5 GB to 9.1 GB.
33+
34+
### Findings
35+
36+
**1. Large/pinned objects dominate live data (60-80%)**
37+
38+
`rts_large` = 4.5-8.0 GB out of 5.6-10.2 GB live. These are allocations > ~3KB that go on GHC's large object heap. They oscillate (not growing monotonically), meaning they are being allocated and freed constantly — transient, not leaked.
39+
40+
**2. Fragmentation is the heap growth mechanism**
41+
42+
`rts_heap ≈ rts_live + rts_frag`. The heap grows because pinned/large objects fragment GHC's block allocator. Once GHC expands the heap, it never shrinks. Growth pattern:
43+
- Large objects allocated → occupy blocks
44+
- Large objects freed → blocks can't be reused if ANY other object shares the block
45+
- New allocations need fresh blocks → heap expands
46+
- Heap never returns memory to OS
47+
48+
**3. Non-large heap data is stable (~1.0-2.2 GB)**
49+
50+
Normal Haskell objects (Maps, TVars, closures, client structures) account for only 1-2 GB. This scales with client count at ~100-130 KB/client and does NOT grow over time.
51+
52+
**4. All tracked data structures are NOT the cause**
53+
54+
- `clientSndQ=0, clientMsgQ=0` — TBQueues empty, no message accumulation
55+
- `smpQSubs` oscillates ~1.0-1.4M — entries are cleaned up, not leaking
56+
- `ntfStore` < 2K entries — negligible
57+
- All proxy agent maps near 0
58+
- `loadedQ=0` — useCache=False confirmed working
59+
60+
**5. Source of large objects is unclear without heap profiling**
61+
62+
The 4.5-8.0 GB of large objects could come from:
63+
- PostgreSQL driver (`postgresql-simple`/`libpq`) — pinned ByteStrings for query results
64+
- TLS library (`tls`) — pinned buffers per connection
65+
- Network socket I/O — pinned ByteStrings for recv/send
66+
- SMP protocol message blocks
67+
68+
Cannot distinguish between these without `-hT` heap profiling (which is too expensive for this server).
69+
70+
### Root Cause
71+
72+
**GHC heap fragmentation from constant churn of large/pinned ByteString allocations.**
73+
74+
Not a data structure leak. The live data itself is reasonable (5-10 GB for 15-17K clients). The problem is that GHC's copying GC cannot compact around pinned objects, so the heap grows with fragmentation and never shrinks.
75+
76+
### Mitigation Options
2377

24-
**clients** oscillates 14K-20K, also not monotonically growing.
78+
All are RTS flag changes — no rebuild needed, reversible by restart.
2579

26-
**Everything else is tiny**: ntfStore ~7K entries (<1MB), paClients ~350 (~50KB), all other metrics near 0.
80+
**1. `-F1.2`** (reduce GC trigger factor from default 2.0)
81+
- Triggers major GC when heap reaches 1.2x live data instead of 2x
82+
- Reclaims fragmented blocks sooner
83+
- Trade-off: more frequent GC, slightly higher CPU
84+
- Risk: low — just makes GC run more often
2785

28-
**The leak is in something we're not measuring.** ~6-9GB of live data is unaccounted for by all tracked structures. The most likely candidates are:
86+
**2. Reduce `-A16m` to `-A4m`** (smaller nursery)
87+
- More frequent minor GC → short-lived pinned objects freed faster
88+
- Trade-off: more GC cycles, but each is smaller
89+
- Risk: low — may actually improve latency by reducing GC pause times
2990

30-
1. **Per-client state we didn't measure** — the *contents* of TBQueues (buffered messages), per-client `subscriptions` TMap contents (Sub records with TVars)
31-
2. **TLS connection buffers** — the `tls` library allocates internal state per connection
32-
3. **Pinned ByteStrings** from PostgreSQL queries — these aren't collected by normal GC
33-
4. **GHC heap fragmentation** — pinned objects cause block-level fragmentation
91+
**3. `+RTS -xn`** (nonmoving GC)
92+
- Designed for pinned-heavy workloads — avoids copying entirely
93+
- Available since GHC 8.10, improved in 9.x
94+
- Trade-off: different GC characteristics, less battle-tested
95+
- Risk: medium — different GC algorithm, should test first
3496

35-
The next step is either:
36-
- **Add more metrics**: measure total TBQueue fill across all clients, total subscription count, and pinned byte count from RTS stats
37-
- **Run with `-hT`**: heap profiling by type to see exactly what's consuming memory
97+
**4. Limit concurrent connections** (application-level)
98+
- Since large objects scale per-client, fewer clients = less fragmentation
99+
- Trade-off: reduced capacity
100+
- Risk: low but impacts users

0 commit comments

Comments
 (0)