fix: reduce in-process cluster memory usage by huntharo · Pull Request #33 · pwrdrvr/ghcrawl

huntharo · 2026-03-24T17:54:11Z

Summary

I added opt-in heap diagnostics for cluster and refresh, then reduced peak memory usage in the in-process clustering path by loading and normalizing one source kind at a time instead of materializing the full parsed embedding cache for the repo up front.

What I changed

I added --heap-snapshot-dir and --heap-log-interval-ms to the CLI cluster/refresh flows so I can capture heap snapshots and memory samples during long runs.
I added a SIGUSR2 heap snapshot hook and lightweight memory logging for cluster investigations.
I changed the non-worker clustering path to stream source kinds one at a time instead of building a repo-wide parsed embedding cache before edge generation.
I added a regression test that asserts in-process clustering no longer populates the parsed embedding cache.

Verification

I ran pnpm --filter ghcrawl test.
I ran pnpm --filter @ghcrawl/api-core test.
I reproduced the OOM on origin/main with:
- pnpm --filter ghcrawl cli cluster openclaw/openclaw --heap-snapshot-dir /tmp/ghcrawl-heaps-origin-main --heap-log-interval-ms 5000
I reran the same clustering flow on this branch and confirmed it progressed into repeated edge-building updates instead of aborting immediately after loading embeddings.

Notes

I also checked the vectorlite experiment path separately and it still shows the same class of memory problem; I left that follow-up out of this PR to keep the scope tight.

(cherry picked from commit 0308a94f6b65257c8f4a28a5a68c971b882e785b)

(cherry picked from commit 509de2e838f47cadc1934a7156b7a01846d2bc67)

github-actions · 2026-03-24T17:55:12Z

Cluster Performance

Status: PASS
Fixture median: 445.2 ms (12 samples, 3 cluster rebuilds/sample)
Fixture baseline: 535.1 ms
Fixture delta: -89.9 ms (-16.8%)
Projected openclaw/openclaw duration: 8m 19.2s
Projected openclaw/openclaw baseline: 10m 0.0s
Projected delta: -100810.7 ms (-16.8%)
Regression threshold: +50.0%
Fixture shape: 512 threads x 3 source kinds
Sample durations: 449.5 ms, 460.1 ms, 441.0 ms, 450.4 ms, 442.2 ms, 448.2 ms, 437.4 ms, 574.0 ms, 437.3 ms, 453.2 ms, 434.7 ms, 441.1 ms

Run: workflow run for 43a9de8

huntharo added 2 commits March 24, 2026 13:53

feat: add heap diagnostics for cluster runs

769c566

(cherry picked from commit 0308a94f6b65257c8f4a28a5a68c971b882e785b)

fix: reduce in-process cluster memory usage

43a9de8

(cherry picked from commit 509de2e838f47cadc1934a7156b7a01846d2bc67)

huntharo merged commit 7fb20f1 into main Mar 24, 2026
8 checks passed

huntharo deleted the codex/cluster-oom-fix-pr branch March 24, 2026 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: reduce in-process cluster memory usage#33

fix: reduce in-process cluster memory usage#33
huntharo merged 2 commits intomainfrom
codex/cluster-oom-fix-pr

huntharo commented Mar 24, 2026

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

huntharo commented Mar 24, 2026

Summary

What I changed

Verification

Notes

Uh oh!

github-actions bot commented Mar 24, 2026

Cluster Performance

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant