-
Notifications
You must be signed in to change notification settings - Fork 23
Heap snapshot diffs for accurate memtest leak detection #1372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🚀 Snapshot Release (
|
Package | Version | Info |
---|---|---|
@graphql-hive/gateway |
1.16.3-alpha-b036abed553a3f749bb6b76a78885feaca2ce77a |
npm ↗︎ unpkg ↗︎ |
@graphql-hive/nestjs |
1.0.22-alpha-b036abed553a3f749bb6b76a78885feaca2ce77a |
npm ↗︎ unpkg ↗︎ |
@graphql-hive/plugin-aws-sigv4 |
1.0.18-alpha-b036abed553a3f749bb6b76a78885feaca2ce77a |
npm ↗︎ unpkg ↗︎ |
🚀 Snapshot Release (Bun Docker Image)The latest changes of this PR are available as image on GitHub Container Registry (based on the declared
|
🚀 Snapshot Release (Node Docker Image)The latest changes of this PR are available as image on GitHub Container Registry (based on the declared
|
Wow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR replaces the existing heap allocation sampling approach with a more accurate heap snapshot diffing method for memory leak detection in memtests. Instead of relying on allocation profiles that can produce false positives, the system now takes heap snapshots at the end of each test run and compares them to identify objects that consistently grow across all snapshots.
- Implements heap snapshot parsing and diffing using a forked Chrome DevTools heap snapshot worker
- Changes default memtest behavior to use heap snapshots instead of heap sampling profiles
- Updates memtest configuration with new defaults and flags for better control
Reviewed Changes
Copilot reviewed 31 out of 55 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
internal/perf/src/memtest.ts | Core memtest logic updated to use heap snapshot diffing instead of heap sampling analysis |
internal/perf/src/loadtest.ts | Modified to support both heap snapshots and sampling with improved file naming |
internal/perf/src/heapsnapshot.ts | New module for heap snapshot diffing and leak detection logic |
internal/perf/src/heapsampling.ts | Cleaned up heap sampling code, removed unused heap snapshot analysis |
internal/heapsnapshot/ | Complete new package containing forked Chrome DevTools heap snapshot worker |
internal/perf/tests/heapsnapshot.test.ts | Test suite for heap snapshot leak detection functionality |
Comments suppressed due to low confidence (1)
internal/perf/src/loadtest.ts:284
- [nitpick] The variable name 'filenameSafeStartTime' is misleading since it's not the start time itself but a formatted version of it. Consider renaming to 'formattedStartTime' or 'filenameSafeTimestamp'.
const filenameSafeStartTime = startTime
@@ -1,6 +1,7 @@ | |||
node_modules/ | |||
.DS_Store | |||
dist/ | |||
!/internal/heapsnapshot/dist/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps indicate here why it's commited ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks amazing ! Great work !
ab101c1
to
5e052af
Compare
Ref GW-276
Perform 5 runs of load testing on each memtest; take a heap snapshot at the end of each "calmdown" phase1; compare every heap snapshot consecutively and find leaks by find all objects that grew in size in every2 snapshot.
Because of increased accuracy, there is no need to retry tests as there are not flakey. Also, the tests can run for a shorter time since we're looking for leaks by comparing heap snapshots instead of detecting high frames from allocation sampling. Tests now run for at least half, sometimes third, of the time.
Background
We were using heap allocation sampling profiles that can guess leaks. It does so because it measures how much memory a certain frame allocates over time. If that frame, in total, allocates and frees memory - it wont show up in the test results, but if it allocates and does not free (enough) memory it will show up.
This approach works, but is not very accurate, because there can be frames that are under constant pressure during load-testing that do allocate a lot, but eventually (slower than the others) all of their memory gets freed.
A much more accurate approach would be heap snapshot comparison, looking at which classes/constructors have been growing in allocation size over the compared snapshots.
However, there's not much out there in the ecosystem helping with parsing or comparing heap snapshots.
The only option at the moment is to use Chrome DevTools heap snapshot worker which is fast (leveraging workers), gives accurate results and has primitives for efficiently comparing snapshots. It is tightly coupled for the browser and Chrome machinery. Meaning, it needs to be forked, analysed how the heap snapshot worker is used (DevTools have no programmatic documentation, just code) and finally stripped of its browser requirements implementing it for Node (and other JS) environments - which has been done in this PR.
TODO
Footnotes
After each load testing, there is a "calmdown" phase which triggers garbage collection (through the DevTools inspector protocol) and waits for a while to let the GC do it thing. ↩
If an object grew in one heap snapshot, but not another - we do not consider it a leak. ↩