-
Notifications
You must be signed in to change notification settings - Fork 997
runtime (gc): remove recursion from "conservative" GC #1028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the algorithm can be more efficient. I think it forces more full heap scans than necessary.
If I'm correct, it works at the moment rougly as follows:
for each pointer in roots:
mark pointer and descendants
while stackOverflow:
stackOverflow = false
rescan entire heap
I think the following would be more efficient:
for each pointer in roots:
mark pointer and descendants
while stackOverflow:
stackOverflow = false
rescan entire heap
This should be a relatively simple change. For example, I think the marking phase could look like this:
// Mark phase: mark all reachable objects, recursively.
stackOverflow := false
stackOverflow = stackOverflow || markGlobals()
stackOverflow = stackOverflow || markStack()
for stackOverflow {
stackOverflow = markRoots(poolStart, endBlock.address())
}(This of course requires that many GC-related functions return a stackOverflow boolean, or share a global stackOverflow variable).
|
Alright, I split the marking into 2 phases as requested. |
|
Let me know when it is ready to retest, please. |
|
I have made the requested changes. @aykevl is there anything else before he retests? |
aykevl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. @deadprogram it would be great if you could re-test this PR with your MQTT code to make sure it didn't break anything.
| bytesPerBlock = wordsPerBlock * unsafe.Sizeof(heapStart) | ||
| stateBits = 2 // how many bits a block state takes (see blockState type) | ||
| blocksPerStateByte = 8 / stateBits | ||
| markStackSize = 4 * unsafe.Sizeof((*int)(nil)) // number of to-be-marked blocks to queue before forcing a rescan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This number seems arbitrary but I guess it's reasonable (without data it's hard to say anything about it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I basically had decided that it would be reasonable to spend 64 bytes on ARM, and then decided to reduce that to 16 on AVR. I just went with 4*(wordWidth^2) bytes spent on the stack, since 256 bytes also seemed fine for AMD64 where there is plenty of memory.
|
Retested and working as expected. Now merging, great work @jaddr2line on this important set of fixes! |
This PR attempts to fix #436 and hopefully reduce stack usage.
The GC modifications take a similar approach to the MicroPython GC, in that a fixed-size stack is used to track allocations that need to be scanned. If the stack overflows, this forces a full rescan of all marked allocations. This is potentially a lot slower in some extreme cases, but does not blow up the stack like the current version does.
This seems to pass some basic sanity checks, but we should try this out with some larger tinygo programs before merging it.