Skip to content

Conversation

@niaow
Copy link
Member

@niaow niaow commented Apr 6, 2020

This PR attempts to fix #436 and hopefully reduce stack usage.

The GC modifications take a similar approach to the MicroPython GC, in that a fixed-size stack is used to track allocations that need to be scanned. If the stack overflows, this forces a full rescan of all marked allocations. This is potentially a lot slower in some extreme cases, but does not blow up the stack like the current version does.

This seems to pass some basic sanity checks, but we should try this out with some larger tinygo programs before merging it.

@niaow niaow requested a review from aykevl April 6, 2020 00:55
@deadprogram
Copy link
Member

Any feedback on this PR @aykevl ?

It seems needed in order to address #1009

Copy link
Member

@aykevl aykevl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the algorithm can be more efficient. I think it forces more full heap scans than necessary.
If I'm correct, it works at the moment rougly as follows:

for each pointer in roots:
    mark pointer and descendants
    while stackOverflow:
        stackOverflow = false
        rescan entire heap

I think the following would be more efficient:

for each pointer in roots:
    mark pointer and descendants
while stackOverflow:
    stackOverflow = false
    rescan entire heap

This should be a relatively simple change. For example, I think the marking phase could look like this:

	// Mark phase: mark all reachable objects, recursively.
	stackOverflow := false
	stackOverflow = stackOverflow || markGlobals()
	stackOverflow = stackOverflow || markStack()
	for stackOverflow {
		stackOverflow = markRoots(poolStart, endBlock.address())
	}

(This of course requires that many GC-related functions return a stackOverflow boolean, or share a global stackOverflow variable).

@niaow
Copy link
Member Author

niaow commented Apr 8, 2020

Alright, I split the marking into 2 phases as requested.

@niaow niaow added this to the v0.13 milestone Apr 8, 2020
@deadprogram
Copy link
Member

Let me know when it is ready to retest, please.

@niaow
Copy link
Member Author

niaow commented Apr 8, 2020

I have made the requested changes. @aykevl is there anything else before he retests?

Copy link
Member

@aykevl aykevl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. @deadprogram it would be great if you could re-test this PR with your MQTT code to make sure it didn't break anything.

bytesPerBlock = wordsPerBlock * unsafe.Sizeof(heapStart)
stateBits = 2 // how many bits a block state takes (see blockState type)
blocksPerStateByte = 8 / stateBits
markStackSize = 4 * unsafe.Sizeof((*int)(nil)) // number of to-be-marked blocks to queue before forcing a rescan
Copy link
Member

@aykevl aykevl Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number seems arbitrary but I guess it's reasonable (without data it's hard to say anything about it).

Copy link
Member Author

@niaow niaow Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I basically had decided that it would be reasonable to spend 64 bytes on ARM, and then decided to reduce that to 16 on AVR. I just went with 4*(wordWidth^2) bytes spent on the stack, since 256 bytes also seemed fine for AMD64 where there is plenty of memory.

@deadprogram
Copy link
Member

Retested and working as expected.

Now merging, great work @jaddr2line on this important set of fixes!

@deadprogram deadprogram merged commit e077e35 into tinygo-org:dev Apr 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants