-
Notifications
You must be signed in to change notification settings - Fork 135
Description
We noticed a very peculiar profile:
The top frame is StackProf.start. It is supposedly occupying 20% of the overall aggregated profile time (~10 out of ~50 minutes). How can this be possible?
Checking the code, it looks like it is technically possible that we can have a signal be received within stackprof_start, as the signal handler is installed but there are a few lines left to execute before we return:
https://github.com/tmm1/stackprof/blob/master/ext/stackprof/stackprof.c#L204-L234
Many of the other variables on _stackprof struct are cleared between runs, but notably, last_sample_at is never reset within stackprof_results
Since we never explicitly acquire and relinquish the GVL within stackprof_start, and never reset the value, this is my theory as to what happened:
- stackprof run A occurs, and on its final sample collection, sets
last_sample_at, t_0 - Some time passes
- stackprof run B occurs, and during the initialization in
stackprof_start, between the installation of the signal handler and before we resetlast_sample_atto the current time, the ruby VM swaps to another thread - Enough wall clock time is spent on the other thread that our timer fires, and we collect a sample
- When calculating the weight of the sample, we use t_0 as the base, instead of t_1, because it has not been reset yet
- Execution returns and we finish executing
stackprof_start, andlast_sample_atis set to t_1, the current time.
I'm not sure if the above is plausible in ruby's threading model, but it's the best I could come up with^.
I was very surprised to see StackProf.start at all in profiles, let alone so prominently. But from what I can tell, the above scenario is a plausible explanation for how this could occur.
Looking at the same interval of time, we do not see it appear in :cpu mode profiles collected. This makes sense though, as those don't use timestamps for their weighting.
The solution to this I think would be some combination of:
- Acquire the GVL during
stackprof_startand release it just before we return - Reset the
last_sample_atduring theresultsfunction, as we do with other variables on the_stackprofstruct.