-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Improve best_effort_wfe_or_timeout #1822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // already. This means that repeated calls to this function with the same timeout will work correctly | ||
| // after the first one! This is fine, because we ask callers to use a polling loop on another | ||
| // event variable when using this function | ||
| if (ta_wakes_up_on_or_before(alarm_pool_get_default()->timer, alarm_pool_get_default()->timer_alarm_num, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the other core (or a preemptive IRQ on the current core) modifies this default timer alarm in the middle of this function call? This call here doesn't look atomic, and the timer may be updated (cancelled, moved further into the future) before the __wfe() just below, which would make the __wfe() wait longer than the requested timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugh true; that's what i get for trying to write code with COVID!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although actually the circumstances where it could not get woken up are really just the other core canceling an alarm (or adding the first one further out in the future)... i think adding a SEV at the end of the alarm (IRQ) handler should solve that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's what i get for trying to write code with COVID!
Oh no! You should just rest.
although actually the circumstances where it could not get woken up are really just the other core canceling an alarm (or adding the first one further out in the future)... i think adding a SEV at the end of the alarm (IRQ) handler should solve that.
I think the alarm needs to guarantee that, if its IRQ gets enabled then:
- when the IRQ fires it will do a
__sev() - the alarm time will never be delayed or cancelled, it can only ever be brought forward (earlier) in time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah actually I think I misunderstood what you meant when I read it yesterday… when the irq handler sets a new time, it should never make it later than the current time - it should always slow the previous target to happen first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when the irq handler sets a new time, it should never make it later than the current time
Yes.
Also, it should never cancel/disable an IRQ if it's already set up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry, this is in the context of the alarm for the alarm pool which is never disabled any more - it is doomed to fire every 2^32 us
|
This is targeting |
…e the alarm target backwards; needs a new test case to make sure though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote some test code to reproduce the scenario I think this is trying to fix. Without this change I can see the problem happening, a sev gets swallowed by best_effort_wfe_or_timeout.
After these changes I see that best_effort_wfe_or_timeout will return early due to the event.
|
I should probably merge this into develop, and add another issue for additional tests |
|
I can add a test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very happy with this :)
Fix some parts of #1812
The problem with the rewrite of the alarm pool code is that adding an alarm forces an IRQ which cases a SEV. We attempted to fix this in
best_effort_wfe_or_timeoutby consuming the event then doing a WFE again. This is clearly wrong if the event was set prior to the call.The best we can do (since we cant tell whether an WFE will block) is to only add the alarm if there isn't an IRQ (we only check this alarm pool alarm) already set to happen on or before the target time (which will cause a SEV). This means that the first call will to this method will always return, but when used in a loop, the second will do the right thing.
This is OK as the prescribed behavior is to use the method in a polling loop