-
Notifications
You must be signed in to change notification settings - Fork 404
Fix lost logcontext when using timeout_deferred(...)
#19090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Fix lost logcontext when using timeout_deferred(...)
#19090
Conversation
This reverts commit 4b9441e.
| if current_context() != calling_context: | ||
| logcontext_error( | ||
| "run_in_background(%s): deferred already completed but the function did not maintain the calling logcontext %s (found %s)" | ||
| % ( | ||
| instance_id, | ||
| calling_context, | ||
| current_context(), | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra protection when our assumption is wrong.
This did happen for the case we're fixing.
| with PreserveLoggingContext(): | ||
| deferred.cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main fix!
See the Deferred callbacks section of our logcontext docs for more info (specifically using solution 2).
Heads-up, I wrote the docs too so it's my assumptions/understanding all the way down. Apply your own scrutiny.
| # logcontext from finishing as soon as we exit this function, in case `f` | ||
| # returns an awaitable/deferred which would continue running and may try to | ||
| # restore the `loop_call` context when it's done (because it's trying to | ||
| # restore the `call_later` context when it's done (because it's trying to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy/paste typo
| # Log so we can still see it in the logs like normal | ||
| logger.warning(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quality of life so these the logcontext errors appear in the logs (_trial_temp/test.log) just like if we never used @logcontext_clean to patch it.
logger.warning(...) is the same as what we're doing in the actual thing:
synapse/synapse/logging/context.py
Lines 105 to 107 in 3b59ac3
| # a hook which can be set during testing to assert that we aren't abusing logcontexts. | |
| def logcontext_error(msg: str) -> None: | |
| logger.warning(msg) |
Perhaps it would be even better if we just called the original implementation alongside raising the AssertionError but the implementation is simple enough for now so just going to duplicate.
| async def test_logcontext_is_not_lost_when_awaiting_on_timeout_cancellation( | ||
| self, | ||
| ) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test reproduces the Expected logging context call_later was lost problem with timeout_deferred(...)
And now passes with the fix introduced in this PR.
| with PreserveLoggingContext(): | ||
| deferred.cancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an ideal world, I think it should be possible to call the deferred callback/errbacks/cancel with some logcontext. I spent too much time trying to figure out the intricacies here and trying to use solution 3 from the Deferred callbacks docs but wasn't successful.
It makes me question if what I wrote there is correct in the first place 🤔
The problem is that calling deferred.cancel() can change the logcontext but it's also complete at that point. Because the deferred is already complete, run_in_background(lambda: (deferred.cancel(), deferred)[1]) assumes that the logcontext was unchanged and returns it as-is which leaves the logcontext is messed up for the caller.
I can't tell if the problem is a) our function just isn't following logcontext rules (and how to resolve that well) or b) we should set_current_context(calling_context) regardless of whether the deferred is already complete.
Instead of banging my head against this more, I've opted to go for the simple route with PreserveLoggingContext():. Which also matches what we do elsewhere in the codebase. Something to improve in the future ⏩
…ut-deferred-call-later
Fix lost logcontext when using
timeout_deferred(...)and things actually timeout.Fix #19087 (our HTTP client times out requests using
timeout_deferred(...)Fix #19066 (
/syncusesnotifier.wait_for_events()which usestimeout_deferred(...)under the hood)When/why did these lost logcontext warnings start happening?
In #18828, we switched
timeout_deferred(...)from usingreactor.callLater(...)toclock.call_later(...)under the hood. This meant it started dealing with logcontexts but ourtime_it_out()callback didn't follow our Synapse logcontext rules.Dev notes
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.