diff --git a/changelog.d/18914.doc b/changelog.d/18914.doc new file mode 100644 index 00000000000..9d4f03ade7f --- /dev/null +++ b/changelog.d/18914.doc @@ -0,0 +1 @@ +Explain how Deferred callbacks interact with logcontexts. diff --git a/docs/log_contexts.md b/docs/log_contexts.md index 4e0c0e64f79..bbe9e86827b 100644 --- a/docs/log_contexts.md +++ b/docs/log_contexts.md @@ -143,8 +143,7 @@ cares about. The following sections describe pitfalls and helpful patterns when implementing these rules. -Always await your awaitables ----------------------------- +## Always await your awaitables Whenever you get an awaitable back from a function, you should `await` on it as soon as possible. Do not pass go; do not do any logging; do not @@ -203,6 +202,171 @@ async def sleep(seconds): return await context.make_deferred_yieldable(get_sleep_deferred(seconds)) ``` +## Deferred callbacks + +When a deferred callback is called, it inherits the current logcontext. The deferred +callback chain can resume a coroutine, which if following our logcontext rules, will +restore its own logcontext, then run: + + - until it yields control back to the reactor, setting the sentinel logcontext + - or until it finishes, restoring the logcontext it was started with (calling context) + +This behavior creates two specific issues: + +**Issue 1:** The first issue is that the callback may have reset the logcontext to the +sentinel before returning. This means our calling function will continue with the +sentinel logcontext instead of the logcontext it was started with (bad). + +**Issue 2:** The second issue is that the current logcontext that called the deferred +callback could finish before the callback finishes (bad). + +In the following example, the deferred callback is called with the "main" logcontext and +runs until we yield control back to the reactor in the `await` inside `clock.sleep(0)`. +Since `clock.sleep(0)` follows our logcontext rules, it sets the logcontext to the +sentinel before yielding control back to the reactor. Our `main` function continues with +the sentinel logcontext (first bad thing) instead of the "main" logcontext. Then the +`with LoggingContext("main")` block exits, finishing the "main" logcontext and yielding +control back to the reactor again. Finally, later on when `clock.sleep(0)` completes, +our `with LoggingContext("competing")` block exits, and restores the previous "main" +logcontext which has already finished, resulting in `WARNING: Re-starting finished log +context main` and leaking the `main` logcontext into the reactor which will then +erronously be associated with the next task the reactor picks up. + +```python +async def competing_callback(): + # Since this is run with the "main" logcontext, when the "competing" + # logcontext exits, it will restore the previous "main" logcontext which has + # already finished and results in "WARNING: Re-starting finished log context main" + # and leaking the `main` logcontext into the reactor. + with LoggingContext("competing"): + await clock.sleep(0) + +def main(): + with LoggingContext("main"): + d = defer.Deferred() + d.addCallback(lambda _: defer.ensureDeferred(competing_callback())) + # Call the callback within the "main" logcontext. + d.callback(None) + # Bad: This will be logged against sentinel logcontext + logger.debug("ugh") + +main() +``` + +**Solution 1:** We could of course fix this by following the general rule of "always +await your awaitables": + +```python +async def main(): + with LoggingContext("main"): + d = defer.Deferred() + d.addCallback(lambda _: defer.ensureDeferred(competing_callback())) + d.callback(None) + # Wait for `d` to finish before continuing so the "main" logcontext is + # still active. This works because `d` already follows our logcontext + # rules. If not, we would also have to use `make_deferred_yieldable(d)`. + await d + # Good: This will be logged against the "main" logcontext + logger.debug("phew") +``` + +**Solution 2:** We could also fix this by surrounding the call to `d.callback` with a +`PreserveLoggingContext`, which will reset the logcontext to the sentinel before calling +the callback, and restore the "main" logcontext afterwards before continuing the `main` +function. This solves the problem because when the "competing" logcontext exits, it will +restore the sentinel logcontext which is never finished by its nature, so there is no +warning and no leakage into the reactor. + +```python +async def main(): + with LoggingContext("main"): + d = defer.Deferred() + d.addCallback(lambda _: defer.ensureDeferred(competing_callback())) + d.callback(None) + with PreserveLoggingContext(): + # Call the callback with the sentinel logcontext. + d.callback(None) + # Good: This will be logged against the "main" logcontext + logger.debug("phew") +``` + +**Solution 3:** But let's say you *do* want to run (fire-and-forget) the deferred +callback in the current context without running into issues: + +We can solve the first issue by using `run_in_background(...)` to run the callback in +the current logcontext and it handles the magic behind the scenes of a) restoring the +calling logcontext before returning to the caller and b) resetting the logcontext to the +sentinel after the deferred completes and we yield control back to the reactor to avoid +leaking the logcontext into the reactor. + +To solve the second issue, we can extend the lifetime of the "main" logcontext by +avoiding the `LoggingContext`'s context manager lifetime methods +(`__enter__`/`__exit__`). We can still set "main" as the current logcontext by using +`PreserveLoggingContext` and passing in the "main" logcontext. + + +```python +async def main(): + main_context = LoggingContext("main") + with PreserveLoggingContext(main_context): + d = defer.Deferred() + d.addCallback(lambda _: defer.ensureDeferred(competing_callback())) + # The whole lambda will be run in the "main" logcontext. But we're using + # a trick to return the deferred `d` itself so that `run_in_background` + # will wait on that to complete and reset the logcontext to the sentinel + # when it does to avoid leaking the "main" logcontext into the reactor. + run_in_background(lambda: (d.callback(None), d)[1]) + # Good: This will be logged against the "main" logcontext + logger.debug("phew") + +... + +# Wherever possible, it's best to finish the logcontext by calling `__exit__` at some +# point. This allows us to catch bugs if we later try to erroneously restart a finished +# logcontext. +# +# Since the "main" logcontext stores the `LoggingContext.previous_context` when it is +# created, we can wrap this call in `PreserveLoggingContext()` to restore the correct +# previous logcontext. Our goal is to have the calling context remain unchanged after +# finishing the "main" logcontext. +with PreserveLoggingContext(): + # Finish the "main" logcontext + with main_context: + # Empty block - We're just trying to call `__exit__` on the "main" context + # manager to finish it. We can't call `__exit__` directly as the code expects us + # to `__enter__` before calling `__exit__` to `start`/`stop` things + # appropriately. And in any case, it's probably best not to call the internal + # methods directly. + pass +``` + +The same thing applies if you have some deferreds stored somewhere which you want to +callback in the current logcontext. + + +### Deferred errbacks and cancellations + +The same care should be taken when calling errbacks on deferreds. An errback and +callback act the same in this regard (see section above). + +```python +d = defer.Deferred() +d.addErrback(some_other_function) +d.errback(failure) +``` + +Additionally, cancellation is the same as directly calling the errback with a +`twisted.internet.defer.CancelledError`: + +```python +d = defer.Deferred() +d.addErrback(some_other_function) +d.cancel() +``` + + + + ## Fire-and-forget Sometimes you want to fire off a chain of execution, but not wait for diff --git a/tests/util/test_logcontext.py b/tests/util/test_logcontext.py index 651dd844835..5fecc7fd536 100644 --- a/tests/util/test_logcontext.py +++ b/tests/util/test_logcontext.py @@ -249,6 +249,204 @@ async def competing_callback() -> None: # Back to the sentinel context self._check_test_key("sentinel") + @logcontext_clean + async def test_deferred_callback_await_in_current_logcontext(self) -> None: + """ + Test that calling the deferred callback in the current logcontext ("foo") and + waiting for it to finish in a logcontext blocks works as expected. + + Works because "always await your awaitables". + + Demonstrates one pattern that we can use fix the naive case where we just call + `d.callback(None)` without anything else. See the *Deferred callbacks* section + of docs/log_contexts.md for more details. + """ + clock = Clock(reactor) + + # Sanity check that we start in the sentinel context + self._check_test_key("sentinel") + + callback_finished = False + + async def competing_callback() -> None: + nonlocal callback_finished + try: + # The deferred callback should have the same logcontext as the caller + self._check_test_key("foo") + + with LoggingContext("competing"): + await clock.sleep(0) + self._check_test_key("competing") + + self._check_test_key("foo") + finally: + # When exceptions happen, we still want to mark the callback as finished + # so that the test can complete and we see the underlying error. + callback_finished = True + + with LoggingContext("foo"): + d: defer.Deferred[None] = defer.Deferred() + d.addCallback(lambda _: defer.ensureDeferred(competing_callback())) + self._check_test_key("foo") + d.callback(None) + # The fix for the naive case is here (i.e. things don't work correctly if we + # don't await here). + # + # Wait for `d` to finish before continuing so the "main" logcontext is + # still active. This works because `d` already follows our logcontext + # rules. If not, we would also have to use `make_deferred_yieldable(d)`. + await d + self._check_test_key("foo") + + await clock.sleep(0) + + self.assertTrue( + callback_finished, + "Callback never finished which means the test probably didn't wait long enough", + ) + + # Back to the sentinel context + self._check_test_key("sentinel") + + @logcontext_clean + async def test_deferred_callback_preserve_logging_context(self) -> None: + """ + Test that calling the deferred callback inside `PreserveLoggingContext()` (in + the sentinel context) works as expected. + + Demonstrates one pattern that we can use fix the naive case where we just call + `d.callback(None)` without anything else. See the *Deferred callbacks* section + of docs/log_contexts.md for more details. + """ + clock = Clock(reactor) + + # Sanity check that we start in the sentinel context + self._check_test_key("sentinel") + + callback_finished = False + + async def competing_callback() -> None: + nonlocal callback_finished + try: + # The deferred callback should have the same logcontext as the caller + self._check_test_key("sentinel") + + with LoggingContext("competing"): + await clock.sleep(0) + self._check_test_key("competing") + + self._check_test_key("sentinel") + finally: + # When exceptions happen, we still want to mark the callback as finished + # so that the test can complete and we see the underlying error. + callback_finished = True + + with LoggingContext("foo"): + d: defer.Deferred[None] = defer.Deferred() + d.addCallback(lambda _: defer.ensureDeferred(competing_callback())) + self._check_test_key("foo") + # The fix for the naive case is here (i.e. things don't work correctly if we + # don't `PreserveLoggingContext()` here). + # + # `PreserveLoggingContext` will reset the logcontext to the sentinel before + # calling the callback, and restore the "foo" logcontext afterwards before + # continuing the foo block. This solves the problem because when the + # "competing" logcontext exits, it will restore the sentinel logcontext + # which is never finished by its nature, so there is no warning and no + # leakage into the reactor. + with PreserveLoggingContext(): + d.callback(None) + self._check_test_key("foo") + + await clock.sleep(0) + + self.assertTrue( + callback_finished, + "Callback never finished which means the test probably didn't wait long enough", + ) + + # Back to the sentinel context + self._check_test_key("sentinel") + + @logcontext_clean + async def test_deferred_callback_fire_and_forget_with_current_context(self) -> None: + """ + Test that it's possible to call the deferred callback with the current context + while fire-and-forgetting the callback (no adverse effects like leaking the + logcontext into the reactor or restarting an already finished logcontext). + + Demonstrates one pattern that we can use fix the naive case where we just call + `d.callback(None)` without anything else. See the *Deferred callbacks* section + of docs/log_contexts.md for more details. + """ + clock = Clock(reactor) + + # Sanity check that we start in the sentinel context + self._check_test_key("sentinel") + + callback_finished = False + + async def competing_callback() -> None: + nonlocal callback_finished + try: + # The deferred callback should have the same logcontext as the caller + self._check_test_key("foo") + + with LoggingContext("competing"): + await clock.sleep(0) + self._check_test_key("competing") + + self._check_test_key("foo") + finally: + # When exceptions happen, we still want to mark the callback as finished + # so that the test can complete and we see the underlying error. + callback_finished = True + + # Part of fix for the naive case is here (i.e. things don't work correctly if we + # don't `PreserveLoggingContext(...)` here). + # + # We can extend the lifetime of the "foo" logcontext is to avoid calling the + # context manager lifetime methods of `LoggingContext` (`__enter__`/`__exit__`). + # And we can still set the current logcontext by using `PreserveLoggingContext` + # and passing in the "foo" logcontext. + with PreserveLoggingContext(LoggingContext("foo")): + d: defer.Deferred[None] = defer.Deferred() + d.addCallback(lambda _: defer.ensureDeferred(competing_callback())) + self._check_test_key("foo") + # Other part of fix for the naive case is here (i.e. things don't work + # correctly if we don't `run_in_background(...)` here). + # + # `run_in_background(...)` will run the whole lambda in the current + # logcontext and it handles the magic behind the scenes of a) restoring the + # calling logcontext before returning to the caller and b) resetting the + # logcontext to the sentinel after the deferred completes and we yield + # control back to the reactor to avoid leaking the logcontext into the + # reactor. + # + # We're using a lambda here as a little trick so we can still get everything + # to run in the "foo" logcontext, but return the deferred `d` itself so that + # `run_in_background` will wait on that to complete before resetting the + # logcontext to the sentinel. + # + # type-ignore[call-overload]: This appears like a mypy type inference bug. A + # function that returns a deferred is exactly what `run_in_background` + # expects. + # + # type-ignore[func-returns-value]: This appears like a mypy type inference + # bug. We're always returning the deferred `d`. + run_in_background(lambda: (d.callback(None), d)[1]) # type: ignore[call-overload, func-returns-value] + self._check_test_key("foo") + + await clock.sleep(0) + + self.assertTrue( + callback_finished, + "Callback never finished which means the test probably didn't wait long enough", + ) + + # Back to the sentinel context + self._check_test_key("sentinel") + async def _test_run_in_background(self, function: Callable[[], object]) -> None: clock = Clock(reactor)