Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 156 additions & 2 deletions docs/log_contexts.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,8 +143,7 @@ cares about.
The following sections describe pitfalls and helpful patterns when
implementing these rules.

Always await your awaitables
----------------------------
## Always await your awaitables

Whenever you get an awaitable back from a function, you should `await` on
it as soon as possible. Do not pass go; do not do any logging; do not
Expand Down Expand Up @@ -203,6 +202,161 @@ async def sleep(seconds):
return await context.make_deferred_yieldable(get_sleep_deferred(seconds))
```

## Deferred callbacks

When a deferred callback is called, it will use the current logcontext.

The deferred callback chain can resume a coroutine, which if following our logcontext
rules, will restore its own logcontext, then run:

- until it yields control back to the reactor, setting the sentinel logcontext
- or until it finishes, restoring the logcontext it was started with (calling context)

The first issue is that the callback may have reset the logcontext to the sentinel
before returning. This means our calling function will continue with the sentinel
logcontext instead of the logcontext it was started with (bad).

The second issue is that the current logcontext that called the deferred callback could
finish before the callback finishes (bad).

In the following example, the deferred callback is called with the "main" logcontext and
runs until we yield control back to the reactor in the `await` inside `clock.sleep(0)`.
Since `clock.sleep(0)` follows our logcontext rules, it sets the logcontext to the
sentinel before yielding control back to the reactor. Our `main` function continues with
the sentinel logcontext (first bad thing) instead of the "main" logcontext. Then the
`with LoggingContext("main")` block exits, finishing the "main" logcontext and yielding
control back to the reactor again. Finally, later on when `clock.sleep(0)` completes,
our `with LoggingContext("competing")` block exits, and restores the previous "main"
logcontext which has already finished, resulting in `WARNING: Re-starting finished log
context main` and leaking the `main` logcontext into the reactor which will then
erronously be associated with the next task the reactor picks up.

```python
async def competing_callback():
# Since this is run with the "main" logcontext, when the "competing"
# logcontext exits, it will restore the previous "main" logcontext which has
# already finished and results in "WARNING: Re-starting finished log context main"
# and leaking the `main` logcontext into the reactor.
with LoggingContext("competing"):
await clock.sleep(0)

def main():
with LoggingContext("main"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
# Call the callback within the "main" logcontext.
d.callback(None)
# Bad: This will be logged against sentinel logcontext
logger.debug("ugh")

main()
```

We could of course fix this by following the general rule of "always await your
awaitables":

```python
async def main():
with LoggingContext("main"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
d.callback(None)
# Wait for `d` to finish before continuing so the "main" logcontext is
# still active. This works because `d` already follows our logcontext
# rules. If not, we would also have to use `make_deferred_yieldable(d)`.
await d
# Good: This will be logged against the "main" logcontext
logger.debug("phew")
```

We could also fix this by surrounding the call to `d.callback` with a
`PreserveLoggingContext`, which will reset the logcontext to the sentinel before calling
the callback, and restore the "foo" logcontext afterwards before continuing the `main`
function. This solves the problem because when the "competing" logcontext exits, it will
restore the sentinel logcontext which is never finished by its nature, so there is no
warning and no leakage into the reactor.

```python
async def main():
with LoggingContext("main"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
d.callback(None)
with PreserveLoggingContext():
# Call the callback with the sentinel logcontext.
d.callback(None)
# Good: This will be logged against the "main" logcontext
logger.debug("phew")
```

But let's say you *do* want to run the deferred callback in the current context without
running into issues:

We can solve the first issue by using `run_in_background(...)` to run the callback in
the current logcontext and it handles the magic behind the scenes of a) restoring the
calling logcontext before returning to the caller and b) resetting the logcontext to the
sentinel after the deferred completes and we yield control back to the reactor to avoid
leaking the logcontext into the reactor.

To solve the second problem, we can extend the lifetime of the "main" logcontext is to
avoid calling the context manager lifetime methods of `LoggingContext`
(`__enter__`/`__exit__`). And we can still set the current logcontext by using
`PreserveLoggingContext` and passing in the "main" logcontext.


```python
async def main():
main_context = LoggingContext("main")
with PreserveLoggingContext(main_context):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
# The whole lambda will be run in the "main" logcontext. But we're using
# a trick to return the deferred `d` itself so that `run_in_background`
# will wait on that to complete and reset the logcontext to the sentinel
# when it does to avoid leaking the "main" logcontext into the reactor.
run_in_background(lambda: (d.callback(None), d)[1])
# Good: This will be logged against the "main" logcontext
logger.debug("phew")

...

# Wherever possible, it's best to finish the logcontext by calling `__exit__` at some
# point. This allows us to catch bugs if we later try to erroneously restart a finished
# logcontext.
#
# Since the "main" logcontext stores the `LoggingContext.previous_context` when it is
# created, we can wrap this call in `PreserveLoggingContext()` to restore the correct
# previous logcontext. Our goal is to have the calling context remain unchanged after
# finishing the "main" logcontext.
with PreserveLoggingContext():
# Finish the "main" logcontext
with main_context:
pass
```

### Deferred errbacks and cancellations

The same care should be taken when calling errbacks on deferreds. An errback and
callback act the same in this regard (see section above).

```python
d = defer.Deferred()
d.addErrback(some_other_function)
d.errback(failure)
```

Additionally, cancellation is the same as directly calling the errback with a
`twisted.internet.defer.CancelledError`:

```python
d = defer.Deferred()
d.addErrback(some_other_function)
d.cancel()
```




## Fire-and-forget

Sometimes you want to fire off a chain of execution, but not wait for
Expand Down
107 changes: 105 additions & 2 deletions tests/util/test_logcontext.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,38 +19,141 @@
#
#


import logging
from typing import Callable, Generator, cast

import twisted.python.failure
from twisted.internet import defer, reactor as _reactor

from synapse.logging.context import (
SENTINEL_CONTEXT,
LoggingContext,
PreserveLoggingContext,
current_context,
make_deferred_yieldable,
nested_logging_context,
run_in_background,
_Sentinel,
)
from synapse.types import ISynapseReactor
from synapse.util import Clock

from .. import unittest

Check failure on line 42 in tests/util/test_logcontext.py

View workflow job for this annotation

GitHub Actions / lint

Ruff (I001)

tests/util/test_logcontext.py:23:1: I001 Import block is un-sorted or un-formatted

logger = logging.getLogger(__name__)

reactor = cast(ISynapseReactor, _reactor)


class LoggingContextTestCase(unittest.TestCase):
def _check_test_key(self, value: str) -> None:
context = current_context()
assert isinstance(context, LoggingContext)
self.assertEqual(context.name, value)
assert isinstance(context, LoggingContext) or isinstance(context, _Sentinel), (
f"Expected LoggingContext({value}) but saw {context}"
)
self.assertEqual(
str(context), value, f"Expected LoggingContext({value}) but saw {context}"
)

def test_with_context(self) -> None:
with LoggingContext("test"):
self._check_test_key("test")

async def test_deferred_asdf1(self) -> None:
clock = Clock(reactor)

# Sanity check that we start in the sentinel context
self._check_test_key("sentinel")

async def competing_callback():
# Since this is run with the "foo" logcontext, when the "competing"
# logcontext exits, it will restore the previous "foo" logcontext which has
# already finished and results in "WARNING: Re-starting finished log context foo"
with LoggingContext("competing"):
await clock.sleep(0)

with LoggingContext("foo"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
# Call the callback with the "foo" context.
# d.callback(None)
with PreserveLoggingContext():
d.callback(None)

# This will be logged against sentinel logcontext
logger.debug("ugh")

async def test_deferred_asdf2(self) -> None:
clock = Clock(reactor)

# Sanity check that we start in the sentinel context
self._check_test_key("sentinel")

async def competing_callback():
logger.info("competing_callback1")
# Since this is run with the "foo" logcontext, when the "competing"
# logcontext exits, it will restore the previous "foo" logcontext which has
# already finished and results in "WARNING: Re-starting finished log context foo"
with LoggingContext("competing"):
await clock.sleep(0)

with PreserveLoggingContext(LoggingContext("foo")):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
# Call the callback with the "foo" context.
# d.callback(None)
# with PreserveLoggingContext():
# d.callback(None)
# run_in_background(d.callback, None)
# run_in_background(lambda: d.callback(None))
run_in_background(lambda: (d.callback(None), d)[1])
# run_in_background(lambda: defer.ensureDeferred(competing_callback()))

# This will be logged against sentinel logcontext
logger.debug("ugh")

async def test_deferred(self) -> None:
clock = Clock(reactor)

# Sanity check that we start in the sentinel context
self._check_test_key("sentinel")

callback_finished = False

async def competing_callback() -> None:
nonlocal callback_finished
logger.info("competing_callback1")
# The deferred callback should have the same logcontext as the caller
self._check_test_key("one")

logger.info("competing_callback2")
with LoggingContext("competing"):
await clock.sleep(0)
self._check_test_key("competing")

self._check_test_key("one")
logger.info("competing_callback3")
callback_finished = True

with LoggingContext("one"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
self._check_test_key("one")
d.callback(None)
self._check_test_key("one")
await clock.sleep(0)
self._check_test_key("one")
await clock.sleep(0)

self.assertTrue(
callback_finished,
"Callback never finished which means the test probably didn't wait long enough",
)

# Back to the sentinel context
self._check_test_key("sentinel")

async def test_sleep(self) -> None:
clock = Clock(reactor)

Expand Down
Loading