Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 9 additions & 33 deletions synapse/logging/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -857,8 +857,11 @@ def run_in_background(

# The deferred has already completed
if d.called and not d.paused:
# The function should have maintained the logcontext, so we can
# optimise out the messing about
# The function should have maintained the calling logcontext, so we can avoid
# messing with it further. Additionally, if the deferred has already completed,
# then it would be a mistake to then add a deferred callback (below) to reset
# the logcontext to the sentinel logcontext as that would run immediately
# (remember our goal is to maintain the calling logcontext).
return d

# The function may have reset the context before returning, so we need to restore it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're trusting that the function maintained the calling logcontext just above, why here do we not trust that the function maintained the calling logcontext?

Copy link
Contributor Author

@MadLittleMods MadLittleMods Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plays into the Synapse logcontext rules:

**Rules for functions returning awaitables:**
> - If the awaitable is already complete, the function returns with the
> same logcontext it started with.
> - If the awaitable is incomplete, the function clears the logcontext
> before returning; when the awaitable completes, it restores the
> logcontext before running any callbacks.

I've updated the comments with some further details:

# The deferred has already completed
if d.called and not d.paused:
# If the function messes with logcontexts, we can assume it follows the Synapse
# logcontext rules (Rules for functions returning awaitables: "If the awaitable
# is already complete, the function returns with the same logcontext it started
# with."). If it function doesn't touch logcontexts at all, we can also assume
# the logcontext is unchanged.
#
# Either way, the function should have maintained the calling logcontext, so we
# can avoid messing with it further. Additionally, if the deferred has already
# completed, then it would be a mistake to then add a deferred callback (below)
# to reset the logcontext to the sentinel logcontext as that would run
# immediately (remember our goal is to maintain the calling logcontext when we
# return).
return d
# Since the function we called may follow the Synapse logcontext rules (Rules for
# functions returning awaitables: "If the awaitable is incomplete, the function
# clears the logcontext before returning"), the function may have reset the
# logcontext before returning, so we need to restore the calling logcontext now
# before we return ourselves.
#
# Our goal is to have the caller logcontext unchanged after firing off the
# background task and returning.
set_current_context(calling_context)
# If the function we called is playing nice and following the Synapse logcontext
# rules, it will restore original calling logcontext when the deferred completes;
# but there is nothing waiting for it, so it will get leaked into the reactor (which
# would then get picked up by the next thing the reactor does). We therefore need to
# reset the logcontext here (set the `sentinel` logcontext) before yielding control
# back to the reactor.
#
# (If this feels asymmetric, consider it this way: we are
# effectively forking a new thread of execution. We are
# probably currently within a ``with LoggingContext()`` block,
# which is supposed to have a single entry and exit point. But
# by spawning off another deferred, we are effectively
# adding a new exit point.)
d.addBoth(_set_context_cb, SENTINEL_CONTEXT)

Let me know if that makes more sense otherwise we can continue iterating in another PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so there's a difference in the potential current logcontext dependent on whether the awaitable is complete or incomplete. That makes sense why we don't need to bother when it has completed.

Thanks for the clarification and the extra comments!

Expand Down Expand Up @@ -894,10 +897,9 @@ def run_coroutine_in_background(
Useful for wrapping coroutines that you don't yield or await on (for
instance because you want to pass it to deferred.gatherResults()).

This is a special case of `run_in_background` where we can accept a
coroutine directly rather than a function. We can do this because coroutines
do not run until called, and so calling an async function without awaiting
cannot change the log contexts.
Comment on lines -899 to -900
Copy link
Contributor Author

@MadLittleMods MadLittleMods Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this last sentence because this isn't true on multiple levels:

  1. Calling an async function will run immediately until it yields (hits an await for an incomplete awaitable)
  2. Calling an async function can change the logcontext (and happens all the time). This is exactly why we set the logcontext back to the calling_context before returning.

This is a special case of `run_in_background` where we can accept a coroutine
directly rather than a function. We can do this because coroutines do not continue
running once they have yielded.

This is an ergonomic helper so we can do this:
```python
Expand All @@ -908,33 +910,7 @@ def run_coroutine_in_background(
run_in_background(lambda: func1(arg1))
```
"""
calling_context = current_context()

# Wrap the coroutine in a deferred, which will have the side effect of executing the
# coroutine in the background.
d = defer.ensureDeferred(coroutine)

Copy link
Contributor Author

@MadLittleMods MadLittleMods Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix the root problem, all we need to do is add this same fix that run_in_background already has for already completed deferreds:

# The deferred has already completed
if d.called and not d.paused:
# The function should have maintained the logcontext, so we can
# optimise out the messing about
return d

But instead of duplicating all of this specialty logic and context into run_coroutine_in_background(...), we can just simplify to using run_in_background(...). Especially when run_coroutine_in_background(...) is just an ergonomic wrapper around run_in_background(...)

See #18900 (comment) for more information on how this shortcut and the logcontext logic works for run_in_background(...)

Related conversation where I asked why we even have run_coroutine_in_background(...) -> #18900 (comment)

# The function may have reset the context before returning, so we need to restore it
# now.
#
# Our goal is to have the caller logcontext unchanged after firing off the
# background task and returning.
set_current_context(calling_context)

# The original logcontext will be restored when the deferred completes, but
# there is nothing waiting for it, so it will get leaked into the reactor (which
# would then get picked up by the next thing the reactor does). We therefore
# need to reset the logcontext here (set the `sentinel` logcontext) before
# yielding control back to the reactor.
#
# (If this feels asymmetric, consider it this way: we are
# effectively forking a new thread of execution. We are
# probably currently within a ``with LoggingContext()`` block,
# which is supposed to have a single entry and exit point. But
# by spawning off another deferred, we are effectively
# adding a new exit point.)
d.addBoth(_set_context_cb, SENTINEL_CONTEXT)
return d
return run_in_background(lambda: coroutine)


T = TypeVar("T")
Expand Down
85 changes: 85 additions & 0 deletions tests/util/test_logcontext.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
current_context,
make_deferred_yieldable,
nested_logging_context,
run_coroutine_in_background,
run_in_background,
)
from synapse.types import ISynapseReactor
Expand Down Expand Up @@ -332,6 +333,90 @@ async def testfunc() -> None:

return self._test_run_in_background(testfunc)

@logcontext_clean
async def test_run_coroutine_in_background(self) -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New run_coroutine_in_background tests are here.

We could use a similar pattern to run_in_background(...) where we have a test helper that shares a bunch of the logic. Given we only have to test two variants with coroutines, I've opted not to dry this out.

"""
Test `run_coroutine_in_background`
"""
clock = Clock(reactor)

# Sanity check that we start in the sentinel context
self._check_test_key("sentinel")

callback_finished = False

async def competing_callback() -> None:
nonlocal callback_finished
try:
# The callback should have the same logcontext as the caller
self._check_test_key("foo")

with LoggingContext("competing"):
await clock.sleep(0)
self._check_test_key("competing")

self._check_test_key("foo")
finally:
# When exceptions happen, we still want to mark the callback as finished
# so that the test can complete and we see the underlying error.
callback_finished = True

with LoggingContext("foo"):
run_coroutine_in_background(competing_callback())
self._check_test_key("foo")
await clock.sleep(0)
self._check_test_key("foo")

self.assertTrue(
callback_finished,
"Callback never finished which means the test probably didn't wait long enough",
)

# Back to the sentinel context
self._check_test_key("sentinel")

# @logcontext_clean
async def test_run_coroutine_in_background_already_complete(self) -> None:
"""
Test `run_coroutine_in_background` with a coroutine that is already complete
"""
# Sanity check that we start in the sentinel context
self._check_test_key("sentinel")

callback_finished = False

async def competing_callback() -> None:
nonlocal callback_finished
try:
# The callback should have the same logcontext as the caller
self._check_test_key("foo")

with LoggingContext("competing"):
# We `await` here but there is nothing to wait for here since the
# deferred is already complete so we should immediately continue
# executing in the same context.
await defer.succeed(None)

self._check_test_key("competing")

self._check_test_key("foo")
finally:
# When exceptions happen, we still want to mark the callback as finished
# so that the test can complete and we see the underlying error.
callback_finished = True

with LoggingContext("foo"):
run_coroutine_in_background(competing_callback())
self._check_test_key("foo")

self.assertTrue(
callback_finished,
"Callback never finished which means the test probably didn't wait long enough",
)

# Back to the sentinel context
self._check_test_key("sentinel")

@logcontext_clean
@defer.inlineCallbacks
def test_make_deferred_yieldable(
Expand Down
Loading