-
Notifications
You must be signed in to change notification settings - Fork 49
Replace manual LoggingContext usage with ModuleApi.defer_to_threadpool
#134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
|
|
||
| def s3_download_task(s3_client, bucket, key, extra_args, deferred, parent_logcontext): | ||
| def s3_download_task(s3_client, bucket, key, extra_args, deferred): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only real changes here are:
- Remove
parent_logcontext. - Removing
with LoggingContext ...and de-indenting all of the code underneath it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these changes make sense. s3_download_task will use whatever the caller logcontext is.
And I think we maintain the logcontext when calling s3_download_task ✅ (at-least with the suggested patterns).
Attempt to replace manual usage of LoggingContext with the provided module API's `run_in_background` method.
77b1bc5 to
5bdb5d9
Compare
|
|
||
|
|
||
| def s3_download_task(s3_client, bucket, key, extra_args, deferred, parent_logcontext): | ||
| def s3_download_task(s3_client, bucket, key, extra_args, deferred): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these changes make sense. s3_download_task will use whatever the caller logcontext is.
And I think we maintain the logcontext when calling s3_download_task ✅ (at-least with the suggested patterns).
We also make `store_file` and `fetch` async, as they are async in the base class. This also simplifies the implementation. We could go through and convert the whole module from deferreds to async, but that should be done separately.
|
190686c makes use of This module previously used its own threadpool ( The default threadpool size in Twisted is 10. I worry that switching to Synapse's default threadpool will hurt the performance of this module, with no way for users to configure it. Perhaps we should additionally expose |
s3_storage_provider.py
Outdated
| s3_download_task( | ||
| self._get_s3_client(), self.bucket, self.prefix + path, self.extra_args, d, logcontext | ||
| ) | ||
| await self._module_api.defer_to_thread( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module previously used its own threadpool (
self._s3_pool), and even had an option to configure the size (threadpool_size, default 40). In previous issues, we've suggested users configure boto3's connection count to match the configured threadpool size: #117The default threadpool size in Twisted is 10. I worry that switching to Synapse's default threadpool will hurt the performance of this module, with no way for users to configure it.
Can we recommend people increase the size of the the default Twisted threadpool?
Is there any difference in having a separate threadpool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed in #synapse-dev:matrix.org,
Here is why a separate threadpool is important:
I'd be worried about s3 monopolising the thread pool and blocking other operations (like DNS lookups)
-- @richvdh
As @anoadragon453 points out, DNS lookups have their own threadpool already but the point still stands generally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we recommend people increase the size of the the default Twisted threadpool?
We don't currently have such an option in Synapse, but it would be better than nothing.
Perhaps we should try deploying the module with the default threadpool on matrix.org and see if performance suffers? I'm just worried that if we go ahead with not adding any way to configure the threadpool size, yet require people to upgrade this module to use the latest Synapse, they could be stuck between a rock and a hard place.
Otherwise, I think:
Perhaps we should additionally expose
defer_to_threadpoolto modules, allowing modules to specify a threadpool? Though this would break compatibility with older versions of Synapse. We could check if the method existed onModuleApiand use the default threadpool if not.
May be the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we recommend people increase the size of the the default Twisted threadpool?
The size of the default Twisted threadpool can only be increased through code, i.e.:
we don't provide a configuration option which tweaks this currently, so users are unable to increase the size of the default threadpool.
Above I suggested exposing X to modules. Another alternative is to add an argument to the already-exposed defer_to_thread ModuleApi method to allow specifying a threadpool, defaulting to the default Twisted threadpool if not provided. It would then use defer_to_threadpool under the hood instead of defer_to_thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds workable 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried the latter approach in element-hq/synapse#19032.
|
FYI I tried this branch on jki.re with latest develop and no new media is being downloaded. It seems to fail with: |
ModuleApi.run_in_backgroundModuleApi.defer_to_threadpool
As introduced by element-hq/synapse#19032.
Discovered by @erikj: > the thread is waiting for synapse to use the S3Responder, but the responder isn't returned to Synapse until the thread is finished
> the `d` we pass in to `s3_download_task` gets resolved once we connect to S3, and the thread is concluded only once we finish the download.
|
Thanks for the reviews both! My theory as to why this PR passed the integration test CI, but failed on @erikjohnston's homeserver, is that the integration tests do not delete the media from Synapse's local storage after it's uploaded. Thus we don't end up actually fetching the media from the minIO S3 storage upon download. I'll have a look at modifying the tests to actually do that. |
| # DO await on `d`, as it will resolve once a connection to S3 has been | ||
| # opened. We only want to return to Synapse once we can start streaming | ||
| # chunks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cross-linking internal discussion where the cause of the Exception: Timed out waiting to resume was figured out.
|
Using While Synapse v1.140.0rc1 warns about Users on the current stable Synapse release (v1.139.0) are better off staying with |
…age_provider_installation_old_boto_workaround_enabled`" This reverts commit 2b0ea94. We're going back to s3-storage-provider=v1.5.0 Ref: matrix-org/synapse-s3-storage-provider#134 (comment)
|
@spantaleev thanks for raising! We completely forgot to signal that. I've added a warning to the top of the latest release notes: https://github.com/matrix-org/synapse-s3-storage-provider/releases/tag/v1.6.0. Synapse 1.140.0 is expected to be released tomorrow (as long as no other regressions are found). |
Attempt to replace manual usage of
LoggingContextwith the provided module API'srun_in_backgroundmethod.I'm not entirely convinced about the changes to
fetch(and subsequentlys3_download_task). The fact we hand it a deferred directly is confusing me.Spawning from #133. #74 can be closed after this PR is merged.