Skip to content

Conversation

@elyesbenamor
Copy link

Closes #121
Closes #115
Closes #96
Closes #28

Changes Made

  1. Fixed file handling logic in store_file method:
  • store_local: true with retention - files are kept locally for the specified duration
  • S3 upload functionality - files are properly uploaded and stored
  • Verified configuration parsing for boolean values and retention settings
  1. Improved configuration handling:

    • Fixed boolean parsing to handle YAML native boolean values correctly
    • Moved media_retention settings outside the config block
    • Added proper defaults when configuration values are not set
  2. Added retention features:

    • Implemented remote_media_lifetime functionality to delete files from S3 after specified duration
    • Files are kept indefinitely in S3 if remote_media_lifetime is not set
    • Added scheduled deletion using reactor.callLater for both local and remote files
  3. Enhanced logging:

    • Added detailed logging for file operations
    • Improved error messages for failed uploads and deletions
    • Added logging for retention scheduling

Documentation

  • Updated README with clearer examples and explanations of retention settings
  • Added documentation for the remote_media_lifetime feature

@elyesbenamor elyesbenamor requested a review from a team as a code owner January 23, 2025 21:26
@elyesbenamor

This comment was marked as spam.

@drallgood

This comment was marked as off-topic.

@elyesbenamor

This comment was marked as off-topic.

@dynamyc010
Copy link

Would this clean up files already not in the database, or would I have to manually painstakingly clean up myself after this rolls out? I don't want to set a temporary super short retention time so it'll clean up files already "gone" according to Synapse (like when accounts get deleted and such)

Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your efforts on the PR, and apologies for the slow review response.

I've had a brief look at this PR, but there are several red flags that signal "sloppy AI" to me. Please review the comments I've left and clean up the PR a bit before the next round of review.

In addition, when making multiple distinct changes ("Fixed file handling logic in store_file method", "Added retention features"), please open separate PRs for each to make review simpler.

Comment on lines +90 to +111
s3_config = config.get("config", {})
if not s3_config:
logger.warning("No config block found, falling back to root-level settings")
self.bucket = config.get("bucket")
self.api_kwargs = {
"region_name": config.get("region_name"),
"endpoint_url": config.get("endpoint_url"),
"aws_access_key_id": config.get("access_key_id"),
"aws_secret_access_key": config.get("secret_access_key"),
}
self.prefix = config.get("prefix", "")
self.extra_args = config.get("extra_args", {})
else:
self.bucket = s3_config.get("bucket")
self.api_kwargs = {
"region_name": s3_config.get("region_name"),
"endpoint_url": s3_config.get("endpoint_url"),
"aws_access_key_id": s3_config.get("access_key_id"),
"aws_secret_access_key": s3_config.get("secret_access_key"),
}
self.prefix = s3_config.get("prefix", "")
self.extra_args = s3_config.get("extra_args", {})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config contains the contents of the config key. It won't contain a config key itself, so this if isn't necessary.

Comment on lines +131 to +136
"""Get or create an S3 client."""
if self._s3_client is None:
with self._s3_client_lock:
if self._s3_client is None:
self._s3_client = boto3.client("s3", **self.api_kwargs)
return self._s3_client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have you removed the comments from this method?

# all the data has been written (or there has been a fatal error).
self.deferred = defer.Deferred()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many instances of whitespace being added unnecessarily. Please remove them.

from setuptools import setup

__version__ = "1.5.0"
__version__ = "1.5.1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't update the version in a feature PR. We'll do that during a release.

Comment on lines +137 to +138
self.local_media_lifetime = parse_duration(media_retention.get("local_media_lifetime"))
self.remote_media_lifetime = parse_duration(media_retention.get("remote_media_lifetime"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_duration will raise an exception if media_retention.get returns None.

Comment on lines -298 to -308
# Triggered by responder when more data has been requested (or
# stop_event has been triggered)
self.wakeup_event = threading.Event()
# Trigered by responder when we should abort the download.
self.stop_event = threading.Event()

# The consumer we're registered to
self.consumer = None

# The deferred returned by write_to_consumer, which should resolve when
# all the data has been written (or there has been a fatal error).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, why are you removing only comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants