Skip to content

Conversation

@rpmcginty
Copy link
Collaborator

@rpmcginty rpmcginty commented Dec 24, 2025

What's in this Change?

This change adds a new function for s3 path tagging update_path_tags. this allows for tagging objects at or prefixed by a given s3 path.

Testing

adding unit tests

@codecov
Copy link

codecov bot commented Dec 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.45%. Comparing base (7ea2dad) to head (a47a526).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #37      +/-   ##
==========================================
+ Coverage   89.39%   89.45%   +0.05%     
==========================================
  Files          37       37              
  Lines        3489     3508      +19     
  Branches      518      525       +7     
==========================================
+ Hits         3119     3138      +19     
  Misses        251      251              
  Partials      119      119              
Files with missing lines Coverage Δ
src/aibs_informatics_aws_utils/s3.py 90.16% <100.00%> (+0.34%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

There are three modes for updating tags:
- replace: Replace all existing tags with new tags
- append: Merge new tags with existing tags
- delete: Delete specified tags from existing tags

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth clarifying that this would delete the specified tag keys regardless of what the value provided with the tag key is. i.e. it doesn't have to match the value of the tag key in the object.

Clarify that values do not matter when deleting tags.
@rpmcginty rpmcginty requested a review from sheriferson January 6, 2026 00:55
s3_paths = list_s3_paths(s3_path=s3_path, **kwargs)
logger.info(f"Updating tags for {len(s3_paths)} objects under prefix {s3_path}")
for nested_s3_path in s3_paths:
update_path_tags(nested_s3_path, tags, mode, recursive=False, **kwargs)
Copy link
Collaborator

@njmei njmei Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Loop is probably more "pythonic" compared to recursion

You could create a s3_objects_to_tag: list[S3URI] that gets populated by list_s3_paths if recursive=True otherwise s3_objects_to_tag would just be a 1-element list if s3_path is_object. Then just loop over L617-643.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the recursion vs loop is probably not where things are held up. We could try to make this multithreaded because a lot of it is just waiting for network responses

Copy link
Collaborator

@njmei njmei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit but otherwise LGTM

@rpmcginty rpmcginty merged commit ca85293 into main Jan 6, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants