-
Notifications
You must be signed in to change notification settings - Fork 2k
Simplify tagging #6165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Simplify tagging #6165
Conversation
|
Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `beets/autotag/hooks.py:48-57` </location>
<code_context>
+def correct_list_fields(data: JSONDict) -> JSONDict:
</code_context>
<issue_to_address>
**issue (bug_risk):** The correct_list_fields function mutates its input dictionary in place.
This dual behavior may cause unexpected side effects for callers. Recommend copying the input or clearly documenting the mutation.
</issue_to_address>
### Comment 2
<location> `beets/autotag/hooks.py:88-89` </location>
<code_context>
class AttrDict(dict[str, V]):
"""Mapping enabling attribute-style access to stored metadata values."""
def copy(self) -> Self:
- return deepcopy(self)
+ return self.__class__(**deepcopy(self))
def __getattr__(self, attr: str) -> V:
</code_context>
<issue_to_address>
**suggestion:** AttrDict.copy uses **deepcopy(self), which may not preserve all dict semantics.
deepcopy(self) may fail for keys that are not valid identifiers or contain special characters. Using self.__class__(deepcopy(dict(self))) ensures all keys are handled correctly.
```suggestion
def copy(self) -> Self:
return self.__class__(deepcopy(dict(self)))
```
</issue_to_address>
### Comment 3
<location> `beets/autotag/hooks.py:120-129` </location>
<code_context>
+ def nullable_fields(cls) -> set[str]:
+ return set(config["overwrite_null"][cls.type.lower()].as_str_seq())
+
@cached_property
def name(self) -> str:
raise NotImplementedError
+ @cached_property
+ def raw_data(self) -> JSONDict:
+ """Provide metadata with artist credits applied when configured."""
+ data = self.copy()
+ if config["artist_credit"]:
+ data.update(
+ artist=self.artist_credit or self.artist,
+ artists=self.artists_credit or self.artists,
+ )
+ return correct_list_fields(data)
+
+ @cached_property
+ def item_data(self) -> JSONDict:
+ """Metadata for items with field mappings and exclusions applied.
+
</code_context>
<issue_to_address>
**issue (bug_risk):** item_data uses dict union with data | {v: data.pop(k) ...}, which mutates data.
Using data.pop(k) within the dict comprehension alters the original data dictionary, which could cause issues if data is accessed elsewhere. To avoid side effects, create a new dictionary without mutating data.
</issue_to_address>
### Comment 4
<location> `test/autotag/test_hooks.py:186-194` </location>
<code_context>
+ assert self.items[0].month == 0
+ assert self.items[0].day == 0
+
+ def test_missing_date_applies_nothing(self):
+ self.items = [Item(year=1, month=2, day=3)]
+ self.info.update(year=None, month=None, day=None)
+
+ self._apply()
+
+ assert self.items[0].year == 1
+ assert self.items[0].month == 2
+ assert self.items[0].day == 3
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test for partial date updates (e.g., only year or only month present).
Adding tests for cases where only some date fields are updated will help verify correct handling of partial updates and prevent unintended data changes.
```suggestion
def test_missing_date_applies_nothing(self):
self.items = [Item(year=1, month=2, day=3)]
self.info.update(year=None, month=None, day=None)
self._apply()
assert self.items[0].year == 1
assert self.items[0].month == 2
assert self.items[0].day == 3
def test_partial_date_update_year_only(self):
self.items = [Item(year=1, month=2, day=3)]
self.info.update(year=2020, month=None, day=None)
self._apply()
assert self.items[0].year == 2020
assert self.items[0].month == 0
assert self.items[0].day == 0
def test_partial_date_update_month_only(self):
self.items = [Item(year=1, month=2, day=3)]
self.info.update(year=None, month=5, day=None)
self._apply()
assert self.items[0].year == 1
assert self.items[0].month == 5
assert self.items[0].day == 0
def test_partial_date_update_day_only(self):
self.items = [Item(year=1, month=2, day=3)]
self.info.update(year=None, month=None, day=15)
self._apply()
assert self.items[0].year == 1
assert self.items[0].month == 2
assert self.items[0].day == 15
```
</issue_to_address>
### Comment 5
<location> `test/autotag/test_hooks.py:162-167` </location>
<code_context>
+ },
+ ]
+
+ def test_autotag_items(self):
+ self._apply()
+
+ keys = self.expected_tracks[0].keys()
+ get_values = operator.itemgetter(*keys)
+
+ applied_data = [
+ dict(zip(keys, get_values(dict(i)))) for i in self.items
+ ]
+
+ assert applied_data == self.expected_tracks
+
+ def test_artist_credit_prefers_artist_over_albumartist_credit(self):
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test for 'from_scratch' import configuration.
Please add a test that enables 'from_scratch' and checks that items are cleared before metadata is applied.
```suggestion
def test_from_scratch_clears_items_before_applying_metadata(self):
# Enable 'from_scratch' configuration
self.config['from_scratch'] = True
# Simulate items with pre-existing metadata
for item in self.items:
item.artist = "PreExistingArtist"
item.title = "PreExistingTitle"
# Assert items have pre-existing metadata
for item in self.items:
assert item.artist == "PreExistingArtist"
assert item.title == "PreExistingTitle"
# Apply autotag with 'from_scratch' enabled
self._apply()
# Assert items are cleared before metadata is applied
for item in self.items:
assert item.artist != "PreExistingArtist"
assert item.title != "PreExistingTitle"
# Assert metadata is correctly applied
keys = self.expected_tracks[0].keys()
get_values = operator.itemgetter(*keys)
applied_data = [
dict(zip(keys, get_values(dict(i)))) for i in self.items
]
assert applied_data == self.expected_tracks
def test_artist_credit_prefers_artist_over_albumartist_credit(self):
self.info.tracks[0].update(artist="oldArtist", artist_credit=None)
self._apply(artist_credit=True)
assert self.items[0].artist == "oldArtist"
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6165 +/- ##
==========================================
+ Coverage 68.68% 68.70% +0.02%
==========================================
Files 138 138
Lines 18532 18509 -23
Branches 3061 3053 -8
==========================================
- Hits 12729 12717 -12
+ Misses 5149 5144 -5
+ Partials 654 648 -6
🚀 New features to boost your workflow:
|
8b417f1 to
5f8ca35
Compare
1cca6b5 to
1ef5f76
Compare
5f8ca35 to
9514f7d
Compare
c766ec8 to
56126ad
Compare
9514f7d to
0b570a9
Compare
56126ad to
a32c45e
Compare
dfb48ad to
05bc57f
Compare
0b570a9 to
02f3cb7
Compare
02f3cb7 to
188f6ab
Compare
05bc57f to
f214346
Compare
|
@semohr @henry-oberholtzer @JOJ0 would be happy to have your eyes over this as I think this is a significant piece of work making autotagging somewhat simpler (hopefully!). |
188f6ab to
a88e477
Compare
f214346 to
22e5461
Compare
I'll try to find some time after first wave of xmadness is over |
527d77e to
829ca5c
Compare
22e5461 to
acac74b
Compare
829ca5c to
c966f17
Compare
acac74b to
8b60550
Compare
c966f17 to
9d6df17
Compare
8b60550 to
d9ed74d
Compare
9d6df17 to
163e153
Compare
d9ed74d to
d5eeb26
Compare
163e153 to
cf72187
Compare
d5eeb26 to
cf31d00
Compare
cf72187 to
60b4a38
Compare
Consolidate multiple granular test methods in ApplyTest into a single comprehensive test that validates all applied metadata at once. This improves test maintainability and clarity by: - Replacing ~20 individual test methods with one data-driven test - Using expected data dictionaries to validate all fields together - Removing ApplyCompilationTest class (covered by va=True in main test) - Keeping focused tests for edge cases (artist_credit, date handling) - Switching from BeetsTestCase to standard TestCase for speed - Adding operator import for efficient data extraction The new approach makes it easier to validate all applied metadata at once.
cf31d00 to
ed73a6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request refactors the item tagging system to move metadata application logic into Match objects and fix synchronization issues for artist-related fields. The changes address bugs where artist_sort, artists_sort, artist_credit, and artists_credit were not properly synchronized, and fix the overwrite_null configuration that was previously ignored for certain fields.
Key Changes:
- Introduced
Match.apply_metadata()methods to replace legacy free functions, centralizing metadata application logic within match objects. - Added structured metadata facilities to
Infoclasses includingraw_dataanditem_dataproperties for transforming metadata before application. - Moved and enhanced
correct_list_fields()to synchronize single/list field pairs for artist, albumtype, and related fields.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test/test_autotag.py | Removed legacy tests that were moved to new test files |
| test/plugins/test_edit.py | Updated assertions to reflect that albumartist and albumartists are now modified alongside titles |
| test/autotag/test_match.py | New test file for track assignment logic, moved from test_autotag.py |
| test/autotag/test_hooks.py | New comprehensive test suite for metadata application, overwrite behavior, and list field synchronization |
| beetsplug/mbsync.py | Updated to use TrackMatch.apply_metadata() and AlbumMatch.apply_metadata() instead of removed functions |
| beetsplug/bpsync.py | Updated to use TrackMatch.apply_metadata() and AlbumMatch.apply_metadata() instead of removed functions |
| beets/importer/tasks.py | Simplified metadata application to delegate to match objects; improved fallback logic for album artist fields |
| beets/autotag/match.py | Updated tag_item() to pass Item objects into TrackMatch constructor |
| beets/autotag/hooks.py | Major refactoring introducing Info.raw_data, Info.item_data, Match.apply_metadata(), and enhanced correct_list_fields() |
| beets/autotag/init.py | Removed legacy metadata application functions and related globals |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| assert self.items[1].artist == "albumArtist" | ||
|
|
||
| def test_date_only_zeroes_month_and_day(self): |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'zeroes' to 'zeros'.
| def test_date_only_zeroes_month_and_day(self): | |
| def test_date_only_zeros_month_and_day(self): |
test/autotag/test_hooks.py
Outdated
| ("1", ["2"], ("1", ["1", "2"])), | ||
| ("1 ft 2", ["1", "1 ft 2"], ("1 ft 2", ["1 ft 2", "1"])), | ||
| ("1 ft 2", ["2", "1"], ("1 ft 2", ["2", "1"])), | ||
| ("1 ft 2", ["2", "1"], ("1 FT 2", ["2", "1"])), |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test case on line 263 appears to have an inconsistency. The expected single value is '1 FT 2' (uppercase), but the input is '1 ft 2' (lowercase). This suggests the test may be verifying case-insensitive matching, but the expected output doesn't match the input, which could indicate a bug in the test or the implementation.
beets/autotag/hooks.py
Outdated
| if (_single_val := single_val.lower()) not in ( | ||
| _list_val := set(map(str.lower, list_val)) | ||
| ) and set(_single_val.split()) & _list_val: | ||
| return |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic in this condition is complex and unclear. Consider adding a comment explaining when this early return occurs (i.e., when the single value appears as a substring within list values, like 'artist ft other' in ['artist', 'artist ft other']).
| changes["albumartists"] = first.artists or [first.artist] | ||
| if not first.mb_albumartistid: | ||
| changes["mb_albumartistid"] = first.mb_artistid | ||
| if not first.mb_albumartistids: | ||
| changes["mb_albumartistids"] = first.mb_artistids or [ | ||
| first.mb_artistid | ||
| ] |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback logic first.artists or [first.artist] could produce an unexpected empty list if both first.artists is an empty list and first.artist is None or empty. Consider using first.artists if first.artists else [first.artist] for more explicit behavior.
| changes["albumartists"] = first.artists or [first.artist] | |
| if not first.mb_albumartistid: | |
| changes["mb_albumartistid"] = first.mb_artistid | |
| if not first.mb_albumartistids: | |
| changes["mb_albumartistids"] = first.mb_artistids or [ | |
| first.mb_artistid | |
| ] | |
| changes["albumartists"] = ( | |
| first.artists if first.artists else [first.artist] | |
| ) | |
| if not first.mb_albumartistid: | |
| changes["mb_albumartistid"] = first.mb_artistid | |
| if not first.mb_albumartistids: | |
| changes["mb_albumartistids"] = ( | |
| first.mb_artistids if first.mb_artistids else [first.mb_artistid] | |
| ) |
| changes["mb_albumartistids"] = first.mb_artistids or [ | ||
| first.mb_artistid | ||
| ] |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to line 432, the fallback first.mb_artistids or [first.mb_artistid] could produce an unexpected result if first.mb_artistids is an empty list. Consider using first.mb_artistids if first.mb_artistids else [first.mb_artistid] for clarity.
| changes["mb_albumartistids"] = first.mb_artistids or [ | |
| first.mb_artistid | |
| ] | |
| changes["mb_albumartistids"] = ( | |
| first.mb_artistids if first.mb_artistids else [first.mb_artistid] | |
| ) |
ed73a6f to
08531ee
Compare
Refactor item tagging and fix several underlying issues.
Fixes
artist_sort/artists_sortandartist_credit/artists_creditfields have not been synchronised.overwrite_nullconfiguration which was previously ignored for fields defined inautotag/__init__.py::SPECIAL_FIELDS.Updates
Matchobjects: addMatch.apply_metadata,AlbumMatch.apply_metadata,AlbumMatch.apply_album_metadata, andTrackMatch.apply_metadata; callers now use those methods instead of legacy free functions.beets.autotag.__init__(apply_item_metadata,apply_album_metadata,apply_metadata) and related globals (SPECIAL_FIELDS,log), and export only core types (AlbumInfo,AlbumMatch,TrackInfo,TrackMatch,Proposal,Recommendation,tag_album,tag_item).Infoand subclasses:Info.typeclass property andnullable_fieldsfor per-type 'overwrite_null' config.Info.raw_dataandInfo.item_datacomputed properties to applyartist_creditrules, filter nulls, and map media-specific field names.AlbumInfoandTrackInfoextendraw_data/item_databehavior to handle album/track specifics (date zeroing,tracktotal,mb_releasetrackid, per-disc numbering).TrackInfo.merge_with_albumto merge track-level data with album-level fallback for a final item payload.correct_list_fieldstohooks.pyand update it to keep unmapped / non-media single/list fields in sync (artist<->artists,albumtype<->albumtypes, etc.).Itemobjects intoTrackMatchinmatch.tag_itemto enable item-level metadata application.autotagapply functions withMatch.apply_metadatainvocations inbeets/importer/tasks.py,beetsplug/bpsync.py, andbeetsplug/mbsync.py.albumartists/mb_albumartistidswhen missing.test/autotag/test_hooks.pyandtest/autotag/test_match.pyto validate new data mapping, list field synchronization, overwrite behavior, and assignment logic.