Simplify tagging #6165

snejus · 2025-11-15T16:05:35Z

Refactor item tagging and fix several underlying issues.

Fixes

Synchronise all artist list fields. Notably, artist_sort / artists_sort and artist_credit / artists_credit fields have not been synchronised.
Fix overwrite_null configuration which was previously ignored for fields defined in autotag/__init__.py::SPECIAL_FIELDS.

Updates

Move metadata application logic into Match objects: add Match.apply_metadata, AlbumMatch.apply_metadata, AlbumMatch.apply_album_metadata, and TrackMatch.apply_metadata; callers now use those methods instead of legacy free functions.
Remove legacy functions from beets.autotag.__init__ (apply_item_metadata, apply_album_metadata, apply_metadata) and related globals (SPECIAL_FIELDS, log), and export only core types (AlbumInfo, AlbumMatch, TrackInfo, TrackMatch, Proposal, Recommendation, tag_album, tag_item).
Add structured metadata facilities to Info and subclasses:
- Info.type class property and nullable_fields for per-type 'overwrite_null' config.
- Info.raw_data and Info.item_data computed properties to apply artist_credit rules, filter nulls, and map media-specific field names.
- AlbumInfo and TrackInfo extend raw_data/item_data behavior to handle album/track specifics (date zeroing, tracktotal, mb_releasetrackid, per-disc numbering).
Introduce TrackInfo.merge_with_album to merge track-level data with album-level fallback for a final item payload.
Move correct_list_fields to hooks.py and update it to keep unmapped / non-media single/list fields in sync (artist <-> artists, albumtype <-> albumtypes, etc.).
Wire changes through the codebase:
- Pass Item objects into TrackMatch in match.tag_item to enable item-level metadata application.
- Replace calls to removed autotag apply functions with Match.apply_metadata invocations in beets/importer/tasks.py, beetsplug/bpsync.py, and beetsplug/mbsync.py.
- Update importer logic to set album artist fallbacks for albumartists / mb_albumartistids when missing.
Add and update tests:
- New test/autotag/test_hooks.py and test/autotag/test_match.py to validate new data mapping, list field synchronization, overwrite behavior, and assignment logic.

github-actions · 2025-11-15T16:05:48Z

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

sourcery-ai

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `beets/autotag/hooks.py:48-57` </location>
<code_context>
+def correct_list_fields(data: JSONDict) -> JSONDict:
</code_context>

<issue_to_address>
**issue (bug_risk):** The correct_list_fields function mutates its input dictionary in place.

This dual behavior may cause unexpected side effects for callers. Recommend copying the input or clearly documenting the mutation.
</issue_to_address>

### Comment 2
<location> `beets/autotag/hooks.py:88-89` </location>
<code_context>
 class AttrDict(dict[str, V]):
     """Mapping enabling attribute-style access to stored metadata values."""

     def copy(self) -> Self:
-        return deepcopy(self)
+        return self.__class__(**deepcopy(self))

     def __getattr__(self, attr: str) -> V:
</code_context>

<issue_to_address>
**suggestion:** AttrDict.copy uses **deepcopy(self), which may not preserve all dict semantics.

deepcopy(self) may fail for keys that are not valid identifiers or contain special characters. Using self.__class__(deepcopy(dict(self))) ensures all keys are handled correctly.

```suggestion
    def copy(self) -> Self:
        return self.__class__(deepcopy(dict(self)))
```
</issue_to_address>

### Comment 3
<location> `beets/autotag/hooks.py:120-129` </location>
<code_context>
+    def nullable_fields(cls) -> set[str]:
+        return set(config["overwrite_null"][cls.type.lower()].as_str_seq())
+
     @cached_property
     def name(self) -> str:
         raise NotImplementedError

+    @cached_property
+    def raw_data(self) -> JSONDict:
+        """Provide metadata with artist credits applied when configured."""
+        data = self.copy()
+        if config["artist_credit"]:
+            data.update(
+                artist=self.artist_credit or self.artist,
+                artists=self.artists_credit or self.artists,
+            )
+        return correct_list_fields(data)
+
+    @cached_property
+    def item_data(self) -> JSONDict:
+        """Metadata for items with field mappings and exclusions applied.
+
</code_context>

<issue_to_address>
**issue (bug_risk):** item_data uses dict union with data | {v: data.pop(k) ...}, which mutates data.

Using data.pop(k) within the dict comprehension alters the original data dictionary, which could cause issues if data is accessed elsewhere. To avoid side effects, create a new dictionary without mutating data.
</issue_to_address>

### Comment 4
<location> `test/autotag/test_hooks.py:186-194` </location>
<code_context>
+        assert self.items[0].month == 0
+        assert self.items[0].day == 0
+
+    def test_missing_date_applies_nothing(self):
+        self.items = [Item(year=1, month=2, day=3)]
+        self.info.update(year=None, month=None, day=None)
+
+        self._apply()
+
+        assert self.items[0].year == 1
+        assert self.items[0].month == 2
+        assert self.items[0].day == 3
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding a test for partial date updates (e.g., only year or only month present).

Adding tests for cases where only some date fields are updated will help verify correct handling of partial updates and prevent unintended data changes.

```suggestion
    def test_missing_date_applies_nothing(self):
        self.items = [Item(year=1, month=2, day=3)]
        self.info.update(year=None, month=None, day=None)

        self._apply()

        assert self.items[0].year == 1
        assert self.items[0].month == 2
        assert self.items[0].day == 3

    def test_partial_date_update_year_only(self):
        self.items = [Item(year=1, month=2, day=3)]
        self.info.update(year=2020, month=None, day=None)

        self._apply()

        assert self.items[0].year == 2020
        assert self.items[0].month == 0
        assert self.items[0].day == 0

    def test_partial_date_update_month_only(self):
        self.items = [Item(year=1, month=2, day=3)]
        self.info.update(year=None, month=5, day=None)

        self._apply()

        assert self.items[0].year == 1
        assert self.items[0].month == 5
        assert self.items[0].day == 0

    def test_partial_date_update_day_only(self):
        self.items = [Item(year=1, month=2, day=3)]
        self.info.update(year=None, month=None, day=15)

        self._apply()

        assert self.items[0].year == 1
        assert self.items[0].month == 2
        assert self.items[0].day == 15
```
</issue_to_address>

### Comment 5
<location> `test/autotag/test_hooks.py:162-167` </location>
<code_context>
+            },
+        ]
+
+    def test_autotag_items(self):
+        self._apply()
+
+        keys = self.expected_tracks[0].keys()
+        get_values = operator.itemgetter(*keys)
+
+        applied_data = [
+            dict(zip(keys, get_values(dict(i)))) for i in self.items
+        ]
+
+        assert applied_data == self.expected_tracks
+
+    def test_artist_credit_prefers_artist_over_albumartist_credit(self):
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding a test for 'from_scratch' import configuration.

Please add a test that enables 'from_scratch' and checks that items are cleared before metadata is applied.

```suggestion
    def test_from_scratch_clears_items_before_applying_metadata(self):
        # Enable 'from_scratch' configuration
        self.config['from_scratch'] = True

        # Simulate items with pre-existing metadata
        for item in self.items:
            item.artist = "PreExistingArtist"
            item.title = "PreExistingTitle"

        # Assert items have pre-existing metadata
        for item in self.items:
            assert item.artist == "PreExistingArtist"
            assert item.title == "PreExistingTitle"

        # Apply autotag with 'from_scratch' enabled
        self._apply()

        # Assert items are cleared before metadata is applied
        for item in self.items:
            assert item.artist != "PreExistingArtist"
            assert item.title != "PreExistingTitle"

        # Assert metadata is correctly applied
        keys = self.expected_tracks[0].keys()
        get_values = operator.itemgetter(*keys)
        applied_data = [
            dict(zip(keys, get_values(dict(i)))) for i in self.items
        ]
        assert applied_data == self.expected_tracks

    def test_artist_credit_prefers_artist_over_albumartist_credit(self):
        self.info.tracks[0].update(artist="oldArtist", artist_credit=None)

        self._apply(artist_credit=True)

        assert self.items[0].artist == "oldArtist"
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

beets/autotag/hooks.py

test/autotag/test_hooks.py

codecov · 2025-11-15T16:07:44Z

Codecov Report

❌ Patch coverage is 95.53571% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.70%. Comparing base (1a899cc) to head (08531ee).
⚠️ Report is 5 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
beets/autotag/hooks.py	97.75%	1 Missing and 1 partial ⚠️
beetsplug/bpsync.py	50.00%	2 Missing ⚠️
beets/importer/tasks.py	92.30%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6165      +/-   ##
==========================================
+ Coverage   68.68%   68.70%   +0.02%     
==========================================
  Files         138      138              
  Lines       18532    18509      -23     
  Branches     3061     3053       -8     
==========================================
- Hits        12729    12717      -12     
+ Misses       5149     5144       -5     
+ Partials      654      648       -6

Files with missing lines	Coverage Δ
beets/autotag/__init__.py	`72.72% <ø> (-14.89%)`	⬇️
beets/autotag/match.py	`76.92% <100.00%> (ø)`
beetsplug/mbsync.py	`82.05% <100.00%> (+0.23%)`	⬆️
beets/importer/tasks.py	`91.22% <92.30%> (+0.32%)`	⬆️
beets/autotag/hooks.py	`98.98% <97.75%> (-1.02%)`	⬇️
beetsplug/bpsync.py	`19.04% <50.00%> (+0.97%)`	⬆️

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

snejus · 2025-12-17T11:23:04Z

@semohr @henry-oberholtzer @JOJ0 would be happy to have your eyes over this as I think this is a significant piece of work making autotagging somewhat simpler (hopefully!).

JOJ0 · 2025-12-17T21:03:40Z

@semohr @henry-oberholtzer @JOJ0 would be happy to have your eyes over this as I think this is a significant piece of work making autotagging somewhat simpler (hopefully!).

I'll try to find some time after first wave of xmadness is over

Consolidate multiple granular test methods in ApplyTest into a single comprehensive test that validates all applied metadata at once. This improves test maintainability and clarity by: - Replacing ~20 individual test methods with one data-driven test - Using expected data dictionaries to validate all fields together - Removing ApplyCompilationTest class (covered by va=True in main test) - Keeping focused tests for edge cases (artist_credit, date handling) - Switching from BeetsTestCase to standard TestCase for speed - Adding operator import for efficient data extraction The new approach makes it easier to validate all applied metadata at once.

Copilot

Pull request overview

This pull request refactors the item tagging system to move metadata application logic into Match objects and fix synchronization issues for artist-related fields. The changes address bugs where artist_sort, artists_sort, artist_credit, and artists_credit were not properly synchronized, and fix the overwrite_null configuration that was previously ignored for certain fields.

Key Changes:

Introduced Match.apply_metadata() methods to replace legacy free functions, centralizing metadata application logic within match objects.
Added structured metadata facilities to Info classes including raw_data and item_data properties for transforming metadata before application.
Moved and enhanced correct_list_fields() to synchronize single/list field pairs for artist, albumtype, and related fields.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
test/test_autotag.py	Removed legacy tests that were moved to new test files
test/plugins/test_edit.py	Updated assertions to reflect that `albumartist` and `albumartists` are now modified alongside titles
test/autotag/test_match.py	New test file for track assignment logic, moved from test_autotag.py
test/autotag/test_hooks.py	New comprehensive test suite for metadata application, overwrite behavior, and list field synchronization
beetsplug/mbsync.py	Updated to use `TrackMatch.apply_metadata()` and `AlbumMatch.apply_metadata()` instead of removed functions
beetsplug/bpsync.py	Updated to use `TrackMatch.apply_metadata()` and `AlbumMatch.apply_metadata()` instead of removed functions
beets/importer/tasks.py	Simplified metadata application to delegate to match objects; improved fallback logic for album artist fields
beets/autotag/match.py	Updated `tag_item()` to pass `Item` objects into `TrackMatch` constructor
beets/autotag/hooks.py	Major refactoring introducing `Info.raw_data`, `Info.item_data`, `Match.apply_metadata()`, and enhanced `correct_list_fields()`
beets/autotag/init.py	Removed legacy metadata application functions and related globals

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-07T11:31:35Z

test/autotag/test_hooks.py

+
+        assert self.items[1].artist == "albumArtist"
+
+    def test_date_only_zeroes_month_and_day(self):


Corrected spelling of 'zeroes' to 'zeros'.

Suggested change

def test_date_only_zeroes_month_and_day(self):

def test_date_only_zeros_month_and_day(self):

Copilot · 2026-01-07T11:31:36Z

test/autotag/test_hooks.py

+        ("1", ["2"], ("1", ["1", "2"])),
+        ("1 ft 2", ["1", "1 ft 2"], ("1 ft 2", ["1 ft 2", "1"])),
+        ("1 ft 2", ["2", "1"], ("1 ft 2", ["2", "1"])),
+        ("1 ft 2", ["2", "1"], ("1 FT 2", ["2", "1"])),


The test case on line 263 appears to have an inconsistency. The expected single value is '1 FT 2' (uppercase), but the input is '1 ft 2' (lowercase). This suggests the test may be verifying case-insensitive matching, but the expected output doesn't match the input, which could indicate a bug in the test or the implementation.

Copilot · 2026-01-07T11:31:36Z

beets/autotag/hooks.py

+        if (_single_val := single_val.lower()) not in (
+            _list_val := set(map(str.lower, list_val))
+        ) and set(_single_val.split()) & _list_val:
+            return


The logic in this condition is complex and unclear. Consider adding a comment explaining when this early return occurs (i.e., when the single value appears as a substring within list values, like 'artist ft other' in ['artist', 'artist ft other']).

Copilot · 2026-01-07T11:31:36Z

beets/importer/tasks.py

+                changes["albumartists"] = first.artists or [first.artist]
+            if not first.mb_albumartistid:
+                changes["mb_albumartistid"] = first.mb_artistid
+            if not first.mb_albumartistids:
+                changes["mb_albumartistids"] = first.mb_artistids or [
+                    first.mb_artistid
+                ]


The fallback logic first.artists or [first.artist] could produce an unexpected empty list if both first.artists is an empty list and first.artist is None or empty. Consider using first.artists if first.artists else [first.artist] for more explicit behavior.

Suggested change

changes["albumartists"] = first.artists or [first.artist]

if not first.mb_albumartistid:

changes["mb_albumartistid"] = first.mb_artistid

if not first.mb_albumartistids:

changes["mb_albumartistids"] = first.mb_artistids or [

first.mb_artistid

]

changes["albumartists"] = (

first.artists if first.artists else [first.artist]

)

if not first.mb_albumartistid:

changes["mb_albumartistid"] = first.mb_artistid

if not first.mb_albumartistids:

changes["mb_albumartistids"] = (

first.mb_artistids if first.mb_artistids else [first.mb_artistid]

)

Copilot · 2026-01-07T11:31:37Z

beets/importer/tasks.py

+                changes["mb_albumartistids"] = first.mb_artistids or [
+                    first.mb_artistid
+                ]


Similar to line 432, the fallback first.mb_artistids or [first.mb_artistid] could produce an unexpected result if first.mb_artistids is an empty list. Consider using first.mb_artistids if first.mb_artistids else [first.mb_artistid] for clarity.

Suggested change

changes["mb_albumartistids"] = first.mb_artistids or [

first.mb_artistid

]

changes["mb_albumartistids"] = (

first.mb_artistids if first.mb_artistids else [first.mb_artistid]

)

snejus requested a review from a team as a code owner November 15, 2025 16:05

snejus requested review from henry-oberholtzer and semohr and removed request for a team November 15, 2025 16:05

sourcery-ai bot reviewed Nov 15, 2025

View reviewed changes

beets/autotag/hooks.py Outdated Show resolved Hide resolved

beets/autotag/hooks.py Show resolved Hide resolved

beets/autotag/hooks.py Show resolved Hide resolved

test/autotag/test_hooks.py Show resolved Hide resolved

snejus requested a review from a team November 15, 2025 16:07

snejus force-pushed the introduce-info-name-property branch from 8b417f1 to 5f8ca35 Compare November 15, 2025 16:14

snejus force-pushed the simplify-item-tagging branch from 1cca6b5 to 1ef5f76 Compare November 15, 2025 16:14

snejus force-pushed the introduce-info-name-property branch from 5f8ca35 to 9514f7d Compare November 15, 2025 17:25

snejus force-pushed the simplify-item-tagging branch 3 times, most recently from c766ec8 to 56126ad Compare November 16, 2025 16:31

JOJ0 mentioned this pull request Nov 17, 2025

Add native support for multiple genres per album/track #6169

Open

snejus force-pushed the introduce-info-name-property branch from 9514f7d to 0b570a9 Compare November 17, 2025 10:21

snejus force-pushed the simplify-item-tagging branch from 56126ad to a32c45e Compare November 17, 2025 10:21

snejus force-pushed the simplify-item-tagging branch 2 times, most recently from dfb48ad to 05bc57f Compare December 1, 2025 00:26

snejus force-pushed the introduce-info-name-property branch from 0b570a9 to 02f3cb7 Compare December 1, 2025 00:26

snejus force-pushed the introduce-info-name-property branch from 02f3cb7 to 188f6ab Compare December 17, 2025 11:19

snejus force-pushed the simplify-item-tagging branch from 05bc57f to f214346 Compare December 17, 2025 11:19

snejus force-pushed the introduce-info-name-property branch from 188f6ab to a88e477 Compare December 17, 2025 13:19

snejus force-pushed the simplify-item-tagging branch from f214346 to 22e5461 Compare December 17, 2025 13:19

snejus force-pushed the introduce-info-name-property branch 2 times, most recently from 527d77e to 829ca5c Compare December 19, 2025 12:10

snejus force-pushed the simplify-item-tagging branch from 22e5461 to acac74b Compare December 19, 2025 12:12

snejus force-pushed the introduce-info-name-property branch from 829ca5c to c966f17 Compare December 19, 2025 12:31

snejus force-pushed the simplify-item-tagging branch from acac74b to 8b60550 Compare December 20, 2025 01:32

snejus force-pushed the introduce-info-name-property branch from c966f17 to 9d6df17 Compare December 20, 2025 01:32

snejus force-pushed the simplify-item-tagging branch from 8b60550 to d9ed74d Compare December 20, 2025 01:37

snejus force-pushed the introduce-info-name-property branch from 9d6df17 to 163e153 Compare December 20, 2025 01:37

snejus force-pushed the simplify-item-tagging branch from d9ed74d to d5eeb26 Compare December 21, 2025 01:10

snejus force-pushed the introduce-info-name-property branch from 163e153 to cf72187 Compare December 21, 2025 01:10

snejus force-pushed the simplify-item-tagging branch from d5eeb26 to cf31d00 Compare December 24, 2025 11:01

snejus force-pushed the introduce-info-name-property branch from cf72187 to 60b4a38 Compare December 24, 2025 11:01

Base automatically changed from introduce-info-name-property to master December 24, 2025 11:06

henry-oberholtzer added this to the 2.6.0 milestone Dec 30, 2025

snejus added 7 commits January 7, 2026 11:30

autotag: add a test for overwrite_null configuration

ab5b7a5

tagging: allow nulling singleton fields

ccb196e

autotag: fix list fields

750f466

tagging: fix list albumartists field

96f8d35

import: simplify tagging item

b160f2e

autotag: split tests into test_match.py and test_hooks.py

6173fa7

Copilot AI review requested due to automatic review settings January 7, 2026 11:30

snejus force-pushed the simplify-item-tagging branch from cf31d00 to ed73a6f Compare January 7, 2026 11:30

Copilot AI reviewed Jan 7, 2026

View reviewed changes

autotag: do not sync joined artist to artists field

08531ee

snejus force-pushed the simplify-item-tagging branch from ed73a6f to 08531ee Compare January 7, 2026 11:52

JOJ0 added the core Pull requests that modify the beets core `beets` label Jan 10, 2026


		assert self.items[1].artist == "albumArtist"

		def test_date_only_zeroes_month_and_day(self):

	def test_date_only_zeroes_month_and_day(self):
	def test_date_only_zeros_month_and_day(self):

Simplify tagging #6165

Are you sure you want to change the base?

Simplify tagging #6165

Uh oh!

Conversation

snejus commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fixes

Updates

Uh oh!

github-actions bot commented Nov 15, 2025

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Nov 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

snejus commented Dec 17, 2025

Uh oh!

JOJ0 commented Dec 17, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

snejus commented Nov 15, 2025 •

edited

Loading

codecov bot commented Nov 15, 2025 •

edited

Loading