-
Notifications
You must be signed in to change notification settings - Fork 2k
Merge mbpseudo functionality into musicbrainz and improve #6255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Merge mbpseudo functionality into musicbrainz and improve #6255
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 7 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `beets/metadata_plugins.py:39-42` </location>
<code_context>
@notify_info_yielded("albuminfo_received")
-def candidates(*args, **kwargs) -> Iterable[AlbumInfo]:
+def candidates(items, *args, **kwargs) -> Iterable[AlbumInfo]:
"""Return matching album candidates from all metadata source plugins."""
for plugin in find_metadata_source_plugins():
- yield from plugin.candidates(*args, **kwargs)
+ for candidate in plugin.candidates(items, *args, **kwargs):
+ send("album_info_received", items=items, album_info=candidate)
+ yield candidate
</code_context>
<issue_to_address>
**issue (bug_risk):** Passing `items` positionally to plugin.candidates can break existing metadata source plugins.
This new signature requires all metadata source plugins to accept `items` as the first positional argument. Any existing plugin still using the old positional signature (e.g. `candidates(artist, album, va_likely, ...)`) will now fail with a `TypeError`.
To keep this backwards compatible, detect whether a plugin supports `items` (e.g. via `inspect.signature` or a feature flag like `supports_items`) and only pass `items` when supported, falling back to the previous call pattern otherwise.
</issue_to_address>
### Comment 2
<location> `beets/metadata_plugins.py:51-54` </location>
<code_context>
yield from plugin.item_candidates(*args, **kwargs)
-def album_for_id(_id: str) -> AlbumInfo | None:
+def album_for_id(_id: str, items: Iterable[Item]) -> AlbumInfo | None:
"""Get AlbumInfo object for the given ID string.
</code_context>
<issue_to_address>
**issue (bug_risk):** Changing the public album_for_id signature may break existing callers that don’t provide items.
Because `metadata_plugins.album_for_id` is a public helper, external callers that use `metadata_plugins.album_for_id(mbid)` without `items` will now raise errors.
To preserve compatibility, you could give `items` a default of `None` and only call `send("album_info_received", ...)` when `items` is not `None`. That keeps the new behavior for internal callers while avoiding a breaking API change.
</issue_to_address>
### Comment 3
<location> `test/plugins/test_musicbrainz_pseudo.py:132-141` </location>
<code_context>
+class TestMBPseudoReleases(TestMBPseudoMixin):
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding tests for `_handle_main_pseudo_release` fallback paths (no relation and no desired script cases).
The current tests only cover pseudo-releases that point to an official release with a desired script via `_handle_intercepted_pseudo_releases`. Two branches in `_handle_main_pseudo_release` remain untested:
1. Pseudo-release with no `transl-tracklisting` backward relation (should return the pseudo-release as-is).
2. Pseudo-release with a backward relation where `_has_desired_script` is `False` (should return the merged pseudo+official `AlbumInfo` via `_merge_pseudo_and_actual_album`).
Targeted tests for these cases would better safeguard behavior when MB data is incomplete or uses an unexpected script.
Suggested implementation:
```python
class TestMBPseudoReleaseFallbacks(TestMBPseudoMixin):
def test_main_pseudo_release_no_back_relation_returns_pseudo(
self,
musicbrainz_plugin: MusicBrainzPlugin,
pseudo_release_info: AlbumInfo,
monkeypatch,
):
# Force the "no transl-tracklisting backward relation" path
# by making the internal relation lookup return None.
# _handle_main_pseudo_release should then return the pseudo-release as-is.
def fake_get_back_relation(pseudo: AlbumInfo):
assert pseudo is pseudo_release_info
return None
# Adjust attribute name here if the plugin uses a different helper
monkeypatch.setattr(
musicbrainz_plugin,
"_get_transl_tracklisting_backward_relation",
fake_get_back_relation,
raising=False,
)
result = musicbrainz_plugin._handle_main_pseudo_release(pseudo_release_info)
assert result is pseudo_release_info
def test_main_pseudo_release_no_desired_script_merges_pseudo_and_official(
self,
musicbrainz_plugin: MusicBrainzPlugin,
official_release_info: AlbumInfo,
pseudo_release_info: AlbumInfo,
monkeypatch,
):
# Force the "has backward relation but no desired script" path:
# 1. Provide a synthetic transl-tracklisting backward relation.
# 2. Make _has_desired_script return False.
# 3. Stub _merge_pseudo_and_actual_album and assert it is used.
def fake_get_back_relation(pseudo: AlbumInfo):
assert pseudo is pseudo_release_info
# Adjust structure to match whatever the real code expects.
return {"release": {"id": official_release_info.album_id}}
monkeypatch.setattr(
musicbrainz_plugin,
"_get_transl_tracklisting_backward_relation",
fake_get_back_relation,
raising=False,
)
def fake_has_desired_script(relation):
# We explicitly force the "no desired script" path
return False
monkeypatch.setattr(
musicbrainz_plugin,
"_has_desired_script",
fake_has_desired_script,
raising=True,
)
merged_sentinel = object()
def fake_merge_pseudo_and_actual_album(
pseudo: AlbumInfo,
actual: AlbumInfo,
) -> AlbumInfo:
assert pseudo is pseudo_release_info
assert actual is official_release_info
return merged_sentinel # type: ignore[return-value]
monkeypatch.setattr(
musicbrainz_plugin,
"_merge_pseudo_and_actual_album",
fake_merge_pseudo_and_actual_album,
raising=True,
)
result = musicbrainz_plugin._handle_main_pseudo_release(pseudo_release_info)
assert result is merged_sentinel
class TestMBPseudoReleases(TestMBPseudoMixin):
```
These tests assume a couple of internal helper names and relation shapes that may differ slightly in your code:
1. If the helper used by `_handle_main_pseudo_release` to obtain the backward transl-tracklisting relation is not named `_get_transl_tracklisting_backward_relation`, adjust the `monkeypatch.setattr` targets `"_get_transl_tracklisting_backward_relation"` in both tests to the actual helper name.
2. If `_handle_main_pseudo_release` reads the related official release ID from a different structure than `{"release": {"id": official_release_info.album_id}}`, update `fake_get_back_relation` to return a dict (or object) that matches what the real implementation expects.
3. If `_handle_main_pseudo_release` does not call `_get_transl_tracklisting_backward_relation` at all, but instead reads relations directly from `AlbumInfo` / `pseudo_release_info`, you can instead:
- Remove the monkeypatch of `_get_transl_tracklisting_backward_relation`.
- Modify `pseudo_release_info` (or create a copy) so that its backing MB JSON has either:
a) no `transl-tracklisting` backward relation (for the first test), or
b) exactly one such relation pointing to `official_release_info` (for the second test).
4. Ensure that `_handle_main_pseudo_release` is indeed a method on `musicbrainz_plugin`. If it is instead a free function or lives on another object, adjust the call sites accordingly.
</issue_to_address>
### Comment 4
<location> `test/plugins/test_musicbrainz_pseudo.py:210-173` </location>
<code_context>
- assert not isinstance(album_info, PseudoAlbumInfo)
- assert album_info.data_source == "MusicBrainzPseudoRelease"
-
- def test_interception(
- self,
- mbpseudo_plugin: MusicBrainzPseudoReleasePlugin,
- official_release: JSONDict,
- ):
- album_info = mbpseudo_plugin.album_info(official_release)
- assert isinstance(album_info, PseudoAlbumInfo)
- assert album_info.data_source == "MusicBrainzPseudoRelease"
-
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests that explicitly exercise `_intercept_mb_release` when no scripts are configured and when release-relations are absent.
Current tests cover mismatched relation fields and scripts, but not these additional branches of `_intercept_mb_release`:
* `self._scripts` is empty → should immediately return `[]`.
* Release has no `release-relations` → interception should be a no-op.
Please add tests that construct these cases, assert `_intercept_mb_release` returns `[]`, and verify that `album_for_id` falls back to using only the main release.
Suggested implementation:
```python
album_info = self.get_album_info(musicbrainz_plugin, official_release)
assert not isinstance(album_info, PseudoAlbumInfo)
assert album_info.data_source == "MusicBrainz"
def test_intercept_mb_release_no_scripts(
self,
mbpseudo_plugin: MusicBrainzPseudoReleasePlugin,
musicbrainz_plugin: MusicBrainzPlugin,
official_release: JSONDict,
):
# When no scripts are configured, interception should immediately return [].
mbpseudo_plugin._scripts = {}
intercepted = mbpseudo_plugin._intercept_mb_release(
musicbrainz_plugin, official_release
)
assert intercepted == []
# album_for_id should fall back to using only the main release.
albums = list(
musicbrainz_plugin.album_for_id(official_release["id"], None, None)
)
assert albums
assert all(not isinstance(a, MultiPseudoAlbumInfo) for a in albums)
def test_intercept_mb_release_no_release_relations(
self,
mbpseudo_plugin: MusicBrainzPseudoReleasePlugin,
musicbrainz_plugin: MusicBrainzPlugin,
official_release: JSONDict,
):
# When the release has no "release-relations", interception should be a no-op.
release_without_relations = dict(official_release)
release_without_relations.pop("release-relations", None)
intercepted = mbpseudo_plugin._intercept_mb_release(
musicbrainz_plugin, release_without_relations
)
assert intercepted == []
# album_for_id should again fall back to using only the main release.
albums = list(
musicbrainz_plugin.album_for_id(official_release["id"], None, None)
)
assert albums
assert all(not isinstance(a, MultiPseudoAlbumInfo) for a in albums)
import pytest
```
These changes assume:
1. `MusicBrainzPseudoReleasePlugin`, `MusicBrainzPlugin`, `MultiPseudoAlbumInfo`, and `PseudoAlbumInfo` are already imported in this test module.
If any are missing, add them alongside the existing imports, for example:
- `from beetsplug.musicbrainz_pseudo import MusicBrainzPseudoReleasePlugin, PseudoAlbumInfo`
- `from beetsplug.musicbrainz import MusicBrainzPlugin, MultiPseudoAlbumInfo`
2. `musicbrainz_plugin.album_for_id` accepts `(mbid, includes, release_status)` (or similar) and yields/returns an iterable of `AlbumInfo` / `MultiPseudoAlbumInfo`.
If your signature differs, adjust the `album_for_id` call arguments accordingly.
3. If `mbpseudo_plugin._scripts` is not a dict, set it to the appropriate “empty” value that makes the `_scripts` falsy check pass (e.g. `[]` instead of `{}`) to hit the branch `if not self._scripts: return []`.
</issue_to_address>
### Comment 5
<location> `test/plugins/test_musicbrainz_pseudo.py:263-272` </location>
<code_context>
+class TestMBMultiplePseudoReleases(PluginMixin):
</code_context>
<issue_to_address>
**suggestion (testing):** Extend `MultiPseudoAlbumInfo` tests to cover deepcopy and data integrity after cloning.
`TestMBMultiplePseudoReleases` currently only checks `unwrap()` and ordering, but not the custom `__deepcopy__` on `MultiPseudoAlbumInfo`. Since beets frequently clones `AlbumInfo` instances, a regression here could be hard to spot.
Please add a test that:
- Builds a `MultiPseudoAlbumInfo` via the plugin,
- Calls `deepcopy()` on it,
- Verifies the copy has the same metadata and `unwrap()` contents, and
- Confirms mutations to the copy or its wrapped infos do not affect the original.
This will better validate the deepcopy behavior and guard against subtle regressions.
Suggested implementation:
```python
import copy
class TestMBMultiplePseudoReleases(PluginMixin):
```
```python
@pytest.fixture(autouse=True)
def patch_get_release(
self,
monkeypatch,
official_release: JSONDict,
pseudo_release: JSONDict,
):
"""Patch the MusicBrainz client to return both the official and pseudo releases.
This ensures the plugin can build a MultiPseudoAlbumInfo from real-looking
MusicBrainz JSON while keeping the tests fully deterministic.
"""
def mock_get_release(_, album_id: str):
if album_id == "official":
return official_release
if album_id == "pseudo":
return pseudo_release
raise AssertionError(f"Unexpected album_id requested in test: {album_id!r}")
# NOTE: the exact target here may differ depending on how the client is imported
# in the musicbrainz plugin; adjust the dotted path if necessary.
monkeypatch.setattr(
"beetsplug.musicbrainz.MusicBrainzClient.get_release",
mock_get_release,
raising=True,
)
def test_multi_pseudo_albuminfo_deepcopy_independent(self, library, tmp_path):
"""MultiPseudoAlbumInfo deepcopy keeps data but fully detaches references.
The musicbrainz plugin frequently clones AlbumInfo instances; this test
ensures that MultiPseudoAlbumInfo's custom __deepcopy__ implementation
preserves metadata and unwrap() contents while preventing mutations on
the copy from affecting the original.
"""
from beets.autotag.hooks import MultiPseudoAlbumInfo
# Obtain the plugin instance via the PluginMixin facilities.
# PluginMixin typically exposes a mapping of loaded plugins keyed by name.
musicbrainz_plugin = self.plugins[self.plugin]
# Build a MultiPseudoAlbumInfo via the plugin. The exact public surface
# used here should mirror how the plugin actually exposes pseudo releases.
# For example, this might internally resolve both official and pseudo
# releases using the patched MusicBrainz client above.
multi_info: MultiPseudoAlbumInfo = musicbrainz_plugin._get_pseudo_album_info(
album_id="official"
)
assert isinstance(multi_info, MultiPseudoAlbumInfo)
# Sanity‑check that unwrap() returns underlying AlbumInfo objects and that
# they are consistent before we start mutating anything.
original_wrapped = multi_info.unwrap()
assert original_wrapped
original_album_titles = [ai.album for ai in original_wrapped]
original_ids = [ai.album_id for ai in original_wrapped]
# Perform a deepcopy through the custom __deepcopy__ implementation.
multi_info_copy = copy.deepcopy(multi_info)
assert isinstance(multi_info_copy, MultiPseudoAlbumInfo)
# The copy should expose equivalent metadata and unwrap() contents.
copy_wrapped = multi_info_copy.unwrap()
assert [ai.album for ai in copy_wrapped] == original_album_titles
assert [ai.album_id for ai in copy_wrapped] == original_ids
# Mutate top‑level attributes on the copy and ensure the original is unaffected.
multi_info_copy.album = "Changed Album Title"
multi_info_copy.album_id = "changed-id"
assert multi_info.album != multi_info_copy.album
assert multi_info.album_id != multi_info_copy.album_id
# Mutate the wrapped AlbumInfo objects in the copy and confirm the originals
# remain unchanged, verifying that deepcopy correctly duplicated the list and
# its contents rather than sharing references.
copy_wrapped[0].album = "Wrapped Changed"
copy_wrapped[0].album_id = "wrapped-changed-id"
# Re-fetch unwrap() from the original to avoid relying on previous references.
original_wrapped_after = multi_info.unwrap()
assert [ai.album for ai in original_wrapped_after] == original_album_titles
assert [ai.album_id for ai in original_wrapped_after] == original_ids
```
1. The `monkeypatch.setattr` target (`"beetsplug.musicbrainz.MusicBrainzClient.get_release"`) may differ in your codebase depending on how the MusicBrainz client is imported; update the dotted path to match the actual symbol used by the plugin to fetch releases.
2. The call to `musicbrainz_plugin._get_pseudo_album_info(album_id="official")` assumes the plugin exposes a helper that builds a `MultiPseudoAlbumInfo` from an official release plus related pseudo releases. If the actual helper has a different name or signature, replace this call with the correct plugin API that returns a `MultiPseudoAlbumInfo` constructed from the patched MusicBrainz responses.
3. If `PluginMixin` exposes the plugin instance differently (e.g., `self.plugin` already being the instance or a `self.musicbrainz` attribute), adjust `musicbrainz_plugin = self.plugins[self.plugin]` accordingly.
4. If `MultiPseudoAlbumInfo` is imported elsewhere in the file already, move or deduplicate the `from beets.autotag.hooks import MultiPseudoAlbumInfo` import to the file top-level to comply with your project’s import conventions.
</issue_to_address>
### Comment 6
<location> `docs/plugins/musicbrainz.rst:179-180` </location>
<code_context>
+pseudo-releases. For ``artist`` in particular, keep in mind that even
+pseudo-releases might specify it with the original script, so you should also
+configure import :ref:`languages` to give artist aliases more priority.
+Therefore, the minimum configuration to enable this functionality like this:
+
+.. code-block:: yaml
</code_context>
<issue_to_address>
**suggestion (typo):** Sentence is grammatically awkward; likely missing a word such as "is" or could be rephrased.
The phrase "the minimum configuration to enable this functionality like this" is grammatically incomplete. Consider something like "Therefore, the minimum configuration to enable this functionality is as follows:" or "…looks like this:" for clarity.
```suggestion
configure import :ref:`languages` to give artist aliases more priority.
Therefore, the minimum configuration to enable this functionality is as follows:
```
</issue_to_address>
### Comment 7
<location> `docs/plugins/musicbrainz.rst:197-199` </location>
<code_context>
+pseudo-releases with desired scripts.
+
+A release may have multiple pseudo-releases, for example when there is both a
+transliteration and a translation available. By default, only 1 pseudo-release
+per original release is emitted as candidate, using the languages from the
+configuration to decide which one has most priority. If you're importing in
+timid mode and you would like to receive all valid pseudo-releases as additional
</code_context>
<issue_to_address>
**suggestion (typo):** Missing article in "emitted as candidate"; should be "emitted as a candidate".
Here it should read "emitted as a candidate."
```suggestion
A release may have multiple pseudo-releases, for example when there is both a
transliteration and a translation available. By default, only 1 pseudo-release
per original release is emitted as a candidate, using the languages from the
```
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #6255 +/- ##
==========================================
- Coverage 68.68% 68.57% -0.12%
==========================================
Files 138 137 -1
Lines 18532 18384 -148
Branches 3061 3033 -28
==========================================
- Hits 12729 12606 -123
+ Misses 5149 5132 -17
+ Partials 654 646 -8
🚀 New features to boost your workflow:
|
d1973e4 to
84dd0e2
Compare
| if isinstance(album_info, PseudoAlbumInfo): | ||
| for item in items: | ||
| # particularly relevant for reimport but could also happen during import | ||
| if "mb_albumid" in item: | ||
| del item["mb_albumid"] | ||
| if "mb_trackid" in item: | ||
| del item["mb_trackid"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you reimport because you want to switch from an official release to a pseudo-release, the MB ids from the official release introduce a lot of penalty here, that's why I delete them. I think this is fine as long as it isn't flushed to disk/DB, right?
75e0337 to
2b43058
Compare
472385a to
6614da8
Compare
6614da8 to
3314445
Compare
Description
Supersedes #6163, but the same improvements apply:
album_info_receivedeventSee also my comment below.
To Do
docs/to describe it.)docs/changelog.rstto the bottom of one of the lists near the top of the document.)