-
Notifications
You must be signed in to change notification settings - Fork 2k
Discogs: fixes for: 6177, 6068 #6179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
91ece4c to
d68514e
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6179 +/- ##
==========================================
+ Coverage 68.68% 68.70% +0.01%
==========================================
Files 138 138
Lines 18532 18600 +68
Branches 3061 3069 +8
==========================================
+ Hits 12729 12779 +50
- Misses 5149 5169 +20
+ Partials 654 652 -2
🚀 New features to boost your workflow:
|
This comment was marked as resolved.
This comment was marked as resolved.
f0902d3 to
cb4b4fc
Compare
cb4b4fc to
ec23eec
Compare
f1e1620 to
21a6456
Compare
semohr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had some time to have a look here. Added some comments for potential improvements.
Thanks for starting to tackle the discogs plugin, it is in dire need of some love 😄
|
Thanks Sebastian! I'm out of town at the moment but will get these applied once I'm back. |
|
No hurry! Take your time and enjoy the holidays. |
beetsplug/discogs.py
Outdated
| index = 0 | ||
| divisions: list[str] = [] | ||
| next_divisions: list[str] = [] | ||
| t: TracklistInfo = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try to not use single character variable names? I get that they simplify things in writing but my fish brain forgets what t is after reading like 20 lines 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed - they're fine for hashing out the ideas but they do make retreading the code a lot harder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should good here!
While the whole logic is still a bit convoluted, I feel like it is a major improvement from before the PR 👍
Thank you for taking the initiative!
|
Updated the variable name - can go ahead and merge if all looks good! |
|
I think @snejus wants to have a look too since he self assigned this to himself too. |
snejus
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that it took me a long time to finally get to this! Well done, this is a big and important piece of work.
It seems to me that DiscogsPlugin now has a bit too many responsibilities. I think it would be a good idea to move the logic from _build_* methods to intermediate dataclasses which should make it easier for us to reason about it all and hopefully simplify the logic. I refactored a couple of bits as I reviewed this, so I'm opening a PR based from your last commit for you to review!
| class IntermediateTrackInfo(TrackInfo): | ||
| """Allows work with string mediums from | ||
| get_track_info""" | ||
| class AlbumArtistInfo(ArtistInfo): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❗ Note that AlbumInfo object uses artist, artists etc. fields without album prefix
| artist, artist_id = self.get_artist(artist_list, join_key="join") | ||
| return self.strip_disambiguation(artist), artist_id | ||
|
|
||
| def _build_albumartistinfo(self, artists: list[Artist]) -> AlbumArtistInfo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method above get_artist_with_anv is now unused and should be removed
| ) | ||
| artist_credit = album_artist_anv | ||
| # Information for the album artist | ||
| albumartist: AlbumArtistInfo = self._build_albumartistinfo(artist_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want to use this method directly since AlbumInfo does not support albumartist* fields.
| albumartist: AlbumArtistInfo = self._build_albumartistinfo(artist_data) | |
| albumartist = self._build_artistinfo(artists, for_album_artist=True) |
| title: str | ||
| duration: str | ||
| artists: list[Artist] | ||
| extraartists: NotRequired[list[Artist]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Artist definition above needs fixing:
class Artist(TypedDict):
name: str
anv: str
join: str
role: str
tracks: str
- id: str
+ id: int
resource_url: str| } | ||
| ], | ||
| "artists": [ | ||
| {"name": "ARTIST", "anv": "", "id": 2}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure that all "artists" and "extraartists" reflect the expected Artist shape:
| {"name": "ARTIST", "anv": "", "id": 2}, | |
| { | |
| "name": "ARTIST", | |
| "anv": "", | |
| "id": 2, | |
| "role": "", | |
| "join": "", | |
| "tracks": "", | |
| "resource_url": "", | |
| }, |
I'd strongly suggest defining a helper method:
def _artist(name: str, **kwargs):
return {
"id": 1,
"name": name,
"join": "",
"role": "",
"anv": "",
"tracks": "",
"resource_url": "",
} | kwargs| anv = a.get("anv", "") or name | ||
| role = a.get("role", "").lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All Artist keys are always present:
| anv = a.get("anv", "") or name | |
| role = a.get("role", "").lower() | |
| anv = a["anv"] or name | |
| role = a["role"].lower() |
| name = config["va_name"].as_str() | ||
| anv = name | ||
| # If the artist is listed as featured | ||
| if "featuring" in role: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about cases like this one? See feat Yannou which has role: ""
>>> pprint(r.tracklist[0].data)
{
'position': '1-01',
'type_': 'track',
'artists': [
{'name': 'DJ Sammy', 'anv': '', 'join': '&', 'role': '', 'tracks': '', 'id': 47752, 'resource_url': 'https://api.discogs.com/artists/47752'},
{'name': 'Yanou', 'anv': '', 'join': 'feat.', 'role': '', 'tracks': '', 'id': 43728, 'resource_url': 'https://api.discogs.com/artists/43728'},
{'name': 'Do', 'anv': '', 'join': '', 'role': '', 'tracks': '', 'id': 52560, 'resource_url': 'https://api.discogs.com/artists/52560'}
],
'title': 'Heaven',
'extraartists': [
{'name': 'Do', 'anv': '', 'join': '', 'role': 'Featuring', 'tracks': '', 'id': 52560, 'resource_url': 'https://api.discogs.com/artists/52560'},
{
'name': 'Frank Reinert',
'anv': '',
'join': '',
'role': 'Producer [Assistant Producer]',
'tracks': '',
'id': 200378,
'resource_url': 'https://api.discogs.com/artists/200378'
},
{'name': 'DJ Sammy', 'anv': '', 'join': '', 'role': 'Producer [Produced By]', 'tracks': '', 'id': 47752, 'resource_url': 'https://api.discogs.com/artists/47752'},
{'name': 'Yanou', 'anv': '', 'join': '', 'role': 'Producer [Produced By]', 'tracks': '', 'id': 43728, 'resource_url': 'https://api.discogs.com/artists/43728'},
{'name': 'Bryan Adams', 'anv': 'B.Adams', 'join': '', 'role': 'Written-By', 'tracks': '', 'id': 10933, 'resource_url': 'https://api.discogs.com/artists/10933'},
{'name': 'Jim Vallance', 'anv': 'J.Vallance', 'join': '', 'role': 'Written-By', 'tracks': '', 'id': 266699, 'resource_url': 'https://api.discogs.com/artists/266699'}
],
'duration': '3:39'
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of those weird quirks with the Discogs system. If they're listed as a main artist, they'll never have a role listed, it'll just be blank, but will get assembled properly for the main artist field.
In extraartists, Do appears again, this time with featuring as the role, which will cause the featured artist to appear twice - an I noticed in #6166 Both these ways of noting an artist as featuring are apparently within discogs guidelines, and sometimes releases include them in the main artist field, both, or just the extra artists. It's really a wild west out there.
| if not featured_flag: | ||
| artist += feat_str | ||
| artist_anv += feat_str | ||
| artist += name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is very complicated and hard to reason about. I think it can be simplified, and moved to a dedicated intermediate class, something along these lines:
@dataclass
class ArtistState:
class ValidArtist(NamedTuple):
name: str
credit: str
join: str
is_feat: bool
def get_artist(self, property_name: str) -> str:
return getattr(self, property_name) + (
{",": ", ", "": ""}.get(self.join, f" {self.join} ")
)
raw_artists: list[Artist]
use_anv: bool
use_credit_anv: bool
featured_string: str
should_strip_disambiguation: bool
@property
def info(self) -> ArtistInfo:
return {k: getattr(self, k) for k in ArtistInfo.__annotations__} # type: ignore[return-value]
@property
def artists_ids(self) -> list[str]:
return [str(a["id"]) for a in self.raw_artists]
@property
def artist_id(self) -> str:
return self.artists_ids[0]
def strip_disambiguation(self, text: str) -> str:
"""Removes discogs specific disambiguations from a string.
Turns 'Label Name (5)' to 'Label Name' or 'Artist (1) & Another Artist (2)'
to 'Artist & Another Artist'. Does nothing if strip_disambiguation is False."""
if self.should_strip_disambiguation:
return DISAMBIGUATION_RE.sub("", text)
return text
@cached_property
def valid_artists(self) -> list[ValidArtist]:
va_name = config["va_name"].as_str()
return [
self.ValidArtist(
self.strip_disambiguation(anv if self.use_anv else name),
self.strip_disambiguation(anv if self.use_credit_anv else name),
a["join"],
is_feat,
)
for a in self.raw_artists
if (
(name := va_name if a["name"] == "Various" else a["name"])
and (anv := a["anv"] or name)
and (
(is_feat := ("featuring" in a["role"].lower()))
or not a["role"]
)
)
]
@property
def artists(self) -> list[str]:
return [a.name for a in self.valid_artists]
@property
def artists_credit(self) -> list[str]:
return [a.credit for a in self.valid_artists]
@property
def artist(self) -> str:
return self.join_artists("name")
@property
def artist_credit(self) -> str:
return self.join_artists("credit")
def join_artists(self, property_name: str) -> str:
non_featured = [a for a in self.valid_artists if not a.is_feat]
featured = [a for a in self.valid_artists if a.is_feat]
artist = "".join(a.get_artist(property_name) for a in non_featured)
if featured:
if "feat" not in artist:
artist += f" {self.featured_string} "
artist += ", ".join(a.get_artist(property_name) for a in featured)
return artist
@classmethod
def from_plugin(
cls,
plugin: DiscogsPlugin,
artists: list[Artist],
for_album_artist: bool = False,
) -> ArtistState:
return cls(
artists,
plugin.config["anv"][
"album_artist" if for_album_artist else "artist"
].get(bool),
plugin.config["anv"]["artist_credit"].get(bool),
plugin.config["featured_string"].as_str(),
plugin.config["strip_disambiguation"].get(bool),
)Note I'd implemented this functionality in my fork a while ago (I also parse composers and remixers), so the dataclass above takes inspiration from there.
|
See #6277 |
Description
Fixes #6177, #6068.
I fixed the issue in #6177 by removing the derived class interface, and moving those fields back into function variables. They're a bit unwieldy still, but that's the algorithm it came with. There's a lot of room to continue to improve the clarity of the code in that section, but I think that'll require a deeper overhaul.
For #6068, I created the
ArtistInfoandAlbumArtistInfotyped dictionaries, and was able to centralize the logic of building the artist info intobuild_artistinfoandbuild_albumartistinfo. Tests for these scenarios were created largely by expanding existing tests to incorporate the new fields.I think I'll have to re-think the entire algorithm for 6030 to make it more flexible at parsing the issue for #6030, so I'll move that to a later PR in the interest of getting the flex attr fix in.
To Do
docs/changelog.rstto the bottom of one of the lists near the top of the document.)