-
Notifications
You must be signed in to change notification settings - Fork 2k
fromfilename: improvements & fixes #6259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
7c22d71 to
f015620
Compare
f015620 to
af07f45
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #6259 +/- ##
==========================================
+ Coverage 68.22% 69.04% +0.81%
==========================================
Files 138 138
Lines 18792 18736 -56
Branches 3167 3096 -71
==========================================
+ Hits 12821 12936 +115
+ Misses 5298 5149 -149
+ Partials 673 651 -22
🚀 New features to boost your workflow:
|
af63e8f to
3267205
Compare
|
Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry. |
f8b9f44 to
f930875
Compare
f930875 to
4cd29f9
Compare
JOJ0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just some tiny things I saw during a quick scroll through. this is a lot! but worth doing! thanks! I came along a bunch of old badly tagged mp3's lately that will be perfect for testing. hope I get to it.
5ee59cf to
177e997
Compare
|
I've been using it a good amount with my vinyl ripping workflow, and it's been working way better than the old version. If everything looks good & works well for you, I can update the changelog & get it merged in. |
I merged it into my personal master and want to check it alongside current discogs changes. I got distracted with a crash in Beets elsewhere though.... Will try to do some testing with badly tagged files as promised soonish. One thing I wanted to ask: How did you come up with all the regex patterns? 1. By heart since I'm a regex wizard. 2. Had AI generate them. 3. Used regex101 (or similar) to develop and test them. :-))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a very high level, the first things I notice:
Given a badly tagged collection of Alternative Rock/Pop music residing in a folder named Alternative we find a hand full of single files as well as subfolders with artist names or even subfolders with full albums.
This might be out of scope with your suggested changes but since you are kind of the expert in refactoring this plugin at the moment I'd like to feedback anyway:
First beets tries to process the 7 files in the root of the collection as an album, thus I tell it to "Group Albums":
/......./Alternative (7 items)
...
...
fromfilename: Item updated with: dict_items([])
fromfilename: Item updated with: dict_items([])
fromfilename: Item updated with: dict_items([('album', 'Alternative')])
fromfilename: Item updated with: dict_items([('album', 'Alternative')])
fromfilename: Item updated with: dict_items([])
fromfilename: Item updated with: dict_items([('album', 'Alternative')])
fromfilename: Item updated with: dict_items([('album', 'Alternative')])
...
...
Enter search, enter Id, aBort, eDit, edit Candidates, plaY? g
...
...
fromfilename: Item updated with: dict_items([])
Looking up: /...Alternative/caesars+palace+-+jerk+it+out.mp3
In this case what does the "Item updated with " message mean exactly? Note that this single track actually does have an artist, album and title tag set.
On a next file i see:
In this case updating the item album Alternative is a bad idea since we just asked the importer to "Group Albums" and to not consider the containing folder's name as the album title (or I am mistaken and this is how Beets works even in "Group albums" mode. Let's investigate....).
fromfilename: Item updated with: dict_items([('album', 'Alternative')])
Looking up: /...Alternative/Elastica - Connection.mp3
This next one is probably impossible for the plugin, so we might just ignore that weird namings like that can't really be considered:
fromfilename: Item updated with: dict_items([('album', 'Alternative')])
Looking up: /.../Alternative/Stigmata Soundtrack - 01_Mary Mary (Stigmatic Mix) - Chumbaw.mp3
But also here we see that the missing album is replaced with the folder name which in this case doesn't improve the metadata search request. Also note that this file had a correct artist and title tag. Album was missing though.
/...Alternative/Stigmata Soundtrack - 01_Mary Mary (Stigmatic Mix) - Chumbaw.mp3 (1 items)
Sending event: import_task_before_choice
Sending event: before_choose_candidate
Finding tags for album "Chumbawamba - Alternative".
Candidates:
I'm stopping now since I'm running out of time but will continue with this folder. There is loads of difficult material in it ;-)
Please don't take this too critical. Just trying to give a real life example and I'm sure there might be situations were no automation on this world will prevent a manual intervention, eg. search/rename/restructure.
One thing I always wanted to have in Beets is the ability to edit what is actually sent to the metadata source withihn the importer. This way an only halfway correct suggestion from filename or wonky tags could be quickly fixed. Similar to the edit plugin, but before the search, but this is out of scope here!
So the main thing I notice and question is: Is it expected that "non-album-folders" names will be considered as album names in "Group Albums" mode, and can we do anything about it?
Anyway, hope it helps, and again: Love that you took on the task of making this plugin more useful! I'll try to do more playing around this weekend I hope!!!
|
Woohoo! Thank you for this very thorough and constructive feedback. The rewrite was mainly done through my own lense, so there's a lot of scenarios I may have missed or things I didn't catch. Regex was a combination of regex101 and by heart. I'm not gonna let a chatbot have all the fun writing regex!! I unfortunately(?) enjoy the process of writing out all the code myself, so unless someone else requires me to, I tend to stay away from LLMs with my code. Good issue raised here - this log message could be more clear. It essentially shows the fields that it thinks are empty and it should substitute. Since all the files are lacking the album tag, it's using "Alternative" as the album tag. This would be so useful! Maybe it could be an expansion of the edit plugin, since it's sort of on the right track, and could stand to allow more fields to be edited too. Here are the changes I think I 'll make to have it perform better that will hopefully 1.I think I will restructure that log message to the following: This should be more instructive as to what it is actually doing with each file.
Thanks for this great feedback - would much rather get this right the first time than have to keep patching it. |
Haha! Yes! That's the spirit! I'm lazy often I have to admit and let AI help me, but especially with regex I had bad experiences and regex101 and human thinking is the only way! Brings back one thought I'd like to throw in in case you are not aware already: Putting more complex regex into yaml can be a PITA, which is the case here with the custom patterns you implemented. If people try to do crazy black magic with this feature we might get issues in the future. Didn't look how you are escaping these patterns. Will do in my final low level review but I'll let you continue first withe below's awesome feature ideas! ❤️
Let's definitely keep that in mind for the future! Yeah an edit plugin PR sounds doable!
That sounds super clear, explains everything and is still quite concise! ❤️
Please help me out here: When we set it on the CLI already it will be enabled throughout the entire import run.? No matter in which folder a file resides, the importer will always try to mere everything together right? Now if I enable it interactively during the importer run (like in my example), only for this one folder (and it's subfolders???), will the group albums feature unset itself when we jump out of the folder again. Eg. in my example, once we Or will it stay enabled always from that point on, in the import run? Sorry, confused how Beets actually works in that matter 🤷
Hmm if that is easy to implement, do it. Also add a CLI option for it then. I'm picturing a workflow where you What I missed in my first read-through is that you actually already thought of disabling this in a way: So with
you tried to point to this feature? Right? This is a great addition, let's try not ot overengineer it but brainstorm a little. How would a CLI option look like that tells the importer (well actually the fromfilename plugin) that for this run the guessing by foldernames for albumnames shouldn't be done. Something like
Dozens of thanks going out here! This will be so so super helpful! |
This pull request improves the usability and configurability of the
fromfilenameplugin to encompass more file name patterns and folder names, which can contain usable information that will lead to better results when put through the metadata plugins, even for files with partial tags.These changes gives the plugin the ability, by default, to extract the following fields:
album,artist,albumartist,disc,track,title,media,catalognum,yearA comprehensive set of tests has been written to make sure the patterns can recognize the information in a wide
variety of layouts and file names.
For more unusual folder and filename patterns, users can now specify a pattern, using the same method as our
path format strings, e.g. "$albumartist - $album ($comments)"
Changes & Issues Addressed:
Notes:
Housekeeping