-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Hello,
I'm a complete beginner to OCR (I did a CS degree about 10 years ago so I feel like an almost complete beginner to ML as I haven't done it since my dissertation). I've been developing an app to teach myself to sightread (and mainly learn to supercharge my development with claude code).
As part of the app, I have been looking at pulling in scans of music that I have laying around and sticking them in to the app which then detects how well the song is being played. I have been very impressed with homr on the whole and it was remarkably easy to get hooked into my codebase.
One thing that I have noticed is that while it is very good at picking up staves, clefs, bar lines, time signatures, notes, rests etc, it doesn't deal at all with other markings such as note relations (slurs etc), dynamics, articulation and ornamentation.
I'm slightly tempted to use this as a project to relearn some ML and see how good claude is at teaching me something new. Is this an incredibly stupid idea and there is a very obvious reason for these not to have been included in this model?
Thank you for the great work that you have done on this, I have really enjoyed using it in my little project.