-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Documentation
- Well organized.
- What is a vlink? I presume that that's a link to a video recording. It would make sense to use this part of your documentation to annotate any attributes whose names do not make their meanings obvious.
- Steps and figures ... hmmm ... OK, we probably should have had this conversation earlier, and it's really OK anyway, but ... We do not really have steps in ECD, we just have figures. A step is an ordinary walking step, often used (actually) with a count to ensure timing with the music, e.g., "Four changes of rights and lefts, four steps per change." The normal assumption is that you take a step on every beat of the music, if yoiu are moving at all. Four changes of rights and lefts is an example of a figure. Another is up a double (and variants, e.g., up a double and back, down a double). The website Up a Double has a nice list of figures or figure categories. Notice how short the list is!
E-R Analysis
- I'm confused about the meanings of step and figure β following on the above remarks. For you all, a step has a sequence. I would have said that a dance has a sequence of figures, and that a figure is a sequence of moves or steps. But, anyway, I'm not sure how you model the sequence, or how you model a move of a figure. I don't quite know what those words mean. (Maybe this information belongs in your documentation.)
- Otherwise, good modeling of relationship sets in particular.
Schema and SQL
- By the time I got to reviewing this, your schema.sql had mysteriously disappeared, so I looked at the one in P3. This is slightly unfortunate, because I wanted to understand your schema better by seeing the data types, which (understandably) are not present in your textual representation of the schema.
- The Figure relation here makes sense, though specifying the duration would probably not work. If the integer type duration represents the number of steps or beats, this makes a back-to-back in 8 steps a completely different figure from one in 12 steps, which does not quite agree with most dancers' intuitions. Moreover, most figures involve a sequence of steps and changes of direction. For instance, in a 2-couple minor set, four changes of rights and lefts, the dancers walk around the square boundary of their set in some direction, depending on their initial position: the first man and second woman go clockwise, the others counter-clockwise. On each side of the square, the two passing each other clasp right or left hands (generally, right on the first pass, left on the second, etc.). Each side would require some number of steps, and the number is usually, but not necessarily, the same on each side. So the duration of the whole figure does not tell us enough information.
- The above remarks suggest that the list of possible figures should be small (see above), and that there be some other entity set, and therefore some other relation, that parametrizes the variant of that figure.
- I still do not know what a step (Step) is or why a step would refer back to a particular dance. I'm baffled.
- The FigureStep relation looks interesting, but, again, I don't know what it is supposed to model, or what place is exactly.
- It's actually inefficient for SQLite to declare your IDs as autoincrement. If you leave that off, when you declare an attribute to have type integer and status as primary key, SQLite will automatically treat it as an alias for its own rowid, which it always creates and is already automatically incremented. This is a special feature of SQLite. Practically everybody made this mistake, and it's by fault. I myself did not learn about this feature until after the first few weeks of class. I mentioned it, but probably not enough times.
Code
You all put some extra effort into the deduplication algorithm, which is ingenious and super-cool. Now that I've described a bit more of what a figure is, let's consider how else you might have defined the problem and then addressed it. In this case, you have a fairly small and finite set of figures to start with, and you need to classify a large set of terms for those figures into that small set of classes. This would actually be easier for AI β it is close to a traditional NLP classification problem, at which an LLM should do quite well. Or you could use a pre-LLM technology, such as simply minimal edit distance (Levenshtein distance) or K-means clustering or some other proximity measure. You would want the algorithm to fail where the dissimilarity to any known class is too great, because that would enable you to identify special minority classes. Then you could add those minority classes to your set of figures.
There is also a potentially interesting problem involved in separating out what I am calling the figure class from the parameters that determine its specific variety. For instance, there's a circle left and a circle right. Either one can go all the way, half way, or a quarter way. (Moreover, a single file actually is a circle, but without holding hands. Should it be classified as a circle, and have hand-holding be a parameter?) If the parameters are separated out from the figure classes, then how would they be modeled and stored?
But hats off for your experiment in applying AI to this problem!
Great job overall, team! π