Conversation
superbock
left a comment
There was a problem hiding this comment.
Nice work! I am not sure about some defaults (e.g. the 5 frames look back or a threshold of 0.1), these should at least be documented somehow or reconsidered completely -- e.g. the 5 frames could maybe expressed as the minimum interval or something like that.
madmom/features/beats.py
Outdated
| buffer = self.buffer(activation) | ||
| # create an online interval histogram | ||
| histogram = self.tempo_estimator.online_interval_histogram( | ||
| activation, reset=False) |
There was a problem hiding this comment.
You should pass the reset argument to the tempo_estimator as well, i.e. reset=reset. Otherwise the beat tracker resets its state, but the tempo estimation still keeps its old state.
There was a problem hiding this comment.
Hmm, actually I think it's ok the way it is? The tempo_estimator gets reset inside the reset() method of the BeatTracker and might be called whenever process_online() is called.
If I want to compute a whole song then process_online() gets called only once with all the activations and reset=True. If I would use reset=reset here then while iterating through all activations it would reset the tempo_estimator for each activation.
There was a problem hiding this comment.
Ok, I missed the resetting inside reset(). Anyways, I still think passing reset=reset is the cleaner solution than calling self.tempo_estimator.reset(). When iterating over activations frame by frame, reset must be set to false of course -- and this is exactly what is done in online mode.
There was a problem hiding this comment.
Ah sorry, I forgot to pass the online=True parameter to process_single() (I'm calling stuff with a script). Nevermind, works now ;)
madmom/features/beats.py
Outdated
| activation, reset=False) | ||
| # get the dominant interval | ||
| interval = self.tempo_estimator.dominant_interval(histogram) | ||
| # compute current and next beat times |
There was a problem hiding this comment.
This comment is a bit misleading, especially the part of next beat. If I get it right it is the first position which is possible to be the next beat, right?
| # for every frame the tempo may change and therefore the last beat | ||
| # can easily be missed. | ||
| if len(detections) > 0 and detections[-1] >= len(buffer) - 5 and \ | ||
| buffer[detections[-1]] > self.threshold: |
There was a problem hiding this comment.
The indentation is weird and the if-statement is very long. I'd prefer to have multiple logical expressions and then test for them, e.g.:
cond_1 = detections[-1] >= len(buffer) - 5
cond_2 = buffer[detections[-1]] > self.threshold
if detections and cond_1 and cond_2:
Of course, more meaningful names would be nice, but you get the idea. What I still don't like is some hard coded values such as -5. Why 5 frames?
There was a problem hiding this comment.
I just tested different values here and this just seemed to work best. But I know what you mean... maybe i can find something more rational. For the threshold the intention was that we probably don't want false positive predictions on silent signals. This also seem to improve the dataset results a bit.
There was a problem hiding this comment.
I was never questioning the rationale behind using a threshold, e.g.DBNBeatTrackingProcessor uses it for the very same reason 😏. Back then I added the threshold for the DBN stuff and did not back-port it to the other methods. Maybe we should do so now and add it to the other methods (after evaluating it's positive effect of course).
There was a problem hiding this comment.
Some more notes regarding the threshold:
- What I do is to determine the first/last frame where the activation is above/below the threshold and use only the segment in between. Although this works quite nicely, I am not sure if this is not simply some sort of adaptation towards the annotations...
- The same could be done for multiple segments, not just one per song. Although this sounds logical and straight forward, I am not sure if this segmentation helps in general. We must be aware of the many songs which have segments without any perceivable beats where counting continues. For these cases splitting into segments doesn't make sense at all.
There was a problem hiding this comment.
Ah ok, cool! Yeah using only "active" segments seems like a good idea. Maybe the distance between those active segments gives a hint whether it should stop tracking or not.
For online it's a bit different anyway because we don't know how long the break will be so we have to decide on the spot (or maybe with a bit of delay).
| # display tempo | ||
| display.append('| %5.1f | ' % float(self.fps * 60 / interval)) | ||
| sys.stderr.write('\r%s' % ''.join(display)) | ||
| sys.stderr.flush() |
There was a problem hiding this comment.
We should consider refactoring the visualisation stuff as a mixin which provides a visualize() method. BeatTrackingProcessor and DBNBeatTrackingProcessor could then inherit from this mixin and simply call visualize() whenever needed. The mixin should have/keep its own state and counters. Some more notes: we should consider using constants to define how long the beats are displayed (e.g. self.beat_counter = DISPLAY_DURATION) or the inertia of the RNN beat activation strength (e.g. DISPLAY_INERTIA).
There was a problem hiding this comment.
I think that would be very nice! Every BeatTracker could show the activation strength, the tempo and whether the current frame is a beat. My only concern is that different trackers may want to show different additional information (like the DBNTracker does with the state space).
tests/test_features_beats.py
Outdated
| 1.81, 2.15, 2.49])) | ||
|
|
||
| def test_process_online(self): | ||
| processor = BeatTrackingProcessor(fps=sample_blstm_act.fps, |
There was a problem hiding this comment.
minor note: use fps=sample_lstm_act.fps instead.
|
Thx for all your comments, will work through all of them and then amend ;) |
fb6c54f to
cb466ac
Compare
| return np.array(positions) | ||
|
|
||
|
|
||
| def detect_beats_with_ccf(activations, interval, look_aside=0.2): |
There was a problem hiding this comment.
I am actually not sure if you want to keep this method :) It is completely interchangeable with detect_beats() but seems to produce results which are a little bit worse. On the other hand its much faster. Maybe its an option for the BeatDetector (with constant tempo).
There was a problem hiding this comment.
The problem you are facing here is the fact that many musical pieces simply don't have a constant tempo.
Although you're allowing some deviation from a constant tempo, this is only considered for each beat individually, i.e. the tempo remains stable and only some jitter is allowed. If one beat is a bit earlier (faster), the position of the next is not based on the current beat, but rather on the previously pre-computed positions.
detect_beats on the other hand allows also some deviation from the main tempo, but the position of the next beat is based on this position rather than the pre-computed ones. This makes this method more flexible, but also slower.
If detect_beats_with_ccf performs inferior to detect_beats on most datasets I don't see a point for including it, even if it is faster. I'd rather speed up detect_beats if speed is really an issue.
There was a problem hiding this comment.
Yes, i think the same! At least i can use it for my thesis and evaluate it :)
I'm very curious now about the MultiAgent approach because, similar to detect_beats it always jumps from beat to beat but considers multiple tempo hypothesis and should be a very good fit for online mode 😋
cb466ac to
9d85686
Compare
48d0cf8 to
886937c
Compare
37e83bb to
cbdbe8c
Compare
ce27bc1 to
67fb095
Compare
b169c9b to
31c4d33
Compare
2aa06d6 to
4fd7ab9
Compare
8777d30 to
ceb5c74
Compare
56eb679 to
55d7f1f
Compare
dcdbb8b to
1c640e5
Compare
1c640e5 to
be97b36
Compare
|
I merged #292, please rebase your branch on master. |
55d7f1f to
a192ef0
Compare
a192ef0 to
5ba9021
Compare
|
Ok, rebased this branch and applied your stuck-appveyor fixes to the tests. I think the -0.1% Coverage is due to the CCF-Commit (which can be omitted). Anyway, would be awesome if you could take it from here \m/ |
This PR extends The BeatTracker with online capability based on #292.
Also a simple CCF beat detecting method for experimenting was added.