4.3  Integrate our Segmentation model and instrument classifier to automatically label film clips


Adopt a CRNN-based architecture (Convolutional Recurrent Neural Network) from the following paper and reproduce experiment

[Onset Detection for String Instruments Using Bidirectional Temporal and Convolutional Recurrent Networks](https://dl.acm.org/doi/10.1145/3616195.3616206)


## Training Process

- [ ]  Pre-train the model on QTDS/Böck dataset.
- [ ]  Fine-tune on the curated martial-arts + Jingju dataset.
- [ ]  Convert audio to logarithmic spectrograms using:
- [ ]  Hanning window with  11.6 ms hop size [29]
- [ ]  Use Librosa for spectrogram generation.

## training

- [ ]  Train CRNN for frame-wise onset classification using specifications from:
- [ ] “Onset Detection for String Instruments Using Bidirectional Temporal and Convolutional Recurrent Networks” [29].

## Model Evaluation

- [ ]  Use F1-score as primary evaluation metric.
- [ ]  Apply data split: 70% train / 15% validation / 15% test.
- [ ]  Merge detected onsets within 30 ms into a single event.
- [ ]  Count an onset/offset as correct if within ±50 ms of ground truth.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

4.3 Integrate our Segmentation model and instrument classifier to automatically label film clips #9

Training Process

training

Model Evaluation

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

4.3 Integrate our Segmentation model and instrument classifier to automatically label film clips #9

Description

Training Process

training

Model Evaluation

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions