Code of the paper Impact of time and note duration tokenizations on deep learning symbolic music modeling. (ISMIR 2023)
In this work, we analyze the current common tokenization methods and experiment with time and note duration representations. We compare the performance of these two impactful criteria on several tasks, including composer classification, emotion classification, music generation, and sequence representation.
pip install -r requirementsto install requirementssh scripts/download_datasets.shto download the POP909 and EMOPIA datasets;- Download the GiantMIDI dataset and put it in
data/ python scripts/tokenize_datasets.pyto tokenize data and learn BPEpython exp_generation.pyto train generative models and generate resultspython exp_pretrain.pyto pretrain classification and contrastive modelspython exp_cla_finetune.pyto train classification models and test thempython exp_contrastive.pyto train contrastive models and test them