The Transformer, a deep learning architecture, has shown exceptional adaptability across fields, including music information retrieval (MIR). Transformers excel at capturing global, long-range dependencies in sequences, which is valuable for tracking rhythmic patterns over time. Temporal Convolutional Networks (TCNs), with their dilated convolutions, are effective at processing local, temporal patterns with reduced complexity. Combining these complementary characteristics, global sequence modeling from Transformers and local temporal detail from TCNs enhances beat tracking while reducing the model’s overall complexity.
0 commit comments