Skip to content

1.2 Balance Dataset #6

@hughmancoder

Description

@hughmancoder
  • Curate and expand the dataset to address class imbalance and limitations.
  • More percussion recordings needed
  • Include instrument classes in the project Scope, each with > 600 3s recordings
  • Suona — double-reed horn
  • Erhu — two-string bowed fiddle
  • Pipa — four-string plucked lute
  • Dizi — bamboo flute
  • Guzheng — plucked zither
  • Sheng — mouth-blown free-reed pipe organ
  • Percussion — gongs, drums, and cymbals

Datasets

  • Get CTIS and ChMusic datasets and integrate them into dataset. We can add the files to our instrument_name.json file

Dataset improvements

  • Use data augmentation (mix in Chinese instruments with other instruments) to simulate realistic polyphony.

  • We can do this by taking existing labels and overlaying audio signals

  • Consider multi-label detection (presence/absence) rather than strict “which one dominant”.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

Status

In review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions