Skip to content

Releases: FrenchKrab/datasets-pyannote

Release v0.1.0

02 Jun 16:02

Choose a tag to compare

This release has not actually been thoroughly tested, but it has been sitting there for some time, so let's release it even if it's not fully baked.

It adds some very common free diarization datasets, current datasets:

  • AISHELL-4
  • AMI
  • AliMeeting
  • MSDWild
  • MagicData-RAMC
  • VoxConverse
  • DIHARD-3 (no download since it's paid)

This release also changes the old awkward scripts/ folder to a "proper" package (sorry, more stuff to install). There are now dedicated cli tools to perform most of the preparation operations (i'm not sure how useful it is for others but at least it makes current bash scripts cleaner !).

Things left to do before a v1.0.0 (if it ever comes):

  • checking everything actually works,
  • moving everything into one big cli file so that the use can do pyannote_datasets install aishell4,
  • check that the code is clear enough and has docstrings,
  • look into uv