Skip to content

Release v0.1.0

Latest

Choose a tag to compare

@FrenchKrab FrenchKrab released this 02 Jun 16:02
· 1 commit to master since this release

This release has not actually been thoroughly tested, but it has been sitting there for some time, so let's release it even if it's not fully baked.

It adds some very common free diarization datasets, current datasets:

  • AISHELL-4
  • AMI
  • AliMeeting
  • MSDWild
  • MagicData-RAMC
  • VoxConverse
  • DIHARD-3 (no download since it's paid)

This release also changes the old awkward scripts/ folder to a "proper" package (sorry, more stuff to install). There are now dedicated cli tools to perform most of the preparation operations (i'm not sure how useful it is for others but at least it makes current bash scripts cleaner !).

Things left to do before a v1.0.0 (if it ever comes):

  • checking everything actually works,
  • moving everything into one big cli file so that the use can do pyannote_datasets install aishell4,
  • check that the code is clear enough and has docstrings,
  • look into uv