Releases: FrenchKrab/datasets-pyannote
Releases · FrenchKrab/datasets-pyannote
Release v0.1.0
This release has not actually been thoroughly tested, but it has been sitting there for some time, so let's release it even if it's not fully baked.
It adds some very common free diarization datasets, current datasets:
- AISHELL-4
- AMI
- AliMeeting
- MSDWild
- MagicData-RAMC
- VoxConverse
- DIHARD-3 (no download since it's paid)
This release also changes the old awkward scripts/ folder to a "proper" package (sorry, more stuff to install). There are now dedicated cli tools to perform most of the preparation operations (i'm not sure how useful it is for others but at least it makes current bash scripts cleaner !).
Things left to do before a v1.0.0 (if it ever comes):
- checking everything actually works,
- moving everything into one big cli file so that the use can do
pyannote_datasets install aishell4, - check that the code is clear enough and has docstrings,
- look into uv