-
Notifications
You must be signed in to change notification settings - Fork 85
Description
Hi, thanks for releasing the great nvidia/AudioSkills dataset!
I’m currently exploring the dataset (e.g., audioskills_xl/AudioSet.json) and noticed that the "id" fields (e.g., "YJRaxh5RfawI", "YFHuxuM-iRo4") don’t seem to correspond to any YouTube IDs in the official AudioSet CSVs (balanced_train_segments.csv, unbalanced_train_segments.csv, eval_segments.csv) from AudioSet.
I checked using:
grep -E "YJRaxh5RfawI|YFHuxuM-iRo4" Audioset/{eval_segments.csv,unbalanced_train_segments.csv,balanced_train_segments.csv}
No matches were found.
My questions:
• Are these IDs derived from original AudioSet YouTube IDs?
• If not, what’s the correct way to align AudioSkills samples with entries from the original AudioSet (for example, to obtain labels or categories)?
• Is there a mapping file between AudioSkills IDs and original AudioSet segment IDs?
Any clarification would be greatly appreciated — thanks again for this excellent work!