Zero-shot Audio Classification using Whisper #673
jumon
started this conversation in
Show and tell
Replies: 2 comments 5 replies
-
Can we add more classes on it ? |
Beta Was this translation helpful? Give feedback.
5 replies
-
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there!
I have found Whisper to be good at recognizing environmental sounds without fine-tuning, so I wrote a code to conduct zero-shot audio classification using Whisper.
Code: https://github.com/jumon/zac
Demo: https://huggingface.co/spaces/Jumon/whisper-zero-shot-audio-classification
If you are interested, give it a try and let me know what you think.
I have evaluated the code for zero-shot environment sound classification on the ESC-50 dataset, which contains 2000 audio samples from 50 classes (40 samples per class), and it achieved 31.8% accuracy. Since the accuracy of random prediction is 2%, I think the result is not bad.
Beta Was this translation helpful? Give feedback.
All reactions