How to Use Whisper for Feature representation? #1837
Unanswered
YiYang-github
asked this question in
Q&A
Replies: 2 comments 2 replies
-
What you need is here: |
Beta Was this translation helpful? Give feedback.
1 reply
-
请问找到办法了吗 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I've noticed that Whisper is capable of speech translation and language identification, but it appears to lack the ability to directly convert audio input into a structured feature representation, like a 1024-dimensional vector. For my project, I am looking to process a dataset of Chinese audio clips, each containing a single word, and I would like to use Whisper to perform initial feature extraction. Could you provide guidance on how this might be achieved, or suggest alternative methods if Whisper isn't suited for this type of feature extraction?
Beta Was this translation helpful? Give feedback.
All reactions