How to Use Whisper for Feature representation？ #1837

YiYang-github · 2023-11-24T13:46:57Z

YiYang-github
Nov 24, 2023

I've noticed that Whisper is capable of speech translation and language identification, but it appears to lack the ability to directly convert audio input into a structured feature representation, like a 1024-dimensional vector. For my project, I am looking to process a dataset of Chinese audio clips, each containing a single word, and I would like to use Whisper to perform initial feature extraction. Could you provide guidance on how this might be achieved, or suggest alternative methods if Whisper isn't suited for this type of feature extraction?

EtienneAb3d · 2023-11-24T14:52:48Z

EtienneAb3d
Nov 24, 2023

What you need is here:
https://github.com/facebookresearch/LASER
;-)
PS: works on text, not sound, thus after Whisper use. Knowing that Whisper is probably not really accurate on single-word audio.

1 reply

YiYang-github Nov 25, 2023
Author

Thank you for the LASER recommendation.
However, I'm looking for audio signal representation like Wav2Vec, capturing features like timbre and intonation, not just text transcriptions. Still, I appreciate your suggestion!

perpetual-pj · 2024-04-25T02:55:46Z

perpetual-pj
Apr 25, 2024

请问找到办法了吗

1 reply

superFilicos Aug 7, 2024

找到了

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to Use Whisper for Feature representation？ #1837

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to Use Whisper for Feature representation？ #1837

Uh oh!

YiYang-github Nov 24, 2023

Replies: 2 comments · 2 replies

Uh oh!

Uh oh!

EtienneAb3d Nov 24, 2023

Uh oh!

YiYang-github Nov 25, 2023 Author

Uh oh!

perpetual-pj Apr 25, 2024

Uh oh!

superFilicos Aug 7, 2024

YiYang-github
Nov 24, 2023

Replies: 2 comments 2 replies

EtienneAb3d
Nov 24, 2023

YiYang-github Nov 25, 2023
Author

perpetual-pj
Apr 25, 2024