Whisper for keyword spotting #1249

minhquoc0712 · 2023-04-17T11:37:59Z

minhquoc0712
Apr 17, 2023

Hi,

If I have a list of keywords, how can I use whisper to get the probability that the audio is the keyword for each word in the list? The direct use of whisper can give the wrong transcription to a similar word to the word in the keyword list, whereas I am only interested in the keyword in the list.

Answered by jongwook

May 5, 2023

If the keywords are very common words so that all of them can be represented as just one token each using the tokenizer, you can compare the probabilities directly by taking softmax of the logits predicted by the model. If some keywords span multiple tokens, it becomes tricker to compare, you can either use the sum of log probability that mathematically makes more sense, or just the average log probability which works better practically, or something in between like the length penalty used in equation 14 of this paper.

View full answer

jongwook · 2023-05-05T09:20:36Z

jongwook
May 5, 2023
Maintainer

If the keywords are very common words so that all of them can be represented as just one token each using the tokenizer, you can compare the probabilities directly by taking softmax of the logits predicted by the model. If some keywords span multiple tokens, it becomes tricker to compare, you can either use the sum of log probability that mathematically makes more sense, or just the average log probability which works better practically, or something in between like the length penalty used in equation 14 of this paper.

0 replies

c-arvind · 2023-09-09T06:12:13Z

c-arvind
Sep 9, 2023

Hi I know this might be a stupid question but how exactly do we extract the softmax layer to get logits predicted by the model?

0 replies

Vaish1795 · 2024-06-30T13:38:29Z

Vaish1795
Jun 30, 2024

Hi,

If I have a list of keywords, how can I use whisper to get the probability that the audio is the keyword for each word in the list? The direct use of whisper can give the wrong transcription to a similar word to the word in the keyword list, whereas I am only interested in the keyword in the list.

Can you please tell me how did you do this? I am fairly new to LLMs. I would like to use whisper for key word spotting in my personal project.

0 replies

rahulbansal16 · 2025-03-24T07:59:39Z

rahulbansal16
Mar 24, 2025

Does putting the keywords directly into the prompt work? I noticed that if the single word is there, it works. if it is more than one word, it doesn't work properly.

The single word messes up with other similar-sounding words. For example, I added "Artem" into the prompt, and it messes up with "startup" now.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Whisper for keyword spotting #1249

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Whisper for keyword spotting #1249

Uh oh!

minhquoc0712 Apr 17, 2023

Replies: 4 comments

Uh oh!

jongwook May 5, 2023 Maintainer

Uh oh!

c-arvind Sep 9, 2023

Uh oh!

Vaish1795 Jun 30, 2024

Uh oh!

Uh oh!

rahulbansal16 Mar 24, 2025

minhquoc0712
Apr 17, 2023

jongwook
May 5, 2023
Maintainer

c-arvind
Sep 9, 2023

Vaish1795
Jun 30, 2024

rahulbansal16
Mar 24, 2025