Skip to content
Discussion options

You must be logged in to vote

It's not trivial, but it would be simpler if every word in your list can be represented in a single token, like:

from whisper.tokenizer import get_tokenizer

tokenizer = get_tokenizer(multilingual=False)

tokenizer.encode(" cat")  # [3797]
tokenizer.encode(" dog")  # [3290]
tokenizer.encode(" mouse")  # [10211]

At this point it become just a multi-class classification. Otherwise, during decoding, you could add another instance of LogitFilter which would block all token sequences except the ones you allow:

whisper/whisper/decoding.py

Lines 367 to 380 in 0f39c89

class LogitFilter:
def apply(self, logits: Tensor, tokens: Tensor) -> None:
"""Apply any filtering or maskin…

Replies: 3 comments 5 replies

Comment options

You must be logged in to vote
5 replies
@pranabenator
Comment options

@tilsen
Comment options

@jongwook
Comment options

@tilsen
Comment options

@LeoArtaza
Comment options

Answer selected by jongwook
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
6 participants