Looking for complete guide for how to get best performance for transcribing Chinese TV show using Whisper #2245

vlashada · 2024-06-25T21:12:50Z

vlashada
Jun 25, 2024

I have the mp4 files for Peppa Pig with Chinese dub. I am trying to make a transcript using OpenAI Whisper, but I am struggling to get high-quality output. I experience frequent hallucinations or repeats of words.

For preprocessing, I am removing the into and outro of the video. For postprocessing, I am removing lines if they are really long, have very many repeated words or have many non-Hanzi characters. These are my current options for the transcript:

res = model.transcribe(
    filename,
    temperature = (0.0, 0.2, 0.4),
    compression_ratio_threshold = 1.3,
    logprob_threshold = -0.5,
    no_speech_threshold = 0.6,
    condition_on_previous_text = True,
    initial_prompt = "",
    word_timestamps = False,
    prepend_punctuations = "\"'“¿([{-",
    append_punctuations = "\"'.。,，!！?？:：”)]}、",
    language = "Chinese",
    sample_len = None,
    best_of = 20,
    beam_size = 5,
    patience = None,
    length_penalty = 0.1,
    prompt = None,
    prefix = None,
    suppress_tokens = "-1",
    suppress_blank = True,
    without_timestamps = False,
    max_initial_timestamp = 1.0,
    fp16 = True,
)

I have tried looking for a comprehensive guide into what all the parameters are, and what the best strategy is for getting the best performance, but I am struggling to find any good resources. I have a Quadro RTX 6000, so I am easily able to run large-v3.

dgoryeo · 2025-06-16T15:16:59Z

dgoryeo
Jun 16, 2025

Hi @vlashada , I came across this old post. I'm keen to hear if you have found better strategies / settings than above parameter values.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Looking for complete guide for how to get best performance for transcribing Chinese TV show using Whisper #2245

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Looking for complete guide for how to get best performance for transcribing Chinese TV show using Whisper #2245

Uh oh!

vlashada Jun 25, 2024

Replies: 1 comment

Uh oh!

dgoryeo Jun 16, 2025

vlashada
Jun 25, 2024

dgoryeo
Jun 16, 2025