Whisper downloads the "medium.en" model repeatedly #1883

T145 · 2023-12-07T17:48:37Z

T145
Dec 7, 2023

I have a small Python script that's translating a batch of WAV files, and recently made some improvements to speed it up. The following has remained unchanged though:

model = whisper.load_model('medium.en') # detects the applicable device automatically
text = model.transcribe(
	audio=wav,
	fp16=has_cuda,
	language='en',
	condition_on_previous_text=False,
	verbose=True
)['text'].strip()

I'll get warnings about the SHA256 checksum not matching and it downloads the model over and over again, until eventually crashing with this error:

Traceback (most recent call last):
  File "{censored}\build.py", line 206, in <module>
    text = model.transcribe(
           ^^^^^^^^^^^^^^^^^
  File "{censored}\Lib\site-packages\whisper\transcribe.py", line 240, in transcribe
    result: DecodingResult = decode_with_fallback(mel_segment)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "{censored}\Lib\site-packages\whisper\transcribe.py", line 170, in decode_with_fallback
    decode_result = model.decode(segment, options)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "{censored}\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "{censored}\Lib\site-packages\whisper\decoding.py", line 824, in decode
    result = DecodingTask(model, options).run(mel)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "{censored}\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "{censored}\Lib\site-packages\whisper\decoding.py", line 737, in run
    tokens, sum_logprobs, no_speech_probs = self._main_loop(audio_features, tokens)
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "{censored}\Lib\site-packages\whisper\decoding.py", line 703, in _main_loop
    tokens, completed = self.decoder.update(tokens, logits, sum_logprobs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "{censored}\Lib\site-packages\whisper\decoding.py", line 283, in update
    next_tokens = Categorical(logits=logits / self.temperature).sample()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "{censored}\Lib\site-packages\torch\distributions\categorical.py", line 70, in __init__
    super().__init__(batch_shape, validate_args=validate_args)
  File "{censored}\Lib\site-packages\torch\distributions\distribution.py", line 68, in __init__
    raise ValueError(
ValueError: Expected parameter logits (Tensor of shape (1, 51864)) of distribution Categorical(logits: torch.Size([1, 51864])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan]], device='cuda:0')

I'm using CUDA 12.1, CUDANN 12.x, and Whisper v20231117. My old program used CUDA 11 so maybe that has something to do with it? Also if there are better parameters I could give the model or there's a better way to load the model and transcribe text please let me know!

Answered by T145

Dec 8, 2023

I changed the download_root to be a directory one level above the CWD, so it shouldn't be a problem. I've tried doing this:

model = whisper.load_model(
	name=get_whisper_model(duration),
	device='cuda',
	download_root=data,
	in_memory=True
)
text = model.transcribe(
	initial_prompt=prompt,
	audio=wav,
	fp16=has_cuda,
	language='en',
	#condition_on_previous_text=False,
	verbose=True,
	#word_timestamps=True
)['text'].strip()

And it's still happening. I've been able to fix it, but in order to do so I downloaded the files manually, then assigned the proper path as the name parameter. If you want to download the model one time, first call load_model effectively like a void statement. Then load…

View full answer

glangford · 2023-12-07T18:06:38Z

glangford
Dec 7, 2023

Suggest a quick test with a single file using the command line and --model small to see if that works. That will use cpu by default. Then repeat with --device cuda.

Some relevant past discussions here:

Model has been downloaded but the SHA256 checksum does not not match. Please retry loading the model. #1218
Model has been downloaded but the SHA256 checksum does not not match. Please retry loading the model. #1027

7 replies

glangford Dec 7, 2023

ssl.SSLError: [SSL] record layer failure (_ssl.c:2580)

p.s. this is obviously OpenSSL complaining so perhaps a certificates issue or interference from an intermediary (?)

T145 Dec 7, 2023
Author

I do have a firewall, however as mentioned before I've triaged the connections and everything looks good on my end. Downloading the files isn't the problem, it's the fact that Whisper keeps trying to download the model over and over. From what I can see in the _download method, it checks to see if the model is available and tries to download it rather than checking to see if that model is physically available first and leaving well enough alone before trying to download. IMO the os.path.isfile(name) check needs to happen before name in _MODELS.

glangford Dec 7, 2023

Is it a caching/permissions issue? Whisper by default would ordinarily save to ~/.cache/whisper - or more precisely

if download_root is None:
        default = os.path.join(os.path.expanduser("~"), ".cache")
        download_root = os.path.join(os.getenv("XDG_CACHE_HOME", default), "whisper")

T145 Dec 8, 2023
Author

I changed the download_root to be a directory one level above the CWD, so it shouldn't be a problem. I've tried doing this:

model = whisper.load_model(
	name=get_whisper_model(duration),
	device='cuda',
	download_root=data,
	in_memory=True
)
text = model.transcribe(
	initial_prompt=prompt,
	audio=wav,
	fp16=has_cuda,
	language='en',
	#condition_on_previous_text=False,
	verbose=True,
	#word_timestamps=True
)['text'].strip()

And it's still happening. I've been able to fix it, but in order to do so I downloaded the files manually, then assigned the proper path as the name parameter. If you want to download the model one time, first call load_model effectively like a void statement. Then load that pth file from the download path and you're good to go.

Answer selected by T145

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Whisper downloads the "medium.en" model repeatedly #1883

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 7 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Whisper downloads the "medium.en" model repeatedly #1883

Uh oh!

T145 Dec 7, 2023

Replies: 1 comment · 7 replies

Uh oh!

glangford Dec 7, 2023

Uh oh!

Uh oh!

glangford Dec 7, 2023

Uh oh!

T145 Dec 7, 2023 Author

Uh oh!

Uh oh!

glangford Dec 7, 2023

Uh oh!

T145 Dec 8, 2023 Author

T145
Dec 7, 2023

Replies: 1 comment 7 replies

glangford
Dec 7, 2023

T145 Dec 7, 2023
Author

T145 Dec 8, 2023
Author