tensor mismatch error when load testing whisper #2019
Unanswered
gupta9ankit5
asked this question in
Q&A
Replies: 1 comment
-
You cannot use a model across multiple threads. See the discussion here. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This is during the load testing (using locust) of the whisper model (deployed as flask application) that I am getting an error as below. The strange thing about it, is the error occurs only when the number of users per second is set to greater than 1. There is no error at all when number of users is 1.
Error:
RuntimeError: Expected size for first two dimensions of batch2 tensor to be: [12, 1198] but got: [12, 598].
Detailed stack trace is as follows:
[2024-02-13 11:12:44,102] ERROR in app: Exception on / [POST]
Traceback (most recent call last):
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/flask/app.py", line 1463, in wsgi_app
response = self.full_dispatch_request()
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/flask/app.py", line 872, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/flask/app.py", line 870, in full_dispatch_request
rv = self.dispatch_request()
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/flask/app.py", line 855, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]
File "/Users/user101/Documents/whisper_latest_version_LT/whisp.py", line 44, in transcribe
response = get_transcriptions(audios, whisper_variant)
File "/Users/user101/Documents/whisper_latest_version_LT/whisp.py", line 29, in get_transcriptions
transcriptions = [small.transcribe(data, fp16=False)['text'] for data in datas]
File "/Users/user101/Documents/whisper_latest_version_LT/whisp.py", line 29, in
transcriptions = [small.transcribe(data, fp16=False)['text'] for data in datas]
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/whisper/transcribe.py", line 240, in transcribe
result: DecodingResult = decode_with_fallback(mel_segment)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/whisper/transcribe.py", line 170, in decode_with_fallback
decode_result = model.decode(segment, options)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/whisper/decoding.py", line 824, in decode
result = DecodingTask(model, options).run(mel)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/whisper/decoding.py", line 737, in run
tokens, sum_logprobs, no_speech_probs = self._main_loop(audio_features, tokens)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/whisper/decoding.py", line 687, in _main_loop
logits = self.inference.logits(tokens, audio_features)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/whisper/decoding.py", line 163, in logits
return self.model.decoder(tokens, audio_features, kv_cache=self.kv_cache)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/whisper/model.py", line 211, in forward
x = block(x, xa, mask=self.mask, kv_cache=kv_cache)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/whisper/model.py", line 136, in forward
x = x + self.attn(self.attn_ln(x), mask=mask, kv_cache=kv_cache)[0]
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/whisper/model.py", line 90, in forward
wv, qk = self.qkv_attention(q, k, v, mask)
File "/Users/user101/Documents/whisper_latest_version_LT/lib/python3.9/site-packages/whisper/model.py", line 108, in qkv_attention
return (w @ v).permute(0, 2, 1, 3).flatten(start_dim=2), qk.detach()
RuntimeError: Expected size for first two dimensions of batch2 tensor to be: [12, 1198] but got: [12, 598].
Beta Was this translation helpful? Give feedback.
All reactions