Replies: 1 comment
-
half precision is only supported on gpu, you're running on cpu so don't do that |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Issue Description:
I encountered a runtime error while using the Whisper library for audio processing and transcription. The specific error message I received is: "RuntimeError: 'slow_conv2d_cpu' not implemented for 'Half'." This error occurred during the execution of the whisper.decode function.
Steps Taken:
Initially, I attempted to address the issue by converting the input mel tensor to half-precision using the .half() method, but it did not resolve the error.
I also checked the audio source for any potential problems, but I'm unsure if it is the cause of the issue.
Current Status:
I am still experiencing the same error, and I'm uncertain about the underlying cause. It would be greatly appreciated if anyone could provide assistance in resolving this issue or offer guidance on potential solutions.
Additional Information:
To provide further context, here are some additional details:
the versions of the Whisper library:20230314
Audio Source: mp3 72minutes
Include the complete relevant code
_________________________________
import os
import torch
import whisper
fileName = "test"
lang = "ja"
model = whisper.load_model("base")
Load audio
file_path = "/content/drive/MyDrive/audio/test.mp3"
audio = whisper.load_audio(file_path)
audio = whisper.pad_or_trim(audio)
mel = whisper.log_mel_spectrogram(audio)
Change mel spectrogram data type to float32
mel = mel.float()
Move mel spectrogram to the device
device = model.device
mel = mel.to(device)
Convert model to float16 for faster inference
model = model.half()
Output the recognized text
options = whisper.DecodingOptions(language=lang, without_timestamps=True)
result = whisper.decode(model, mel, options)
print(result.text)
Write into a text file
output_dir = "download"
os.makedirs(output_dir, exist_ok=True)
output_file = os.path.join(output_dir, f"{fileName}.txt")
with open(output_file, "w") as f:
f.write(f"▼ Transcription of {fileName}\n")
f.write(result.text)
_________________________________
Error Traceback:
RuntimeError Traceback (most recent call last)
in <cell line: 28>()
26 # Output the recognized text
27 options = whisper.DecodingOptions(language=lang, without_timestamps=True)
---> 28 result = whisper.decode(model, mel, options)
29 print(result.text)
30
10 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
307 weight, bias, self.stride,
308 _single(0), self.dilation, self.groups)
--> 309 return F.conv1d(input, weight, bias, self.stride,
310 self.padding, self.dilation, self.groups)
311
RuntimeError: "slow_conv2d_cpu" not implemented for 'Half'
Beta Was this translation helpful? Give feedback.
All reactions