Extremely slow performance compared with Descript #1515

Xi23000 · 2023-07-10T23:23:07Z

Xi23000
Jul 10, 2023

Understanding Descript and Whisper are targeted to different demographics I would like to understand why the abysmal difference in the speed it takes to transcribe the same audio. (30 mins wav file).

I there particular factors that speed up / slow down transcription speed?

whisper audio.wav --model tiny

takes about 20 minutes

Descript takes about 110 secs

jwnacnud · 2023-07-10T23:40:14Z

jwnacnud
Jul 10, 2023

Is there a difference in the quality of the two results?

…

On Mon, Jul 10, 2023 at 5:23 PM Xi23000 ***@***.***> wrote: Understanding Descript and Whisper are targeted to different demographics I would like to understand why the abysmal difference in the speed it takes to transcribe the same audio. (30 mins wav file). I there particular factors that speed up / slow down transcription speed? whisper audio.wav --model tiny takes about 20 minutes Descript takes about 110 secs — Reply to this email directly, view it on GitHub <#1515>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGW5AZHS3Q3HFF4XENIGBLXPSFGVANCNFSM6AAAAAA2FGEFVQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

3 replies

markrmiller Oct 5, 2023

Last I knew, Descript uses Google for transcription, and not to weigh in on whether it's at all related to any differences performance, but I found Descript's transcription quality to be some of the best I'd seen last year. But in my initial tests with out of the box whisper with the large model, whisper kicks the crap out of Descript ;)

ryanheise Oct 6, 2023

Descript was using Rev last year, and since receiving backing from OpenAI at the end of last year, had plans to switch to Whisper. That is in progress with some customers getting Rev, and some customers getting Whisper.

markrmiller Oct 6, 2023

Ah, I was going by some version of this that I had run into, perhaps out of date then: https://cloud.google.com/customers/descript

If they move to whisper I'd be much more likely to subscribe again. Whisper doing much, much better at tech jargon and strong foreign accents.

Tex2002ans · 2023-07-11T05:49:29Z

Tex2002ans
Jul 11, 2023

You may have accidentally run Whisper in CPU mode, which is much slower.

What GPU are you running Whisper on?
Did you correctly install the CUDA Toolkit?
Does your Whisper or pytorch recognize your CUDA device?

If you run whisper --help, you should see a line along the lines of:

--device DEVICE device to use for PyTorch inference (default: cuda)

Where it says default: cuda, that means Whisper recognized your GPU.

(I think, in an install without a recognized GPU, Whisper would say default: cpu there.)

[...] (30 mins wav file).

whisper audio.wav --model tiny

takes about 20 minutes

Descript takes about 110 secs

For example, a 30 minute audio...

On my RTX 3060 on the latest Whisper master:

~10 mins = largev2 model.

If you look at the table speeds given in the Whisper readme:

Size	Relative speed
tiny	~32x
base	~16x
small	~6x
medium	~2x
large	1x

it would take:

~19 seconds = tiny model.

CPU is much, much slower than GPU—and anything beyond small/medium models start taking much longer than real-time on a CPU.

I there particular factors that speed up / slow down transcription speed?

Yes, there are settings that can have a huge effect. For example:

beam_size
best_of

See some benchmarks given back in October 2022:

By default, the commandline version of Whisper defaults to:

beam_size = 5
best_of = 5

If you lower those, you can get huge speedups at the cost of accuracy (and more hallucinations), because Whisper will be checking less possible results against each other.

For some more details on those 2 settings, see:

Much slower via command line tool than in Python? #177 (comment)

Also, there are forks of Whisper, like:

faster-whisper

which lower the precision from fp32 -> fp16 or int8.

This allows major speedups + massively lowered RAM/VRAM usage.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extremely slow performance compared with Descript #1515

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Extremely slow performance compared with Descript #1515

Uh oh!

Xi23000 Jul 10, 2023

Replies: 2 comments · 3 replies

Uh oh!

jwnacnud Jul 10, 2023

Uh oh!

markrmiller Oct 5, 2023

Uh oh!

ryanheise Oct 6, 2023

Uh oh!

markrmiller Oct 6, 2023

Uh oh!

Uh oh!

Tex2002ans Jul 11, 2023

Xi23000
Jul 10, 2023

Replies: 2 comments 3 replies

jwnacnud
Jul 10, 2023

Tex2002ans
Jul 11, 2023