It requires more VRAM when you give bigger beam size is that expected? #396

FurkanGozukara · 2022-10-22T17:50:18Z

FurkanGozukara
Oct 22, 2022

I have found out that bigger beam (10) size increases accuracy

For example by default it spells as GPX, however I am saying JPEG and it correctly spells with beam size 10 : https://www.youtube.com/watch?v=_nKwisL8dTs

However, with bigger beam size (20), i am starting to get not enough memory error on my 12 GB RTX 3060

Is this expected and why happening?

Moreover, what other hyper parameters did you find that increases accuracy?

Answered by shervinemami

Oct 22, 2022

Yes this is expected, they mention it in some of these GitHub discussions that you can fit in smaller RAM by turning off beam_size and best_of (so that you'll be using the Greedy decoder instead of BeamSearch Decoder). Also, using FP16 will use less RAM than FP32. I posted some GDDR measurements for small beam search values in #391 where you can see that beam_size=7 is slightly larger memory requirement than the others, but not by much.

View full answer

shervinemami · 2022-10-22T19:46:30Z

shervinemami
Oct 22, 2022

Yes this is expected, they mention it in some of these GitHub discussions that you can fit in smaller RAM by turning off beam_size and best_of (so that you'll be using the Greedy decoder instead of BeamSearch Decoder). Also, using FP16 will use less RAM than FP32. I posted some GDDR measurements for small beam search values in #391 where you can see that beam_size=7 is slightly larger memory requirement than the others, but not by much.

12 replies

dgoryeo Sep 15, 2023

Hi @FurkanGozukara , I came across your old post on experiemnting with beam size. Did you by any chance arrive to any good balance between bean size, compression ratio, and logprobe?

FurkanGozukara Sep 15, 2023
Author

yes i am suing these settings with V1

string srCommand = @$"cmd /c whisper ""{srExtractMp3Name}"" --model large-v1 --language en --initial_prompt ""Welcome our Youtube channel."" --best_of 10 --beam_size 10 --output_dir ""{srFileDirectory}""";

dgoryeo Sep 16, 2023

Thanks @FurkanGozukara . Am I right to understand that best_of parameter would be ignored because the temperature is zero and beam search active? Admittedly I've had difficulty to understand how beam_size and best_of play together.

FurkanGozukara Sep 16, 2023
Author

i am not sure could be. i did some testing and this yields best for my speech. but of course not enough. after this i do punctuation correction with AI and then turn back that into the VTT

dgoryeo Sep 17, 2023

Thanks @FurkanGozukara.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

It requires more VRAM when you give bigger beam size is that expected? #396

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 12 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

It requires more VRAM when you give bigger beam size is that expected? #396

Uh oh!

Uh oh!

FurkanGozukara Oct 22, 2022

Replies: 1 comment · 12 replies

Uh oh!

shervinemami Oct 22, 2022

Uh oh!

dgoryeo Sep 15, 2023

Uh oh!

FurkanGozukara Sep 15, 2023 Author

Uh oh!

dgoryeo Sep 16, 2023

Uh oh!

FurkanGozukara Sep 16, 2023 Author

Uh oh!

dgoryeo Sep 17, 2023

FurkanGozukara
Oct 22, 2022

Replies: 1 comment 12 replies

shervinemami
Oct 22, 2022

FurkanGozukara Sep 15, 2023
Author

FurkanGozukara Sep 16, 2023
Author