Out of memory error at start of job despite pytorch allocating 6.60 GB #1899

DrLucky · 2023-12-14T18:32:44Z

DrLucky
Dec 14, 2023

I am running Whisper on arch linux, seeking to use an nvidia 2070super's cuda.
I avoided the cuda mis-match error by downgrading to 12.1 to match pytorch's cuda version.

I am getting an odd error when trying on a test file, immediately I get:

(myenv) [kris@archlinux Ep1]$ whisper --model large-v2 --language French --task translate --device cuda output.mkv Traceback (most recent call last): File "/home/kris/myenv/bin/whisper", line 8, in <module> sys.exit(cli()) ^^^^^ File "/home/kris/myenv/lib/python3.11/site-packages/whisper/transcribe.py", line 458, in cli model = load_model(model_name, device=device, download_root=model_dir) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/kris/myenv/lib/python3.11/site-packages/whisper/__init__.py", line 156, in load_model return model.to(device) ^^^^^^^^^^^^^^^^ File "/home/kris/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1160, in to return self._apply(convert) ^^^^^^^^^^^^^^^^^^^^ File "/home/kris/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 810, in _apply module._apply(fn) File "/home/kris/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 810, in _apply module._apply(fn) File "/home/kris/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 810, in _apply module._apply(fn) [Previous line repeated 2 more times] File "/home/kris/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 833, in _apply param_applied = fn(param) ^^^^^^^^^ File "/home/kris/myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1158, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB. GPU 0 has a total capacty of 7.78 GiB of which 27.19 MiB is free. Including non-PyTorch memory, this process has 7.05 GiB memory in use. Of the allocated memory 6.60 GiB is allocated by PyTorch, and 344.11 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

If I am reading this right, Whisper is trying to allocate memory but pytorch has already allocated most of it?
"Of the allocated memory 6.60 GiB is allocated by PyTorch"
Why isn't Whisper trying to use the memory pytorch has already allocated instead of allocating its own?

glangford · 2023-12-14T18:41:52Z

glangford
Dec 14, 2023

One possibility is that a different process is using gpu memory.

If so, find the process and kill it

You can find the PID of the process from
nvidia-smi

1 reply

DrLucky Dec 14, 2023
Author

The most is xorg at 545mb
nvidia-smi is showing just 1028 of 8192mb vram in use.
I can run it headless, but as nothing else is running yet, I don't know if that will help.
It seems the pytorch allocation is working but whisper isnt using it

glangford · 2023-12-14T19:17:14Z

glangford
Dec 14, 2023

Ok, if you only have 8GB VRAM you don't have enough space to use the large model. The large model requires ~10GB. Try with --model medium?

Model sizes
https://github.com/openai/whisper#available-models-and-languages

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Out of memory error at start of job despite pytorch allocating 6.60 GB #1899

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Out of memory error at start of job despite pytorch allocating 6.60 GB #1899

Uh oh!

DrLucky Dec 14, 2023

Replies: 2 comments · 1 reply

Uh oh!

glangford Dec 14, 2023

Uh oh!

DrLucky Dec 14, 2023 Author

Uh oh!

Uh oh!

glangford Dec 14, 2023

DrLucky
Dec 14, 2023

Replies: 2 comments 1 reply

glangford
Dec 14, 2023

DrLucky Dec 14, 2023
Author

glangford
Dec 14, 2023