-
Notifications
You must be signed in to change notification settings - Fork 40
Open
Description
I am trying to run your golden data set with default model and parameters with a batch size of ' 1 ' on sequencer-train. sh. In the paper, you have mentioned using K80 GPU which has 24GB of memory. I am using GeForce 2080 Ti which has 12GB of memory. So I am using 2 of GeForce and changed -world_size 2 and -gpu_ranks 0 1 but still getting CUDA out of memory. Could you please guide us on what can be the possible issue?
Traceback (most recent call last):
File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/train.py", line 63, in run
single_main(opt, device_id)
File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/onmt/train_single.py", line 132, in main
model = build_model(model_opt, opt, fields, checkpoint)
File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/onmt/model_builder.py", line 301, in build_model
model = build_base_model(model_opt, fields, use_gpu(opt), checkpoint)
File "/home/anuragswar.yadav/Anurag/chai/lib/OpenNMT-py/onmt/model_builder.py", line 294, in build_base_model
model.to(device)
File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 607, in to
return self._apply(convert)
File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 354, in _apply
module._apply(fn)
File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 354, in _apply
module._apply(fn)
File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 354, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 376, in _apply
param_applied = fn(param)
File "/home/anuragswar.yadav/anaconda3/envs/sequencer/lib/python3.6/site-packages/torch-1.6.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 605, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory
Metadata
Metadata
Assignees
Labels
No labels