LAMMPS: Resource exhausted on Tesla P100 GPU #294
Answered
by
jameswind
LiangMD-BGI
asked this question in
Q&A
Replies: 1 comment
-
We will have a good method for this soon. But for now, you can only reduce
memory issues by using smaller network sizes or using multiple GPUs with
the mpi version of LAMMPS.
…On Tue, Nov 17, 2020 at 5:07 AM LiangMD-BGI ***@***.***> wrote:
Hello All,
I run the GPU-based LAMMPS with deepmd potential on a single Tesla P100
GPU (16 GB Memory Capacity).
My system contains 21600 atoms. I got an error showing that memory
resource is exhausted. Is there a method to reduce memory usage?
The details of the error is shown as follows:
2020-11-16 21:54:46.278230: W
tensorflow/core/common_runtime/bfc_allocator.cc:429] *_____****
*_****__________****___*
2020-11-16 21:54:46.278259: W tensorflow/core/framework/op_kernel.cc:1655]
OP_REQUIRES failed at concat_op.cc:153 : Resource exhausted: OOM when
allocating tensor with shape[10800,200,100] and type double on
/job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Resource exhausted: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with
shape[10800,200,100] and type double on
/job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node filter_type_1/concat_4}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current allocation
info.
[[o_force/_27]]
Hint: If you want to see a list of allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current allocation
info.
(1) Resource exhausted: OOM when allocating tensor with
shape[10800,200,100] and type double on
/job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node filter_type_1/concat_4}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current allocation
info.
Many thanks,
Liang
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<https://github.com/deepmodeling/deepmd-kit/issues/294>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEJ6DC63UVPXPBKFGD5Y6DDSQGICLANCNFSM4TXWNJDA>
.
|
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
njzjz
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello All,
I run the GPU-based LAMMPS with deepmd potential on a single Tesla P100 GPU (16 GB Memory Capacity).
My system contains 21600 atoms. I got an error showing that memory resource is exhausted. Is there a method to reduce memory usage?
The details of the error is shown as follows:
2020-11-16 21:54:46.278230: W tensorflow/core/common_runtime/bfc_allocator.cc:429] _____***_****__________****___
2020-11-16 21:54:46.278259: W tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at concat_op.cc:153 : Resource exhausted: OOM when allocating tensor with shape[10800,200,100] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Resource exhausted: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[10800,200,100] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node filter_type_1/concat_4}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[10800,200,100] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node filter_type_1/concat_4}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Many thanks,
Liang
Beta Was this translation helpful? Give feedback.
All reactions