A segmentation fault is reported during DeepMD simulation using LAMMPS #2868
Unanswered
21al07se09t
asked this question in
Q&A
Replies: 3 comments 1 reply
-
I would appreciate your providing these files to help us reproduce. |
Beta Was this translation helpful? Give feedback.
0 replies
-
I think the error report mentioned the cause of the crash: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected |
Beta Was this translation helpful? Give feedback.
1 reply
-
Has the virtual environment been activated? Is the TensorFlow package installed in the environment?
| |
冯泰熙
|
|
***@***.***
|
---- Replied Message ----
| From | ***@***.***> |
| Date | 9/29/2023 09:18 |
| To | ***@***.***> |
| Cc | ***@***.***> ,
***@***.***> |
| Subject | Re: [deepmodeling/deepmd-kit] A segmentation fault is reported during DeepMD simulation using LAMMPS (Discussion #2868) |
Thanks for your response! I checked the outputs from the normal running tasks, they also show such a error.
As I highlight below:
/opt/gridview/slurm/spool/slurmd/job2139458/slurm_script: /usr/bin/modulecmd: No such file or directory
/opt/gridview/slurm/spool/slurmd/job2139458/slurm_script: /usr/bin/modulecmd: No such file or directory
DeePMD-kit WARNING: Environmental variable TF_INTRA_OP_PARALLELISM_THREADS is not set. Tune TF_INTRA_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable TF_INTER_OP_PARALLELISM_THREADS is not set. Tune TF_INTER_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable TF_INTRA_OP_PARALLELISM_THREADS is not set. Tune TF_INTRA_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable TF_INTER_OP_PARALLELISM_THREADS is not set. Tune TF_INTER_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit: Successfully load libcudart.so
2023-09-20 14:43:17.678675: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-20 14:43:17.684037: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2023-09-20 14:43:17.684066: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: node27
2023-09-20 14:43:17.684073: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: node27
2023-09-20 14:43:17.684102: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 525.105.17
2023-09-20 14:43:17.684126: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 525.105.17
2023-09-20 14:43:17.684133: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:310] kernel version seems to match DSO: 525.105.17
2023-09-20 14:43:17.684174: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2023-09-20 14:43:17.741581: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Dear all,
I'm trying to perform deepmd simulations with the model developed in the paper PHYSICAL REVIEW LETTERS 126, 236001 (2021).
In my system, there are more than 100000 water molecules, and a segmentation fault is reported:
I'm not sure if this error is due to an incorrect input setting or the excessive number of particles (I can perform simulations with 90000 water molecules),
and below is my input file:
Thanks for any help!
Beta Was this translation helpful? Give feedback.
All reactions