-
For the water example, I use DP_INTERFACE_PREC=low to train a model. When I use lmp_mpi for inference, it raise type dismatch exception. How can I fix it? # training
cd examples/water/se_e2_a
export DP_INTERFACE_PREC=low
dp train input.json # use example config
...
DEEPMD INFO build float prec: float
...
cp graph.pb ../lmp/frozen_model.pb
# inference
cd examples/water/lmp
../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi -echo screen < in.lammps # use example config
...
build float prec: double
...
2021-10-14 10:01:44.853361: F tensorflow/core/framework/tensor.cc:636] Check failed: dtype() == expected_dtype (1 vs. 2) double expected, got float
[34a1cc5eb3b4:237092] *** Process received signal ***
[34a1cc5eb3b4:237092] Signal: Aborted (6)
[34a1cc5eb3b4:237092] Signal code: (-6)
[34a1cc5eb3b4:237092] [ 0] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fbf4ea303c0]
[34a1cc5eb3b4:237092] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fbf4e86f18b]
[34a1cc5eb3b4:237092] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fbf4e84e859]
[34a1cc5eb3b4:237092] [ 3] /usr/local/lib/python3.8/dist-packages/tensorflow_core/libtensorflow_cc.so.1(+0xbe69fe8)[0x7fbf5ca90fe8]
[34a1cc5eb3b4:237092] [ 4] /usr/local/lib/python3.8/dist-packages/tensorflow_core/libtensorflow_framework.so.1(+0x7575d6)[0x7fbf4f5fb5d6]
[34a1cc5eb3b4:237092] [ 5] /codes/opensource/deepmd-kit/dp/lib/libdeepmd_cc.so(_ZN6deepmd18session_get_scalarIdEET_PN10tensorflow7SessionESsSs+0x2f3)[0x7fbfa7603af3]
[34a1cc5eb3b4:237092] [ 6] /codes/opensource/deepmd-kit/dp/lib/libdeepmd_cc.so(_ZNK6deepmd7DeepPot10get_scalarIdEET_RKSs+0x5b)[0x7fbfa75f23ab]
[34a1cc5eb3b4:237092] [ 7] /codes/opensource/deepmd-kit/dp/lib/libdeepmd_cc.so(_ZN6deepmd7DeepPot4initERKSsRKiS2_+0x132)[0x7fbfa75ece82]
[34a1cc5eb3b4:237092] [ 8] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0x255608)[0x5638ea9bb608]
[34a1cc5eb3b4:237092] [ 9] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0xdaa25)[0x5638ea840a25]
[34a1cc5eb3b4:237092] [10] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0xdfc84)[0x5638ea845c84]
[34a1cc5eb3b4:237092] [11] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0xdfe75)[0x5638ea845e75]
[34a1cc5eb3b4:237092] [12] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0xd34ed)[0x5638ea8394ed]
[34a1cc5eb3b4:237092] [13] /usr/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fbf4e8500b3]
[34a1cc5eb3b4:237092] [14] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0xd419e)[0x5638ea83a19e]
[34a1cc5eb3b4:237092] *** End of error message ***
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 7 replies
-
fix. update CMakeLists.txt |
Beta Was this translation helpful? Give feedback.
-
@njzjz is the lmp_mpi built only with double precision? |
Beta Was this translation helpful? Give feedback.
-
On a similar note, I am also trying to use low precision, and followed the suggestions, with some success (deepmd2.1.3). I get far but get the following error right before the LAMMPS MD loop:
See output.txt for more details. Please note that high precision works, but is much slower for the A40 GPU. Any suggestions are much appreciated! |
Beta Was this translation helpful? Give feedback.
fix. update CMakeLists.txt
set(HIGH_PREC_DEF "HIGH_PREC")
=>set(HIGH_PREC_DEF "LOW_PREC")
, reinstall dp and lmp_mpi.