Low precision model in lammps for inference #1216

leonf88 · 2021-10-14T10:07:19Z

leonf88
Oct 14, 2021

For the water example, I use DP_INTERFACE_PREC=low to train a model. When I use lmp_mpi for inference, it raise type dismatch exception. How can I fix it?

# training
cd examples/water/se_e2_a
export DP_INTERFACE_PREC=low
dp train input.json # use example config
...
DEEPMD INFO    build float prec:     float
...
cp graph.pb ../lmp/frozen_model.pb
# inference
cd examples/water/lmp
../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi -echo screen < in.lammps  # use example config
...
  build float prec:   double
...
2021-10-14 10:01:44.853361: F tensorflow/core/framework/tensor.cc:636] Check failed: dtype() == expected_dtype (1 vs. 2) double expected, got float
[34a1cc5eb3b4:237092] *** Process received signal ***
[34a1cc5eb3b4:237092] Signal: Aborted (6)
[34a1cc5eb3b4:237092] Signal code:  (-6)
[34a1cc5eb3b4:237092] [ 0] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fbf4ea303c0]
[34a1cc5eb3b4:237092] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fbf4e86f18b]
[34a1cc5eb3b4:237092] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fbf4e84e859]
[34a1cc5eb3b4:237092] [ 3] /usr/local/lib/python3.8/dist-packages/tensorflow_core/libtensorflow_cc.so.1(+0xbe69fe8)[0x7fbf5ca90fe8]
[34a1cc5eb3b4:237092] [ 4] /usr/local/lib/python3.8/dist-packages/tensorflow_core/libtensorflow_framework.so.1(+0x7575d6)[0x7fbf4f5fb5d6]
[34a1cc5eb3b4:237092] [ 5] /codes/opensource/deepmd-kit/dp/lib/libdeepmd_cc.so(_ZN6deepmd18session_get_scalarIdEET_PN10tensorflow7SessionESsSs+0x2f3)[0x7fbfa7603af3]
[34a1cc5eb3b4:237092] [ 6] /codes/opensource/deepmd-kit/dp/lib/libdeepmd_cc.so(_ZNK6deepmd7DeepPot10get_scalarIdEET_RKSs+0x5b)[0x7fbfa75f23ab]
[34a1cc5eb3b4:237092] [ 7] /codes/opensource/deepmd-kit/dp/lib/libdeepmd_cc.so(_ZN6deepmd7DeepPot4initERKSsRKiS2_+0x132)[0x7fbfa75ece82]
[34a1cc5eb3b4:237092] [ 8] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0x255608)[0x5638ea9bb608]
[34a1cc5eb3b4:237092] [ 9] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0xdaa25)[0x5638ea840a25]
[34a1cc5eb3b4:237092] [10] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0xdfc84)[0x5638ea845c84]
[34a1cc5eb3b4:237092] [11] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0xdfe75)[0x5638ea845e75]
[34a1cc5eb3b4:237092] [12] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0xd34ed)[0x5638ea8394ed]
[34a1cc5eb3b4:237092] [13] /usr/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fbf4e8500b3]
[34a1cc5eb3b4:237092] [14] ../../../source/3rdparty/lammps-stable_29Oct2020/src/lmp_mpi(+0xd419e)[0x5638ea83a19e]
[34a1cc5eb3b4:237092] *** End of error message ***

Answered by leonf88

Oct 14, 2021

fix. update CMakeLists.txt set(HIGH_PREC_DEF "HIGH_PREC") => set(HIGH_PREC_DEF "LOW_PREC"), reinstall dp and lmp_mpi.

View full answer

leonf88 · 2021-10-14T12:28:30Z

leonf88
Oct 14, 2021
Author

fix. update CMakeLists.txt set(HIGH_PREC_DEF "HIGH_PREC") => set(HIGH_PREC_DEF "LOW_PREC"), reinstall dp and lmp_mpi.

0 replies

wanghan-iapcm · 2021-10-15T00:02:16Z

wanghan-iapcm
Oct 15, 2021
Maintainer

@njzjz is the lmp_mpi built only with double precision?

2 replies

njzjz Oct 15, 2021
Maintainer

https://docs.deepmodeling.org/projects/deepmd/en/v2.0.2/install/install-lammps.html

If you need low precision version, move env_low.sh to env.sh in the directory.

leonf88 Oct 15, 2021
Author

yes, it works. But, it needs to delele Makefile.package before recompile lammps or totally compile lammps from scratch.
Thanks!

sigbjobo · 2022-08-03T00:26:14Z

sigbjobo
Aug 3, 2022

On a similar note, I am also trying to use low precision, and followed the suggestions, with some success (deepmd2.1.3). I get far but get the following error right before the LAMMPS MD loop:
2022-08-02 19:09:37.313538: F tensorflow/core/framework/tensor.cc:718] Check failed: dtype() == expected_dtype (2 vs. 1) float expected, got double
Is this due to:

That I trained with a different precision (double precision)
How I compiled tensorflow

See output.txt for more details. Please note that high precision works, but is much slower for the A40 GPU. Any suggestions are much appreciated!

5 replies

njzjz Aug 3, 2022
Maintainer

You have to train the model with the single precision. The precision is defined in the model.

sigbjobo Aug 3, 2022

I understand, so there is no way of converting a double precision model into single precision then (like with dp freeze).

wanghan-iapcm Aug 4, 2022
Maintainer

dp transform will convert the model.

sigbjobo Aug 4, 2022

dp transform will convert the model.

dp: error: argument command: invalid choice: 'transform' (choose from 'config', 'transfer', 'train', 'freeze', 'test', 'compress', 'doc-train-input', 'model-devi', 'convert-from', 'neighbor-stat', 'train-nvnmd')
Did you perhaps refer to transfer? And if so, which arguments should I use?

wanghan-iapcm Aug 5, 2022
Maintainer

dp transfer https://docs.deepmodeling.com/projects/deepmd/en/v2.1.3/cli.html#transfer

Low precision model in lammps for inference #1216

Uh oh!

Uh oh!

leonf88 Oct 14, 2021

Replies: 3 comments · 7 replies

Uh oh!

leonf88 Oct 14, 2021 Author

Uh oh!

Uh oh!

wanghan-iapcm Oct 15, 2021 Maintainer

Uh oh!

njzjz Oct 15, 2021 Maintainer

Uh oh!

leonf88 Oct 15, 2021 Author

Uh oh!

Uh oh!

sigbjobo Aug 3, 2022

Uh oh!

njzjz Aug 3, 2022 Maintainer

Uh oh!

sigbjobo Aug 3, 2022

Uh oh!

wanghan-iapcm Aug 4, 2022 Maintainer

Uh oh!

sigbjobo Aug 4, 2022

Uh oh!

wanghan-iapcm Aug 5, 2022 Maintainer

leonf88
Oct 14, 2021

Replies: 3 comments 7 replies

leonf88
Oct 14, 2021
Author

wanghan-iapcm
Oct 15, 2021
Maintainer

njzjz Oct 15, 2021
Maintainer

leonf88 Oct 15, 2021
Author

sigbjobo
Aug 3, 2022

njzjz Aug 3, 2022
Maintainer

wanghan-iapcm Aug 4, 2022
Maintainer

wanghan-iapcm Aug 5, 2022
Maintainer