Replies: 1 comment 1 reply
-
We'll track the issue in #4167 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Dear Developers,
I am using DeePMD-kit v3.0.0b3 for pretraining and fine-tuning with DPA-2. The software was installed offline with CUDA 11.8, which matches my CUDA version. Both the pretraining and fine-tuning processes completed successfully, and molecular dynamics simulations with ASE run without issues. However, when I attempt to run LAMMPS with a frozen model file (model.pth) using the command lmp -in in.lmp, I encounter a problem, which also exists in the version that I compiled myself. Could you please help me resolve this issue? Thank you for your assistance.
Here are all the relevant files:
link: https://pan.baidu.com/s/1dfFwZhzANTwI70Pf7We5Hg
extract code: a88t
The output error message of lmp -in in.lmp:
WARNING: There was an error initializing an OpenFabrics device.
Local host: xc06n08
Local device: mlx5_0
LAMMPS (2 Aug 2023)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98)
using 1 OpenMP thread(s) per MPI task
DeePMD-kit: Successfully load libcudart.so.12
2024-09-02 01:44:51.820682: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-02 01:44:51.820809: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-02 01:44:51.821687: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
Loaded 1 plugins from /share/home/yjli/apps/dp-v300b3-cuda124/lib/deepmd_lmp
Reading data file ...
orthogonal box = (0 0 0) to (50 50 50)
1 by 1 by 1 MPI processor grid
reading atoms ...
6 atoms
Finding 1-2 1-3 1-4 neighbors ...
special bond factors lj: 0 0 0
special bond factors coul: 0 0 0
0 = max # of 1-2 neighbors
0 = max # of 1-3 neighbors
0 = max # of 1-4 neighbors
1 = max # of special neighbors
special bonds CPU = 0.000 seconds
read_data CPU = 0.003 seconds
DeePMD-kit WARNING: Environmental variable DP_INTRA_OP_PARALLELISM_THREADS is not set. Tune DP_INTRA_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable DP_INTER_OP_PARALLELISM_THREADS is not set. Tune DP_INTER_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
Summary of lammps deepmd module ...
CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
Your simulation uses code contributions which should be cited:
The log file lists these citations in BibTeX format.
CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
Generated 0 of 1 mixed pair_coeff terms from geometric mixing rule
Neighbor list info ...
update: every = 2 steps, delay = 10 steps, check = yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 11
ghost atom cutoff = 11
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair deepmd, perpetual
attributes: full, newton on
pair build: full/nsq
stencil: none
bin: none
WARNING: Proc sub-domain size < neighbor skin, could lead to lost atoms (src/domain.cpp:966)
Setting up Verlet run ...
Unit style : metal
Current step : 0
Time step : 0.0001
WARNING: Communication cutoff adjusted to 11 (src/comm.cpp:732)
ERROR on proc 0: DeePMD-kit C API Error: DeePMD-kit Error: DeePMD-kit PyTorch backend error: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/deepmd/pt/model/model/transform_output.py", line 156, in forward_lower
vvi = split_vv1[_44]
svvi = split_svv1[_44]
_45 = _36(vvi, svvi, coord_ext, do_virial, do_atomic_virial, create_graph, )
Beta Was this translation helpful? Give feedback.
All reactions