Skip to content

Commit 7faa186

Browse files
author
Han Wang
committed
smooth version of deepmd. better mixed systems. add mpi support in lammps. add small box support.
1 parent 21fb516 commit 7faa186

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+3832
-529
lines changed

README.md

Lines changed: 82 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@
1717
- [License](#license)
1818

1919
# Install DeePMD-kit
20-
The installation of the DeePMD-kit is lengthy, but do not be panic. Just follow step by step. Wish you good luck..
20+
The installation of the DeePMD-kit is lengthy, but do not be panic. Just follow step by step. Wish you good luck..
21+
22+
A docker for installing the DeePMD-kit on CentOS 7 is available [here](https://github.com/TimChen314/deepmd-kit_docker).
2123

2224
## Install tensorflow's Python interface
2325
There are two ways of installing the Python interface of tensorflow, either [using google's binary](https://www.tensorflow.org/install/install_linux), or [installing from sources](https://www.tensorflow.org/install/install_sources). When you are using google's binary, do not forget to add the option `-DTF_GOOGLE_BIN=true` when building DeePMD-kit.
@@ -168,12 +170,12 @@ dp_frz dp_ipi dp_mdnn dp_test dp_train
168170
```
169171

170172
## Install Lammps' DeePMD-kit module
171-
DeePMD-kit provide module for running serial MD simulation with Lammps. Notice that the parallel running is not support at this moment. Now make the DeePMD-kit module for lammps.
173+
DeePMD-kit provide module for running MD simulation with Lammps. Now make the DeePMD-kit module for lammps.
172174
```bash
173175
cd $deepmd_source_dir/source/build
174176
make lammps
175177
```
176-
If everything works fine, DeePMD-kit will generate a module called `USER-DEEPMD` in the `build` directory. Now download your favorite Lammps code, and uncompress it (I assume that you have downloaded the tar `lammps-stable.tar.gz`)
178+
DeePMD-kit will generate a module called `USER-DEEPMD` in the `build` directory. Now download your favorite Lammps code, and uncompress it (I assume that you have downloaded the tar `lammps-stable.tar.gz`)
177179
```bash
178180
cd /some/workspace
179181
tar xf lammps-stable.tar.gz
@@ -186,10 +188,16 @@ cp -r $deepmd_source_dir/source/build/USER-DEEPMD .
186188
Now build Lammps
187189
```bash
188190
make yes-user-deepmd
189-
make serial -j4
191+
make mpi -j4
192+
```
193+
The option `-j4` means using 4 processes in parallel. You may want to be use a different number according to your hardware.
194+
195+
If everything works fine, you will end up with an executable `lmp_mpi`.
196+
197+
The DeePMD-kit module can be removed from Lammps source code by
198+
```bash
199+
make no-user-deepmd
190200
```
191-
The option `-j4` means using 4 processes in parallel. You may want to be use a different number according to your hardware. If everything works fine, you will end up with an executable
192-
`lmp_serial`.
193201

194202
# Use DeePMD-kit
195203
In this text, we will call the deep neural network that is used to represent the interatomic interactions (Deep Potential) the **model**. The typical procedure of using DeePMD-kit is
@@ -238,6 +246,7 @@ box.raw coord.raw energy.raw force.raw set.000 set.001 set.002 type.raw
238246
It generates two sets `set.000`, `set.001` and `set.002`, with each set contains 2000 frames. The last set (`set.002`) is used as testing set, while the rest sets (`set.000` and `set.001`) are used as training sets. One do not need to take care the binary data files in each of the `set.*` directories. The path containing `set.*` and `type.raw` is called a *system*.
239247

240248
## Train a model
249+
### The standard DeePMD model
241250
The method of training is explained in our [DeePMD paper][1]. With the source code we provide a small training dataset taken from 400 frames generated by NVT ab-initio water MD trajectory with 300 frames for training and 100 for testing. [An example training parameter file](./examples/train/water.json) is provided. One can try with the training by
242251
```bash
243252
$ cd $deepmd_source_dir/examples/train/
@@ -247,10 +256,10 @@ $ $deepmd_root/bin/dp_train water.json
247256
```json
248257
{
249258
"_comment": " model parameters",
259+
"use_smooth": false,
250260
"sel_a": [16, 32],
251261
"sel_r": [30, 60],
252-
"rcut_a": -1,
253-
"rcut_r": 6.00,
262+
"rcut": 6.00,
254263
"axis_rule": [0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0],
255264
"_comment": " default rule: []",
256265
"_comment": " user defined rule: for each type provides two axes, ",
@@ -273,10 +282,9 @@ $ $deepmd_root/bin/dp_train water.json
273282
"limit_pref_e": 8,
274283
"start_pref_f": 1000,
275284
"limit_pref_f": 1,
276-
"start_pref_v": 0.02,
277-
"limit_pref_v": 8,
285+
"start_pref_v": 0,
286+
"limit_pref_v": 0,
278287

279-
"num_threads": 4,
280288
"seed": 1,
281289

282290
"_comment": " display and restart",
@@ -286,7 +294,6 @@ $ $deepmd_root/bin/dp_train water.json
286294
"numb_test": 100,
287295
"save_freq": 100,
288296
"save_ckpt": "model.ckpt",
289-
"restart": false,
290297
"load_ckpt": "model.ckpt",
291298
"disp_training": true,
292299
"time_training": true,
@@ -295,7 +302,7 @@ $ $deepmd_root/bin/dp_train water.json
295302
}
296303
```
297304

298-
The option **`rcut_r`** is the cut-off radius for neighbor searching. The `sel_a` and `sel_r` are the maximum selected numbers of fully-local-coordinate and radial-only-coordinate atoms from the neighbor list, respectively. `sel_a + sel_r` should larger than the maximum possible number of neighbors in the cut-off radius. `sel_a` and `sel_r` are vectors, the length of the vectors are same as the number of atom types in the system. `sel_a[i]` and `sel_r[i]` denote the selected number of neighbors of type `i`.
305+
The option **`rcut`** is the cut-off radius for neighbor searching. The `sel_a` and `sel_r` are the maximum selected numbers of fully-local-coordinate and radial-only-coordinate atoms from the neighbor list, respectively. `sel_a + sel_r` should larger than the maximum possible number of neighbors in the cut-off radius. `sel_a` and `sel_r` are vectors, the length of the vectors are same as the number of atom types in the system. `sel_a[i]` and `sel_r[i]` denote the selected number of neighbors of type `i`.
299306

300307
The option **`axis_rule`** specifies how to make the axis for the local coordinate of each atom. For each atom type, 6 integers should be provided. The first three for the first axis, while the last three for the second axis. Within the three integers, the first one specifies if the axis atom is fully-local-coordinated (`0`) or radial-only-coordinated (`1`). The second integer specifies the type of the axis atom. If this number is less than 0, saying `t < 0`, then this axis exclude atom of type `-(t+1)`. If the third integer is, saying `s`, then the axis atom is the `s`th nearest neighbor satisfying the previous two conditions.
301308

@@ -314,15 +321,63 @@ The options **`start_pref_e`**, **`limit_pref_e`**, **`start_pref_f`**, **`limit
314321
```math
315322
w_f(t) = start_pref_f * ( lr(t) / start_lr ) + limit_pref_f * ( 1 - lr(t) / start_lr )
316323
```
324+
Since we do not have virial data, the virial prefactors `start_pref_v` and `limit_pref_v` are set to 0.
317325

318-
The option **`num_threads`** specifies the number of threads used in the training.
319-
320-
The option **`seed`** specifies the random seed for neural network initialization.
326+
The option **`seed`** specifies the random seed for neural network initialization. If not provided, the `seed` will be initialized with `None`.
321327

322328
During the training, the error of the model is tested every **`disp_freq`** batches with **`numb_test`** frames from the last set in the **`systems`** directory on the fly, and the results are output to **`disp_file`**.
323329

324330
Checkpoints will be written to files with prefix **`save_ckpt`** every **`save_freq`** batches. If **`restart`** is set to `true`, then the training will start from the checkpoint named **`load_ckpt`**, rather than from scratch.
325331

332+
Several command line options can be passed to `dp_train`, this can be checked with
333+
```bash
334+
$ $deepmd_root/bin/dp_train --help
335+
```
336+
An explanation will be provided
337+
```
338+
positional arguments:
339+
INPUT the input json database
340+
341+
optional arguments:
342+
-h, --help show this help message and exit
343+
-t INTER_THREADS, --inter-threads INTER_THREADS
344+
With default value 0. Setting the "inter_op_parallelism_threads" key for the tensorflow, the "intra_op_parallelism_threads" will be set by the env variable OMP_NUM_THREADS
345+
--init-model INIT_MODEL
346+
Initialize a model by the provided checkpoint
347+
--restart RESTART Restart the training from the provided checkpoint
348+
```
349+
The keys `intra_op_parallelism_threads` and `inter_op_parallelism_threads` are Tensorflow configurations for multithreading, which are explained [here](https://www.tensorflow.org/performance/performance_guide#optimizing_for_cpu). Skipping `-t` and `OMP_NUM_THREADS` leads to the default setting of these keys in the Tensorflow.
350+
351+
**`--init-model model.ckpt`**, for example, initializes the model training with an existing model that is stored in the checkpoint `model.ckpt`, the network architectures should match.
352+
353+
**`--restart model.ckpt`**, continues the training from the checkpoint `model.ckpt`.
354+
355+
### The smooth DeePMD model
356+
The smooth version of DeePMD can be trained by the DeePMD-kit. [An example training parameter file](./examples/train/water_smth.json) is provided. One can try with the training by
357+
```bash
358+
$ cd $deepmd_source_dir/examples/train/
359+
$ $deepmd_root/bin/dp_train water_smth.json
360+
```
361+
The difference between the standard and smooth DeePMD models lies in the model parameters:
362+
```json
363+
"use_smooth": true,
364+
"sel_a": [46, 92],
365+
"rcut_smth": 5.80,
366+
"rcut": 6.00,
367+
"filter_neuron": [25, 50, 100],
368+
"filter_resnet_dt": false,
369+
"n_axis_neuron": 16,
370+
"n_neuron": [240, 240, 240],
371+
"resnet_dt": true,
372+
```
373+
The `sel_r` option is skipped by the smooth version and the model use fully-local-coordinate for all neighboring atoms. The `sel_a` should larger than the maximum possible number of neighbors in the cut-off radius `rcut`.
374+
375+
The descriptors will decay smoothly from **`rcut_smth`** to the cutoff radius `rcut`.
376+
377+
**`filter_neuron`** provides the size of the filter network (also called local-embedding network). If the size of the next layer is the same or twice as the previous layer, then a skip connection is build (ResNet). **`filter_resnet_dt`** tells if a timestep is used in the skip connection. By default it is `false`. **`n_axis_neuron`** specifies the number of axis filter, which should be much smaller than the size of the last layer of the filter network.
378+
379+
**`n_neuron`** specifies the fitting network. If the size of the next layer is the same as the previous layer, then a skip connection is build (ResNet). **`resnet_dt`** tells if a timestep is used in the skip connection. By default it is `true`.
380+
326381

327382
## Freeze the model
328383
The trained neural network is extracted from a checkpoint and dumped into a database. This process is called "freeze" a model. Typically one does
@@ -331,7 +386,6 @@ $ $deepmd_root/bin/dp_frz -o graph.pb
331386
```
332387
in the folder where the model is trained. The output database is called `graph.pb`.
333388

334-
335389
## Run MD with Lammps
336390
Run an MD simulation with Lammps is simpler. In the Lammps input file, one needs to specify the pair style as follows
337391
```bash
@@ -340,6 +394,17 @@ pair_coeff
340394
```
341395
where `graph.pb` is the file name of the frozen model. The `pair_coeff` should be left blank. It should be noted that Lammps counts atom types starting from 1, therefore, all Lammps atom type will be firstly subtracted by 1, and then passed into the DeePMD-kit engine to compute the interactions.
342396

397+
### With long-range interaction
398+
The reciprocal space part of the long-range interaction can be calculated by lammps command `kspace_style`. To use it with DeePMD-kit, one writes
399+
```bash
400+
pair_style hybrid/overlay deepmd graph.pb coul/long 9.0
401+
pair_coeff * * deepmd
402+
pair_coeff * * coul/long
403+
pair_modify pair coul/long compute no
404+
kspace_style pppm 1.0e-5
405+
kspace_modify gewald 0.45
406+
```
407+
In this setting, the direct space part of the long-range interaction is ignored by the `pair_modify` command, because this part is fitted in the DeePMD model. The splitting parameter `gewald` is modified by the `kspace_modify` command.
343408

344409
## Run path-integral MD with i-PI
345410
The i-PI works in a client-server model. The i-PI provides the server for integrating the replica positions of atoms, while the DeePMD-kit provides a client named `dp_ipi` that computes the interactions (including energy, force and virial). The server and client communicates via the Unix domain socket or the Internet socket. The client can be started by

data/raw/copy_raw.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@
77
def copy (in_dir,
88
out_dir,
99
ncopies = [1,1,1]) :
10-
has_energy = os.path.isfile (in_dir + "energy.raw")
11-
has_force = os.path.isfile (in_dir + "force.raw")
12-
has_virial = os.path.isfile (in_dir + "virial.raw")
10+
has_energy = os.path.isfile (in_dir + "/energy.raw")
11+
has_force = os.path.isfile (in_dir + "/force.raw")
12+
has_virial = os.path.isfile (in_dir + "/virial.raw")
1313

1414
i_box = np.loadtxt (in_dir + "/box.raw")
1515
i_coord = np.loadtxt (in_dir + "/coord.raw")
@@ -65,7 +65,7 @@ def copy (in_dir,
6565
np.savetxt (out_dir + "/force.raw", o_force)
6666
if has_virial :
6767
np.savetxt (out_dir + "/virial.raw", o_virial)
68-
np.savetxt (out_dir + "/type.raw", o_type)
68+
np.savetxt (out_dir + "/type.raw", o_type, fmt = '%d')
6969
np.savetxt (out_dir + "/ncopies.raw", ncopies, fmt = "%d")
7070

7171
def _main () :

data/raw/raw_to_set.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ if test $# -ge 1; then
66
nline_per_set=$1
77
fi
88

9+
rm -fr set.*
910
echo nframe is `cat energy.raw | wc -l`
1011
echo nline per set is $nline_per_set
1112

data/raw/shuffle_raw.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,17 +42,25 @@ def _main () :
4242
print ("# no file to shuffle, exit")
4343
return
4444

45+
assert ("box.raw" in raws)
46+
tmp = np.loadtxt(os.path.join(inpath, "box.raw"))
47+
tmp = np.reshape(tmp, [-1, 9])
48+
nframe = tmp.shape[0]
49+
print(nframe)
50+
4551
print ("# will shuffle raw files " + str(raws) +
4652
" in dir " + inpath +
4753
" and output to dir " + outpath)
4854

4955
tmp = np.loadtxt (inpath + "/" + raws[0])
56+
tmp = np.reshape(tmp, [nframe, -1])
5057
nframe = tmp.shape[0]
5158
idx = np.arange (nframe)
5259
np.random.shuffle(idx)
5360

5461
for ii in raws :
5562
data = np.loadtxt(inpath + "/" + ii)
63+
data = np.reshape(data, [nframe, -1])
5664
data = data [idx]
5765
np.savetxt (outpath + "/" + ii, data)
5866

examples/train/water.json

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
{
22
"_comment": " model parameters",
3+
"use_smooth": false,
34
"sel_a": [16, 32],
45
"sel_r": [30, 60],
5-
"rcut_a": -1,
6-
"rcut_r": 6.00,
6+
"rcut": 6.00,
77
"axis_rule": [0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0],
88
"_comment": " default rule: []",
99
"_comment": " user defined rule: for each type provides two axes, ",
@@ -29,7 +29,6 @@
2929
"start_pref_v": 0,
3030
"limit_pref_v": 0,
3131

32-
"num_threads": 4,
3332
"seed": 1,
3433

3534
"_comment": " display and restart",
@@ -39,7 +38,6 @@
3938
"numb_test": 100,
4039
"save_freq": 100,
4140
"save_ckpt": "model.ckpt",
42-
"restart": false,
4341
"load_ckpt": "model.ckpt",
4442
"disp_training": true,
4543
"time_training": true,

examples/train/water_smth.json

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
{
2+
"_comment": " model parameters",
3+
"use_smooth": true,
4+
"sel_a": [46, 92],
5+
"rcut_smth": 5.80,
6+
"rcut": 6.00,
7+
"filter_neuron": [25, 50, 100],
8+
"filter_resnet_dt": false,
9+
"n_axis_neuron": 16,
10+
"n_neuron": [240, 240, 240],
11+
"resnet_dt": true,
12+
13+
"_comment": " traing controls",
14+
"systems": ["../data/water/"],
15+
"set_prefix": "set",
16+
"stop_batch": 1000000,
17+
"batch_size": 1,
18+
"start_lr": 0.005,
19+
"decay_steps": 5000,
20+
"decay_rate": 0.95,
21+
22+
"start_pref_e": 0.02,
23+
"limit_pref_e": 1,
24+
"start_pref_f": 1000,
25+
"limit_pref_f": 1,
26+
"start_pref_v": 0,
27+
"limit_pref_v": 0,
28+
29+
"seed": 1,
30+
31+
"_comment": " display and restart",
32+
"_comment": " frequencies counted in batch",
33+
"disp_file": "lcurve.out",
34+
"disp_freq": 100,
35+
"numb_test": 50,
36+
"save_freq": 100,
37+
"save_ckpt": "model.ckpt",
38+
"load_ckpt": "model.ckpt",
39+
"disp_training": true,
40+
"time_training": true,
41+
"profiling": false,
42+
"profiling_file": "timeline.json",
43+
44+
"_comment": "that's all"
45+
}
46+

0 commit comments

Comments
 (0)