You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: source/_posts/tutorial1.md
+12-49Lines changed: 12 additions & 49 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,9 +7,7 @@ tags:
7
7
- DeePMD-kit
8
8
---
9
9
10
-
DeePMD-kit is a software to implement Deep Potential.
11
-
There is a lot of information on the Internet, but there are not so many tutorials for the new hand, and the official guide is too long.
12
-
Today, I’ll take you 5 minutes to get started with DeePMD-kit.
10
+
DeePMD-kit is a software to implement Deep Potential. There is a lot of information on the Internet, but there are not so many tutorials for the new hand, and the official guide is too long. Today, I'll take you 5 minutes to get started with DeePMD-kit.
13
11
14
12
Let's take a look at the training process of DeePMD-kit:
Preparing data is converting the computational results of DFT to data that can be recognized by the DeePMD-kit.
24
-
Training is train a Deep Potential model using the DeePMD-kit with data prepared in the previous step.
25
-
Finally, what we need to do is to freeze the restart file in the training process into a model, in other words is to extract the neural network parameters into a file for subsequent use.
26
-
I believe you can't wait to get started. Let's go!
19
+
What? Only three steps? Yes, it's that simple. Preparing data is converting the computational results of DFT to data that can be recognized by the DeePMD-kit. Training is train a Deep Potential model using the DeePMD-kit with data prepared in the previous step. Finally, what we need to do is to freeze the restart file in the training process into a model, in other words is to extract the neural network parameters into a file for subsequent use. I believe you can't wait to get started. Let's go!
27
20
28
-
The data format of the DeePMD-kit is introduced in the [official document](https://deepmd.readthedocs.io/) but seems complex.
29
-
Don't worry, I'd like to introduce a data processing tool: dpdata!
30
-
You can use only one line Python scripts to process data.
31
-
So easy!
21
+
The data format of the DeePMD-kit is introduced in the [official document](https://deepmd.readthedocs.io/) but seems complex. Don't worry, I'd like to introduce a data processing tool: dpdata! You can use only one line Python scripts to process data. So easy!
In this example, we converted the computaional results of the VASP in the `OUTCAR` to the data format of the DeePMD-kit and saved in to a directory named `data`,
39
-
where `npy` is the compressed format of the numpy, which is required by the DeePMD-kit training.
40
-
We assume `OUTCAR` stores 1000 frames of molecular dynamics trajectory, then where will be 1000 points after converting.
41
-
`set_size=200` means these 1000 points will be divided into 5 subsets, which is named as `data/set.000`~`data/set.004`, respectively.
42
-
The size of each set is 200.
43
-
In these 5 sets, `data/set.000`~`data/set.003` will be considered as the training set by the DeePMD-kit, and `data/set.004` will be considered as the test set.
44
-
The last set will be considered as the test set by the DeePMD-kit by default.
45
-
If there is only one set, the set will be both the training set and the test set. (Of course, such test set is meaningless.)
46
-
It's required to prepare an input script to start the DeePMD-kit training.
47
-
Are you still out of the fear of being dominated by INCAR script?
48
-
Don't worry, it's much easier to configure the DeePMD-kit than configuring the VASP.
49
-
First, let's download an example and save to `input.json`:
28
+
In this example, we converted the computaional results of the VASP in the `OUTCAR` to the data format of the DeePMD-kit and saved in to a directory named `data`, where `npy` is the compressed format of the numpy, which is required by the DeePMD-kit training. We assume `OUTCAR` stores 1000 frames of molecular dynamics trajectory, then where will be 1000 points after converting. `set_size=200` means these 1000 points will be divided into 5 subsets, which is named as `data/set.000`~`data/set.004`, respectively. The size of each set is 200. In these 5 sets, `data/set.000`~`data/set.003` will be considered as the training set by the DeePMD-kit, and `data/set.004` will be considered as the test set. The last set will be considered as the test set by the DeePMD-kit by default. If there is only one set, the set will be both the training set and the test set. (Of course, such test set is meaningless.) It's required to prepare an input script to start the DeePMD-kit training. Are you still out of the fear of being dominated by INCAR script? Don't worry, it's much easier to configure the DeePMD-kit than configuring the VASP. First, let's download an example and save to `input.json`:
The strength of the DeePMD-kit is that the same training parameters are suitable for different systems, so we only need to slightly modify `input.json` to start training.
56
-
Here is the first parameter to modify:
34
+
The strength of the DeePMD-kit is that the same training parameters are suitable for different systems, so we only need to slightly modify `input.json` to start training. Here is the first parameter to modify:
57
35
58
36
```json
59
37
"type_map": ["O", "H"],
60
38
```
61
39
62
-
In the DeePMD-kit data, each atom type is numbered as an integer starting from 0.
63
-
The parameter gives an element name to each atom in the numbering system.
64
-
Here, we can copy from the content of `data/type_map.raw`.
65
-
For example,
40
+
In the DeePMD-kit data, each atom type is numbered as an integer starting from 0. The parameter gives an element name to each atom in the numbering system. Here, we can copy from the content of `data/type_map.raw`. For example,
66
41
67
42
```json
68
43
"type_map": ["A", "B","C"],
@@ -74,13 +49,7 @@ Next, we are going to modify the neighbour searching parameter:
74
49
"sel": [46, 92],
75
50
```
76
51
77
-
Each number in this list gives the maximum number of atoms of each type among neighbor atoms of an atom.
78
-
For example, `46` means there are at most 46 `O` (type `0`) neighbours.
79
-
Here, our elements were modified to `A`, `B`, and `C`, so this parameters is also required to modify.
80
-
What to do if you don’t know the maximum number of neighbors?
81
-
You can be roughly estimate one by the density of the system, or try a number blindly.
82
-
If it is not big enough, the DeePMD-kit will shoot WARNINGS.
83
-
Below we changed it to
52
+
Each number in this list gives the maximum number of atoms of each type among neighbor atoms of an atom. For example, `46` means there are at most 46 `O` (type `0`) neighbours. Here, our elements were modified to `A`, `B`, and `C`, so this parameters is also required to modify. What to do if you don't know the maximum number of neighbors? You can be roughly estimate one by the density of the system, or try a number blindly. If it is not big enough, the DeePMD-kit will shoot WARNINGS. Below we changed it to
84
53
85
54
```json
86
55
"sel": [64, 64, 64]
@@ -98,15 +67,11 @@ to
98
67
"systems": ["./data/"],
99
68
```
100
69
101
-
It is because that the directory to write to is `./data/` in the current directory.
102
-
Here I'd like to introduce the definition of the data system.
103
-
The DeePMD-kit considers that data with corresponding element types and atomic numbers form a system.
104
-
Our data is generated from a molecular dynamics simulation and meets this condition, so we can put them into one system.
105
-
Dpdata works the same way.
106
-
If data cannot be put into a system, multiple systems is required to be set as a list here:
70
+
It is because that the directory to write to is `./data/` in the current directory. Here I'd like to introduce the definition of the data system. The DeePMD-kit considers that data with corresponding element types and atomic numbers form a system. Our data is generated from a molecular dynamics simulation and meets this condition, so we can put them into one system. Dpdata works the same way. If data cannot be put into a system, multiple systems is required to be set as a list here:
107
71
```json
108
72
"training": {
109
-
"systems": [system1, system2]
73
+
"systems": ["system1", "system2"]
74
+
```
110
75
111
76
Finnally, we are likely to modify another two parameters:
112
77
@@ -128,14 +93,12 @@ Now we have succesfully set a input file! To start training, we execuate
128
93
dp train input.json
129
94
```
130
95
131
-
and wait for results. During the training process, we can see `lcurve.out` to observe the error reduction.
132
-
Among them, Column 4 and 5 are the test and training errors of energy (normalized by the number of atoms), and Column 6 and 7 are the test and training errors of the force.
96
+
and wait for results. During the training process, we can see `lcurve.out` to observe the error reduction.Among them, Column 4 and 5 are the test and training errors of energy (normalized by the number of atoms), and Column 6 and 7 are the test and training errors of the force.
133
97
134
98
After training, we can use the following script to freeze the model:
135
99
136
100
```sh
137
101
dp freeze
138
102
```
139
103
140
-
The default filename of the output model is `frozen_model.pb`. As so, we have got a good or bad DP model.
141
-
As for the reliability of this model and how to use it, I will give you a detailed tutorial in the next post.
104
+
The default filename of the output model is `frozen_model.pb`. As so, we have got a good or bad DP model. As for the reliability of this model and how to use it, I will give you a detailed tutorial in the next post.
0 commit comments