Skip to content

Commit b00e29d

Browse files
authored
Merge pull request #24 from deepmodeling/master
master update
2 parents 07c42c1 + b2662e2 commit b00e29d

File tree

127 files changed

+7338
-1970
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

127 files changed

+7338
-1970
lines changed

.travis.yml

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,10 @@ addons:
1414
- g++-7
1515
- gcc-8
1616
- g++-8
17-
matrix:
17+
jobs:
1818
include:
19-
- python: 3.6
19+
- stage: unit tests
20+
python: 3.6
2021
env:
2122
- CC=gcc-4.8
2223
- CXX=g++-4.8
@@ -65,14 +66,31 @@ matrix:
6566
env:
6667
- CC=gcc-5
6768
- CXX=g++-5
68-
- TENSORFLOW_VERSION=2.0
69+
- TENSORFLOW_VERSION=2.1
6970
- python: 3.7
7071
env:
7172
- CC=gcc-8
7273
- CXX=g++-8
73-
- TENSORFLOW_VERSION=2.0
74+
- TENSORFLOW_VERSION=2.1
75+
- stage: build whls
76+
services: docker
77+
env:
78+
- TWINE_USERNAME=__token__
79+
- CIBW_BUILD="cp36-* cp37-*"
80+
- CIBW_BEFORE_BUILD="pip install tensorflow && sed -i 's/libresolv.so.2\"/libresolv.so.2\", \"libtensorflow_framework.so.2\"/g' \$(find / -name policy.json)"
81+
- CIBW_SKIP="*-win32 *-manylinux_i686"
82+
- CC=gcc-7
83+
- CXX=g++-7
84+
- TENSORFLOW_VERSION=2.1
85+
install:
86+
- python -m pip install twine cibuildwheel==1.1.0 scikit-build setuptools_scm
87+
script:
88+
- python -m cibuildwheel --output-dir wheelhouse
89+
- python setup.py sdist
90+
after_success:
91+
- if [[ $TRAVIS_TAG ]]; then python -m twine upload wheelhouse/*; python -m twine upload dist/*.tar.gz; fi
7492
before_install:
75-
# - pip install --upgrade pip
93+
#- pip install --upgrade pip
7694
- pip install --upgrade setuptools
7795
- pip install tensorflow==$TENSORFLOW_VERSION
7896
install:

README.md

Lines changed: 56 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@
1010
- [Deep Potential in a nutshell](#deep-potential-in-a-nutshell)
1111
- [Download and install](#download-and-install)
1212
- [Easy installation methods](#easy-installation-methods)
13+
- [Offline packages](#offline-packages)
1314
- [With Docker](#with-docker)
1415
- [With conda](#with-conda)
15-
- [Offline packages](#offline-packages)
1616
- [Install the python interaction](#install-the-python-interface)
1717
- [Install the Tensorflow's python interface](#install-the-tensorflows-python-interface)
1818
- [Install the DeePMD-kit's python interface](#install-the-deepmd-kits-python-interface)
@@ -90,8 +90,10 @@ Please follow our [github](https://github.com/deepmodeling/deepmd-kit) webpage t
9090
## Easy installation methods
9191
There various easy methods to install DeePMD-kit. Choose one that you prefer. If you want to build by yourself, jump to the next two sections.
9292

93-
### With Docker
94-
A docker for installing the DeePMD-kit on CentOS 7 is available [here](https://github.com/frankhan91/deepmd-kit_docker).
93+
After your easy installation, DeePMD-kit (`dp`) and LAMMPS (`lmp`) will be available to execute. You can try `dp -h` and `lmp -h` to see the help. `mpirun` is also available considering you may want to run LAMMPS in parallel.
94+
95+
### Offline packages
96+
Both CPU and GPU version offline packages are avaiable in [the Releases page](https://github.com/deepmodeling/deepmd-kit/releases).
9597

9698
### With conda
9799
DeePMD-kit is avaiable with [conda](https://github.com/conda/conda). Install [Anaconda](https://www.anaconda.com/distribution/#download-section) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html) first.
@@ -101,28 +103,37 @@ To install the CPU version:
101103
conda install deepmd-kit=*=*cpu lammps-dp=*=*cpu -c deepmodeling
102104
```
103105

104-
To install the GPU version containing [CUDA 10.0](https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility__table-toolkit-driver):
106+
To install the GPU version containing [CUDA 10.1](https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility__table-toolkit-driver):
105107
```bash
106108
conda install deepmd-kit=*=*gpu lammps-dp=*=*gpu -c deepmodeling
107109
```
108110

109-
### Offline packages
110-
Both CPU and GPU version offline package are avaiable in [the Releases page](https://github.com/deepmodeling/deepmd-kit/releases).
111+
### With Docker
112+
A docker for installing the DeePMD-kit is available [here](https://github.com/orgs/deepmodeling/packages/container/deepmd-kit).
113+
114+
To pull the CPU version:
115+
```bash
116+
docker pull ghcr.io/deepmodeling/deepmd-kit:1.2.0_cpu
117+
```
118+
119+
To pull the GPU version:
120+
```bash
121+
docker pull ghcr.io/deepmodeling/deepmd-kit:1.2.0_cuda10.1_gpu
122+
```
111123

112124
## Install the python interface
113125
### Install the Tensorflow's python interface
114-
First, check the python version and compiler version on your machine
126+
First, check the python version on your machine
115127
```bash
116-
python --version; gcc --version
128+
python --version
117129
```
118-
If your python version is 3.7.x, it is highly recommended that the GNU C/C++ compiler is higher than or equal to 5.0.
119130

120131
We follow the virtual environment approach to install the tensorflow's Python interface. The full instruction can be found on [the tensorflow's official website](https://www.tensorflow.org/install/pip). Now we assume that the Python interface will be installed to virtual environment directory `$tensorflow_venv`
121132
```bash
122133
virtualenv -p python3 $tensorflow_venv
123134
source $tensorflow_venv/bin/activate
124135
pip install --upgrade pip
125-
pip install --upgrade tensorflow==1.14.0
136+
pip install --upgrade tensorflow==2.1.0
126137
```
127138
It is notice that everytime a new shell is started and one wants to use `DeePMD-kit`, the virtual environment should be activated by
128139
```bash
@@ -136,31 +147,21 @@ If one has multiple python interpreters named like python3.x, it can be specifie
136147
```bash
137148
virtualenv -p python3.7 $tensorflow_venv
138149
```
139-
If one needs the GPU support of deepmd-kit, the GPU version of tensorflow should be installed by
140-
```bash
141-
pip install --upgrade tensorflow-gpu==1.14.0
150+
If one does not need the GPU support of deepmd-kit and is concerned about package size, the CPU-only version of tensorflow should be installed by
151+
```bash
152+
pip install --upgrade tensorflow-cpu==2.1.0
142153
```
143154
To verify the installation, run
144155
```bash
145-
python -c "import tensorflow as tf; sess=tf.Session(); print(sess.run(tf.reduce_sum(tf.random_normal([1000, 1000]))))"
156+
python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
146157
```
147158
One should remember to activate the virtual environment every time he/she uses deepmd-kit.
148159

149160
### Install the DeePMD-kit's python interface
150161

151-
Clone the DeePMD-kit source code
162+
Execute
152163
```bash
153-
cd /some/workspace
154-
git clone --recursive https://github.com/deepmodeling/deepmd-kit.git deepmd-kit -b devel
155-
```
156-
If one downloads the .zip file from the github, then the default folder of source code would be `deepmd-kit-master` rather than `deepmd-kit`. For convenience, you may want to record the location of source to a variable, saying `deepmd_source_dir` by
157-
```bash
158-
cd deepmd-kit
159-
deepmd_source_dir=`pwd`
160-
```
161-
Then execute
162-
```bash
163-
pip install .
164+
pip install deepmd-kit
164165
```
165166
To test the installation, one may execute
166167
```bash
@@ -189,11 +190,30 @@ If one does not need to use DeePMD-kit with Lammps or I-Pi, then the python inte
189190

190191
### Install the Tensorflow's C++ interface
191192

192-
It is highly recommended that one keeps the same C/C++ compiler as the python interface. The C++ interface of DeePMD-kit was tested with compiler gcc >= 4.8. It is noticed that the I-Pi support is only compiled with gcc >= 4.9.
193+
Check the compiler version on your machine
194+
195+
```
196+
gcc --version
197+
```
198+
199+
The C++ interface of DeePMD-kit was tested with compiler gcc >= 4.8. It is noticed that the I-Pi support is only compiled with gcc >= 4.9.
193200

194201
First the C++ interface of Tensorflow should be installed. It is noted that the version of Tensorflow should be in consistent with the python interface. We assume that you have followed our instruction and installed tensorflow python interface 1.14.0 with, then you may follow [the instruction for CPU](doc/install-tf.1.14.md) to install the corresponding C++ interface (CPU only). If one wants GPU supports, he/she should follow [the instruction for GPU](doc/install-tf.1.14-gpu.md) to install the C++ interface.
195202

196203
### Install the DeePMD-kit's C++ interface
204+
205+
Clone the DeePMD-kit source code
206+
```bash
207+
cd /some/workspace
208+
git clone --recursive https://github.com/deepmodeling/deepmd-kit.git deepmd-kit
209+
```
210+
211+
For convenience, you may want to record the location of source to a variable, saying `deepmd_source_dir` by
212+
```bash
213+
cd deepmd-kit
214+
deepmd_source_dir=`pwd`
215+
```
216+
197217
Now goto the source code directory of DeePMD-kit and make a build place.
198218
```bash
199219
cd $deepmd_source_dir/source
@@ -437,8 +457,6 @@ positional arguments:
437457
438458
optional arguments:
439459
-h, --help show this help message and exit
440-
-t INTER_THREADS, --inter-threads INTER_THREADS
441-
With default value 0. Setting the "inter_op_parallelism_threads" key for the tensorflow, the "intra_op_parallelism_threads" will be set by the env variable OMP_NUM_THREADS
442460
--init-model INIT_MODEL
443461
Initialize a model by the provided checkpoint
444462
--restart RESTART Restart the training from the provided checkpoint
@@ -449,6 +467,15 @@ The keys `intra_op_parallelism_threads` and `inter_op_parallelism_threads` are T
449467

450468
**`--restart model.ckpt`**, continues the training from the checkpoint `model.ckpt`.
451469

470+
On some resources limited machines, one may want to control the number of threads used by DeePMD-kit. This is achieved by three environmental variables: `OMP_NUM_THREADS`, `TF_INTRA_OP_PARALLELISM_THREADS` and `TF_INTER_OP_PARALLELISM_THREADS`. `OMP_NUM_THREADS` controls the multithreading of DeePMD-kit implemented operations. `TF_INTRA_OP_PARALLELISM_THREADS` and `TF_INTER_OP_PARALLELISM_THREADS` controls `intra_op_parallelism_threads` and `inter_op_parallelism_threads`, which are Tensorflow configurations for multithreading. An explanation is found [here](https://stackoverflow.com/questions/41233635/meaning-of-inter-op-parallelism-threads-and-intra-op-parallelism-threads).
471+
472+
For example if you wish to use 3 cores of 2 CPUs on one node, you may set the environmental variables and run DeePMD-kit as follows:
473+
```bash
474+
export OMP_NUM_THREADS=6
475+
export TF_INTRA_OP_PARALLELISM_THREADS=3
476+
export TF_INTER_OP_PARALLELISM_THREADS=2
477+
dp train input.json
478+
```
452479

453480
## Freeze a model
454481

@@ -606,18 +633,6 @@ rm -r *
606633
```
607634
and redo the `cmake` process.
608635

609-
## Training: TensorFlow abi binary cannot be found when doing training
610-
If you confront such kind of error:
611-
612-
```
613-
$deepmd_root/lib/deepmd/libop_abi.so: undefined symbol:
614-
_ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringB5cxx11Ev
615-
```
616-
617-
This may happen if you are using a gcc >= 5.0, and tensorflow was compiled with gcc < 5.0. You may set `-DOP_CXX_ABI=0` in the process of `cmake`.
618-
619-
Another possible reason might be the large gap between the python version of TensorFlow and the TensorFlow c++ interface.
620-
621636
## MD: cannot run LAMMPS after installing a new version of DeePMD-kit
622637
This typically happens when you install a new version of DeePMD-kit and copy directly the generated `USER-DEEPMD` to a LAMMPS source code folder and re-install LAMMPS.
623638

doc/install-tf.1.14.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Now, copy the libraries to the tensorflow's installation directory:
4242
mkdir $tensorflow_root/lib
4343
cp -d bazel-bin/tensorflow/libtensorflow_cc.so* $tensorflow_root/lib/
4444
cp -d bazel-bin/tensorflow/libtensorflow_framework.so* $tensorflow_root/lib/
45+
cp -d $tensorflow_root/lib/libtensorflow_framework.so.1 $tensorflow_root/lib/libtensorflow_framework.so
4546
```
4647
Then copy the headers
4748
```bash

examples/water/train/polar.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,9 @@
3131

3232
"learning_rate" :{
3333
"type": "exp",
34-
"start_lr": 0.001,
3534
"decay_steps": 5000,
36-
"decay_rate": 0.95,
35+
"start_lr": 0.001,
36+
"stop_lr": 3.51e-8,
3737
"_comment": "that's all"
3838
},
3939

examples/water/train/polar_se_a.json

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
"_comment": " model parameters",
44
"model":{
55
"type_map": ["O", "H"],
6-
"data_stat_nbatch": 1,
6+
"data_stat_nbatch": 10,
77
"descriptor" :{
88
"type": "se_a",
99
"sel": [46, 92],
@@ -18,7 +18,7 @@
1818
"fitting_net": {
1919
"type": "polar",
2020
"sel_type": [0],
21-
"fit_diag": true,
21+
"fit_diag": false,
2222
"neuron": [100, 100, 100],
2323
"resnet_dt": true,
2424
"seed": 1,
@@ -29,9 +29,9 @@
2929

3030
"learning_rate" :{
3131
"type": "exp",
32-
"start_lr": 0.01,
3332
"decay_steps": 5000,
34-
"decay_rate": 0.95,
33+
"start_lr": 0.01,
34+
"stop_lr": 3.51e-7,
3535
"_comment": "that's all"
3636
},
3737

examples/water/train/wannier.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,9 +32,9 @@
3232

3333
"learning_rate" :{
3434
"type": "exp",
35-
"start_lr": 0.001,
3635
"decay_steps": 5000,
37-
"decay_rate": 0.95,
36+
"start_lr": 0.001,
37+
"stop_lr": 3.51e-8,
3838
"_comment": "that's all"
3939
},
4040

examples/water/train/water.json

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
"_comment": " model parameters",
44
"model":{
55
"type_map": ["O", "H"],
6+
"data_stat_nbatch": 10,
67
"descriptor": {
78
"type": "loc_frame",
89
"sel_a": [16, 32],
@@ -28,9 +29,9 @@
2829

2930
"learning_rate" :{
3031
"type": "exp",
31-
"start_lr": 0.001,
3232
"decay_steps": 5000,
33-
"decay_rate": 0.95,
33+
"start_lr": 0.001,
34+
"stop_lr": 3.51e-8,
3435
"_comment": "that's all"
3536
},
3637

examples/water/train/water_se_a.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@
2424

2525
"learning_rate" :{
2626
"type": "exp",
27-
"start_lr": 0.001,
2827
"decay_steps": 5000,
29-
"decay_rate": 0.95,
28+
"start_lr": 0.001,
29+
"stop_lr": 3.51e-8,
3030
"_comment": "that's all"
3131
},
3232

examples/water/train/water_se_ar.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,9 @@
3535

3636
"learning_rate" :{
3737
"type": "exp",
38-
"start_lr": 0.005,
3938
"decay_steps": 5000,
40-
"decay_rate": 0.95,
39+
"start_lr": 0.005,
40+
"stop_lr": 1.76e-7,
4141
"_comment": "that's all"
4242
},
4343

examples/water/train/water_se_r.json

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,10 @@
2323
},
2424

2525
"learning_rate" : {
26-
"start_lr": 0.005,
26+
"type": "exp",
2727
"decay_steps": 5000,
28-
"decay_rate": 0.95,
28+
"start_lr": 0.005,
29+
"stop_lr": 1.76e-7,
2930
"_comment": " that's all"
3031
},
3132

0 commit comments

Comments
 (0)