Skip to content

Commit bc01883

Browse files
authored
Merge pull request #226 from amcadmus/master
merge stable changes on devel into master
2 parents 07c42c1 + e015eac commit bc01883

File tree

121 files changed

+6408
-1769
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

121 files changed

+6408
-1769
lines changed

.travis.yml

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,10 @@ addons:
1414
- g++-7
1515
- gcc-8
1616
- g++-8
17-
matrix:
17+
jobs:
1818
include:
19-
- python: 3.6
19+
- stage: unit tests
20+
python: 3.6
2021
env:
2122
- CC=gcc-4.8
2223
- CXX=g++-4.8
@@ -65,14 +66,31 @@ matrix:
6566
env:
6667
- CC=gcc-5
6768
- CXX=g++-5
68-
- TENSORFLOW_VERSION=2.0
69+
- TENSORFLOW_VERSION=2.1
6970
- python: 3.7
7071
env:
7172
- CC=gcc-8
7273
- CXX=g++-8
73-
- TENSORFLOW_VERSION=2.0
74+
- TENSORFLOW_VERSION=2.1
75+
- stage: build whls
76+
services: docker
77+
env:
78+
- TWINE_USERNAME=__token__
79+
- CIBW_BUILD="cp36-* cp37-*"
80+
- CIBW_BEFORE_BUILD="pip install tensorflow && sed -i 's/libresolv.so.2\"/libresolv.so.2\", \"libtensorflow_framework.so.2\"/g' \$(find / -name policy.json)"
81+
- CIBW_SKIP="*-win32 *-manylinux_i686"
82+
- CC=gcc-7
83+
- CXX=g++-7
84+
- TENSORFLOW_VERSION=2.1
85+
install:
86+
- python -m pip install twine cibuildwheel==1.1.0 scikit-build
87+
script:
88+
- python -m cibuildwheel --output-dir wheelhouse
89+
- python setup.py sdist
90+
after_success:
91+
- if [[ $TRAVIS_TAG ]]; then python -m twine upload wheelhouse/*; python -m twine upload dist/*.tar.gz; fi
7492
before_install:
75-
# - pip install --upgrade pip
93+
#- pip install --upgrade pip
7694
- pip install --upgrade setuptools
7795
- pip install tensorflow==$TENSORFLOW_VERSION
7896
install:

README.md

Lines changed: 38 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -111,18 +111,17 @@ Both CPU and GPU version offline package are avaiable in [the Releases page](htt
111111

112112
## Install the python interface
113113
### Install the Tensorflow's python interface
114-
First, check the python version and compiler version on your machine
114+
First, check the python version on your machine
115115
```bash
116-
python --version; gcc --version
116+
python --version
117117
```
118-
If your python version is 3.7.x, it is highly recommended that the GNU C/C++ compiler is higher than or equal to 5.0.
119118

120119
We follow the virtual environment approach to install the tensorflow's Python interface. The full instruction can be found on [the tensorflow's official website](https://www.tensorflow.org/install/pip). Now we assume that the Python interface will be installed to virtual environment directory `$tensorflow_venv`
121120
```bash
122121
virtualenv -p python3 $tensorflow_venv
123122
source $tensorflow_venv/bin/activate
124123
pip install --upgrade pip
125-
pip install --upgrade tensorflow==1.14.0
124+
pip install --upgrade tensorflow==2.1.0
126125
```
127126
It is notice that everytime a new shell is started and one wants to use `DeePMD-kit`, the virtual environment should be activated by
128127
```bash
@@ -136,31 +135,21 @@ If one has multiple python interpreters named like python3.x, it can be specifie
136135
```bash
137136
virtualenv -p python3.7 $tensorflow_venv
138137
```
139-
If one needs the GPU support of deepmd-kit, the GPU version of tensorflow should be installed by
140-
```bash
141-
pip install --upgrade tensorflow-gpu==1.14.0
138+
If one does not need the GPU support of deepmd-kit and is concerned about package size, the CPU-only version of tensorflow should be installed by
139+
```bash
140+
pip install --upgrade tensorflow-cpu==2.1.0
142141
```
143142
To verify the installation, run
144143
```bash
145-
python -c "import tensorflow as tf; sess=tf.Session(); print(sess.run(tf.reduce_sum(tf.random_normal([1000, 1000]))))"
144+
python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
146145
```
147146
One should remember to activate the virtual environment every time he/she uses deepmd-kit.
148147

149148
### Install the DeePMD-kit's python interface
150149

151-
Clone the DeePMD-kit source code
152-
```bash
153-
cd /some/workspace
154-
git clone --recursive https://github.com/deepmodeling/deepmd-kit.git deepmd-kit -b devel
155-
```
156-
If one downloads the .zip file from the github, then the default folder of source code would be `deepmd-kit-master` rather than `deepmd-kit`. For convenience, you may want to record the location of source to a variable, saying `deepmd_source_dir` by
157-
```bash
158-
cd deepmd-kit
159-
deepmd_source_dir=`pwd`
160-
```
161-
Then execute
150+
Execute
162151
```bash
163-
pip install .
152+
pip install deepmd-kit
164153
```
165154
To test the installation, one may execute
166155
```bash
@@ -189,11 +178,30 @@ If one does not need to use DeePMD-kit with Lammps or I-Pi, then the python inte
189178

190179
### Install the Tensorflow's C++ interface
191180

192-
It is highly recommended that one keeps the same C/C++ compiler as the python interface. The C++ interface of DeePMD-kit was tested with compiler gcc >= 4.8. It is noticed that the I-Pi support is only compiled with gcc >= 4.9.
181+
Check the compiler version on your machine
182+
183+
```
184+
gcc --version
185+
```
186+
187+
The C++ interface of DeePMD-kit was tested with compiler gcc >= 4.8. It is noticed that the I-Pi support is only compiled with gcc >= 4.9.
193188

194189
First the C++ interface of Tensorflow should be installed. It is noted that the version of Tensorflow should be in consistent with the python interface. We assume that you have followed our instruction and installed tensorflow python interface 1.14.0 with, then you may follow [the instruction for CPU](doc/install-tf.1.14.md) to install the corresponding C++ interface (CPU only). If one wants GPU supports, he/she should follow [the instruction for GPU](doc/install-tf.1.14-gpu.md) to install the C++ interface.
195190

196191
### Install the DeePMD-kit's C++ interface
192+
193+
Clone the DeePMD-kit source code
194+
```bash
195+
cd /some/workspace
196+
git clone --recursive https://github.com/deepmodeling/deepmd-kit.git deepmd-kit
197+
```
198+
199+
For convenience, you may want to record the location of source to a variable, saying `deepmd_source_dir` by
200+
```bash
201+
cd deepmd-kit
202+
deepmd_source_dir=`pwd`
203+
```
204+
197205
Now goto the source code directory of DeePMD-kit and make a build place.
198206
```bash
199207
cd $deepmd_source_dir/source
@@ -437,8 +445,6 @@ positional arguments:
437445
438446
optional arguments:
439447
-h, --help show this help message and exit
440-
-t INTER_THREADS, --inter-threads INTER_THREADS
441-
With default value 0. Setting the "inter_op_parallelism_threads" key for the tensorflow, the "intra_op_parallelism_threads" will be set by the env variable OMP_NUM_THREADS
442448
--init-model INIT_MODEL
443449
Initialize a model by the provided checkpoint
444450
--restart RESTART Restart the training from the provided checkpoint
@@ -449,6 +455,15 @@ The keys `intra_op_parallelism_threads` and `inter_op_parallelism_threads` are T
449455

450456
**`--restart model.ckpt`**, continues the training from the checkpoint `model.ckpt`.
451457

458+
On some resources limited machines, one may want to control the number of threads used by DeePMD-kit. This is achieved by three environmental variables: `OMP_NUM_THREADS`, `TF_INTRA_OP_PARALLELISM_THREADS` and `TF_INTER_OP_PARALLELISM_THREADS`. `OMP_NUM_THREADS` controls the multithreading of DeePMD-kit implemented operations. `TF_INTRA_OP_PARALLELISM_THREADS` and `TF_INTER_OP_PARALLELISM_THREADS` controls `intra_op_parallelism_threads` and `inter_op_parallelism_threads`, which are Tensorflow configurations for multithreading. An explanation is found [here](https://stackoverflow.com/questions/41233635/meaning-of-inter-op-parallelism-threads-and-intra-op-parallelism-threads).
459+
460+
For example if you wish to use 3 cores of 2 CPUs on one node, you may set the environmental variables and run DeePMD-kit as follows:
461+
```bash
462+
export OMP_NUM_THREADS=6
463+
export TF_INTRA_OP_PARALLELISM_THREADS=3
464+
export TF_INTER_OP_PARALLELISM_THREADS=2
465+
dp train input.json
466+
```
452467

453468
## Freeze a model
454469

@@ -606,18 +621,6 @@ rm -r *
606621
```
607622
and redo the `cmake` process.
608623

609-
## Training: TensorFlow abi binary cannot be found when doing training
610-
If you confront such kind of error:
611-
612-
```
613-
$deepmd_root/lib/deepmd/libop_abi.so: undefined symbol:
614-
_ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringB5cxx11Ev
615-
```
616-
617-
This may happen if you are using a gcc >= 5.0, and tensorflow was compiled with gcc < 5.0. You may set `-DOP_CXX_ABI=0` in the process of `cmake`.
618-
619-
Another possible reason might be the large gap between the python version of TensorFlow and the TensorFlow c++ interface.
620-
621624
## MD: cannot run LAMMPS after installing a new version of DeePMD-kit
622625
This typically happens when you install a new version of DeePMD-kit and copy directly the generated `USER-DEEPMD` to a LAMMPS source code folder and re-install LAMMPS.
623626

doc/install-tf.1.14.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Now, copy the libraries to the tensorflow's installation directory:
4242
mkdir $tensorflow_root/lib
4343
cp -d bazel-bin/tensorflow/libtensorflow_cc.so* $tensorflow_root/lib/
4444
cp -d bazel-bin/tensorflow/libtensorflow_framework.so* $tensorflow_root/lib/
45+
cp -d $tensorflow_root/lib/libtensorflow_framework.so.1 $tensorflow_root/lib/libtensorflow_framework.so
4546
```
4647
Then copy the headers
4748
```bash

examples/water/train/polar.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,9 @@
3131

3232
"learning_rate" :{
3333
"type": "exp",
34-
"start_lr": 0.001,
3534
"decay_steps": 5000,
36-
"decay_rate": 0.95,
35+
"start_lr": 0.001,
36+
"stop_lr": 3.51e-8,
3737
"_comment": "that's all"
3838
},
3939

examples/water/train/polar_se_a.json

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
"_comment": " model parameters",
44
"model":{
55
"type_map": ["O", "H"],
6-
"data_stat_nbatch": 1,
6+
"data_stat_nbatch": 10,
77
"descriptor" :{
88
"type": "se_a",
99
"sel": [46, 92],
@@ -18,7 +18,7 @@
1818
"fitting_net": {
1919
"type": "polar",
2020
"sel_type": [0],
21-
"fit_diag": true,
21+
"fit_diag": false,
2222
"neuron": [100, 100, 100],
2323
"resnet_dt": true,
2424
"seed": 1,
@@ -29,9 +29,9 @@
2929

3030
"learning_rate" :{
3131
"type": "exp",
32-
"start_lr": 0.01,
3332
"decay_steps": 5000,
34-
"decay_rate": 0.95,
33+
"start_lr": 0.01,
34+
"stop_lr": 3.51e-7,
3535
"_comment": "that's all"
3636
},
3737

examples/water/train/wannier.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,9 +32,9 @@
3232

3333
"learning_rate" :{
3434
"type": "exp",
35-
"start_lr": 0.001,
3635
"decay_steps": 5000,
37-
"decay_rate": 0.95,
36+
"start_lr": 0.001,
37+
"stop_lr": 3.51e-8,
3838
"_comment": "that's all"
3939
},
4040

examples/water/train/water.json

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
"_comment": " model parameters",
44
"model":{
55
"type_map": ["O", "H"],
6+
"data_stat_nbatch": 10,
67
"descriptor": {
78
"type": "loc_frame",
89
"sel_a": [16, 32],
@@ -28,9 +29,9 @@
2829

2930
"learning_rate" :{
3031
"type": "exp",
31-
"start_lr": 0.001,
3232
"decay_steps": 5000,
33-
"decay_rate": 0.95,
33+
"start_lr": 0.001,
34+
"stop_lr": 3.51e-8,
3435
"_comment": "that's all"
3536
},
3637

examples/water/train/water_se_a.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@
2424

2525
"learning_rate" :{
2626
"type": "exp",
27-
"start_lr": 0.001,
2827
"decay_steps": 5000,
29-
"decay_rate": 0.95,
28+
"start_lr": 0.001,
29+
"stop_lr": 3.51e-8,
3030
"_comment": "that's all"
3131
},
3232

examples/water/train/water_se_ar.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,9 @@
3535

3636
"learning_rate" :{
3737
"type": "exp",
38-
"start_lr": 0.005,
3938
"decay_steps": 5000,
40-
"decay_rate": 0.95,
39+
"start_lr": 0.005,
40+
"stop_lr": 1.76e-7,
4141
"_comment": "that's all"
4242
},
4343

examples/water/train/water_se_r.json

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,10 @@
2323
},
2424

2525
"learning_rate" : {
26-
"start_lr": 0.005,
26+
"type": "exp",
2727
"decay_steps": 5000,
28-
"decay_rate": 0.95,
28+
"start_lr": 0.005,
29+
"stop_lr": 1.76e-7,
2930
"_comment": " that's all"
3031
},
3132

0 commit comments

Comments
 (0)