Skip to content

Commit 7d0d617

Browse files
authored
Merge pull request #92 from amcadmus/devel
support gromacs gro file format
2 parents a54a7ab + 14456cd commit 7d0d617

File tree

7 files changed

+169
-29
lines changed

7 files changed

+169
-29
lines changed

README.md

Lines changed: 30 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,34 @@ The labels provided in the `OUTCAR`, i.e. energies, forces and virials (if any),
5151

5252
The `System` or `LabeledSystem` can be constructed from the following file formats with the `format key` in the table passed to argument `fmt`:
5353

54+
| Software| format | multi frames | labeled | class | format key |
55+
| ------- | :--- | :---: | :---: | :--- | :--- |
56+
| vasp | poscar | False | False | System | 'vasp/poscar' |
57+
| vasp | outcar | True | True | LabeledSystem | 'vasp/outcar' |
58+
| vasp | xml | True | True | LabeledSystem | 'vasp/xml' |
59+
| lammps | lmp | False | False | System | 'lammps/lmp' |
60+
| lammps | dump | True | False | System | 'lammps/dump' |
61+
| deepmd | raw | True | False | System | 'deepmd/raw' |
62+
| deepmd | npy | True | False | System | 'deepmd/npy' |
63+
| deepmd | raw | True | True | LabeledSystem | 'deepmd/raw' |
64+
| deepmd | npy | True | True | LabeledSystem | 'deepmd/npy' |
65+
| gaussian| log | False | True | LabeledSystem | 'gaussian/log'|
66+
| gaussian| log | True | True | LabeledSystem | 'gaussian/md' |
67+
| siesta | output | False | True | LabeledSystem | 'siesta/output'|
68+
| siesta | aimd_output | True | True | LabeledSystem | 'siesta/aimd_output' |
69+
| cp2k | output | False | True | LabeledSystem | 'cp2k/output' |
70+
| cp2k | aimd_output | True | True | LabeledSystem | 'cp2k/aimd_output' |
71+
| QE | log | False | True | LabeledSystem | 'qe/pw/scf' |
72+
| QE | log | True | False | System | 'qe/cp/traj' |
73+
| QE | log | True | True | LabeledSystem | 'qe/cp/traj' |
74+
|quip/gap|xyz|True|True|MultiSystems|'quip/gap/xyz'|
75+
| PWmat | atom.config | False | False | System | 'pwmat/atom.config' |
76+
| PWmat | movement | True | True | LabeledSystem | 'pwmat/movement' |
77+
| PWmat | OUT.MLMD | True | True | LabeledSystem | 'pwmat/out.mlmd' |
78+
| Amber | multi | True | True | LabeledSystem | 'amber/md' |
79+
| Gromacs | gro | False | False | System | 'gromacs/gro' |
80+
81+
5482
The Class `dpdata.MultiSystems` can read data from a dir which may contains many files of different systems, or from single xyz file which contains different systems.
5583

5684
Use `dpdata.MultiSystems.from_dir` to read from a directory, `dpdata.MultiSystems` will walk in the directory
@@ -82,35 +110,8 @@ xyz_multi_systems.systems['B1C9'].to_deepmd_raw('./my_work_dir/B1C9_raw')
82110

83111
# dump all systems
84112
xyz_multi_systems.to_deepmd_raw('./my_deepmd_data/')
85-
86-
87113
```
88114

89-
| Software| format | multi frames | labeled | class | format key |
90-
| ------- | :--- | :---: | :---: | :--- | :--- |
91-
| vasp | poscar | False | False | System | 'vasp/poscar' |
92-
| vasp | outcar | True | True | LabeledSystem | 'vasp/outcar' |
93-
| vasp | xml | True | True | LabeledSystem | 'vasp/xml' |
94-
| lammps | lmp | False | False | System | 'lammps/lmp' |
95-
| lammps | dump | True | False | System | 'lammps/dump' |
96-
| deepmd | raw | True | False | System | 'deepmd/raw' |
97-
| deepmd | npy | True | False | System | 'deepmd/npy' |
98-
| deepmd | raw | True | True | LabeledSystem | 'deepmd/raw' |
99-
| deepmd | npy | True | True | LabeledSystem | 'deepmd/npy' |
100-
| gaussian| log | False | True | LabeledSystem | 'gaussian/log'|
101-
| gaussian| log | True | True | LabeledSystem | 'gaussian/md' |
102-
| siesta | output | False | True | LabeledSystem | 'siesta/output'|
103-
| siesta | aimd_output | True | True | LabeledSystem | 'siesta/aimd_output' |
104-
| cp2k | output | False | True | LabeledSystem | 'cp2k/output' |
105-
| cp2k | aimd_output | True | True | LabeledSystem | 'cp2k/aimd_output' |
106-
| QE | log | False | True | LabeledSystem | 'qe/pw/scf' |
107-
| QE | log | True | False | System | 'qe/cp/traj' |
108-
| QE | log | True | True | LabeledSystem | 'qe/cp/traj' |
109-
|quip/gap|xyz|True|True|MultiSystems|'quip/gap/xyz'|
110-
| PWmat | atom.config | False | False | System | 'pwmat/atom.config' |
111-
| PWmat | movement | True | True | LabeledSystem | 'pwmat/movement' |
112-
| PWmat | OUT.MLMD | True | True | LabeledSystem | 'pwmat/out.mlmd' |
113-
| Amber | multi | True | True | LabeledSystem | 'amber/md' |
114115
## Access data
115116
These properties stored in `System` and `LabeledSystem` can be accessed by operator `[]` with the key of the property supplied, for example
116117
```python
@@ -130,7 +131,6 @@ Available properties are (nframe: number of frames in the system, natoms: total
130131
| 'virials' | np.ndarray | nframes x 3 x 3 | True | The virial tensor of each frame
131132

132133

133-
134134
## Dump data
135135
The data stored in `System` or `LabeledSystem` can be dumped in 'lammps/lmp' or 'vasp/poscar' format, for example:
136136
```python
@@ -142,7 +142,6 @@ d_outcar.to('vasp/poscar', 'POSCAR', frame_idx=-1)
142142
```
143143
The last frames of `d_outcar` will be dumped to 'POSCAR'.
144144

145-
146145
The data stored in `LabeledSystem` can be dumped to deepmd-kit raw format, for example
147146
```python
148147
d_outcar.to('deepmd/raw', 'dpmd_raw')
@@ -157,13 +156,15 @@ dpdata.LabeledSystem('OUTCAR').sub_system([0,-1]).to('deepmd/raw', 'dpmd_raw')
157156
```
158157
by which only the first and last frames are dumped to `dpmd_raw`.
159158

159+
160160
## replicate
161161
dpdata will create a super cell of the current atom configuration.
162162
```python
163163
dpdata.System('./POSCAR').replicate((1,2,3,) )
164164
```
165165
tuple(1,2,3) means don't copy atom configuration in x direction, make 2 copys in y direction, make 3 copys in z direction.
166166

167+
167168
## perturb
168169
By the following example, each frame of the original system (`dpdata.System('./POSCAR')`) is perturbed to generate three new frames. For each frame, the cell is perturbed by 5% and the atom positions are perturbed by 0.6 Angstrom. `atom_pert_style` indicates that the perturbation to the atom positions is subject to normal distribution. Other available options to `atom_pert_style` are`uniform` (uniform in a ball), and `const` (uniform on a sphere).
169170
```python

dpdata/gromacs/__init__.py

Whitespace-only changes.

dpdata/gromacs/gro.py

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
#!/usr/bin/env python3
2+
3+
import numpy as np
4+
5+
nm2ang = 10.
6+
7+
def _get_line(line):
8+
atom_name = line[10:15].split()[0]
9+
atom_idx = int(line[15:20].split()[0])
10+
posis = [float(line[ii:ii+8]) for ii in range(20,44,8)]
11+
posis = np.array(posis) * nm2ang
12+
return atom_name, atom_idx, posis
13+
14+
def _get_cell(line):
15+
cell = np.zeros([3,3])
16+
lengths = [float(ii) for ii in line.split()]
17+
if len(lengths) >= 3:
18+
for dd in range(3):
19+
cell[dd][dd] = lengths[dd]
20+
else:
21+
raise RuntimeError('wrong box format: ', line)
22+
if len(lengths) == 9:
23+
cell[0][1] = lengths[3]
24+
cell[0][2] = lengths[4]
25+
cell[1][0] = lengths[5]
26+
cell[1][2] = lengths[6]
27+
cell[2][0] = lengths[7]
28+
cell[2][1] = lengths[8]
29+
cell = cell * nm2ang
30+
return cell
31+
32+
def file_to_system_data(fname):
33+
names = []
34+
idxs = []
35+
posis = []
36+
with open(fname) as fp:
37+
fp.readline()
38+
natoms = int(fp.readline())
39+
for ii in range(natoms):
40+
n, i, p = _get_line(fp.readline())
41+
names.append(n)
42+
idxs.append(i)
43+
posis.append(p)
44+
cell = _get_cell(fp.readline())
45+
posis = np.array(posis)
46+
system = {}
47+
system['orig'] = np.array([0, 0, 0])
48+
system['atom_names'] = list(set(names))
49+
system['atom_names'].sort()
50+
system['atom_numbs'] = [names.count(ii) for ii in system['atom_names']]
51+
system['atom_types'] = [system['atom_names'].index(ii) for ii in names]
52+
system['atom_types'] = np.array(system['atom_types'], dtype = int)
53+
system['coords'] = np.array([posis])
54+
system['cells'] = np.array([cell])
55+
return system

dpdata/system.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
import dpdata.pwmat.movement
2323
import dpdata.pwmat.atomconfig
2424
import dpdata.fhi_aims.output
25+
import dpdata.gromacs.gro
2526
from copy import deepcopy
2627
from monty.json import MSONable
2728
from monty.serialization import loadfn,dumpfn
@@ -607,6 +608,19 @@ def from_deepmd_raw(self, folder, type_map = None) :
607608
if tmp_data is not None :
608609
self.data = tmp_data
609610

611+
@register_from_funcs.register_funcs("gro")
612+
@register_from_funcs.register_funcs("gromacs/gro")
613+
def from_gromacs_gro(self, file_name) :
614+
"""
615+
Load gromacs .gro file
616+
617+
Parameters
618+
----------
619+
file_name : str
620+
The input file name
621+
"""
622+
self.data = dpdata.gromacs.gro.file_to_system_data(file_name)
623+
610624
@register_to_funcs.register_funcs("deepmd/npy")
611625
def to_deepmd_npy(self, folder, set_size = 5000, prec=np.float32) :
612626
"""

tests/gromacs/1h.gro

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
2+
9
3+
1SOL O 1 0.135 0.183 0.341
4+
1SOL H 2 0.177 0.149 0.262
5+
1SOL H 3 0.046 0.149 0.339
6+
2SOL O 4 0.520 0.447 0.111
7+
2SOL H 5 0.567 0.481 0.035
8+
2SOL H 6 0.568 0.481 0.186
9+
3SOL O 7 0.651 0.539 0.335
10+
3SOL H 8 0.653 0.634 0.336
11+
3SOL H 9 0.743 0.512 0.336
12+
0.7822838765564372 0.7353572647182051 0.9036518515423753

tests/gromacs/1h.tri.gro

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
2+
9
3+
1SOL O 1 0.135 0.183 0.341
4+
1SOL H 2 0.177 0.149 0.262
5+
1SOL H 3 0.046 0.149 0.339
6+
2SOL O 4 0.520 0.447 0.111
7+
2SOL H 5 0.567 0.481 0.035
8+
2SOL H 6 0.568 0.481 0.186
9+
3SOL O 7 0.651 0.539 0.335
10+
3SOL H 8 0.653 0.634 0.336
11+
3SOL H 9 0.743 0.512 0.336
12+
0.7822838765564372 0.7353572647182051 0.9036518515423753 0.0 0.1 0.2 0.3 0.4 0.5

tests/test_gromacs_gro.py

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
import os
2+
import numpy as np
3+
import unittest
4+
from context import dpdata
5+
6+
class TestGromacsGro(unittest.TestCase):
7+
def test_read_file(self):
8+
system = dpdata.System('gromacs/1h.gro')
9+
self.assertEqual(system['atom_names'], ['H', 'O'])
10+
self.assertEqual(system['atom_numbs'], [6, 3])
11+
for cc,ii in enumerate([1, 0, 0, 1, 0, 0, 1, 0, 0]):
12+
self.assertEqual(system['atom_types'][cc], ii)
13+
self.assertEqual(len(system['cells']), 1)
14+
self.assertEqual(len(system['coords']), 1)
15+
for ii in range(3):
16+
for jj in range(3):
17+
if ii != jj:
18+
self.assertAlmostEqual(system['cells'][0][ii][jj], 0)
19+
self.assertAlmostEqual(system['cells'][0][0][0], 7.822838765564372)
20+
self.assertAlmostEqual(system['cells'][0][1][1], 7.353572647182051)
21+
self.assertAlmostEqual(system['cells'][0][2][2], 9.036518515423753)
22+
self.assertAlmostEqual(system['coords'][0][8][0], 7.43)
23+
self.assertAlmostEqual(system['coords'][0][8][1], 5.12)
24+
self.assertAlmostEqual(system['coords'][0][8][2], 3.36)
25+
26+
def test_read_file_tri(self):
27+
system = dpdata.System('gromacs/1h.tri.gro')
28+
self.assertEqual(system['atom_names'], ['H', 'O'])
29+
self.assertEqual(system['atom_numbs'], [6, 3])
30+
for cc,ii in enumerate([1, 0, 0, 1, 0, 0, 1, 0, 0]):
31+
self.assertEqual(system['atom_types'][cc], ii)
32+
self.assertEqual(len(system['cells']), 1)
33+
self.assertEqual(len(system['coords']), 1)
34+
count = 0
35+
for ii in range(3):
36+
for jj in range(3):
37+
if ii != jj:
38+
self.assertAlmostEqual(system['cells'][0][ii][jj], count)
39+
count += 1
40+
self.assertAlmostEqual(system['cells'][0][0][0], 7.822838765564372)
41+
self.assertAlmostEqual(system['cells'][0][1][1], 7.353572647182051)
42+
self.assertAlmostEqual(system['cells'][0][2][2], 9.036518515423753)
43+
self.assertAlmostEqual(system['coords'][0][8][0], 7.43)
44+
self.assertAlmostEqual(system['coords'][0][8][1], 5.12)
45+
self.assertAlmostEqual(system['coords'][0][8][2], 3.36)
46+
system.to('vasp/poscar', 'POSCAR')

0 commit comments

Comments
 (0)