Skip to content

Commit ecdf566

Browse files
authored
format Python codes in docs (#414)
This PR adds a pre-commit hook to use Black to format Python codes in the documentation.
1 parent ffa52c5 commit ecdf566

File tree

2 files changed

+60
-35
lines changed

2 files changed

+60
-35
lines changed

.pre-commit-config.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,5 +21,10 @@ repos:
2121
rev: 22.12.0
2222
hooks:
2323
- id: black-jupyter
24+
# Python inside docs
25+
- repo: https://github.com/asottile/blacken-docs
26+
rev: 1.13.0
27+
hooks:
28+
- id: blacken-docs
2429
ci:
2530
autoupdate_branch: devel

README.md

Lines changed: 55 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -34,18 +34,18 @@ The typicall workflow of `dpdata` is
3434

3535
## Load data
3636
```python
37-
d_poscar = dpdata.System('POSCAR', fmt = 'vasp/poscar')
37+
d_poscar = dpdata.System("POSCAR", fmt="vasp/poscar")
3838
```
3939
or let dpdata infer the format (`vasp/poscar`) of the file from the file name extension
4040
```python
41-
d_poscar = dpdata.System('my.POSCAR')
41+
d_poscar = dpdata.System("my.POSCAR")
4242
```
4343
The number of atoms, atom types, coordinates are loaded from the `POSCAR` and stored to a data `System` called `d_poscar`.
4444
A data `System` (a concept used by [deepmd-kit](https://github.com/deepmodeling/deepmd-kit)) contains frames that has the same number of atoms of the same type. The order of the atoms should be consistent among the frames in one `System`.
4545
It is noted that `POSCAR` only contains one frame.
4646
If the multiple frames stored in, for example, a `OUTCAR` is wanted,
4747
```python
48-
d_outcar = dpdata.LabeledSystem('OUTCAR')
48+
d_outcar = dpdata.LabeledSystem("OUTCAR")
4949
```
5050
The labels provided in the `OUTCAR`, i.e. energies, forces and virials (if any), are loaded by `LabeledSystem`. It is noted that the forces of atoms are always assumed to exist. `LabeledSystem` is a derived class of `System`.
5151

@@ -100,51 +100,58 @@ The following commands relating to `Class dpdata.MultiSystems` may be useful.
100100
```python
101101
# load data
102102

103-
xyz_multi_systems = dpdata.MultiSystems.from_file(file_name='tests/xyz/xyz_unittest.xyz',fmt='quip/gap/xyz')
104-
vasp_multi_systems = dpdata.MultiSystems.from_dir(dir_name='./mgal_outcar', file_name='OUTCAR', fmt='vasp/outcar')
103+
xyz_multi_systems = dpdata.MultiSystems.from_file(
104+
file_name="tests/xyz/xyz_unittest.xyz", fmt="quip/gap/xyz"
105+
)
106+
vasp_multi_systems = dpdata.MultiSystems.from_dir(
107+
dir_name="./mgal_outcar", file_name="OUTCAR", fmt="vasp/outcar"
108+
)
105109

106110
# use wildcard
107-
vasp_multi_systems = dpdata.MultiSystems.from_dir(dir_name='./mgal_outcar', file_name='*OUTCAR', fmt='vasp/outcar')
111+
vasp_multi_systems = dpdata.MultiSystems.from_dir(
112+
dir_name="./mgal_outcar", file_name="*OUTCAR", fmt="vasp/outcar"
113+
)
108114

109115
# print the multi_system infomation
110116
print(xyz_multi_systems)
111-
print(xyz_multi_systems.systems) # return a dictionaries
117+
print(xyz_multi_systems.systems) # return a dictionaries
112118

113119
# print the system infomation
114-
print(xyz_multi_systems.systems['B1C9'].data)
120+
print(xyz_multi_systems.systems["B1C9"].data)
115121

116122
# dump a system's data to ./my_work_dir/B1C9_raw folder
117-
xyz_multi_systems.systems['B1C9'].to_deepmd_raw('./my_work_dir/B1C9_raw')
123+
xyz_multi_systems.systems["B1C9"].to_deepmd_raw("./my_work_dir/B1C9_raw")
118124

119125
# dump all systems
120-
xyz_multi_systems.to_deepmd_raw('./my_deepmd_data/')
126+
xyz_multi_systems.to_deepmd_raw("./my_deepmd_data/")
121127
```
122128

123129
You may also use the following code to parse muti-system:
124130
```python
125-
from dpdata import LabeledSystem,MultiSystems
131+
from dpdata import LabeledSystem, MultiSystems
126132
from glob import glob
133+
127134
"""
128135
process multi systems
129136
"""
130-
fs=glob('./*/OUTCAR') # remeber to change here !!!
131-
ms=MultiSystems()
137+
fs = glob("./*/OUTCAR") # remeber to change here !!!
138+
ms = MultiSystems()
132139
for f in fs:
133140
try:
134-
ls=LabeledSystem(f)
141+
ls = LabeledSystem(f)
135142
except:
136143
print(f)
137-
if len(ls)>0:
144+
if len(ls) > 0:
138145
ms.append(ls)
139146

140-
ms.to_deepmd_raw('deepmd')
141-
ms.to_deepmd_npy('deepmd')
147+
ms.to_deepmd_raw("deepmd")
148+
ms.to_deepmd_npy("deepmd")
142149
```
143150

144151
## Access data
145152
These properties stored in `System` and `LabeledSystem` can be accessed by operator `[]` with the key of the property supplied, for example
146153
```python
147-
coords = d_outcar['coords']
154+
coords = d_outcar["coords"]
148155
```
149156
Available properties are (nframe: number of frames in the system, natoms: total number of atoms in the system)
150157

@@ -163,61 +170,74 @@ Available properties are (nframe: number of frames in the system, natoms: total
163170
## Dump data
164171
The data stored in `System` or `LabeledSystem` can be dumped in 'lammps/lmp' or 'vasp/poscar' format, for example:
165172
```python
166-
d_outcar.to('lammps/lmp', 'conf.lmp', frame_idx=0)
173+
d_outcar.to("lammps/lmp", "conf.lmp", frame_idx=0)
167174
```
168175
The first frames of `d_outcar` will be dumped to 'conf.lmp'
169176
```python
170-
d_outcar.to('vasp/poscar', 'POSCAR', frame_idx=-1)
177+
d_outcar.to("vasp/poscar", "POSCAR", frame_idx=-1)
171178
```
172179
The last frames of `d_outcar` will be dumped to 'POSCAR'.
173180

174181
The data stored in `LabeledSystem` can be dumped to deepmd-kit raw format, for example
175182
```python
176-
d_outcar.to('deepmd/raw', 'dpmd_raw')
183+
d_outcar.to("deepmd/raw", "dpmd_raw")
177184
```
178185
Or a simpler command:
179186
```python
180-
dpdata.LabeledSystem('OUTCAR').to('deepmd/raw', 'dpmd_raw')
187+
dpdata.LabeledSystem("OUTCAR").to("deepmd/raw", "dpmd_raw")
181188
```
182189
Frame selection can be implemented by
183190
```python
184-
dpdata.LabeledSystem('OUTCAR').sub_system([0,-1]).to('deepmd/raw', 'dpmd_raw')
191+
dpdata.LabeledSystem("OUTCAR").sub_system([0, -1]).to("deepmd/raw", "dpmd_raw")
185192
```
186193
by which only the first and last frames are dumped to `dpmd_raw`.
187194

188195

189196
## replicate
190197
dpdata will create a super cell of the current atom configuration.
191198
```python
192-
dpdata.System('./POSCAR').replicate((1,2,3,) )
199+
dpdata.System("./POSCAR").replicate(
200+
(
201+
1,
202+
2,
203+
3,
204+
)
205+
)
193206
```
194207
tuple(1,2,3) means don't copy atom configuration in x direction, make 2 copys in y direction, make 3 copys in z direction.
195208

196209

197210
## perturb
198211
By the following example, each frame of the original system (`dpdata.System('./POSCAR')`) is perturbed to generate three new frames. For each frame, the cell is perturbed by 5% and the atom positions are perturbed by 0.6 Angstrom. `atom_pert_style` indicates that the perturbation to the atom positions is subject to normal distribution. Other available options to `atom_pert_style` are`uniform` (uniform in a ball), and `const` (uniform on a sphere).
199212
```python
200-
perturbed_system = dpdata.System('./POSCAR').perturb(pert_num=3,
213+
perturbed_system = dpdata.System("./POSCAR").perturb(
214+
pert_num=3,
201215
cell_pert_fraction=0.05,
202216
atom_pert_distance=0.6,
203-
atom_pert_style='normal')
217+
atom_pert_style="normal",
218+
)
204219
print(perturbed_system.data)
205220
```
206221

207222
## replace
208223
By the following example, Random 8 Hf atoms in the system will be replaced by Zr atoms with the atom postion unchanged.
209224
```python
210-
s=dpdata.System('tests/poscars/POSCAR.P42nmc',fmt='vasp/poscar')
211-
s.replace('Hf', 'Zr', 8)
212-
s.to_vasp_poscar('POSCAR.P42nmc.replace')
225+
s = dpdata.System("tests/poscars/POSCAR.P42nmc", fmt="vasp/poscar")
226+
s.replace("Hf", "Zr", 8)
227+
s.to_vasp_poscar("POSCAR.P42nmc.replace")
213228
```
214229

215230
# BondOrderSystem
216231
A new class `BondOrderSystem` which inherits from class `System` is introduced in dpdata. This new class contains information of chemical bonds and formal charges (stored in `BondOrderSystem.data['bonds']`, `BondOrderSystem.data['formal_charges']`). Now BondOrderSystem can only read from .mol/.sdf formats, because of its dependency on rdkit (which means rdkit must be installed if you want to use this function). Other formats, such as pdb, must be converted to .mol/.sdf format (maybe with software like open babel).
217232
```python
218233
import dpdata
219-
system_1 = dpdata.BondOrderSystem("tests/bond_order/CH3OH.mol", fmt="mol") # read from .mol file
220-
system_2 = dpdata.BondOrderSystem("tests/bond_order/methane.sdf", fmt="sdf") # read from .sdf file
234+
235+
system_1 = dpdata.BondOrderSystem(
236+
"tests/bond_order/CH3OH.mol", fmt="mol"
237+
) # read from .mol file
238+
system_2 = dpdata.BondOrderSystem(
239+
"tests/bond_order/methane.sdf", fmt="sdf"
240+
) # read from .sdf file
221241
```
222242
In sdf file, all molecules must be of the same topology (i.e. conformers of the same molecular configuration).
223243
`BondOrderSystem` also supports initialize from a `rdkit.Chem.rdchem.Mol` object directly.
@@ -244,16 +264,16 @@ According to our test, our sanitization procedure can successfully read 4852 sma
244264
import dpdata
245265

246266
for sdf_file in glob.glob("bond_order/refined-set-ligands/obabel/*sdf"):
247-
syst = dpdata.BondOrderSystem(sdf_file, sanitize_level='high', verbose=False)
267+
syst = dpdata.BondOrderSystem(sdf_file, sanitize_level="high", verbose=False)
248268
```
249269
## Formal Charge Assignment
250270
BondOrderSystem implement a method to assign formal charge for each atom based on the 8-electron rule (see below). Note that it only supports common elements in bio-system: B,C,N,O,P,S,As
251271
```python
252272
import dpdata
253273

254-
syst = dpdata.BondOrderSystem("tests/bond_order/CH3NH3+.mol", fmt='mol')
255-
print(syst.get_formal_charges()) # return the formal charge on each atom
256-
print(syst.get_charge()) # return the total charge of the system
274+
syst = dpdata.BondOrderSystem("tests/bond_order/CH3NH3+.mol", fmt="mol")
275+
print(syst.get_formal_charges()) # return the formal charge on each atom
276+
print(syst.get_charge()) # return the total charge of the system
257277
```
258278

259279
If a valence of 3 is detected on carbon, the formal charge will be assigned to -1. Because for most cases (in alkynyl anion, isonitrile, cyclopentadienyl anion), the formal charge on 3-valence carbon is -1, and this is also consisent with the 8-electron rule.

0 commit comments

Comments
 (0)