Skip to content

Commit c685e62

Browse files
authored
v0.2.20 (#713)
2 parents 4f6854d + bad0285 commit c685e62

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+1615
-670
lines changed

.git_archival.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
11
node: $Format:%H$
22
node-date: $Format:%cI$
33
describe-name: $Format:%(describe:tags=true,match=*[0-9]*)$
4-
ref-names: $Format:%D$

.github/workflows/benchmark.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
name: Python package
1+
name: Benchmark
22

33
on:
44
- push
55
- pull_request
66

77
jobs:
8-
build:
8+
benchmark:
99
runs-on: ubuntu-latest
1010
steps:
1111
- uses: actions/checkout@v4
@@ -15,9 +15,9 @@ jobs:
1515
python-version: 3.12
1616
- run: curl -LsSf https://astral.sh/uv/install.sh | sh
1717
- name: Install dependencies
18-
run: uv pip install --system .[amber,ase,pymatgen,benchmark] rdkit openbabel-wheel
18+
run: uv pip install --system .[test,amber,ase,pymatgen,benchmark] rdkit openbabel-wheel
1919
- name: Run benchmarks
20-
uses: CodSpeedHQ/action@v2
20+
uses: CodSpeedHQ/action@v3
2121
with:
2222
token: ${{ secrets.CODSPEED_TOKEN }}
2323
run: pytest benchmark/ --codspeed

.github/workflows/test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ jobs:
2020
python-version: ${{ matrix.python-version }}
2121
- run: curl -LsSf https://astral.sh/uv/install.sh | sh
2222
- name: Install dependencies
23-
run: uv pip install --system .[amber,ase,pymatgen] coverage ./tests/plugin rdkit openbabel-wheel
23+
run: uv pip install --system .[test,amber,ase,pymatgen] coverage ./tests/plugin rdkit openbabel-wheel
2424
- name: Test
2525
run: cd tests && coverage run --source=../dpdata -m unittest && cd .. && coverage combine tests/.coverage && coverage report
2626
- name: Run codecov

.pre-commit-config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ repos:
2121
# Python
2222
- repo: https://github.com/astral-sh/ruff-pre-commit
2323
# Ruff version.
24-
rev: v0.4.7
24+
rev: v0.6.3
2525
hooks:
2626
- id: ruff
2727
args: ["--fix"]
@@ -36,7 +36,7 @@ repos:
3636
args: ["--write"]
3737
# Python inside docs
3838
- repo: https://github.com/asottile/blacken-docs
39-
rev: 1.16.0
39+
rev: 1.18.0
4040
hooks:
4141
- id: blacken-docs
4242
ci:

README.md

Lines changed: 25 additions & 298 deletions
Large diffs are not rendered by default.

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@
4949
"sphinx.ext.viewcode",
5050
"sphinx.ext.intersphinx",
5151
"numpydoc",
52-
"m2r2",
52+
"myst_parser",
5353
"sphinxarg.ext",
5454
"jupyterlite_sphinx",
5555
]

docs/index.rst

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,22 +6,23 @@
66
Welcome to dpdata's documentation!
77
==================================
88

9+
dpdata is a Python package for manipulating atomistic data of software in computational science.
10+
911
.. toctree::
1012
:maxdepth: 2
1113
:caption: Contents:
1214

13-
Overview <self>
15+
installation
16+
systems/index
1417
try_dpdata
1518
cli
1619
formats
1720
drivers
1821
minimizers
22+
plugin
1923
api/api
2024
credits
2125

22-
.. mdinclude:: ../README.md
23-
24-
2526
Indices and tables
2627
==================
2728

docs/installation.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Installation
2+
3+
DP-GEN only supports Python 3.7 and above. You can [setup a conda/pip environment](https://docs.deepmodeling.com/faq/conda.html), and then use one of the following methods to install DP-GEN:
4+
5+
- Install via pip: `pip install dpdata`
6+
- Install via conda: `conda install -c conda-forge dpdata`
7+
- Install from source code: `git clone https://github.com/deepmodeling/dpdata && pip install ./dpdata`
8+
9+
To test if the installation is successful, you may execute
10+
11+
```bash
12+
dpdata --version
13+
```

docs/plugin.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Plugins
2+
3+
One can follow a simple example under `plugin_example/` directory to add their own format by creating and installing plugins.
4+
It's critical to add the :class:`Format` class to `entry_points['dpdata.plugins']` in `pyproject.toml`:
5+
6+
```toml
7+
[project.entry-points.'dpdata.plugins']
8+
random = "dpdata_random:RandomFormat"
9+
```

docs/systems/bond_order_system.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
2+
## BondOrderSystem
3+
A new class :class:`BondOrderSystem` which inherits from class :class:`System` is introduced in dpdata. This new class contains information of chemical bonds and formal charges (stored in `BondOrderSystem.data['bonds']`, `BondOrderSystem.data['formal_charges']`). Now BondOrderSystem can only read from .mol/.sdf formats, because of its dependency on rdkit (which means rdkit must be installed if you want to use this function). Other formats, such as pdb, must be converted to .mol/.sdf format (maybe with software like open babel).
4+
```python
5+
import dpdata
6+
7+
system_1 = dpdata.BondOrderSystem(
8+
"tests/bond_order/CH3OH.mol", fmt="mol"
9+
) # read from .mol file
10+
system_2 = dpdata.BondOrderSystem(
11+
"tests/bond_order/methane.sdf", fmt="sdf"
12+
) # read from .sdf file
13+
```
14+
In sdf file, all molecules must be of the same topology (i.e. conformers of the same molecular configuration).
15+
`BondOrderSystem` also supports initialize from a :class:`rdkit.Chem.rdchem.Mol` object directly.
16+
```python
17+
from rdkit import Chem
18+
from rdkit.Chem import AllChem
19+
import dpdata
20+
21+
mol = Chem.MolFromSmiles("CC")
22+
mol = Chem.AddHs(mol)
23+
AllChem.EmbedMultipleConfs(mol, 10)
24+
system = dpdata.BondOrderSystem(rdkit_mol=mol)
25+
```
26+
27+
### Bond Order Assignment
28+
The :class:`BondOrderSystem` implements a more robust sanitize procedure for rdkit Mol, as defined in :class:`dpdata.rdkit.santizie.Sanitizer`. This class defines 3 level of sanitization process by: low, medium and high. (default is medium).
29+
+ low: use `rdkit.Chem.SanitizeMol()` function to sanitize molecule.
30+
+ medium: before using rdkit, the programm will first assign formal charge of each atom to avoid inappropriate valence exceptions. However, this mode requires the rightness of the bond order information in the given molecule.
31+
+ high: the program will try to fix inappropriate bond orders in aromatic hetreocycles, phosphate, sulfate, carboxyl, nitro, nitrine, guanidine groups. If this procedure fails to sanitize the given molecule, the program will then try to call `obabel` to pre-process the mol and repeat the sanitization procedure. **That is to say, if you wan't to use this level of sanitization, please ensure `obabel` is installed in the environment.**
32+
According to our test, our sanitization procedure can successfully read 4852 small molecules in the PDBBind-refined-set. It is necessary to point out that the in the molecule file (mol/sdf), the number of explicit hydrogens has to be correct. Thus, we recommend to use
33+
`obabel xxx -O xxx -h` to pre-process the file. The reason why we do not implement this hydrogen-adding procedure in dpdata is that we can not ensure its correctness.
34+
35+
```python
36+
import dpdata
37+
38+
for sdf_file in glob.glob("bond_order/refined-set-ligands/obabel/*sdf"):
39+
syst = dpdata.BondOrderSystem(sdf_file, sanitize_level="high", verbose=False)
40+
```
41+
### Formal Charge Assignment
42+
BondOrderSystem implement a method to assign formal charge for each atom based on the 8-electron rule (see below). Note that it only supports common elements in bio-system: B,C,N,O,P,S,As
43+
```python
44+
import dpdata
45+
46+
syst = dpdata.BondOrderSystem("tests/bond_order/CH3NH3+.mol", fmt="mol")
47+
print(syst.get_formal_charges()) # return the formal charge on each atom
48+
print(syst.get_charge()) # return the total charge of the system
49+
```
50+
51+
If a valence of 3 is detected on carbon, the formal charge will be assigned to -1. Because for most cases (in alkynyl anion, isonitrile, cyclopentadienyl anion), the formal charge on 3-valence carbon is -1, and this is also consisent with the 8-electron rule.

0 commit comments

Comments
 (0)