Skip to content

Commit 8b6a0a7

Browse files
authored
merge devel to master (v1.0.0) (#909)
2 parents e0bc0f0 + c66e97a commit 8b6a0a7

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+13429
-48
lines changed

.github/workflows/benchmark.yml

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,15 @@ on:
66

77
jobs:
88
benchmark:
9+
if: ${{ github.repository_owner == 'deepmodeling' }}
910
runs-on: ubuntu-latest
1011
steps:
11-
- uses: actions/checkout@v4
12+
- uses: actions/checkout@v5
1213
- name: Set up Python
13-
uses: actions/setup-python@v5
14+
uses: actions/setup-python@v6
1415
with:
1516
python-version: 3.12
16-
- uses: astral-sh/setup-uv@v6
17+
- uses: astral-sh/setup-uv@v7
1718
with:
1819
enable-cache: true
1920
cache-dependency-glob: |
@@ -22,7 +23,8 @@ jobs:
2223
- name: Install dependencies
2324
run: uv pip install --system .[test,amber,ase,pymatgen,benchmark] rdkit openbabel-wheel
2425
- name: Run benchmarks
25-
uses: CodSpeedHQ/action@v3
26+
uses: CodSpeedHQ/action@v4
2627
with:
2728
token: ${{ secrets.CODSPEED_TOKEN }}
29+
mode: walltime
2830
run: pytest benchmark/ --codspeed

.github/workflows/pyright.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ jobs:
99
runs-on: ubuntu-latest
1010
steps:
1111
- uses: actions/checkout@master
12-
- uses: actions/setup-python@v5
12+
- uses: actions/setup-python@v6
1313
with:
1414
python-version: '3.12'
1515
- run: pip install uv

.github/workflows/test.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,13 @@ jobs:
1212
python-version: ["3.8", "3.12"]
1313

1414
steps:
15-
- uses: actions/checkout@v4
15+
- uses: actions/checkout@v5
1616
# set up conda
1717
- name: Set up Python ${{ matrix.python-version }}
18-
uses: actions/setup-python@v5
18+
uses: actions/setup-python@v6
1919
with:
2020
python-version: ${{ matrix.python-version }}
21-
- uses: astral-sh/setup-uv@v6
21+
- uses: astral-sh/setup-uv@v7
2222
with:
2323
enable-cache: true
2424
cache-dependency-glob: |

.github/workflows/test_import.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ jobs:
88
build:
99
runs-on: ubuntu-latest
1010
steps:
11-
- uses: actions/checkout@v4
12-
- uses: actions/setup-python@v5
11+
- uses: actions/checkout@v5
12+
- uses: actions/setup-python@v6
1313
with:
1414
python-version: '3.9'
1515
architecture: 'x64'

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,8 @@ docs/minimizers.csv
2929
docs/api/
3030
docs/formats/
3131
.DS_Store
32+
# Test artifacts
33+
tests/data_*.h5
34+
tests/data_*/
35+
tests/tmp.*
36+
tests/.coverage

.pre-commit-config.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# See https://pre-commit.com/hooks.html for more hooks
33
repos:
44
- repo: https://github.com/pre-commit/pre-commit-hooks
5-
rev: v5.0.0
5+
rev: v6.0.0
66
hooks:
77
# there are many log files in tests
88
# TODO: seperate py files and log files
@@ -21,7 +21,7 @@ repos:
2121
# Python
2222
- repo: https://github.com/astral-sh/ruff-pre-commit
2323
# Ruff version.
24-
rev: v0.12.5
24+
rev: v0.14.1
2525
hooks:
2626
- id: ruff
2727
args: ["--fix"]
@@ -36,7 +36,7 @@ repos:
3636
args: ["--write"]
3737
# Python inside docs
3838
- repo: https://github.com/asottile/blacken-docs
39-
rev: 1.19.1
39+
rev: 1.20.0
4040
hooks:
4141
- id: blacken-docs
4242
ci:

AGENTS.md

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
# dpdata - Atomistic Data Format Manipulation
2+
3+
dpdata is a Python package for manipulating atomistic data from computational science software. It supports format conversion between various atomistic simulation packages including VASP, DeePMD-kit, LAMMPS, GROMACS, Gaussian, ABACUS, and many others.
4+
5+
Always reference these instructions first and fallback to search or bash commands only when you encounter unexpected information that does not match the info here.
6+
7+
## Working Effectively
8+
9+
- **Bootstrap and install the repository:**
10+
- `cd /home/runner/work/dpdata/dpdata` (or wherever the repo is cloned)
11+
- `uv pip install -e .` -- installs dpdata in development mode with core dependencies (numpy, scipy, h5py, monty, wcmatch)
12+
- Test installation: `dpdata --version` -- should show version like "dpdata v0.1.dev2+..."
13+
14+
- **Run tests:**
15+
- `cd tests && python -m unittest discover` -- runs all 1826 tests in ~10 seconds. NEVER CANCEL.
16+
- `cd tests && python -m unittest test_<module>.py` -- run specific test modules (individual modules take ~0.5 seconds)
17+
- `cd tests && coverage run --source=../dpdata -m unittest discover && coverage report` -- run tests with coverage
18+
19+
- **Linting and formatting:**
20+
- Install ruff: `uv pip install ruff`
21+
- `ruff check dpdata/` -- lint the main package (takes ~1 second)
22+
- `ruff format dpdata/` -- format code according to project style
23+
- `ruff check --fix dpdata/` -- auto-fix linting issues where possible
24+
25+
- **Pre-commit hooks:**
26+
- Install: `uv pip install pre-commit`
27+
- `pre-commit run --all-files` -- run all hooks on all files
28+
- Hooks include: ruff linting/formatting, trailing whitespace, end-of-file-fixer, yaml/json/toml checks
29+
30+
## Validation
31+
32+
- **Always test CLI functionality after making changes:**
33+
- `dpdata --help` -- ensure CLI still works
34+
- `dpdata --version` -- verify version is correct
35+
- Test a basic conversion if sample data is available
36+
37+
- **Always run linting before committing:**
38+
- `ruff check dpdata/` -- ensure no new linting errors
39+
- `ruff format dpdata/` -- ensure code is properly formatted
40+
41+
- **Run relevant tests for your changes:**
42+
- For format-specific changes: `cd tests && python -m unittest test_<format>*.py`
43+
- For core system changes: `cd tests && python -m unittest test_system*.py test_multisystems.py`
44+
- For CLI changes: `cd tests && python -m unittest test_cli.py` (if exists)
45+
46+
## Build and Documentation
47+
48+
- **Documentation:**
49+
- `cd docs && make help` -- see all available build targets
50+
- `cd docs && make html` -- build HTML documentation (requires additional dependencies)
51+
- Documentation source is in `docs/` directory using Sphinx
52+
- **NOTE:** Full docs build requires additional dependencies like `deepmodeling-sphinx` that may not be readily available
53+
54+
- **Package building:**
55+
- Uses setuptools with pyproject.toml configuration
56+
- `uv pip install build && python -m build` -- create source and wheel distributions
57+
- Version is managed by setuptools_scm from git tags
58+
59+
## Common Tasks
60+
61+
The following are outputs from frequently run commands. Reference them instead of re-running to save time.
62+
63+
### Repository structure
64+
```
65+
/home/runner/work/dpdata/dpdata/
66+
├── dpdata/ # Main package code
67+
│ ├── __init__.py
68+
│ ├── cli.py # Command-line interface
69+
│ ├── system.py # Core System classes
70+
│ ├── format.py # Format registry
71+
│ ├── abacus/ # ABACUS format support
72+
│ ├── amber/ # AMBER format support
73+
│ ├── deepmd/ # DeePMD format support
74+
│ ├── vasp/ # VASP format support
75+
│ ├── xyz/ # XYZ format support
76+
│ └── ... # Other format modules
77+
├── tests/ # Test suite (91 test files)
78+
├── docs/ # Sphinx documentation
79+
├── plugin_example/ # Example plugin
80+
├── pyproject.toml # Project configuration
81+
└── README.md
82+
```
83+
84+
### Key dependencies
85+
- Core: numpy>=1.14.3, scipy, h5py, monty, wcmatch
86+
- Optional: ase (ASE integration), parmed (AMBER), pymatgen (Materials Project), rdkit (molecular analysis)
87+
- Testing: unittest (built-in), coverage
88+
- Linting: ruff
89+
- Docs: sphinx with various extensions
90+
91+
### Test timing expectations
92+
- Full test suite: ~10 seconds (1826 tests). NEVER CANCEL.
93+
- Individual test modules: ~0.5 seconds
94+
- Linting with ruff: ~1 second
95+
- Documentation build: ~30 seconds
96+
97+
### Common workflows
98+
1. **Adding a new format:**
99+
- Create module in `dpdata/<format>/`
100+
- Implement format classes inheriting from appropriate base classes
101+
- Add tests in `tests/test_<format>*.py`
102+
- Register format in the plugin system
103+
104+
2. **Fixing bugs:**
105+
- Write test that reproduces the bug first
106+
- Make minimal fix to pass the test
107+
- Run full test suite to ensure no regressions
108+
- Run linting to ensure code style compliance
109+
110+
3. **CLI changes:**
111+
- Modify `dpdata/cli.py`
112+
- Test with `dpdata --help` and specific commands
113+
- Add/update tests if needed
114+
115+
## Troubleshooting
116+
117+
- **Installation timeouts:** Network timeouts during `uv pip install` are common. If this occurs, try:
118+
- Individual package installation: `uv pip install numpy scipy h5py monty wcmatch`
119+
- Use `--timeout` option: `uv pip install --timeout 300 -e .`
120+
- Verify existing installation works: `dpdata --version` should work even if reinstall fails
121+
122+
- **Optional dependency errors:** Many tests will skip or fail if optional dependencies (ase, parmed, pymatgen, rdkit) are not installed. This is expected. Core functionality will work with just the basic dependencies.
123+
124+
- **Documentation build failures:** The docs build requires specific dependencies like `deepmodeling-sphinx` that may not be readily available. Use `make help` to see available targets, but expect build failures without full doc dependencies.
125+
126+
- **Test artifacts:** The test suite generates temporary files (`tests/data_*`, `tests/tmp.*`, `tests/.coverage`). These are excluded by `.gitignore` and should not be committed.
127+
128+
- **Import errors:** If you see import errors for specific modules, check if the corresponding optional dependency is installed. For example, ASE functionality requires `uv pip install ase`.
129+
130+
## Critical Notes
131+
132+
- **NEVER CANCEL** test runs or builds - they complete quickly (10 seconds for tests, 30 seconds for docs)
133+
- Always run `ruff check` and `ruff format` before committing
134+
- Test artifacts in `tests/` directory are excluded by `.gitignore` - don't commit them
135+
- Optional dependencies are required for some formats but core functionality works without them
136+
- The CLI tool `dpdata` is the main user interface for format conversion
137+
138+
## Commit and PR Guidelines
139+
140+
- **Use semantic commit messages** for all commits and PR titles following the format: `type(scope): description`
141+
- **Types:** `feat` (new feature), `fix` (bug fix), `docs` (documentation), `style` (formatting), `refactor` (code restructuring), `test` (testing), `chore` (maintenance)
142+
- **Examples:**
143+
- `feat(vasp): add support for POSCAR format`
144+
- `fix(cli): resolve parsing error for multi-frame files`
145+
- `docs: update installation instructions`
146+
- `test(amber): add tests for trajectory parsing`
147+
- **PR titles** must follow semantic commit format
148+
- **Commit messages** should be concise but descriptive of the actual changes made

README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,18 @@
11
# dpdata
22

3+
[![DOI:10.1021/acs.jcim.5c01767](https://img.shields.io/badge/DOI-10.1021%2Facs.jcim.5c01767-blue)](https://doi.org/10.1021/acs.jcim.5c01767)
34
[![conda-forge](https://img.shields.io/conda/dn/conda-forge/dpdata?color=red&label=conda-forge&logo=conda-forge)](https://anaconda.org/conda-forge/dpdata)
45
[![pip install](https://img.shields.io/pypi/dm/dpdata?label=pip%20install&logo=pypi)](https://pypi.org/project/dpdata)
56
[![Documentation Status](https://readthedocs.org/projects/dpdata/badge/)](https://dpdata.readthedocs.io/)
67

78
**dpdata** is a Python package for manipulating atomistic data of software in computational science.
89

10+
## Credits
11+
12+
If you use this software, please cite the following paper:
13+
14+
- Jinzhe Zeng, Xingliang Peng, Yong-Bin Zhuang, Haidi Wang, Fengbo Yuan, Duo Zhang, Renxi Liu, Yingze Wang, Ping Tuo, Yuzhi Zhang, Yixiao Chen, Yifan Li, Cao Thang Nguyen, Jiameng Huang, Anyang Peng, Marián Rynik, Wei-Hong Xu, Zezhong Zhang, Xu-Yuan Zhou, Tao Chen, Jiahao Fan, Wanrun Jiang, Bowen Li, Denan Li, Haoxi Li, Wenshuo Liang, Ruihao Liao, Liping Liu, Chenxing Luo, Logan Ward, Kaiwei Wan, Junjie Wang, Pan Xiang, Chengqian Zhang, Jinchao Zhang, Rui Zhou, Jia-Xin Zhu, Linfeng Zhang, Han Wang, dpdata: A Scalable Python Toolkit for Atomistic Machine Learning Data Sets, *J. Chem. Inf. Model.*, 2025, DOI: [10.1021/acs.jcim.5c01767](https://doi.org/10.1021/acs.jcim.5c01767). [![Citations](https://citations.njzjz.win/10.1021/acs.jcim.5c01767)](https://badge.dimensions.ai/details/doi/10.1021/acs.jcim.5c01767)
15+
916
## Installation
1017

1118
dpdata only supports Python 3.8 and above. You can [setup a conda/pip environment](https://docs.deepmodeling.com/faq/conda.html), and then use one of the following methods to install dpdata:

docs/index.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,19 @@ Welcome to dpdata's documentation!
88

99
dpdata is a Python package for manipulating atomistic data of software in computational science.
1010

11+
If you use this software, please cite the following paper:
12+
13+
- Jinzhe Zeng, Xingliang Peng, Yong-Bin Zhuang, Haidi Wang, Fengbo
14+
Yuan, Duo Zhang, Renxi Liu, Yingze Wang, Ping Tuo, Yuzhi Zhang,
15+
Yixiao Chen, Yifan Li, Cao Thang Nguyen, Jiameng Huang, Anyang Peng,
16+
Marián Rynik, Wei-Hong Xu, Zezhong Zhang, Xu-Yuan Zhou, Tao Chen,
17+
Jiahao Fan, Wanrun Jiang, Bowen Li, Denan Li, Haoxi Li, Wenshuo
18+
Liang, Ruihao Liao, Liping Liu, Chenxing Luo, Logan Ward, Kaiwei Wan,
19+
Junjie Wang, Pan Xiang, Chengqian Zhang, Jinchao Zhang, Rui Zhou,
20+
Jia-Xin Zhu, Linfeng Zhang, Han Wang, dpdata: A Scalable Python
21+
Toolkit for Atomistic Machine Learning Data Sets, *J. Chem. Inf.
22+
Model.*, 2025.
23+
1124
.. toctree::
1225
:maxdepth: 2
1326
:caption: Contents:

dpdata/abacus/scf.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,10 @@ def get_path_out(fname, inlines):
4545
def get_energy(outlines):
4646
Etot = None
4747
for line in reversed(outlines):
48-
if "final etot is" in line:
48+
if "final etot is" in line: # for LTS
49+
Etot = float(line.split()[-2]) # in eV
50+
return Etot, True
51+
elif "TOTAL ENERGY" in line: # for develop
4952
Etot = float(line.split()[-2]) # in eV
5053
return Etot, True
5154
elif "convergence has NOT been achieved!" in line:
@@ -59,7 +62,8 @@ def get_energy(outlines):
5962
def collect_force(outlines):
6063
force = []
6164
for i, line in enumerate(outlines):
62-
if "TOTAL-FORCE (eV/Angstrom)" in line:
65+
# if "TOTAL-FORCE (eV/Angstrom)" in line:
66+
if "TOTAL-FORCE" in line:
6367
value_pattern = re.compile(
6468
r"^\s*[A-Z][a-z]?[1-9][0-9]*\s+[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?\s+[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?\s+[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?\s*$"
6569
)
@@ -95,7 +99,8 @@ def get_force(outlines, natoms):
9599
def collect_stress(outlines):
96100
stress = []
97101
for i, line in enumerate(outlines):
98-
if "TOTAL-STRESS (KBAR)" in line:
102+
# if "TOTAL-STRESS (KBAR)" in line:
103+
if "TOTAL-STRESS" in line:
99104
value_pattern = re.compile(
100105
r"^\s*[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?\s+[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?\s+[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?\s*$"
101106
)

0 commit comments

Comments
 (0)