Skip to content

Commit 6fdd67e

Browse files
Reorganize forcefield tests (#1388)
* first attempt reorganizing mlff ci * fix wf file * tweak ff workflow file * typos * ensure torch dftd is installed with mace * separate mlff neb tests * annotations in pyproject for future dev work; make phonon tutorial only use matgl since these dep groups are split * reorg phonon tests * revert to numpy<2 * try numpy>2 with numba pin * documentation on fairchem + test data should a user install it, correct APIs to some MLIPs, try to test nequip in CI * nequip broken api syntax yet again? * mattersim tests * bump deps * other mattersim tests * organize + expand ff docs + fix mattersim test tol * forcefield dev notes + add on notes * precommit
1 parent 7ce3c51 commit 6fdd67e

File tree

28 files changed

+634
-326
lines changed

28 files changed

+634
-326
lines changed

.github/workflows/testing.yml

Lines changed: 70 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ jobs:
2323

2424
- uses: pre-commit/action@v3.0.0
2525

26-
test-non-ase:
26+
test-forcefields:
2727
# prevent this action from running on forks
2828
if: github.repository == 'materialsproject/atomate2'
2929

@@ -39,8 +39,8 @@ jobs:
3939
shell: bash -l {0} # enables conda/mamba env activation by reading bash profile
4040
strategy:
4141
matrix:
42-
python-version: ["3.11", "3.12"]
43-
split: [1, 2, 3]
42+
python-version: ["3.12"]
43+
dep-group: ["generic", "torch-limited", "e3nn-limited"]
4444

4545
steps:
4646
- name: Check out repo
@@ -59,17 +59,15 @@ jobs:
5959

6060
- name: Install conda dependencies
6161
run: |
62-
micromamba install -n a2 -c conda-forge enumlib packmol bader --yes
62+
micromamba install -n a2 -c conda-forge packmol --yes
6363
6464
- name: Install dependencies
65-
run: |
65+
run: | # TODO: remove setuptools pin once `mattersim` removes `pkg_resources` dependency
6666
micromamba activate a2
67-
python -m pip install --upgrade pip
67+
python -m pip install --upgrade pip 'setuptools<81'
6868
mkdir -p ~/.abinit/pseudos
6969
cp -r tests/test_data/abinit/pseudos/ONCVPSP-PBE-SR-PDv0.4 ~/.abinit/pseudos
70-
uv pip install .[strict,strict-forcefields,abinit,approxneb,aims] --group tests
71-
uv pip install torch-runstats torch_dftd
72-
uv pip install --no-deps nequip==0.5.6
70+
uv pip install .[strict,strict-forcefields-${{ matrix.dep-group }},abinit,approxneb,aims] --group tests
7371
7472
- name: Install pymatgen from master if triggered by pymatgen repo dispatch
7573
if: github.event_name == 'repository_dispatch' && github.event.action == 'pymatgen-ci-trigger'
@@ -81,15 +79,17 @@ jobs:
8179
env:
8280
MP_API_KEY: ${{ secrets.MP_API_KEY }}
8381

84-
# regenerate durations file with `pytest --store-durations --durations-path tests/.pytest-split-durations`
85-
# Note the use of `--splitting-algorithm least_duration`.
86-
# This helps prevent a test split having no tests to run, and then the GH action failing, see:
87-
# https://github.com/jerry-git/pytest-split/issues/95
88-
# However this `splitting-algorithm` means that tests cannot depend sensitively on the order they're executed in.
8982
run: |
9083
micromamba activate a2
91-
pytest -n auto --splits 3 --group ${{ matrix.split }} --durations-path tests/.pytest-split-durations --splitting-algorithm least_duration --ignore=tests/ase --ignore=tests/openff_md --ignore=tests/openmm_md --cov=atomate2 --cov-report=xml
84+
pytest -n auto --cov=atomate2 --cov-report=xml tests/forcefields
9285
86+
- name: Forcefield tutorial
87+
if: matrix.dep-group == 'torch-limited'
88+
env:
89+
MP_API_KEY: ${{ secrets.MP_API_KEY }}
90+
run: |
91+
micromamba activate a2
92+
pytest -n auto --nbmake ./tutorials/force_fields
9393
9494
- uses: codecov/codecov-action@v1
9595
if: matrix.python-version == '3.11' && github.repository == 'materialsproject/atomate2'
@@ -98,7 +98,7 @@ jobs:
9898
name: coverage${{ matrix.split }}
9999
file: ./coverage.xml
100100

101-
test-openff:
101+
test-non-ase:
102102
# prevent this action from running on forks
103103
if: github.repository == 'materialsproject/atomate2'
104104

@@ -114,7 +114,8 @@ jobs:
114114
shell: bash -l {0} # enables conda/mamba env activation by reading bash profile
115115
strategy:
116116
matrix:
117-
python-version: ["3.11","3.12"]
117+
python-version: ["3.11", "3.12"]
118+
split: [1, 2, 3]
118119

119120
steps:
120121
- name: Check out repo
@@ -132,14 +133,18 @@ jobs:
132133
run: micromamba run -n a2 pip install uv
133134

134135
- name: Install conda dependencies
135-
run: | # TODO: migrate openff tests to use non smirnoff99frosst forcefields - requires old setuptools / pkg_resources
136-
micromamba install -n a2 -c conda-forge enumlib packmol bader openbabel openff-toolkit==0.16.2 openff-interchange==0.3.22 'setuptools<81' --yes
136+
run: |
137+
micromamba install -n a2 -c conda-forge enumlib packmol bader --yes
137138
138139
- name: Install dependencies
139140
run: |
140141
micromamba activate a2
141142
python -m pip install --upgrade pip
142-
uv pip install .[strict-openff] --group tests
143+
mkdir -p ~/.abinit/pseudos
144+
cp -r tests/test_data/abinit/pseudos/ONCVPSP-PBE-SR-PDv0.4 ~/.abinit/pseudos
145+
uv pip install .[strict,abinit,approxneb,aims] --group tests
146+
uv pip install torch-runstats torch_dftd
147+
uv pip install --no-deps nequip==0.5.6
143148
144149
- name: Install pymatgen from master if triggered by pymatgen repo dispatch
145150
if: github.event_name == 'repository_dispatch' && github.event.action == 'pymatgen-ci-trigger'
@@ -151,19 +156,27 @@ jobs:
151156
env:
152157
MP_API_KEY: ${{ secrets.MP_API_KEY }}
153158

159+
# regenerate durations file with `pytest --store-durations --durations-path tests/.pytest-split-durations`
160+
# Note the use of `--splitting-algorithm least_duration`.
161+
# This helps prevent a test split having no tests to run, and then the GH action failing, see:
162+
# https://github.com/jerry-git/pytest-split/issues/95
163+
# However this `splitting-algorithm` means that tests cannot depend sensitively on the order they're executed in.
154164
run: |
155165
micromamba activate a2
156-
pytest -n auto tests/{openff_md,openmm_md}
166+
pytest -n auto --splits 3 --group ${{ matrix.split }} --durations-path tests/.pytest-split-durations --splitting-algorithm least_duration --ignore=tests/ase --ignore=tests/openff_md --ignore=tests/openmm_md --ignore=tests/forcefields --cov=atomate2 --cov-report=xml
157167
158-
test-notebooks-and-ase:
168+
169+
- uses: codecov/codecov-action@v1
170+
if: matrix.python-version == '3.11' && github.repository == 'materialsproject/atomate2'
171+
with:
172+
token: ${{ secrets.CODECOV_TOKEN }}
173+
name: coverage${{ matrix.split }}
174+
file: ./coverage.xml
175+
176+
test-openff:
159177
# prevent this action from running on forks
160178
if: github.repository == 'materialsproject/atomate2'
161179

162-
# It seems like anything torch-dependent and tblite can't be installed in the same environment
163-
# without the tblite tests failing in CI, see, e.g.:
164-
# https://github.com/tblite/tblite/issues/116
165-
# Outside of CI, having torch installed but not loaded seems not to affect tblite
166-
# Set off ASE tests here, where tblite-dependent tests live
167180
services:
168181
local_mongodb:
169182
image: mongo:4.0
@@ -176,7 +189,7 @@ jobs:
176189
shell: bash -l {0} # enables conda/mamba env activation by reading bash profile
177190
strategy:
178191
matrix:
179-
python-version: ["3.11", "3.12"]
192+
python-version: ["3.11","3.12"]
180193

181194
steps:
182195
- name: Check out repo
@@ -194,46 +207,38 @@ jobs:
194207
run: micromamba run -n a2 pip install uv
195208

196209
- name: Install conda dependencies
197-
run: |
198-
micromamba install -n a2 -c conda-forge enumlib packmol openbabel --yes
210+
run: | # TODO: migrate openff tests to use non smirnoff99frosst forcefields - requires old setuptools / pkg_resources
211+
micromamba install -n a2 -c conda-forge enumlib packmol bader openbabel openff-toolkit==0.16.2 openff-interchange==0.3.22 'setuptools<81' --yes
199212
200213
- name: Install dependencies
201214
run: |
202215
micromamba activate a2
203216
python -m pip install --upgrade pip
204-
uv pip install .[strict,aims] --group tests
205-
uv pip install tblite>=0.4.0
217+
uv pip install .[strict-openff] --group tests
206218
207219
- name: Install pymatgen from master if triggered by pymatgen repo dispatch
208220
if: github.event_name == 'repository_dispatch' && github.event.action == 'pymatgen-ci-trigger'
209221
run: |
210222
micromamba activate a2
211223
uv pip install --upgrade 'git+https://github.com/materialsproject/pymatgen@${{ github.event.client_payload.pymatgen_ref }}'
212224
213-
- name: Test Notebooks
225+
- name: Test split ${{ matrix.split }}
214226
env:
215227
MP_API_KEY: ${{ secrets.MP_API_KEY }}
216-
run: |
217-
micromamba activate a2
218-
pytest -n auto --nbmake ./tutorials --ignore=./tutorials/openmm_tutorial.ipynb --ignore=./tutorials/force_fields --ignore=./tutorials/torchsim_tutorial.ipynb
219228

220-
- name: Test ASE
221-
env:
222-
MP_API_KEY: ${{ secrets.MP_API_KEY }}
223229
run: |
224230
micromamba activate a2
225-
pytest -n auto --splits 1 --group 1 --cov=atomate2 --cov-report=xml tests/ase
226-
227-
- uses: codecov/codecov-action@v1
228-
if: matrix.python-version == '3.11' && github.repository == 'materialsproject/atomate2'
229-
with:
230-
token: ${{ secrets.CODECOV_TOKEN }}
231-
file: ./coverage.xml
231+
pytest -n auto tests/{openff_md,openmm_md}
232232
233-
test-force-field-notebook:
233+
test-notebooks-and-ase:
234234
# prevent this action from running on forks
235235
if: github.repository == 'materialsproject/atomate2'
236236

237+
# It seems like anything torch-dependent and tblite can't be installed in the same environment
238+
# without the tblite tests failing in CI, see, e.g.:
239+
# https://github.com/tblite/tblite/issues/116
240+
# Outside of CI, having torch installed but not loaded seems not to affect tblite
241+
# Set off ASE tests here, where tblite-dependent tests live
237242
services:
238243
local_mongodb:
239244
image: mongo:4.0
@@ -263,45 +268,43 @@ jobs:
263268
- name: Install uv
264269
run: micromamba run -n a2 pip install uv
265270

271+
- name: Install conda dependencies
272+
run: |
273+
micromamba install -n a2 -c conda-forge enumlib packmol openbabel --yes
274+
266275
- name: Install dependencies
267276
run: |
268277
micromamba activate a2
269278
python -m pip install --upgrade pip
270-
mkdir -p ~/.abinit/pseudos
271-
cp -r tests/test_data/abinit/pseudos/ONCVPSP-PBE-SR-PDv0.4 ~/.abinit/pseudos
272-
uv pip install .[strict,strict-forcefields,abinit,aims] --group tests
273-
uv pip install torch-runstats
274-
uv pip install --no-deps nequip==0.5.6
279+
uv pip install .[strict,aims] --group tests
280+
uv pip install tblite>=0.4.0
275281
276282
- name: Install pymatgen from master if triggered by pymatgen repo dispatch
277283
if: github.event_name == 'repository_dispatch' && github.event.action == 'pymatgen-ci-trigger'
278284
run: |
279285
micromamba activate a2
280286
uv pip install --upgrade 'git+https://github.com/materialsproject/pymatgen@${{ github.event.client_payload.pymatgen_ref }}'
281287
282-
- name: Forcefield tutorial
288+
- name: Test Notebooks
283289
env:
284290
MP_API_KEY: ${{ secrets.MP_API_KEY }}
285-
286-
# regenerate durations file with `pytest --store-durations --durations-path tests/.pytest-split-durations`
287-
# Note the use of `--splitting-algorithm least_duration`.
288-
# This helps prevent a test split having no tests to run, and then the GH action failing, see:
289-
# https://github.com/jerry-git/pytest-split/issues/95
290-
# However this `splitting-algorithm` means that tests cannot depend sensitively on the order they're executed in.
291291
run: |
292292
micromamba activate a2
293-
pytest -n auto --nbmake ./tutorials/force_fields
293+
pytest -n auto --nbmake ./tutorials --ignore=./tutorials/openmm_tutorial.ipynb --ignore=./tutorials/force_fields --ignore=./tutorials/torchsim_tutorial.ipynb
294294
295+
- name: Test ASE
296+
env:
297+
MP_API_KEY: ${{ secrets.MP_API_KEY }}
298+
run: |
299+
micromamba activate a2
300+
pytest -n auto --splits 1 --group 1 --cov=atomate2 --cov-report=xml tests/ase
295301
296302
- uses: codecov/codecov-action@v1
297303
if: matrix.python-version == '3.11' && github.repository == 'materialsproject/atomate2'
298304
with:
299305
token: ${{ secrets.CODECOV_TOKEN }}
300-
name: coverage
301306
file: ./coverage.xml
302307

303-
304-
305308
test-torchsim:
306309
# prevent this action from running on forks
307310
if: github.repository == 'materialsproject/atomate2'
@@ -373,15 +376,15 @@ jobs:
373376
cache-dependency-path: pyproject.toml
374377

375378
- name: Install dependencies
376-
run: |
377-
python -m pip install --upgrade pip
378-
pip install .[strict,strict-forcefields] --group docs
379+
run: | # TODO: remove setuptools pin once `mattersim` removes `pkg_resources` dependency
380+
python -m pip install --upgrade pip 'setuptools<81'
381+
pip install .[strict,strict-forcefields-generic] --group docs
379382
380383
- name: Build
381384
run: sphinx-build docs docs_build
382385

383386
automerge:
384-
needs: [docs, lint, test-force-field-notebook, test-non-ase, test-notebooks-and-ase, test-openff, test-torchsim]
387+
needs: [docs, lint, test-forcefields, test-non-ase, test-notebooks-and-ase, test-openff, test-torchsim]
385388
runs-on: ubuntu-latest
386389

387390
permissions:

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ default_language_version:
33
exclude: ^(.github/|tests/test_data/abinit/)
44
repos:
55
- repo: https://github.com/charliermarsh/ruff-pre-commit
6-
rev: v0.15.0
6+
rev: v0.15.1
77
hooks:
88
- id: ruff
99
args: [--fix]

docs/dev/forcefields.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Maintaining the machine learning forcefields module
2+
3+
Some of these points are already noted in the `pyproject.toml`. This goes into a bit more depth.
4+
5+
## Overview
6+
7+
`atomate2` contains a convenience interface to many common machine learning interatomic forcefields (MLFFs) via the atomic simulation environment (ASE). In the literature, these may also be known as machine learning interatomic potentials (MLIPs), or, when specifically referring to MLIPs with coverage of most of the periodic table, foundation potentials (FPs).
8+
9+
There is both an `ase` module in `atomate2`, based around general `ase` `Calculator`s, and a `forcefields`-specific module which has a higher number of workflows.
10+
11+
The `ase` module should be used to manage high-level tasks, such as geometry optimization, molecular dynamics, and nudged elastic band. Any further developments to these tools in `ase` should also warrant updates in this module in `atomate2`. For example, when `ase` rolled out the `MTKNPT` NPT MD barostat as a replacement for the default barostat, this was also made the default in `atomate2`.
12+
13+
The `forcefields` library should be used to develop concrete implementations of workflows, e.g., harmonic phonon, Grüneisen parameter, ApproxNEB.
14+
15+
## Dependency Chaos
16+
17+
The individual MLFFs in `atomate2` often have conflicting dependencies. This makes testing and managing a consistent, relatively up-to-date testing environment challenging.
18+
We want to avoid pinning MLFF libraries at older versions, because this may break their API within `atomate2`, or lead to drift in test data as models evolve.
19+
20+
Thus, it is likely that the `pyproject.toml` contains multiple optional dependencies under the header `strict-forcefields-*`. These groupings are used to ensure the most recent version of a MLFF library is installed in CI testing, with acceptable dependencies. The names of these groups can change over time, but the names should be chosen to be informative as to why they exist: ex., `strict-forcefields-e3nn-limited` to indicate that these MLFFs need an older version of `e3nn`, or `strict-forcefields-generic` to indicate that no strong dependency limitation is observed.
21+
22+
When updating these groupings, it is critical to ensure that you also update the `.github/workflows/testing.yml` testing workflow. You will see that the different forcefield dependency groups are tested separately.
23+
24+
When adding a new MLFF and tests for it (if possible), you must ensure that appropriate `pytest.mark.skipif` decorators are applied if that MLFF package is not installed. A `mlff_is_installed` boolean check is included in `tests/forcefields/conftest.py` for convenience in writing these skip test markers. See `tests/forcefields/test_jobs.py` for examples.
25+
26+
## Testing limitations
27+
28+
Some MLFFs, like FAIRChem, have access restrictions on them which prohibit running tests in CI. For these, we should also likely create tests for continuing development even if they are not run in CI.
29+
30+
Other MLFFs, like GAP or Nequip, are more generic architectures and require specific potential files to describe certain chemical spaces. Contributors adding new architectures which require these potential fields should submit minimal potential files (as small as possible to test, accuracy is not important here) to run tests for these.

docs/dev/input_sets.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Most ab-initio codes rely on input files read from disk. The `InputSet` class au
1111
- As a dictionary-like collection, the `InputSet` associates file names (the keys) with their content (which can be either simple strings or `InputFile` instances). The main methods of this class are `write_input()` for standardized file writing and `from_directory()` to read in pre-existing `InputSets`. The `validate()` method confirms the validity of the `InputSet`.
1212
- `InputGenerators` implement the `get_input_set()` method, which provides the recipe, i.e., logic to return a suitable `InputSet` for a `Molecule`, `Structure`, or another set of atomic coordinates. During the initialization of an `InputGenerator` additional inputs can be provided that adjust the recipe.
1313

14-
While generally the input classes are supposed to be part of `pymatgen` during development it is recommended to include them in `atomate2` at first to facilitate rapid iteration. Once mature, they can be moved to `pymatgen` or to a `pymatgen` [addon package](https://.org/addons). When implementing your own input classes take note of the recommendations and rules in the pymatgen documentation [[1](https://pymatgen.org/pymatgen.io.html#module-pymatgen.io.core), [2](https://pymatgen.org/pymatgen.io.vasp.html#module-pymatgen.io.vasp.sets)].
14+
While generally the input classes are supposed to be part of `pymatgen` during development it is recommended to include them in `atomate2` at first to facilitate rapid iteration. Once mature, they can be moved to `pymatgen` or to a `pymatgen` [addon package](https://github.com/materialsproject/pymatgen-addon-template). When implementing your own input classes take note of the recommendations and rules in the pymatgen documentation [[1](https://pymatgen.org/pymatgen.io.html#module-pymatgen.io.core), [2](https://pymatgen.org/pymatgen.io.vasp.html#module-pymatgen.io.vasp.sets)].
1515

1616
The `InputGenerators` interact with the atomate2 workflows through `Makers`. Each `Maker` for a code that requires input files, sets its `input_set_generator` parameter during initialization. For example, for the `RelaxMaker`, the default `input_set_generator` is the `RelaxSetGenerator`, but of course, users can provide or modify any `VaspInputGenerator` and provide it as input to a Maker.
1717

docs/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ user/jobflow-remote
1010
user/fireworks
1111
user/atomate-1-vs-2
1212
user/codes/index
13+
user/addons
1314
tutorials/tutorials
1415
```
1516

@@ -26,6 +27,7 @@ dev/dev_install
2627
dev/workflow_tutorial
2728
dev/vasp_tests
2829
dev/abinit_tests
30+
dev/forcefields
2931
```
3032

3133
```{toctree}

0 commit comments

Comments
 (0)