Skip to content

Commit c0c46aa

Browse files
sacostagutzlwalew
andauthored
docs: revision SAG (#49)
* docs: index page review * docs: cli page review * docs: new benchmark page review * docs: category split modification * docs: stability docs review * docs: scaling docs review * docs: typo benchmarks index * docs: conformers new figures * docs: dihedral-scan reviewed * docs: non-covalent interactions reviewed * docs: ring-planarity reviewed * docs: tautomers reviewed * docs: minimization reviewed, new figure pending * docs: bond length reviewed * docs: reactivity reviewed * docs: rdf molecular liquids reviewed * docs: folding stability review 1 * docs: folding stability review 2 * docs: sampling reviewed * docs: folding-stability index.rst typo * docs: folding-stability index.rst typo 2 * docs: changes requestedby Christoph and Lucien * docs: new minimization benchmark figure * docs: slight improvements + grammar * chore: linting --------- Co-authored-by: lwalew <[email protected]>
1 parent ea9a3e7 commit c0c46aa

33 files changed

+139
-131
lines changed

docs/source/benchmarks/biomolecules/folding_stability/index.rst

Lines changed: 26 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -6,20 +6,36 @@ Folding Stability Benchmark
66
Purpose
77
-------
88

9-
This benchmark evaluates the performance of a machine-learned interatomic potential
10-
(**MLIP**) in preserving the structural integrity of experimentally determined protein
9+
This benchmark evaluates the ability of a machine-learned interatomic potential
10+
(**MLIP**) to maintain the structural integrity of experimentally determined protein
1111
conformations during molecular dynamics (**MD**) simulations.
12-
Starting from an experimentally derived X-ray or NMR structure, the benchmark assesses
13-
the **MLIP** model ability to maintain the native protein fold throughout the simulation.
1412

15-
Specifically, it evaluates the maintain of the original protein fold, retention of
16-
secondary structure elements, and overall compactness across a set of known protein
17-
structures. This is quantified using various metrics, including **RMSD** (Root Mean Square Deviation),
18-
**TM-score** (Template Modeling score), **Secondary Structure matching** (using DSSP),
19-
and **Compactness** (radius of gyration analysis).
13+
Description
14+
-----------
15+
16+
Starting from an experimentally derived X-ray or NMR structure, the benchmark performs an **MD** simulation using the **MLIP**
17+
model in the **NVT** ensemble at **300 K** for **100,000 steps** (100ps), leveraging the `jax-md <https://github.com/google/jax-md>`_,
18+
as integrated via the `mlip <https://github.com/instadeepai/mlip>`_ library, starting from a solvated structure.
2019

20+
Performance is quantified using the following metrics:
2121

22+
- Retention of the original protein fold, via **RMSD** and **TM-score**.
23+
- Retention of secondary structure elements, via **Secondary Structure matching** (using DSSP).
24+
- Overall compactness, via **Compactness** (radius of gyration analysis).
2225

26+
For more information on each metric, please refer to the following pages:
27+
28+
.. toctree::
29+
:maxdepth: 1
30+
31+
RMSD & TM-score <folding_stability>
32+
Compactness (Radius of gyration) <compactness>
33+
Secondary Structure matching <secondary_structure>
34+
35+
Dataset
36+
-------
37+
The dataset is composed by a series of protein structures taken from the `PDB <https://www.rcsb.org/>`_ databank.
38+
They have the following IDs:
2339

2440
.. list-table::
2541
:widths: 25 25 25 25
@@ -43,43 +59,9 @@ and **Compactness** (radius of gyration analysis).
4359
:figclass: align-center
4460

4561
Amyloid-beta (PDBid: 1BA6)
46-
- .. figure:: ../img/1cq0.png
62+
- .. figure:: ../img/hypocretin.png
4763
:width: 100%
4864
:align: center
4965
:figclass: align-center
5066

5167
Hypocretin-2 (PDBid: 1CQ0)
52-
53-
54-
Dataset
55-
-------
56-
57-
TRP-cage (PDBid: 2JOF)
58-
59-
Chignolin (PDBid: 1UAO)
60-
61-
Amyloid-beta (PDBid: 1BA6)
62-
63-
Hypocretin-2 (PDBid: 1CQ0)
64-
65-
Description
66-
-----------
67-
68-
The benchmark performs an **MD** simulation using the **MLIP** model in the **NVT**
69-
ensemble at **300 K** for **100,000 steps**,
70-
leveraging the `jax-md <https://github.com/google/jax-md>`_, as integrated via the
71-
`mlip <https://github.com/instadeepai/mlip>`_ library. The starting configuration is an
72-
experimentally determined structure (X-ray or NMR).
73-
74-
For each system, the benchmark compares the following metrics to the reference structure
75-
for each trajectory frame.
76-
77-
78-
.. toctree::
79-
:maxdepth: 1
80-
81-
Folding Stability (RMSD & TM-score) <folding_stability>
82-
Compactness (Radius of gyration) <compactness>
83-
Secondary Structure matching <secondary_structure>
84-
85-
(For detailed descriptions and implementations of each metric, please refer to the pages linked above)
-19.5 KB
Loading
-19.6 KB
Loading
75.3 KB
Loading
-34.1 KB
Loading

docs/source/benchmarks/biomolecules/sampling.rst

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ reference data \ [#f1]_ and outliers are detected.
1212

1313
Description
1414
-----------
15-
This benchmark performs an MD simulation using the **MLIP** model in the **NVT** ensemble at **350 K** for **150,000 steps**. The
15+
This benchmark performs an MD simulation using the **MLIP** model in the **NVT** ensemble at **350 K** for **150,000 steps** (150ps). The
1616
sampled probability distribution of backbone and side chain dihedrals is compared to a reference distribution. The main metrics are
1717
the **RMSD** and the **Hellinger distance** between the sampled and reference distributions. We also compute the **outliers ratio**
1818
of the sampled dihedrals. An outlier is defined as a conformation that is far away from any point of the reference data.
@@ -23,10 +23,9 @@ Dataset
2323

2424
Each sequence was prepared to have a neutral total charge.
2525

26-
Systems were prepared with tleap, using larger boxes to enable proper handeling of long range cutoffs
27-
and minimised and equilibrated with the NPT ensemble with the AMBER force field using openMM.
28-
29-
Boxes of 300 molecules of water were then extracted from the equilibrated systems.
26+
Systems were prepared with **AmberTools25** \ [#f2]_, using larger boxes of pre-equilibrated **TIP3P** \ [#f3]_ water to enable
27+
proper handeling of long range cutoffs,and then minimised and equilibrated in the NPT ensemble (1atm, 350K) with the AMBER99SB-ILDN \ [#f4]_,
28+
force field using openMM \ [#f5]_. After equilibration, boxes of 300 molecules of water were extracted to optimise benchmark runtimes.
3029

3130
The sequences are as follows:
3231

@@ -52,3 +51,7 @@ MLIP with an outlier ratio higher than 0.3 should be considered as not sampling
5251
References
5352
----------
5453
.. [#f1] Lovell, S.C., Word, J.M., Richardson, J.S. and Richardson, D.C. (2000),The penultimate rotamer library. Proteins, 40: 389-408. https://doi.org/10.1002/1097-0134(20000815)40:3<389::AID-PROT50>3.0.CO;2-2
54+
.. [#f2] AmberTools25, Case, D.A.; et al. Amber 2025. University of California, San Francisco (2025).
55+
.. [#f3] TIP3P, Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. J. Chem. Phys. 79 (1983) 926–935. doi:10.1063/1.445869
56+
.. [#f4] Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis, J. L.; Dror, R. O.; Shaw, D. E. Proteins 78 (2010) 1950–1958. doi:10.1002/prot.22711
57+
.. [#f5] P. Eastman; et al. “OpenMM 8: Molecular Dynamics Simulation with Machine Learning Potentials.” J. Phys. Chem. B 128(1), pp. 109-116 (2023).

docs/source/benchmarks/general/scaling.rst

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,16 +17,17 @@ Description
1717

1818
For each system in the dataset, the benchmark performs a **MD** simulation using the **MLIP** model in the **NVT** ensemble at **300 K**
1919
for **1000 steps** (1 ps), leveraging the `jax-md <https://github.com/google/jax-md>`_, as integrated via the
20-
`mlip <https://github.com/instadeepai/mlip>`_ library. During each simulation, a timer tracks the duration of each episode, and the average episode time (excluding the first episode)
21-
is recorded. After all simulations are complete, the benchmark reports the **average inference time per episode as a function of
22-
system size**, providing a direct measure of how the **MLIP** implementation's computational cost grows with increasing molecular
23-
complexity. This allows for the identification of scaling bottlenecks and informs optimization strategies for large-scale
24-
simulations.
20+
`mlip <https://github.com/instadeepai/mlip>`_ library. During each simulation, a timer tracks the duration of each episode,
21+
and the average episode time (excluding the first episode to ignore the compilation time) is recorded. After all simulations are complete, the benchmark reports
22+
the **average inference time per averagestep as a function of system size**, providing a direct measure of how the **MLIP** implementation's
23+
computational cost grows with increasing molecular complexity. This allows for the identification of scaling bottlenecks and informs
24+
optimization strategies for large-scale simulations.
2525

2626
Dataset
2727
-------
2828

29-
The structures that are tested for stability are a series of protein structures, RNA fragments, peptides and inhibitors taken from the PDB.
29+
The scaling dataset is composed of a series of protein structures, RNA fragments,
30+
peptides and small-molecules experimental structures taken from the `PDB <https://www.rcsb.org/>`_ databank.
3031
They have the following ids:
3132

3233
* 1JRS

docs/source/benchmarks/general/stability.rst

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Purpose
77
-------
88

99
To assess the long-term dynamical stability of a machine-learned interatomic potential (**MLIP**) during realistic,
10-
production-scale molecular dynamics (**MD**) simulations.
10+
molecular dynamics (**MD**) simulations.
1111

1212
Description
1313
-----------
@@ -16,8 +16,8 @@ For each system in the dataset, the benchmark performs a **MD** simulation using
1616
**NVT** ensemble at **300 K** for **100,000 steps** (100 ps), leveraging the
1717
`jax-md <https://github.com/google/jax-md>`_, as integrated via the `mlip <https://github.com/instadeepai/mlip>`_
1818
library. The test monitors the system for signs of instability by detecting abrupt temperature spikes
19-
(**explosions**) and hydrogen atom drift. These indicators help determine whether the **MLIP** maintains
20-
stable and physically consistent dynamics over extended simulation times.
19+
(explosions) and hydrogen atom drift. These indicators help determine whether the **MLIP** maintains
20+
stable and physically consistent dynamics over simulation times.
2121

2222
Our **stability score** is computed as:
2323

@@ -36,7 +36,8 @@ the frame at which the first H atom detaches.
3636
Dataset
3737
-------
3838

39-
The structures that are tested for stability are a series of protein structures, RNA fragments, peptides and inhibitors taken from the PDB.
39+
The stability dataset is composed by a series of protein structures, RNA fragments,
40+
peptides and small-molecules experimental structures taken from the `PDB <https://www.rcsb.org/>`_ databank.
4041
They have the following ids:
4142

4243
* 1JRS (Leupeptin)

docs/source/benchmarks/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,5 @@ implement your own benchmarks.
1616

1717
General <general/index>
1818
Small Molecules <small_molecules/index>
19+
Molecular Liquids <molecular_liquids/index>
1920
Biomolecules <biomolecules/index>
File renamed without changes.

0 commit comments

Comments
 (0)