Skip to content

Commit 22d121f

Browse files
authored
Merge pull request #1476 from haddocking/1475-automated-toppar-generation-for-unknown-ligands
add automated toppar generation for unknown small molecules
2 parents 7c05ea8 + 33de7e0 commit 22d121f

File tree

23 files changed

+793
-83
lines changed

23 files changed

+793
-83
lines changed

.github/workflows/ci.yml

Lines changed: 64 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -17,36 +17,26 @@ on:
1717
- cron: "0 */6 * * *" # every 6 hours
1818

1919
jobs:
20-
ci:
21-
runs-on: ${{ matrix.os }}
20+
# Full suite on Linux + Python 3.14: unit tests, coverage, integration, e2e, docs.
21+
test-linux-py314:
22+
runs-on: ubuntu-latest
2223
permissions:
2324
contents: read
2425
actions: read
2526
checks: write
26-
strategy:
27-
matrix:
28-
os: [ubuntu-latest, macos-latest]
29-
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13", "3.14"]
30-
fail-fast: false
3127

3228
steps:
3329
- uses: actions/checkout@v4
3430

3531
- uses: actions/setup-python@v5
3632
with:
37-
python-version: ${{ matrix.python-version }}
33+
python-version: "3.14"
3834

39-
- name: install system dependencies (Linux)
40-
if: runner.os == 'Linux'
35+
- name: install system dependencies
4136
run: |
4237
sudo apt-get update
4338
sudo apt-get install -y openmpi-bin libopenmpi3 libopenmpi-dev
4439
45-
- name: install system dependencies (macOS)
46-
if: runner.os == 'macOS'
47-
run: |
48-
brew install open-mpi
49-
5040
- name: install haddock3 with extra dependencies
5141
run: pip install -e '.[mpi,dev,docs,notebooks]'
5242

@@ -55,23 +45,77 @@ jobs:
5545
pytest -v --random-order tests/
5646
--cov --cov-report=
5747
58-
- name: run integration tests
59-
run: >-
60-
pytest -v --random-order integration_tests/
61-
6248
- name: generate coverage report
6349
run: |
6450
coverage report
6551
coverage xml
6652
6753
- uses: codacy/codacy-coverage-reporter-action@v1
68-
if: matrix.python-version == '3.14'
6954
with:
7055
project-token: ${{ secrets.CODACY_PROJECT_TOKEN }}
7156
coverage-reports: ./coverage.xml
7257

58+
- name: run integration tests
59+
run: >-
60+
pytest -v --random-order integration_tests/
61+
62+
- name: run end to end tests
63+
run: >-
64+
pytest -v --random-order --no-cov end-to-end_tests/
65+
7366
- name: check if docs are buildable
7467
continue-on-error: true
7568
run: |
7669
sphinx-apidoc -f -e -o docs/ src/haddock -d 1
7770
sphinx-build docs haddock3-docs
71+
72+
# Run the remaining OSxPython combinations in parallel
73+
test-matrix:
74+
runs-on: ${{ matrix.os }}
75+
permissions:
76+
contents: read
77+
actions: read
78+
checks: write
79+
strategy:
80+
matrix:
81+
# ubuntu-latest = linux-x86_64
82+
# ubuntu-24.04-arm = linux-aarch64
83+
# macos-latest = macos-arm64
84+
os: [ubuntu-latest, ubuntu-24.04-arm, macos-latest]
85+
# Test only newest and oldest python version
86+
# > versions in the middle are unlikely to have compatibility issues
87+
python-version: ["3.9", "3.14"]
88+
exclude:
89+
# Already covered by test-linux-py314, don't run again
90+
- os: ubuntu-latest
91+
python-version: "3.14"
92+
fail-fast: false
93+
94+
steps:
95+
- uses: actions/checkout@v4
96+
97+
- uses: actions/setup-python@v5
98+
with:
99+
python-version: ${{ matrix.python-version }}
100+
101+
- name: install system dependencies (Linux)
102+
if: runner.os == 'Linux'
103+
run: |
104+
sudo apt-get update
105+
sudo apt-get install -y openmpi-bin libopenmpi3 libopenmpi-dev
106+
107+
- name: install system dependencies (macOS)
108+
if: runner.os == 'macOS'
109+
run: |
110+
brew install open-mpi
111+
112+
- name: install haddock3 with extra dependencies
113+
run: pip install -e '.[mpi,dev,docs,notebooks]'
114+
115+
- name: run unit tests
116+
run: >-
117+
pytest -v --random-order tests/
118+
119+
- name: run integration tests
120+
run: >-
121+
pytest -v --random-order integration_tests/

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# Changelog
22

3+
- 2026-02-24: Implement automated toppar generation for unknown atoms with PRODRG
34
- 2026-02-20: Add fallback routine to use `Scheduler` if the GRID is not available
45
- 2025-12-15: Added missing NGA glycan parameters - Issue #1462
56
- 2025-11-25: Simplify the use of multiple ambig archives

MANIFEST.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,4 +28,5 @@ recursive-include varia *.lua
2828
recursive-include varia *.md
2929
include src/haddock/bin/*
3030
include src/haddock/cns/bin/*
31+
include src/haddock/prodrg/*
3132
include src/haddock/libs/assets/*
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
"""End-to-end test for the protein-ligand docking example."""
2+
3+
import shutil
4+
import tempfile
5+
from pathlib import Path
6+
7+
from haddock.clis.cli import main as cli_main
8+
9+
EXAMPLE_DIR = (
10+
Path(__file__).resolve().parents[1] / "examples" / "docking-protein-ligand"
11+
)
12+
13+
14+
def test_protein_ligand_autotoppar_workflow(monkeypatch):
15+
"""Test protein-ligand docking example with automated topology generation.
16+
17+
Uses docking-protein-ligand-autotoppar-test.cfg to run the full pipeline
18+
(topoaa, rigidbody, caprieval, seletop, flexref, caprieval, rmsdmatrix,
19+
clustrmsd, seletopclusts, caprieval x2) without any user-provided ligand
20+
topology or parameter files. ``autotoppar=True`` instructs topoaa to
21+
invoke prodrg automatically, and the generated files are propagated to all
22+
downstream modules via ``_output_params``.
23+
"""
24+
with tempfile.TemporaryDirectory() as tmpdir:
25+
shutil.copytree(Path(EXAMPLE_DIR, "data"), Path(tmpdir, "data"))
26+
cfg = Path(tmpdir, "docking-protein-ligand-autotoppar-test.cfg")
27+
shutil.copy(
28+
Path(EXAMPLE_DIR, "docking-protein-ligand-autotoppar-test.cfg"), cfg
29+
)
30+
31+
monkeypatch.chdir(tmpdir)
32+
cli_main(cfg)
33+
34+
run_dir = Path("run1-autotoppar-test")
35+
36+
# Check if the auto-generated prodrg topology files were generated
37+
autotoppar_param = Path(
38+
run_dir, "00_topoaa", "oseltamivir_zwitterion_prodrg.param"
39+
)
40+
autotoppar_top = Path(run_dir, "00_topoaa", "oseltamivir_zwitterion_prodrg.top")
41+
assert autotoppar_param.exists(), f"{autotoppar_param} was not generated"
42+
assert autotoppar_top.exists(), f"{autotoppar_top} was not generated"
43+
44+
# Verify all workflow steps produced output directories
45+
assert Path(run_dir, "00_topoaa").exists()
46+
assert Path(run_dir, "01_rigidbody").exists()
47+
assert Path(run_dir, "02_caprieval").exists()
48+
assert Path(run_dir, "03_seletop").exists()
49+
assert Path(run_dir, "04_flexref").exists()
50+
assert Path(run_dir, "05_caprieval").exists()
51+
assert Path(run_dir, "06_rmsdmatrix").exists()
52+
assert Path(run_dir, "07_clustrmsd").exists()
53+
assert Path(run_dir, "08_seletopclusts").exists()
54+
assert Path(run_dir, "09_caprieval").exists()
55+
assert Path(run_dir, "10_caprieval").exists()
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# ====================================================================
2+
# Protein-ligand docking example
3+
4+
# directory in which the scoring will be done
5+
run_dir = "run1-autotoppar-test"
6+
7+
# execution mode
8+
mode = "local"
9+
ncores = 40
10+
11+
# molecules to be docked
12+
molecules = [
13+
"data/neuraminidase-2BAT.pdb",
14+
"data/oseltamivir_zwitterion.pdb"
15+
]
16+
17+
# ====================================================================
18+
# Parameters for each stage are defined below
19+
# ====================================================================
20+
[topoaa]
21+
autohis = true
22+
delenph = false
23+
autotoppar = true
24+
25+
[rigidbody]
26+
tolerance = 20
27+
ambig_fname = "data/ambig-active-rigidbody.tbl"
28+
sampling = 20
29+
w_vdw = 1.0
30+
31+
[caprieval]
32+
reference_fname = 'data/target.pdb'
33+
34+
[seletop]
35+
select = 5
36+
37+
[flexref]
38+
tolerance = 20
39+
ambig_fname = "data/ambig-passive.tbl"
40+
mdsteps_rigid = 0
41+
mdsteps_cool1 = 0
42+
43+
[caprieval]
44+
reference_fname = 'data/target.pdb'
45+
46+
[rmsdmatrix]
47+
resdic_A = [151,152,348,276,156,292,277,222,371,246,406,179,178,227,294,224,119,118]
48+
resdic_B = [500]
49+
50+
[clustrmsd]
51+
criterion = "maxclust"
52+
n_clusters = 2
53+
54+
[seletopclusts]
55+
top_models = 4
56+
57+
[caprieval]
58+
reference_fname = 'data/target.pdb'
59+
60+
# Running final caprieval with allatoms parameter set to true to also
61+
# include the evaluation of protein side chains
62+
# in both the alignment process and irmsd, ilrmsd computations
63+
# NOTE that all ligand atoms are always considered even without this option.
64+
[caprieval]
65+
allatoms = true
66+
reference_fname = "data/target.pdb"
67+
68+
# ====================================================================

integration_tests/test_alascan.py

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
from haddock.modules.analysis.alascan import (
99
DEFAULT_CONFIG as DEFAULT_ALASCAN_CONFIG,
1010
HaddockModule as AlascanModule,
11-
)
11+
)
1212
from haddock.modules.analysis.alascan.scan import RES_CODES
1313
from haddock.libs.libio import read_from_yaml
1414
from haddock.libs.libontology import PDBFile
@@ -40,11 +40,11 @@ def alascan_module_protlig():
4040
shutil.copy(
4141
Path(GOLDEN_DATA, "ligand.top"),
4242
Path(alascan.path, "ligand.top"),
43-
)
43+
)
4444
shutil.copy(
4545
Path(GOLDEN_DATA, "ligand.param"),
4646
Path(alascan.path, "ligand.param"),
47-
)
47+
)
4848
# Set the parameters to point the file
4949
alascan.params["ligand_param_fname"] = Path(alascan.path, "ligand.param")
5050
alascan.params["ligand_top_fname"] = Path(alascan.path, "ligand.top")
@@ -73,7 +73,7 @@ def retrieve_models(self, individualize: bool = False):
7373

7474
def output(self):
7575
return None
76-
76+
7777

7878
class MockPreviousIO_single_model:
7979
def __init__(self, path):
@@ -158,7 +158,7 @@ def test_alascan_single_model(alascan_module, mocker):
158158
# check single complex csv
159159
df = pd.read_csv(expected_csv, sep="\t", comment="#")
160160
assert df.shape == (12, 15), f"{expected_csv} has wrong shape"
161-
161+
162162
# there should be several mutants saved to file
163163
# for each mutation in df, check that the corresponding file exists
164164
from haddock.libs.libalign import PROT_SIDE_CHAINS_DICT
@@ -170,7 +170,7 @@ def test_alascan_single_model(alascan_module, mocker):
170170
mut_file = Path(alascan_module.path, f"2oob-{mut_file_identifier}.pdb")
171171
assert mut_file.exists(), f"{mut_file} does not exist"
172172
# now let's open the file and check that the mutation is correct
173-
173+
174174
heavy_atoms = []
175175
with open(mut_file, "r") as f:
176176
for ln in f:
@@ -179,9 +179,7 @@ def test_alascan_single_model(alascan_module, mocker):
179179
if not atom_name.startswith("H"):
180180
heavy_atoms.append(atom_name)
181181
# heavy_atoms should be = PROT_SIDE_CHAINS_DICT["LYS"] (order may vary)
182-
assert set(heavy_atoms) == set(
183-
PROT_SIDE_CHAINS_DICT["LYS"]
184-
), (
182+
assert set(heavy_atoms) == set(PROT_SIDE_CHAINS_DICT["LYS"]), (
185183
f"Heavy atoms for {mut_file_identifier} are not correct: {heavy_atoms}"
186184
)
187185

@@ -196,7 +194,9 @@ def test_alascan_mutation_resiudes():
196194

197195
def test_alascan_with_ligand_topar(alascan_module_protlig, mocker):
198196
"""Test the use of alascan in presence of a ligand."""
199-
alascan_module_protlig.previous_io = MockPreviousIO_protlig(path=alascan_module_protlig.path)
197+
alascan_module_protlig.previous_io = MockPreviousIO_protlig(
198+
path=alascan_module_protlig.path
199+
)
200200
alascan_module_protlig.run()
201201

202202
expected_csv = Path(alascan_module_protlig.path, "scan_protlig_complex_1.tsv")
@@ -206,7 +206,9 @@ def test_alascan_with_ligand_topar(alascan_module_protlig, mocker):
206206
assert expected_clt_csv.exists(), f"{expected_clt_csv} does not exist"
207207

208208
# List mutated files
209-
mutated_filepaths = list(Path(alascan_module_protlig.path).glob("protlig_complex_1-*.pdb"))
209+
mutated_filepaths = list(
210+
Path(alascan_module_protlig.path).glob("protlig_complex_1-*.pdb")
211+
)
210212
assert len(mutated_filepaths) >= 1
211213

212214
# Loop over files

0 commit comments

Comments
 (0)