Skip to content

Commit 239776a

Browse files
authored
Merge pull request #9 from wehs7661/remove_gmxapi
Remove the use of gmxapi
2 parents c71ad05 + 156d70b commit 239776a

File tree

12 files changed

+222
-372
lines changed

12 files changed

+222
-372
lines changed

.circleci/config.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,12 +48,12 @@ jobs:
4848
- run:
4949
name: Install the ensemble_md package
5050
command: |
51-
export gmxapi_ROOT=$HOME/pkgs # set the envrionment variable so gmxapi can be installed successfully
52-
python3 -m pip install '.[gmxapi]'
51+
python3 -m pip install .
5352
5453
- run:
5554
name: Run unit tests
5655
command: |
56+
source $HOME/pkgs/bin/GMXRC
5757
pip3 install pytest
5858
pip3 install pytest-cov
5959
pytest -vv --disable-pytest-warnings --cov=ensemble_md --cov-report=xml --color=yes ensemble_md/tests/

.codecov.yml

Lines changed: 0 additions & 15 deletions
This file was deleted.

.lgtm.yml

Lines changed: 0 additions & 12 deletions
This file was deleted.

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -176,4 +176,4 @@
176176
# autoclass_content = 'both'
177177
autodoc_member_order = 'bysource'
178178
napoleon_attr_annotations = True
179-
autodoc_mock_imports = ["mpi4py", "gmxapi"]
179+
autodoc_mock_imports = ["mpi4py"] # we originally included gmxapi in the old versions of ensemble_md

docs/getting_started.rst

Lines changed: 3 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,21 +4,19 @@
44
running, and analyzing GROMACS simulation ensembles. The current implementation is
55
mainly for synchronous ensemble of expanded ensemble (EEXE), but we will develop
66
methods like asynchronous EEXE, or ensemble of alchemical metadynamics in the future.
7-
In the current implementation, `gmxapi`_, which is a higher level Python API of GROMACS,
7+
In the current implementation, the module :code:`subprocess`
88
is used to launch GROMACS commands, but we will switch to `SCALE-MS`_ for this purpose
99
in the future when possible.
1010

1111

12-
.. _`gmxapi`: https://manual.gromacs.org/current/gmxapi/
1312
.. _`SCALE-MS`: https://scale-ms.readthedocs.io/en/latest/
1413

1514

1615
2. Installation
1716
===============
1817
2.1. Requirements
1918
-----------------
20-
Before installing :code:`ensemble_md`, one should have working versions of `GROMACS`_
21-
and `gmxapi`_. Please refer to the linked documentations for full installation instructions.
19+
Before installing :code:`ensemble_md`, one should have working versions of `GROMACS`_. Please refer to the linked documentations for full installation instructions.
2220
All the other pip-installable dependencies of :code:`ensemble_md` (specified in :code:`setup.py` of the package)
2321
will be automatically installed during the installation of the package.
2422

@@ -31,14 +29,6 @@ will be automatically installed during the installation of the package.
3129

3230
pip install ensemble-md
3331

34-
By default, the command above does not install :code:`gmxapi`, so one needs to either
35-
following the full installation instruction of :code:`gmxapi`, or install
36-
:code:`gmxapi` along with the package (after sourcing the GROMACS excutable, e.g.
37-
:code:`/usr/local/gromacs/bin/GMXRC`) with the following command:
38-
::
39-
40-
pip install ensemble-md[gmxapi]
41-
4232
2.3. Installation from source
4333
-----------------------------
4434
One can also install :code:`ensemble_md` from the source code, which is available in our
@@ -49,8 +39,7 @@ One can also install :code:`ensemble_md` from the source code, which is availabl
4939
cd ensemble_md/
5040
pip install .
5141

52-
To install the pacakg along with :code:`gmxapi`, replace the last command with
53-
:code:`pip install '.[gmxapi]'`. If you are interested in contributing to the project, append the
42+
If you are interested in contributing to the project, append the
5443
last command with the flag :code:`-e` to install the project in the editable mode
5544
so that changes you make in the source code will take effects without re-installation of the package.
5645
(Pull requests to the project repository are welcome!)

docs/simulations.rst

Lines changed: 44 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,11 @@
44
===============================
55
:code:`ensemble_md` provides three command-line interfaces (CLI), including :code:`explore_EEXE`, :code:`run_EEXE` and :code:`analyze_EEXE`.
66
:code:`explore_EEXE` helps the user to figure out possible combinations of EEXE parameters, while :code:`run_EEXE` and :code:`analyze_EEXE`
7-
can be used to perform and analyze EEXE simulations, respectively. Here is the help message of :code:`explore_EEXE`:
7+
can be used to perform and analyze EEXE simulations, respectively. Below we provide more details about each of these CLIs.
8+
9+
1.1. CLI `explore_EEXE`
10+
-----------------------
11+
Here is the help message of :code:`explore_EEXE`:
812

913
::
1014

@@ -25,7 +29,9 @@ can be used to perform and analyze EEXE simulations, respectively. Here is the h
2529
replicas.
2630

2731

28-
And here is the help message of :code:`run_EEXE`:
32+
1.2. CLI `run_EEXE`
33+
-------------------
34+
Here is the help message of :code:`run_EEXE`:
2935

3036
::
3137

@@ -52,6 +58,18 @@ And here is the help message of :code:`run_EEXE`:
5258
The maximum number of warnings in parameter specification to be
5359
ignored.
5460

61+
In our current implementation, it is assumed that all replicas of an EEXE simulations are performed in
62+
parallel using MPI. Naturally, performing an EEXE simulation using :code:`run_EEXE` requires a command-line interface
63+
to launch MPI processes, such as :code:`mpirun` or :code:`mpiexec`. For example, on a 128-core node
64+
in a cluster, one may use :code:`mpirun -np 4 run_EEXE` (or :code:`mpiexec -n 4 run_EEXE`) to run an EEXE simulation composed of 4
65+
replicas with 4 MPI processes. Note that in this case, it is often recommended to explicitly specify
66+
more details about resources allocated for each replica. For example, one can specifies :code:`{'-nt': 32}`
67+
for the EEXE parameter `runtime_args` (specified in the input YAML file, see :ref:`doc_EEXE_parameters`),
68+
so each of the 4 replicas will use 32 threads (assuming thread-MPI GROMACS), taking the full advantage
69+
of 128 cores.
70+
71+
1.3. CLI `analyze_EEXE`
72+
-----------------------
5573
Finally, here is the help message of :code:`analyze_EEXE`:
5674

5775
::
@@ -119,11 +137,9 @@ other during the simulation ensemble. Check :ref:`doc_parameters` for more detai
119137

120138
Step 2: Run the 1st iteration
121139
-----------------------------
122-
With all the input files/parameters set up in the previous run, one can use :obj:`.run_EEXE` to run the
123-
first iteration. Specifically, :obj:`.run_EEXE` uses :code:`gmxapi.commandline_operation` to launch an GROMACS
124-
:code:`grompp` command to generate the input MDP file. Then, if :code:`parallel` is specified as :code:`True`
125-
in the input YAML file, :code:`gmxapi.mdrun` will be used to run GROMACS :code:`mdrun` commands in parallel,
126-
otherwise :code:`gmxapi.commandline_operation` will be used to run simulations serially.
140+
With all the input files/parameters set up in the previous run, one can use run the first iteration,
141+
using :obj:`.run_EEXE`, which uses :code:`subprocess.run` to launch GROMACS :code:`grompp`
142+
and :code:`mdrun` commands in parallel.
127143

128144
Step 3: Set up the new iteration
129145
--------------------------------
@@ -194,7 +210,15 @@ In the current implementation of the algorithm, 22 parameters can be specified i
194210
Note that the two CLIs :code:`run_EEXE` and :code:`analyze_EEXE` share the same input YAML file, so we also
195211
include parameters for data analysis here.
196212

197-
3.1. Simulation inputs
213+
3.1. GROMACS executable
214+
-----------------------
215+
216+
- :code:`gmx_executable`: (Required)
217+
The GROMACS executable to be used to run the EEXE simulation. The value could be as simple as :code:`gmx`
218+
or :code:`gmx_mpi` if the exeutable has be sourced. Otherwise, the full path of the exetuable (e.g.
219+
:code:`/usr/local/gromacs/bin/gmx`, the path returned by the command :code:`which gmx`).
220+
221+
3.2. Simulation inputs
198222
----------------------
199223

200224
- :code:`gro`: (Required)
@@ -204,11 +228,11 @@ include parameters for data analysis here.
204228
- :code:`mdp`: (Required)
205229
The MDP template that has the whole range of :math:`λ` values.
206230

207-
3.2. EEXE parameters
231+
.. _doc_EEXE_parameters:
232+
233+
3.3. EEXE parameters
208234
--------------------
209235

210-
- :code:`parallel`: (Required)
211-
Whether the replicas of EEXE should be run in parallel or not.
212236
- :code:`n_sim`: (Required)
213237
The number of replica simulations.
214238
- :code:`n_iter`: (Required)
@@ -241,7 +265,7 @@ include parameters for data analysis here.
241265
Additional runtime arguments to be appended to the GROMACS :code:`mdrun` command provided in a dictionary.
242266
For example, one could have :code:`{'-nt': 16}` to run the simulation using 16 threads.
243267

244-
3.3. Output settings
268+
3.4. Output settings
245269
--------------------
246270
- :code:`verbose`: (Optional, Default: :code:`True`)
247271
Whether a verbse log is wanted.
@@ -250,7 +274,7 @@ include parameters for data analysis here.
250274

251275
.. _doc_analysis_params:
252276

253-
3.4. Data analysis
277+
3.5. Data analysis
254278
------------------
255279
- :code:`msm`: (Optional, Default: :code:`False`)
256280
Whether to build Markov state models (MSMs) for the EEXE simulation and perform relevant analysis.
@@ -271,20 +295,21 @@ include parameters for data analysis here.
271295
- :code:`seed`: (Optional, Default: None)
272296
The random seed to use in bootstrapping.
273297

274-
3.5. A template input YAML file
298+
3.6. A template input YAML file
275299
-------------------------------
276300
For convenience, here is a template of the input YAML file, with each optional parameter specified with the default and required
277301
parameters left with a blank. Note that specifying :code:`null` is the same as leaving the parameter unspecified (i.e. :code:`None`).
278302

279303
::
304+
# Section 1: GROMACS executable
305+
gmx_executable:
280306

281-
# Section 1: Simulation inputs
307+
# Section 2: Simulation inputs
282308
gro:
283309
top:
284310
mdp:
285311

286-
# Section 2: EEXE parameters
287-
parallel:
312+
# Section 3: EEXE parameters
288313
n_sim:
289314
n_iter:
290315
s:
@@ -297,11 +322,11 @@ parameters left with a blank. Note that specifying :code:`null` is the same as l
297322
grompp_args: null
298323
runtime_args: null
299324

300-
# Section 3: Output settings
325+
# Section 4: Output settings
301326
verbose: True
302327
n_ckpt: 100
303328

304-
# Section 4: Data analysis
329+
# Section 5: Data analysis
305330
msm: False
306331
free_energy: False
307332
df_spacing: 1

ensemble_md/cli/run_EEXE.py

Lines changed: 19 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@
99
####################################################################
1010
import os
1111
import sys
12-
import glob
1312
import time
1413
import copy
1514
import shutil
@@ -90,6 +89,8 @@ def main():
9089

9190
# Step 2: If there is no checkpoint file found/provided, perform the 1st iteration (index 0)
9291
if os.path.isfile(args.ckpt) is False:
92+
start_idx = 1
93+
9394
# 2-1. Set up input files for all simulations with 1 rank
9495
if rank == 0:
9596
for i in range(EEXE.n_sim):
@@ -99,46 +100,29 @@ def main():
99100
MDP.write(f"sim_{i}/iteration_0/{EEXE.mdp.split('/')[-1]}", skipempty=True)
100101

101102
# 2-2. Run the first ensemble of simulations
102-
md = EEXE.run_EEXE(0)
103+
EEXE.run_EEXE(0)
103104

104-
# 2-3. Restructure the directory (move the files from mdrun_0_i0_* to sim_*/iteration_0)
105-
if rank == 0:
106-
work_dir = md.output.directory.result()
107-
for i in range(EEXE.n_sim):
108-
if EEXE.verbose is True:
109-
print(f' Moving files from {work_dir[i].split("/")[-1]}/ to sim_{i}/iteration_0/ ...')
110-
print(f' Removing the empty folder {work_dir[i].split("/")[-1]} ...')
111-
for f in glob.glob(f'{work_dir[i]}/*'):
112-
shutil.move(f, f'sim_{i}/iteration_0/')
113-
os.rmdir(work_dir[i])
114-
start_idx = 1
115105
else:
116106
if rank == 0:
117107
# If there is a checkpoint file, we see the execution as an extension of an EEXE simulation
118108
ckpt_data = np.load(args.ckpt)
119-
start_idx = len(ckpt_data[0])
109+
start_idx = len(ckpt_data[0]) # The length should be the same for the same axis
120110
print(f'\nGetting prepared to extend the EEXE simulation from iteration {start_idx} ...')
121111

122-
print('Deleting corrupted data ...')
123-
corrupted = glob.glob('gmxapi.commandline.cli*') # corrupted iteration
124-
corrupted.extend(glob.glob('mdrun*'))
125-
for i in corrupted:
126-
shutil.rmtree(i)
127-
if len(corrupted) == 0:
128-
corrupt_bool = False
129-
130-
for i in range(EEXE.n_sim):
131-
n_finished = len(next(os.walk(f'sim_{i}'))[1]) # number of finished iterations (the last might be initialized but corrupted though) # noqa: E501
132-
if n_finished == EEXE.n_iter and corrupt_bool is False:
133-
print('Extension aborted: The expected number of iterations have been completed!')
134-
sys.exit()
135-
else:
136-
print('Deleting data generated after the checkpoint ...')
112+
if start_idx == EEXE.n_iter:
113+
print('Extension aborted: The expected number of iterations have been completed!')
114+
sys.exit()
115+
else:
116+
print('Deleting data generated after the checkpoint ...')
117+
for i in range(EEXE.n_sim):
118+
n_finished = len(next(os.walk(f'sim_{i}'))[1]) # number of finished iterations
137119
for j in range(start_idx, n_finished):
138120
print(f' Deleting the folder sim_{i}/iteration_{j}')
139121
shutil.rmtree(f'sim_{i}/iteration_{j}')
140122

141123
# Read g_vecs.npy and rep_trajs.npy so that new data can be appended, if any.
124+
# Note that these two arrays are created in rank 0 and should always be operated in rank 0,
125+
# or broadcasting is required.
142126
EEXE.rep_trajs = [list(i) for i in ckpt_data]
143127
if os.path.isfile(args.g_vecs) is True:
144128
EEXE.g_vecs = [list(i) for i in np.load(args.g_vecs)]
@@ -209,7 +193,9 @@ def main():
209193
MDP.write(f"sim_{j}/iteration_{i}/{EEXE.mdp.split('/')[-1]}", skipempty=True)
210194
# In run_EEXE(i, swap_pattern), where the tpr files will be generated, we use the top file at the
211195
# level of the simulation (the file that will be shared by all simulations). For the gro file, we pass
212-
# swap_patter to the function to figure it out internally.
196+
# swap_pattern to the function to figure it out internally.
197+
else:
198+
swap_pattern = None
213199

214200
if -1 not in EEXE.equil and 0 not in EEXE.equil:
215201
# This is the case where the weights are equilibrated in a weight-updating simulation.
@@ -220,20 +206,11 @@ def main():
220206

221207
# Step 4: Perform another iteration
222208
# 4-1. Run another ensemble of simulations
223-
md = EEXE.run_EEXE(i, swap_pattern)
209+
swap_pattern = comm.bcast(swap_pattern, root=0)
210+
EEXE.run_EEXE(i, swap_pattern)
224211

225212
if rank == 0:
226-
# 4-2. Restructure the directory (move the files from mdrun_{i}_i0_* to sim_*/iteration_{i})
227-
work_dir = md.output.directory.result()
228-
for j in range(EEXE.n_sim):
229-
if EEXE.verbose is True:
230-
print(f' Moving files from {work_dir[j].split("/")[-1]}/ to sim_{j}/iteration_{i}/ ...')
231-
print(f' Removing the empty folder {work_dir[j].split("/")[-1]} ...')
232-
for f in glob.glob(f'{work_dir[j]}/*'):
233-
shutil.move(f, f'sim_{j}/iteration_{i}/')
234-
os.rmdir(work_dir[j])
235-
236-
# 4-3. Save data
213+
# 4-2. Save data
237214
if (i + 1) % EEXE.n_ckpt == 0:
238215
if len(EEXE.g_vecs) != 0:
239216
# Save g_vec as a function of time if weight combination was used.

0 commit comments

Comments
 (0)