Skip to content

Commit d7489b8

Browse files
committed
update docs
1 parent 904df2f commit d7489b8

File tree

2 files changed

+63
-8
lines changed

2 files changed

+63
-8
lines changed

docs/simulations.rst

Lines changed: 55 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,9 @@
44
===============================
55
:code:`ensemble_md` provides three command-line interfaces (CLI), including :code:`explore_REXEE`, :code:`run_REXEE` and :code:`analyze_REXEE`.
66
:code:`explore_REXEE` helps the user to figure out possible combinations of REXEE parameters, while :code:`run_REXEE` and :code:`analyze_REXEE`
7-
can be used to perform and analyze REXEE simulations, respectively. Below we provide more details about each of these CLIs.
7+
can be used to perform and analyze REXEE simulations, respectively. Both :code:`run_REXEE` and :code:`analyze_REXEE` will run with MT-REXEE
8+
with changes in the input parameters, but :code:`explore_REXEE` is only functional for single topology REXEE simulations. Below we provide
9+
more details about each of these CLIs.
810

911
.. _doc_explore_REXEE:
1012

@@ -39,7 +41,7 @@ Here is the help message of :code:`run_REXEE`:
3941

4042
::
4143

42-
usage: run_REXEE [-h] [-y YAML] [-c CKPT] [-g G_VECS] [-o OUTPUT] [-m MAXWARN]
44+
usage: run_REXEE [-h] [-y YAML] [-c CKPT] [-g G_VECS] [-e EQUIL] [-o OUTPUT] [-m MAXWARN]
4345

4446
This CLI runs a REXEE simulation given necessary inputs.
4547

@@ -133,8 +135,8 @@ least needs the following four files. (Check :ref:`doc_input_files` for more det
133135
* One TOP file of the system of interest, as specified in the input YAML file.
134136
* One MDP template for customizing MDP files for different replicas, as specified in the input YAML file.
135137

136-
Note that multiple GRO/TOP files can be provided to initiate different replicas with different configurations/topologies,
137-
in which case the number of GRO/TOP files must be equal to the number of replicas.
138+
Note that multiple GRO/TOP files can be provided to initiate different replicas with different configurations/topologies
139+
(like for MT-REXEE), in which case the number of GRO/TOP files must be equal to the number of replicas.
138140
Also, the MDP template should contain parameters shared by all replicas and define the coupling parameters for all
139141
intermediate states. Moreover, additional care needs to be taken for specifying some MDP parameters need additional care to be taken, which we describe in
140142
:ref:`doc_mdp_params`. Lastly, to extend a REXEE simulation, one needs to additionally provide the following
@@ -197,7 +199,34 @@ the CLI :code:`run_REXEE` applies the weight combination scheme using the functi
197199
and the histogram correction scheme using the function :obj:`.histogram_correction`.
198200
For more details about correction schemes, please refer to the section :ref:`doc_correction`.
199201

200-
Step 3-4: Set up the input files for the next iteration
202+
203+
Step 3-4: Perform necessary coordinate modification to enable swap (Only for MT-REXEE)
204+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
205+
When running with multiple topologies which may differ in the number and identity of atoms in each simulations one must perform an additional
206+
step in order to generate coordinates for those missing atoms. In this step the CLI :code:`run_REXEE` runs the function :obj:`.modify_coords_fn`.
207+
The :obj:`.modify_coords_fn` can either be the default provided coordinate modification method (:obj:`.default_coords_fn`) or if you prefer to
208+
provide your own function that can be specified in the input parameters. If using the default coordinate modification function then some automated
209+
topology processing will be performed in order to parse atom connectivity as well as which atoms are different between all swappable pairs of simulations.
210+
This analysis is performed by the :code:`.process_top` function which does the following:
211+
- Perform atom mapping between swappable simulations. If we have molecule A and molecule B which both contain a benzene ring, the carbons in that ring may
212+
differ in atom name between the two molecules. If, during system preparation, you ensure that the only difference in atom namming is the addition of 'D' or 'V'
213+
to designiate dummy or virtual atoms then this mapping can be performed automatically using :obj: `coordinate_swap.create_atom_map`. Otherwise a CSV file named 'atom_name_mapping.csv' should be provided.
214+
The details on the syntax of this file can be found below and an example is provided in the MT-REXEE example.
215+
- The residue connectivity is parsed from the input topology files to extract the name and number of atoms in the residue of interest which have covalent bonds.
216+
This will later be utilized to fix breaks across the periodic boundary. This step produces a file named 'residue_connect.csv' which can be examined to ensure
217+
no mistakes were made and modified as desired.
218+
- Lastly a swap map is determined for swaps between all specified potentially swappable simulations. This swap map breaks-up missing atoms into distinct R groups
219+
and determined which reference atoms will be used during alignment. This produces a human-readable file 'residue_swap_map.csv' which can also be examined for errors
220+
and modified as desired.
221+
222+
The default coordinate modification function does the following:
223+
224+
- Extract coordinates in high precision from the trajectory files.
225+
- Fix any breaks across the periodic boundaries in the residue of interest.
226+
- Align segments of the residue for which coordinates are being reconstructed in order to obtain locations for missing dummy atoms.
227+
- Write new gro files containing the reconstructed coordinates for miss-matching dummy atoms.
228+
229+
Step 3-5: Set up the input files for the next iteration
201230
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
202231
After the final configuration has been figured out by :obj:`.get_swapping_pattern` (and the weights/counts have been adjusted by the specified correction schemes, if any),
203232
the CLI :code:`run_REXEE` sets up input files for the next iteration. In principle, the new iteration should inherit the final
@@ -268,6 +297,7 @@ include parameters for data analysis here.
268297
For the CLI :code:`run_REXEE` to work, here is the predefined contract for the module/function based on the assumptions :code:`run_REXEE` makes.
269298
Modules/functions not obeying the contract are unlikely to work.
270299

300+
- Selecting :code:`default` will utilize the built-in modification function.
271301
- Multiple functions can be defined in the module, but the function for coordinate manipulation must have the same name as the module itself.
272302
- The function must only have two compulsory arguments, which are the two GRO files to be modified. The function must not depend on the order of the input GRO files.
273303
- The function must return :code:`None` (i.e., no return value).
@@ -461,8 +491,8 @@ MDP parameters:
461491
- If you want to explicitly specify a reference distance (:code:`d`) to use for all iterations, simply use
462492
:code:`pull_coord1_start = no` with :code:`pull_coord1_init = d` in your input MDP template.
463493

464-
5. Some rules of thumb
465-
======================
494+
5. Some rules of thumb for REXEE
495+
================================
466496
Here are some rules of thumb for specifying some key YAML parameters, as discussed/concluded from our paper [Hsu2024]_.
467497

468498
- **Number of replicas** (:code:`n_sim`): Just like other replica exchange methods, it is generally recommended that the number of replicas be
@@ -522,3 +552,21 @@ Here are some rules of thumb for specifying some key YAML parameters, as discuss
522552
enabled by YAML parameters :code:`w_combine`, :code:`N_cutoff`, and :code:`hist_corr`, respectively. To converge alchemical weights, we recommend
523553
just using weight-updating EE simulations, or weight-updating REXEE simulations without any correction schemes, i.e., using default values for
524554
these parameters.
555+
556+
6. Some rules of thumb for MT-REXEE
557+
===================================
558+
Though many selections are system specific below are some recommendations:
559+
560+
- **Number of parallel simulations**: This is entirely system specific and will likely be limited by your available computational resources. We
561+
provide some tips for allocating computational resources effectively on high performance computing resources
562+
- **Total number of states**: The total number of states should equal the number of states per simulation * the number of simulations.
563+
- **Number of states per replica**/**State shift**: We recommend optimizing the number of states per simulation using independent expanded ensemble simulations
564+
prior to running your MT-REXEE simulation to minimize use of computational resources.
565+
- **Exchange period** (:code:`nst_sim`): The most effective use of MT-REXEE is for simulating groups of transformations for which a high energy barrier exists
566+
for only a subset of these simulations. Thus we recommend generally choosing a longer iteration length to allow for incresed intra-iteration sampling prior to
567+
performing the next swap. There is no lower limit to swapping frequency; however, we have found that with many systems a frequency of less than 10 ps begins to
568+
increase round-trip times rather than decrease.
569+
570+
7. Additional optional input files for MT-REXEE
571+
===============================================
572+
If you perform a MT-REXEE simulation for

docs/theory.rst

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -458,7 +458,14 @@ calculated by the estimator. In :code:`ensemble_md`, this has been implemented i
458458

459459
7. MT-REXEE Coordinate Modification
460460
===================================
461-
The MT-REXEE method requires that coordinates for the non-matching dummy atoms between two transformations must be reconstructed in order to swap coordinates.
461+
The MT-REXEE method requires that coordinates for the non-matching dummy atoms between two transformations must be reconstructed in order to swap coordinates.
462+
The default cordinate modification function provided within this package can be selected by adding :code:`modify_coords: default` to your input YAML file.
463+
Alternatively, a custom function can be provided by the user. For this explanation we will be using the defualt function to perform a swap between a simulation
464+
which features molecule A and a simulation which features molecule B. We first determine which atoms are missing between determines all atoms which are present
465+
in molecule A but not in B and vice versa. By definition these atoms must be dummy atoms in their fully non-coupled state when the swap is performed. These
466+
missing atoms are then broken up by functional group to create several missing R groups. The alignment is then performed individually for each missing R group
467+
as shown in Figure ?. This will provide coordinates for the R groups unique to molecule B consistant with the structure of the common atoms in molecule A and vice
468+
versa. This allows new GRO files to be written and the next iteration to be perfomed.
462469

463470
.. figure:: _static/explain_swap_method.png
464471
:name: Fig. 3

0 commit comments

Comments
 (0)