|
4 | 4 | =============================== |
5 | 5 | :code:`ensemble_md` provides three command-line interfaces (CLI), including :code:`explore_REXEE`, :code:`run_REXEE` and :code:`analyze_REXEE`. |
6 | 6 | :code:`explore_REXEE` helps the user to figure out possible combinations of REXEE parameters, while :code:`run_REXEE` and :code:`analyze_REXEE` |
7 | | -can be used to perform and analyze REXEE simulations, respectively. Below we provide more details about each of these CLIs. |
| 7 | +can be used to perform and analyze REXEE simulations, respectively. Both :code:`run_REXEE` and :code:`analyze_REXEE` will run with MT-REXEE |
| 8 | +with changes in the input parameters, but :code:`explore_REXEE` is only functional for single topology REXEE simulations. Below we provide |
| 9 | +more details about each of these CLIs. |
8 | 10 |
|
9 | 11 | .. _doc_explore_REXEE: |
10 | 12 |
|
@@ -39,7 +41,7 @@ Here is the help message of :code:`run_REXEE`: |
39 | 41 |
|
40 | 42 | :: |
41 | 43 |
|
42 | | - usage: run_REXEE [-h] [-y YAML] [-c CKPT] [-g G_VECS] [-o OUTPUT] [-m MAXWARN] |
| 44 | + usage: run_REXEE [-h] [-y YAML] [-c CKPT] [-g G_VECS] [-e EQUIL] [-o OUTPUT] [-m MAXWARN] |
43 | 45 |
|
44 | 46 | This CLI runs a REXEE simulation given necessary inputs. |
45 | 47 |
|
@@ -133,8 +135,8 @@ least needs the following four files. (Check :ref:`doc_input_files` for more det |
133 | 135 | * One TOP file of the system of interest, as specified in the input YAML file. |
134 | 136 | * One MDP template for customizing MDP files for different replicas, as specified in the input YAML file. |
135 | 137 |
|
136 | | -Note that multiple GRO/TOP files can be provided to initiate different replicas with different configurations/topologies, |
137 | | -in which case the number of GRO/TOP files must be equal to the number of replicas. |
| 138 | +Note that multiple GRO/TOP files can be provided to initiate different replicas with different configurations/topologies |
| 139 | +(like for MT-REXEE), in which case the number of GRO/TOP files must be equal to the number of replicas. |
138 | 140 | Also, the MDP template should contain parameters shared by all replicas and define the coupling parameters for all |
139 | 141 | intermediate states. Moreover, additional care needs to be taken for specifying some MDP parameters need additional care to be taken, which we describe in |
140 | 142 | :ref:`doc_mdp_params`. Lastly, to extend a REXEE simulation, one needs to additionally provide the following |
@@ -197,7 +199,34 @@ the CLI :code:`run_REXEE` applies the weight combination scheme using the functi |
197 | 199 | and the histogram correction scheme using the function :obj:`.histogram_correction`. |
198 | 200 | For more details about correction schemes, please refer to the section :ref:`doc_correction`. |
199 | 201 |
|
200 | | -Step 3-4: Set up the input files for the next iteration |
| 202 | + |
| 203 | +Step 3-4: Perform necessary coordinate modification to enable swap (Only for MT-REXEE) |
| 204 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 205 | +When running with multiple topologies which may differ in the number and identity of atoms in each simulations one must perform an additional |
| 206 | +step in order to generate coordinates for those missing atoms. In this step the CLI :code:`run_REXEE` runs the function :obj:`.modify_coords_fn`. |
| 207 | +The :obj:`.modify_coords_fn` can either be the default provided coordinate modification method (:obj:`.default_coords_fn`) or if you prefer to |
| 208 | +provide your own function that can be specified in the input parameters. If using the default coordinate modification function then some automated |
| 209 | +topology processing will be performed in order to parse atom connectivity as well as which atoms are different between all swappable pairs of simulations. |
| 210 | +This analysis is performed by the :code:`.process_top` function which does the following: |
| 211 | + - Perform atom mapping between swappable simulations. If we have molecule A and molecule B which both contain a benzene ring, the carbons in that ring may |
| 212 | + differ in atom name between the two molecules. If, during system preparation, you ensure that the only difference in atom namming is the addition of 'D' or 'V' |
| 213 | + to designiate dummy or virtual atoms then this mapping can be performed automatically using :obj: `coordinate_swap.create_atom_map`. Otherwise a CSV file named 'atom_name_mapping.csv' should be provided. |
| 214 | + The details on the syntax of this file can be found below and an example is provided in the MT-REXEE example. |
| 215 | + - The residue connectivity is parsed from the input topology files to extract the name and number of atoms in the residue of interest which have covalent bonds. |
| 216 | + This will later be utilized to fix breaks across the periodic boundary. This step produces a file named 'residue_connect.csv' which can be examined to ensure |
| 217 | + no mistakes were made and modified as desired. |
| 218 | + - Lastly a swap map is determined for swaps between all specified potentially swappable simulations. This swap map breaks-up missing atoms into distinct R groups |
| 219 | + and determined which reference atoms will be used during alignment. This produces a human-readable file 'residue_swap_map.csv' which can also be examined for errors |
| 220 | + and modified as desired. |
| 221 | + |
| 222 | +The default coordinate modification function does the following: |
| 223 | + |
| 224 | + - Extract coordinates in high precision from the trajectory files. |
| 225 | + - Fix any breaks across the periodic boundaries in the residue of interest. |
| 226 | + - Align segments of the residue for which coordinates are being reconstructed in order to obtain locations for missing dummy atoms. |
| 227 | + - Write new gro files containing the reconstructed coordinates for miss-matching dummy atoms. |
| 228 | + |
| 229 | +Step 3-5: Set up the input files for the next iteration |
201 | 230 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
202 | 231 | After the final configuration has been figured out by :obj:`.get_swapping_pattern` (and the weights/counts have been adjusted by the specified correction schemes, if any), |
203 | 232 | the CLI :code:`run_REXEE` sets up input files for the next iteration. In principle, the new iteration should inherit the final |
@@ -268,6 +297,7 @@ include parameters for data analysis here. |
268 | 297 | For the CLI :code:`run_REXEE` to work, here is the predefined contract for the module/function based on the assumptions :code:`run_REXEE` makes. |
269 | 298 | Modules/functions not obeying the contract are unlikely to work. |
270 | 299 |
|
| 300 | + - Selecting :code:`default` will utilize the built-in modification function. |
271 | 301 | - Multiple functions can be defined in the module, but the function for coordinate manipulation must have the same name as the module itself. |
272 | 302 | - The function must only have two compulsory arguments, which are the two GRO files to be modified. The function must not depend on the order of the input GRO files. |
273 | 303 | - The function must return :code:`None` (i.e., no return value). |
@@ -461,8 +491,8 @@ MDP parameters: |
461 | 491 | - If you want to explicitly specify a reference distance (:code:`d`) to use for all iterations, simply use |
462 | 492 | :code:`pull_coord1_start = no` with :code:`pull_coord1_init = d` in your input MDP template. |
463 | 493 |
|
464 | | -5. Some rules of thumb |
465 | | -====================== |
| 494 | +5. Some rules of thumb for REXEE |
| 495 | +================================ |
466 | 496 | Here are some rules of thumb for specifying some key YAML parameters, as discussed/concluded from our paper [Hsu2024]_. |
467 | 497 |
|
468 | 498 | - **Number of replicas** (:code:`n_sim`): Just like other replica exchange methods, it is generally recommended that the number of replicas be |
@@ -522,3 +552,21 @@ Here are some rules of thumb for specifying some key YAML parameters, as discuss |
522 | 552 | enabled by YAML parameters :code:`w_combine`, :code:`N_cutoff`, and :code:`hist_corr`, respectively. To converge alchemical weights, we recommend |
523 | 553 | just using weight-updating EE simulations, or weight-updating REXEE simulations without any correction schemes, i.e., using default values for |
524 | 554 | these parameters. |
| 555 | + |
| 556 | +6. Some rules of thumb for MT-REXEE |
| 557 | +=================================== |
| 558 | + Though many selections are system specific below are some recommendations: |
| 559 | + |
| 560 | + - **Number of parallel simulations**: This is entirely system specific and will likely be limited by your available computational resources. We |
| 561 | + provide some tips for allocating computational resources effectively on high performance computing resources |
| 562 | + - **Total number of states**: The total number of states should equal the number of states per simulation * the number of simulations. |
| 563 | + - **Number of states per replica**/**State shift**: We recommend optimizing the number of states per simulation using independent expanded ensemble simulations |
| 564 | + prior to running your MT-REXEE simulation to minimize use of computational resources. |
| 565 | + - **Exchange period** (:code:`nst_sim`): The most effective use of MT-REXEE is for simulating groups of transformations for which a high energy barrier exists |
| 566 | + for only a subset of these simulations. Thus we recommend generally choosing a longer iteration length to allow for incresed intra-iteration sampling prior to |
| 567 | + performing the next swap. There is no lower limit to swapping frequency; however, we have found that with many systems a frequency of less than 10 ps begins to |
| 568 | + increase round-trip times rather than decrease. |
| 569 | + |
| 570 | +7. Additional optional input files for MT-REXEE |
| 571 | +=============================================== |
| 572 | +If you perform a MT-REXEE simulation for |
0 commit comments