Skip to content

Commit 831df4c

Browse files
committed
Updated the section of weight combination in the documentation
1 parent 8043af1 commit 831df4c

File tree

4 files changed

+218
-56
lines changed

4 files changed

+218
-56
lines changed

docs/requirements.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,4 @@ dependencies:
99
# Pip-only installs
1010
- pip:
1111
- nbsphinx
12-
12+
- docutils=0.16

docs/theory_implementation.rst

Lines changed: 212 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -451,76 +451,239 @@ sampling different alchemical ranges would have different references. Therefore,
451451

452452
Weight combination
453453
==================
454+
455+
Basic idea
456+
----------
454457
As mentioned above, to leverage the stastics of the states collected from multiple replicas, we recommend
455-
combining the alchemical weights of these states across replicas to initialize the next iteration. Below
456-
we first describe how we shift weights to deal with the issue of different references of alchemical weights
457-
in GROMACS LOG files, then, we describe weight-combining methods available in our current implementation.
458+
combining the alchemical weights of these states across replicas to initialize the next iteration. Ideally,
459+
well-modified weights should facilitate the convergence of the alchemical weights in expanded ensemble, which
460+
in the limit of inifinite simulation time correspond to dimensionless free energies of the alchemical states.
461+
The modified weights also directly influence the the accpetance ratio, hence the convergence of the simulation
462+
ensemble. Potentially, there are various methods for combining weights across multiple replicas. One intuitive
463+
method to average the probabilities :math:`p_1` and :math:`p_1` that respectively correspond to weights :math:`g_1`
464+
and :math:`g_1`, i.e.
458465

459-
Weight shifting
460-
---------------
461-
In the log file of a GROMACS expanded ensemble simulation, the alchemical weight of the first alchemical intermediate state
462-
is always shifted to 0 to serve as the reference. In EEXE, different replicas have different ranges of alchemical
463-
states, i.e. different first states, hence difference references. For example, there could be 3 replicas having
464-
the following weights:
466+
.. math::
467+
g=\ln p = -\ln\left(\frac{p_1+p_2}{2}\right) = -\ln\left(\frac{\text{e}^{-g_1} + \text{e}^{-g_2}}{2}\right)
468+
469+
This exploits the fact that in expanded ensemble, the alchemical weight of a state is the dimensionless free energy
470+
of that state given an exactly flat histogram of state visitation. While this assumption of flat histograms is generally
471+
not true, espeically in cases where the free energy differen of interest is large, one can consider "correcting"
472+
the weights before combining them. (See :ref:`doc_histogram` for more details.)
473+
474+
While the method illustrated above is intuitive and easy to operate, it suffers from the issue of reference state selection.
475+
This issue comes from the fact that GROMACS always shifts the weight of the first state to 0 to make it as the reference state.
476+
Given that the first state of different replicas in EEXE are different, this essentially means that the vector of
477+
alchemical weights of different replicas have different references. Although it is possible to pick a reference
478+
state for all replicas before weight combination could solve this issue, different choices of references could lead to
479+
slightly different combined weights, hence probability ratios. As there is no real justification which state should be favored
480+
as the reference, instead of the method explained above, we implemented another method that exploits the average of "probability ratios"
481+
to circumvent the issue of reference selection.
482+
483+
Weight combinination based on probability ratios
484+
------------------------------------------------
485+
Generally, weight combination is performed after the final configurations have beeen figured out and it is just for
486+
the initialization of the MDP files for the next iteration. Now, to demonstrate the method implemented in
487+
:code:`ensemble_md` (or more specifically, :obj:`.combine_weights`, here we consider the following sets of weights
488+
as an example, with :code:`X` denoting a state not present in the alchemical range:
465489

466490
::
467491

468-
State 0 1 2 3 4 5 6 7 8 9
469-
Rep 1 0.0 2.1 4.0 3.7 4.8 6.5 X X X X
470-
Rep 2 X X 0.0 -0.4 0.7 2.3 2.8 3.9 X X
471-
Rep 3 X X X X 0.0 1.5 2.1 3.3 6.0 9.2
492+
State 0 1 2 3 4 5
493+
Rep 1 0.0 2.1 4.0 3.7 X X
494+
Rep 2 X 0.0 1.7 1.2 2.6 X
495+
Rep 3 X X 0.0 -0.4 0.9 1.9
472496

473-
Each of these replicas sample 6 alchemical states. There alchemical ranges are different but overlapping. Specifically,
474-
Replicas 1 and 2 have overlap at states 2 to 5 and replicas 2 and 3 have overlap at states 4 to 7. Notably, all
475-
three replicas sampled states 4 and 5. Therefore, what we are going to do is
497+
As shown above, the three replicas sample different but overlapping states. Now, our goal
498+
is to
476499

477-
* For states 2 and 3, combine weights across replicas 1 and 2.
478-
* For states 4 and 5, combine weights across replicas 1, 2 and 3.
479-
* For states 6 and 7, combine weights across replicas 2 and 3.
500+
* For state 1, combine the weights arcoss replicas 1 and 2.
501+
* For states 2 and 3, combine the weights across all three replicas.
502+
* For state 4, combine the weights across replicas 1 and 2.
480503

481-
However, before we combine the weights, we should make sure the weights of all replicas have the same reference because now
482-
the references of the 3 replicas are states 0, 2, and 4, respectively. Therefore could be
504+
That is, we combine weights arcoss all replicas that sample the state of interest regardless
505+
which replicas are swapping. The outcome of the whole process should be three vectors of modified
506+
alchemical weights, one for each replica, that should be specified in the MDP files for the next iteration.
507+
Below we elaborate the details of each step carried out by our method.
483508

509+
Step 1: Convert the weights into probabilities
510+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
511+
For weight :math:`g_ij` that corresponds to state :math:`j` in replica :math:`i`, we can calculate its
512+
corresopnding probability as follows:
484513

485-
Exponential averaging
486-
---------------------
487-
In the limit that all alchemical states are equally sampled, the alchemical weight of a state
488-
is equal to the dimensionless free energy of that state, i.e. in the units of kT, we have
489-
:math:`g(\lambda)=f(\lambda)=-\ln p(\lambda)`, or :math:`p(\lambda)=\exp(-g(\lambda))`. Given this,
490-
one intuitive way is to average the probability of the two simulations. Let :math:`g` be the weight combined from
491-
from the weights :math:`g_1` and :math:`g_2` and :math:`p`, :math:`p_1`, :math:`p_2` be the corresponding
492-
probabilities, then we have
514+
.. math::
515+
p_{ij}=\frac{\exp(-g_{ij})}{\sum_{j=1}^N \exp(-g_{ij})}
516+
517+
where :math:`N` is the number of states in replica :math:`i`. As a result, we have the following probabilities
518+
for each replica. Note that the sum of the probabilities of each row (replica) should be 1.
519+
520+
::
521+
522+
State 0 1 2 3 4 5
523+
Rep 1 0.858 0.105 0.016 0.021 X X
524+
Rep 2 X 0.642 0.117 0.193 0.048 X
525+
Rep 3 X X 0.328 0.489 0.133 0.049
526+
527+
528+
Step 2: Calculate the probability ratios between each pair of replicas
529+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
530+
Ideally (in the limit of inifinite simulation time), for the 3 states overlapped between replicas 1 and 2,
531+
we should have
493532

494533
.. math::
495-
p = \frac{p_1+p_2}{2} = \frac{\text{e}^{-g_1} + \text{e}^{-g_2}}{2}
534+
r_{2, 1} = \frac{p_{2i}}{p_{1i}} = \frac{p_{21}}{p_{11}} = \frac{p_{22}}{p_{12}}= \frac{p_{23}}{p_{13}}
496535
497-
Given that :math:`p=\exp(-g)`, we have
536+
where :math:`r_{2, 1}` is the "probability ratio" between replicas 2 and 1. However, the probability ratios
537+
corresopnding to different states will not be the same in practice, but will diverge with statistical noise
538+
for short timescales. For example, in our case we have
498539

499540
.. math::
500-
g = -\ln p = -\ln\left(\frac{\text{e}^{-g_1} + \text{e}^{-g_1}}{2} \right)
541+
\frac{p_{21}}{p_{11}}=6.1143, \; \frac{p_{22}}{p_{12}} = 7.3125, \; \frac{p_{23}}{p_{13}}=9.1905
501542
502-
Exponential averaging with histogram corrections
503-
------------------------------------------------
504-
During the simulation, the histogram of the state visitation is generally not flat, such that
505-
:math:`g` is no longer equal to the dimensionless free energy, i.e. :math:`g(\lambda)=-\ln p(\lambda)`
506-
is no longer true. However, the ratio of counts is equal to the ratio of probabilities, whose natural
507-
logarithm is equal to the free energy difference of the states of interest. With this, we can do
508-
the following corrections before combining the weights:
543+
Similarly, for states 2 to 4, we need to calculate the probability ratios between replicas 2 and 3:
509544

510545
.. math::
511-
g_k'=g_k + \ln\left(\frac{N_{k-1}}{N_k}\right)
546+
\frac{p_{32}}{p_{22}}=2.8034, \; \frac{p_{33}}{p_{23}} = 2.5337, \; \frac{p_{34}}{p_{24}}=2.7708
512547
513-
where :math`g_k'` is the corrected alchemical weight and :math:`N_{k-1}` and :math:`N_k` are the histogram
514-
counts of states :math:`k-1` and :math:`k`. Combining this correction with exponential averaging, we have
548+
Notably, in this case, there is no need to calculate :math:`r_{3, 1}` because :math:`r_{3, 1}` is already determined
549+
as :math:`r_{3, 1}=r_{3, 2} \times r_{2, 1}`. Also, there are only 2 overlapping states between replicas 1 and 3,
550+
but we want to maximize the overlap when combining weights. Therefore, the rule of thumb of calculating the
551+
probability ratios is that we only calculate the ones betwee adjacent replicas, i.e. :math:`r_{i+1, i}`.
552+
553+
554+
Step 3: Average the probability ratios
555+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
556+
Now, to determine an unifying probability ratio between a pair of replicas, we can choose to take simple averages
557+
or geometric averages.
558+
559+
- Method 1: Simple average
560+
561+
.. math::
562+
r_{2, 1} = \frac{1}{3}\left(\frac{p_{21}}{p_{11}} + \frac{p_{22}}{p_{12}} + \frac{p_{23}}{p_{13}} \right) \approx 7.5391, \;
563+
r_{3, 2} = \frac{1}{3}\left(\frac{p_{32}}{p_{22}} + \frac{p_{33}}{p_{23}} + \frac{p_{34}}{p_{24}} \right) \approx 2.7026
564+
565+
- Method 2: Geometric average
566+
567+
.. math::
568+
r_{2, 1}' = \left(\frac{p_{21}}{p_{11}} \times \frac{p_{22}}{p_{12}} \times \frac{p_{23}}{p_{13}} \right)^{\frac{1}{3}} \approx 7.4345, \;
569+
r_{3, 2}' = \left(\frac{p_{32}}{p_{22}} \times \frac{p_{33}}{p_{23}} \times \frac{p_{34}}{p_{24}} \right)^{\frac{1}{3}} \approx 2.6999
570+
571+
Notably, if we take the negative natural logarithm of a probability ratio, we will get a free energy difference. For example,
572+
:math:`-\ln (p_{21}/p_{11})=f_{21}-f_{11}`, i.e. the free energy difference between state 1 in replica 2 and state 1 in replica 1.
573+
(This value is generally not 0 because different replicas have different references.) Therefore, while Method 1 takes the simple
574+
average the probability ratios, Method 2 essentially averages such free energy differences. Both methods are valid in theory and
575+
should not make a big difference in the convergence speed of the simulations because we just need an estimate of free energy for each
576+
state better than the weight of a single simulation. In fact, the closer the probability ratios are from each other, the closer
577+
the simple average is from the geometric average.
578+
579+
Step 4: Scale the probabilities for each replica
580+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
581+
Using the simple averages of the probability ratios :math:`r_{21}` and :math:`r_{32}`, we can scale the probability vectors of
582+
replicas 2 and 3 as follows:
583+
584+
::
585+
586+
State 0 1 2 3 4 5
587+
Rep 1 0.858 0.105 0.016 0.021 X X
588+
Rep 2’ X 0.08529 0.01552 0.02560 0.00637 X
589+
Rep 3’ X X 0.01610 0.02400 0.00653 0.00240
590+
591+
As shown above, we keep the probability vector of replica 1 the same but scale that for the other two. Specifically, the probability
592+
vector of replica 2' is that of replica 2 divided by :math:`r_21` and the probability vector of replica 3' is that of replica 3
593+
divided by :math:`r_{21} \times r_{32}`.
594+
595+
Similarly, if we used the probability ratios :math:`r_{21}'` and :math:`r_{32}'`, we would have had:
596+
597+
::
598+
599+
State 0 1 2 3 4 5
600+
Rep 1 0.858 0.105 0.016 0.021 X X
601+
Rep 2’ X 0.08645 0.01573 0.02596 0.00646 X
602+
Rep 3’ X X 0.01634 0.02436 0.00663 0.00244
603+
604+
605+
Step 5: Average and convert the probabilities
606+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
607+
After we have the scaled probabilities, we need to average them for each state, with the averaging method we used
608+
to average the probability ratios. For example, for state 1, we need to calculate the simple average of 0.105 and 0.08529,
609+
or the geometric average of 0.105 nad 0.08529. As such, for the first case (where the probabilities were scaled with :math:`r_{21}`
610+
and :math:`r_{32}`), we have the following probability vector of full range:
611+
612+
::
613+
614+
Final p 0.858 0.9515 0.01587 0.02353 0.00645 0.00240
615+
616+
which can be converted to the following vector of alchemical weights (denoted as :math:`\vec{g}`) by taking the negative natural logarithm:
617+
618+
::
619+
620+
Final g 0.1532 0.0497 4.1433 3.7495 5.0437 6.0322
621+
622+
For the second case (scaled with :math:`r_{21}'` and :math:`r_{32}'`), we have
623+
624+
::
625+
626+
Final p 0.858 0.09527 0.01602 0.02368 0.00654 0.00244
627+
628+
which can be converted to the following vector of alchemical weights (denoted as :math:`\vec{g'}`):
629+
630+
::
631+
632+
Final g 0.1532 2.3510 4.1339 3.7431 5.0437 6.0158
633+
634+
635+
Step 6: Determine the vector of alchemical weights for each replica
636+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
637+
Lastly, with the the vector of alchemical weights of the full range, we can figure out the alchemical weights
638+
for each replica, by shifting the weight of the first state of each replica back to 0. That is, with :math:`\vec{g}`,
639+
we have:
640+
641+
::
642+
643+
State 0 1 2 3 4 5
644+
Rep 1 0.0 2.199 3.990 3.596 X X
645+
Rep 2 X 0.0 1.791 1.397 2.691 X
646+
Rep 3 X X 0.0 -0.393 0.900 1.889
647+
648+
Similarly, with :math:`\vec{g'}`, we have:
649+
650+
::
651+
652+
State 0 1 2 3 4 5
653+
Rep 1 0.0 2.198 3.981 3.590 X X
654+
Rep 2 X 0.0 1.783 1.392 2.693 X
655+
Rep 3 X X 0.0 -0.391 0.910 1.882
656+
657+
658+
As shown above, the results using Method 1 and Method 2 are pretty close to each other. Notably, regardless of
659+
which type of averages we took, in the case here we were assuming that each replica is equally weighted. In the
660+
future, we might want to assign different weights to different replicas such that the uncertainty of free energies
661+
can be minimized. For example, if we are combining probabilities :math:`p_1` and :math:`p_2` that respectively
662+
have uncertainties :math:`\sigma_1` and :math:`\sigma_2`, we can have
515663

516664
.. math::
517-
g = -\ln\left(\frac{\text{e}^{-g_1'} + \text{e}^{-g_2'}}{2} \right)
665+
p = \left(\frac{\sigma_2}{\sigma_1 + \sigma_2}\right)p_1 + \left(\frac{\sigma_1}{\sigma_1 + \sigma_2}\right)p_2
518666
519-
where :math:`g_1'` and :math:`g_2'` are weights corrected based on their histogram counts.
667+
However, calculating the uncertainties of the :math:`p_1` and :math:`p_2` on-the-fly is generally difficult, so
668+
this method has not been implemented.
669+
670+
.. _doc_histogram:
671+
672+
Histogram corrections
673+
---------------------
674+
In the weight-combining method shown above, we frequently exploted the relationship :math:`g(\lambda)=f(\lambda)=-\ln p(\lambda)`.
675+
However, this relationship is true only when the histogram of state vistation is exactly flat, which rarely happens in reality.
676+
To correct this deviation, we can convert the difference in the histogram counts into the difference in free energies. This is based
677+
on the fact that the ratio of histogram counts is equal to the ratio of probabilities, whose natural
678+
logarithm is equal to the free energy difference of the states of interest. Specifically, we have:
679+
680+
.. math::
681+
g_k'=g_k + \ln\left(\frac{N_{k-1}}{N_k}\right)
682+
683+
where :math`g_k'` is the corrected alchemical weight and :math:`N_{k-1}` and :math:`N_k` are the histogram
684+
counts of states :math:`k-1` and :math:`k`.
520685

521-
Notably, this histogram correction should be carried out before shifting the weights, so the workflow we be
522-
first correcting the weights, shifting the weights, and finally combining the weights. Also, this correction
523-
method can be overcorrect the weights when the histogram counts :math:`N_k` or :math:`N_{k-1}` are too low.
686+
Notably, this correction method can possibly overcorrect the weights when the histogram counts :math:`N_k` or :math:`N_{k-1}` are too low.
524687
To deal with this, the user can choose to specify :code:`N_cutoff` in the input YAML file, so that the the histogram
525-
correction will performed only when :math:`\text{argmin}(N_k, N_{k-1})` is larger than the cutoff, otherwise this method
526-
will reduce to the standard exponential averaging method.
688+
correction will performed only when :math:`\text{argmin}(N_k, N_{k-1})` is larger than the cutoff. Also, this histogram correction
689+
should always be carried out before weight combination. This method is implemented in :obj:`.histogram_correction`.

ensemble_md/ensemble_EXE.py

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,9 @@
3434
class EnsembleEXE:
3535
"""
3636
This class helps set up input files of an ensemble of expanded ensemble.
37+
Note that for easier parsing of the mdp template file when initializing the class, please make sure that at least the following GROMACS mdp
38+
parameters specified using dashes instead of underscores: :code:`ref-t`, :code:`vdw-lambdas`,
39+
:code:`coul-lambdas`, :code:`restraint-lambdas`, and :code:`init-lambda-weights`.
3740
"""
3841

3942
def __init__(self, yml_file):
@@ -47,12 +50,6 @@ def __init__(self, yml_file):
4750
outfile : str
4851
The file name of the log file for documenting how different replicas interact
4952
during the process.
50-
51-
Notes
52-
-----
53-
For easier parsing of the mdp template file, please make sure that at least the following GROMACS mdp
54-
parameters specified using dashes instead of underscores: :code:`ref-t`, :code:`vdw-lambdas`,
55-
:code:`coul-lambdas`, :code:`restraint-lambdas`, and :code:`init-lambda-weights`.
5653
"""
5754
# Step 0: Set up constants
5855
k = 1.380649e-23

requirements.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@ natsort
33
argparse
44
alchemlyb
55
pyyaml
6+
seaborn
7+
matplotlib

0 commit comments

Comments
 (0)