Skip to content

Commit 839f13c

Browse files
committed
fix #186
1 parent a40b931 commit 839f13c

File tree

1 file changed

+20
-20
lines changed

1 file changed

+20
-20
lines changed

manuscript/manuscript.tex

Lines changed: 20 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -121,16 +121,16 @@ \subsection{Markov state models}
121121
and a transition matrix $\mathbf{P}(\tau) = [p_{ij}(\tau)]$ denoting the conditional probability of finding the system in state $j$ at time $t+\tau$ given that it was in state $i$ at time $t$.
122122
Let us make two remarks to avoid common misconceptions:
123123
\begin{enumerate}
124-
\item Equilibrium:
125-
While most analysis techniques require simulation trajectories to be long enough to sample from the equilibrium distribution, this is not required for MSMs.
126-
Because MSMs use the \emph{conditional} probability $p_{ij}(\tau)$,
127-
they are useful for the analysis of short simulation trajectories with arbitrary starting points---see Ref.~\cite{oom-feliks} for restrictions.
128-
\item Markovianity:
129-
An MSM is a memoryless model.
130-
Early MSM papers have argued that accurate MSMs can be found if a few states with high barriers are captured by the MSM states so as to achieve a Mori-Zwanzig projection with fast-decaying memory~\cite{swope-its,noe2007jcp,chodera2007jcp}.
131-
The modern view, however, is that MSMs can be highly accurate if the MSM states discretize the collective coordinates of the slowest processes well~\cite{msm-jhp}.
132-
This mainly requires that the system is characterized by only a few slow processes at lag time $\tau$,
133-
which is true for cooperative systems such as most proteins, but not for highly frustrated systems such as glasses.
124+
\item Equilibrium:
125+
While most analysis techniques require simulation trajectories to be long enough to sample from the equilibrium distribution, this is not required for MSMs.
126+
Because MSMs use the \emph{conditional} probability $p_{ij}(\tau)$,
127+
they are useful for the analysis of short simulation trajectories with arbitrary starting points---see Ref.~\cite{oom-feliks} for restrictions.
128+
\item Markovianity:
129+
An MSM is a memoryless model.
130+
Early MSM papers have argued that accurate MSMs can be found if a few states with high barriers are captured by the MSM states so as to achieve a Mori-Zwanzig projection with fast-decaying memory~\cite{swope-its,noe2007jcp,chodera2007jcp}.
131+
The modern view, however, is that MSMs can be highly accurate if the MSM states discretize the collective coordinates of the slowest processes well~\cite{msm-jhp}.
132+
This mainly requires that the system is characterized by only a few slow processes at lag time $\tau$,
133+
which is true for cooperative systems such as most proteins, but not for highly frustrated systems such as glasses.
134134
\end{enumerate}
135135

136136
In order to create a Markov state model for a dynamical system, each data point in the time series is assigned to a state.
@@ -200,7 +200,7 @@ \subsection{Variational approach and TICA}
200200
More recently, the more general variational approach to Markov processes (VAMP) has been developed in order to facilitate the approximation and comparison of reversible models for basis sets that are continuous,
201201
as opposed to discrete states~\cite{vamp-preprint}.
202202
The VAMP can thus be used to perform model selection.
203-
Specifically, we use the VAMP-2 score, which captures the kinetic variance explained by the model.
203+
Specifically, we use the VAMP-2 score, which captures the kinetic variance explained by the model~\cite{kinetic-maps}.
204204
However, the MSM lag time cannot be optimized using VAMP,
205205
and must be chosen using a separate validation as described above~\cite{husic2017note}.
206206

@@ -248,7 +248,7 @@ \subsection{Variational approach and TICA}
248248
\begin{equation}
249249
\mathbf{y}(t) = \mathbf{U}_d^\top \tilde{\mathbf{x}}(t),
250250
\end{equation}
251-
where, in practice, $d$ is chosen such that a specific fraction of kinetic variance $c_d$ is retained (e.g., \SI{95}{\percent}).
251+
where, in practice, $d$ is preferably chosen such that a specific fraction of kinetic variance $c_d$ is retained (e.g., \SI{95}{\percent}).
252252

253253
\subsection{Hidden Markov state models}
254254

@@ -261,13 +261,12 @@ \subsection{Hidden Markov state models}
261261
The estimation of an MSM requires the dynamics between microstates to be Markovian.
262262
However, in case of a poor dimension reduction and/or discretization or short trajectories,
263263
we cannot anticipate this to be the case.
264-
We illustrate this point in notebook~07.
265264

266265
An alternative, which is much less sensitive to poor discretization,
267266
is to estimate a hidden Markov model (HMM)~\cite{hmm-baum-welch-alg,jhp-spectral-rate-theory,noe-proj-hid-msm,bhmm-preprint}.
268267
HMMs are less sensitive to the discretization error as they sidestep the assumption of Markovian dynamics in the discretized space (illustrated in Fig.~\ref{fig:hmm-scheme}).
269268
Instead, HMMs assume that there is an underlying (hidden) dynamic process that is Markovian
270-
and gives rise to our observed data, e.g., the ($n$~states) discretized trajectories $s(t)$.
269+
and gives rise to our observed data, i.e., the ($n$~states) discretized trajectories $s(t)$.
271270
This is a powerful principle as we know that there is indeed an underlying process that is Markovian:
272271
our molecular dynamics trajectories.
273272

@@ -279,13 +278,13 @@ \subsection{Hidden Markov state models}
279278

280279
An HMM estimation always yields a model with a small number of (hidden) states
281280
in which each state is considered to be metastable and,
282-
thus, the number of hidden states is a new hyper-parameter which needs to be chosen carefully (see notebook~07).
281+
thus, the number of hidden states is a new hyper-parameter which needs to be chosen carefully.
283282
As the HMMs---like MSMs---approximate the full phase-space dynamics,
284283
we can similarly compute the metastable kinetics, apply TPT, visualize the network, and obtain physical observables.
285284

286285
For an extensive discussion of details about HMM properties and the estimation algorithm in general, we suggest Ref.~\cite{hmm-tutorial}.
287286
For its specific application to the discretization of MSMs using HMMs, we suggest Ref.~\cite{noe-proj-hid-msm}.
288-
A generalized extension for estimating this type of low dimensional projection from the data is given in Ref.~\cite{wu2015projected}.
287+
A generalized extension for estimating this type of low dimensional projection from the data is given in Ref.~\cite{wu2015projected}. One of our tutorial notebooks, to be discussed in the next section, provides an example of HMM analysis.
289288

290289
\subsection{Software and installation}
291290

@@ -570,7 +569,7 @@ \subsection{Connecting the MSM with experimental data}
570569
we can use PyEMMA to compute the fluorescence autocorrelation function (ACF) from our MSM (Fig.~\ref{fig:msm-exp-obs}a).
571570
Note how the computed ACF has a very small response (i.e., signal amplitude).
572571

573-
Using PyEMMA, we can simulate the relaxation of an observable if we had prepared our molecular system in a nonequilibrium initial condition.
572+
Using PyEMMA, we can simulate the relaxation of an observable from a nonequilibrium initial condition.
574573
The experimental counterpart of such a prediction could be a temperature or pressure jump experiment or a stopped flow assay.
575574
To illustrate such an experiment, we initialize our molecular ensemble as the metastable distribution of~$\mathcal{S}_1$
576575
and follow the predicted fluorescence signal as it relaxes to equilibrium (Fig.~\ref{fig:msm-exp-obs}b).
@@ -579,7 +578,7 @@ \subsection{Connecting the MSM with experimental data}
579578

580579
In addition to a detailed demonstration of the above, notebook~06 demonstrates how to compute J-couplings and dynamic fingerprints from MSMs.
581580

582-
\subsection{Summary}
581+
\subsection{Summary of the showcase notebook}
583582

584583
In this section, we have summarized how to conduct an MSM-based analysis of biomolecular dynamics data using PyEMMA.
585584
For the full analysis, please refer to the first notebook~(00).
@@ -681,10 +680,11 @@ \section{Funding Information}
681680
%%%%%%%
682681
% Authors should acknowledge funding sources here. Reference specific grants.
683682
%%%%%%%
683+
MKS acknowledges financial support from European Commission (ERC StG 307494 "pcCell").
684684
TH acknowledges financial support from Deutsche Forschungsgemeinschaft (SFB/TRR 186, Project A12).
685-
FN and BEH acknowledge funding from European Commission (ERC CoG 772230 "ScaleCell").
686-
FN acknowledges funding from Deutsche Forschungsgemeinschaft (SFB 1114, Projects A04 and C03, NO 825/2-2).
687685
SO acknowledges a postdoctoral fellowship from the Alexander von Humboldt Foundation.
686+
FN and MKS acknowledge funding from Deutsche Forschungsgemeinschaft (SFB 1114, Projects A04 and C03, NO 825/2-2).
687+
FN and BEH acknowledge funding from European Commission (ERC CoG 772230 "ScaleCell").
688688

689689
\bibliography{literature}
690690

0 commit comments

Comments
 (0)