You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: manuscript/manuscript.tex
+20-20Lines changed: 20 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -121,16 +121,16 @@ \subsection{Markov state models}
121
121
and a transition matrix $\mathbf{P}(\tau) = [p_{ij}(\tau)]$ denoting the conditional probability of finding the system in state $j$ at time $t+\tau$ given that it was in state $i$ at time $t$.
122
122
Let us make two remarks to avoid common misconceptions:
123
123
\begin{enumerate}
124
-
\item Equilibrium:
125
-
While most analysis techniques require simulation trajectories to be long enough to sample from the equilibrium distribution, this is not required for MSMs.
126
-
Because MSMs use the \emph{conditional} probability $p_{ij}(\tau)$,
127
-
they are useful for the analysis of short simulation trajectories with arbitrary starting points---see Ref.~\cite{oom-feliks} for restrictions.
128
-
\item Markovianity:
129
-
An MSM is a memoryless model.
130
-
Early MSM papers have argued that accurate MSMs can be found if a few states with high barriers are captured by the MSM states so as to achieve a Mori-Zwanzig projection with fast-decaying memory~\cite{swope-its,noe2007jcp,chodera2007jcp}.
131
-
The modern view, however, is that MSMs can be highly accurate if the MSM states discretize the collective coordinates of the slowest processes well~\cite{msm-jhp}.
132
-
This mainly requires that the system is characterized by only a few slow processes at lag time $\tau$,
133
-
which is true for cooperative systems such as most proteins, but not for highly frustrated systems such as glasses.
124
+
\item Equilibrium:
125
+
While most analysis techniques require simulation trajectories to be long enough to sample from the equilibrium distribution, this is not required for MSMs.
126
+
Because MSMs use the \emph{conditional} probability $p_{ij}(\tau)$,
127
+
they are useful for the analysis of short simulation trajectories with arbitrary starting points---see Ref.~\cite{oom-feliks} for restrictions.
128
+
\item Markovianity:
129
+
An MSM is a memoryless model.
130
+
Early MSM papers have argued that accurate MSMs can be found if a few states with high barriers are captured by the MSM states so as to achieve a Mori-Zwanzig projection with fast-decaying memory~\cite{swope-its,noe2007jcp,chodera2007jcp}.
131
+
The modern view, however, is that MSMs can be highly accurate if the MSM states discretize the collective coordinates of the slowest processes well~\cite{msm-jhp}.
132
+
This mainly requires that the system is characterized by only a few slow processes at lag time $\tau$,
133
+
which is true for cooperative systems such as most proteins, but not for highly frustrated systems such as glasses.
134
134
\end{enumerate}
135
135
136
136
In order to create a Markov state model for a dynamical system, each data point in the time series is assigned to a state.
@@ -200,7 +200,7 @@ \subsection{Variational approach and TICA}
200
200
More recently, the more general variational approach to Markov processes (VAMP) has been developed in order to facilitate the approximation and comparison of reversible models for basis sets that are continuous,
201
201
as opposed to discrete states~\cite{vamp-preprint}.
202
202
The VAMP can thus be used to perform model selection.
203
-
Specifically, we use the VAMP-2 score, which captures the kinetic variance explained by the model.
203
+
Specifically, we use the VAMP-2 score, which captures the kinetic variance explained by the model~\cite{kinetic-maps}.
204
204
However, the MSM lag time cannot be optimized using VAMP,
205
205
and must be chosen using a separate validation as described above~\cite{husic2017note}.
206
206
@@ -248,7 +248,7 @@ \subsection{Variational approach and TICA}
where, in practice, $d$ is chosen such that a specific fraction of kinetic variance $c_d$ is retained (e.g., \SI{95}{\percent}).
251
+
where, in practice, $d$ is preferably chosen such that a specific fraction of kinetic variance $c_d$ is retained (e.g., \SI{95}{\percent}).
252
252
253
253
\subsection{Hidden Markov state models}
254
254
@@ -261,13 +261,12 @@ \subsection{Hidden Markov state models}
261
261
The estimation of an MSM requires the dynamics between microstates to be Markovian.
262
262
However, in case of a poor dimension reduction and/or discretization or short trajectories,
263
263
we cannot anticipate this to be the case.
264
-
We illustrate this point in notebook~07.
265
264
266
265
An alternative, which is much less sensitive to poor discretization,
267
266
is to estimate a hidden Markov model (HMM)~\cite{hmm-baum-welch-alg,jhp-spectral-rate-theory,noe-proj-hid-msm,bhmm-preprint}.
268
267
HMMs are less sensitive to the discretization error as they sidestep the assumption of Markovian dynamics in the discretized space (illustrated in Fig.~\ref{fig:hmm-scheme}).
269
268
Instead, HMMs assume that there is an underlying (hidden) dynamic process that is Markovian
270
-
and gives rise to our observed data, e.g., the ($n$~states) discretized trajectories $s(t)$.
269
+
and gives rise to our observed data, i.e., the ($n$~states) discretized trajectories $s(t)$.
271
270
This is a powerful principle as we know that there is indeed an underlying process that is Markovian:
272
271
our molecular dynamics trajectories.
273
272
@@ -279,13 +278,13 @@ \subsection{Hidden Markov state models}
279
278
280
279
An HMM estimation always yields a model with a small number of (hidden) states
281
280
in which each state is considered to be metastable and,
282
-
thus, the number of hidden states is a new hyper-parameter which needs to be chosen carefully (see notebook~07).
281
+
thus, the number of hidden states is a new hyper-parameter which needs to be chosen carefully.
283
282
As the HMMs---like MSMs---approximate the full phase-space dynamics,
284
283
we can similarly compute the metastable kinetics, apply TPT, visualize the network, and obtain physical observables.
285
284
286
285
For an extensive discussion of details about HMM properties and the estimation algorithm in general, we suggest Ref.~\cite{hmm-tutorial}.
287
286
For its specific application to the discretization of MSMs using HMMs, we suggest Ref.~\cite{noe-proj-hid-msm}.
288
-
A generalized extension for estimating this type of low dimensional projection from the data is given in Ref.~\cite{wu2015projected}.
287
+
A generalized extension for estimating this type of low dimensional projection from the data is given in Ref.~\cite{wu2015projected}. One of our tutorial notebooks, to be discussed in the next section, provides an example of HMM analysis.
289
288
290
289
\subsection{Software and installation}
291
290
@@ -570,7 +569,7 @@ \subsection{Connecting the MSM with experimental data}
570
569
we can use PyEMMA to compute the fluorescence autocorrelation function (ACF) from our MSM (Fig.~\ref{fig:msm-exp-obs}a).
571
570
Note how the computed ACF has a very small response (i.e., signal amplitude).
572
571
573
-
Using PyEMMA, we can simulate the relaxation of an observable if we had prepared our molecular system in a nonequilibrium initial condition.
572
+
Using PyEMMA, we can simulate the relaxation of an observable from a nonequilibrium initial condition.
574
573
The experimental counterpart of such a prediction could be a temperature or pressure jump experiment or a stopped flow assay.
575
574
To illustrate such an experiment, we initialize our molecular ensemble as the metastable distribution of~$\mathcal{S}_1$
576
575
and follow the predicted fluorescence signal as it relaxes to equilibrium (Fig.~\ref{fig:msm-exp-obs}b).
@@ -579,7 +578,7 @@ \subsection{Connecting the MSM with experimental data}
579
578
580
579
In addition to a detailed demonstration of the above, notebook~06 demonstrates how to compute J-couplings and dynamic fingerprints from MSMs.
581
580
582
-
\subsection{Summary}
581
+
\subsection{Summary of the showcase notebook}
583
582
584
583
In this section, we have summarized how to conduct an MSM-based analysis of biomolecular dynamics data using PyEMMA.
585
584
For the full analysis, please refer to the first notebook~(00).
0 commit comments