Skip to content

Commit da5f72f

Browse files
authored
Merge pull request #184 from markovmodel/revision-cw
revision-cw
2 parents 6e3867e + 135efd3 commit da5f72f

File tree

2 files changed

+35
-28
lines changed

2 files changed

+35
-28
lines changed

binder/environment.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,5 @@ channels:
44
- defaults
55
dependencies:
66
- pyemma_tutorials
7+
- nomkl
78

manuscript/manuscript.tex

Lines changed: 34 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,9 @@
6767
%%% ARTICLE START
6868
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
6969

70+
\hyphenation{Mar-kov}
71+
\setlength{\emergencystretch}{3em}
72+
7073
\begin{document}
7174

7275
\begin{frontmatter}
@@ -152,13 +155,13 @@ \subsection{Markov state models}
152155
\end{equation}
153156
When the ITS become approximately constant with the lag time, we say that our timescales have converged and choose the smallest lag time with the converged timescales in order to maximize the model's temporal resolution.
154157

155-
Once we have used the ITS to choose the lag time, we can check whether a given transition probability matrix $\mathbf{P}(\tau)$ is approximately Markovian using the Chapman-Kolmogorov (CK) test~\cite{noe-folding-pathways}.
158+
Once we have used the ITS to choose the lag time, we can check whether a given transition probability matrix $\mathbf{P}(\tau)$ is approximately Markovian using the Chapman-Kolmogorov (CK) test~\cite{noe-folding-pathways,msm-jhp}.
156159
The CK property for a Markovian matrix is,
157160
\begin{equation}
158161
\mathbf{P}(k \tau) = \mathbf{P}^k(\tau),
159162
\end{equation}
160163
where the left-hand side of the equation corresponds to an MSM estimated at lag time $k\tau$, where $k$ is an integer larger than~$1$, whereas the right-hand side of the equation is our estimated MSM transition probability matrix to the $k^\textrm{th}$ power.
161-
By assessing how well the approximated transition probability matrix adheres to the CK property, we can validate the appropriateness of the Markovian assumption for the model.
164+
By assessing how well the approximated transition probability matrix adheres to the CK property, we can validate the appropriateness of the Markovian assumption for the model (see Sec.~IV.F in~\cite{msm-jhp}).
162165

163166
Once validated, the transition matrix can be decomposed into eigenvectors and eigenvalues.
164167
The highest eigenvalue, $\lambda_1(\tau)$, is unique and equal to $1$.
@@ -248,7 +251,7 @@ \subsection{Variational approach and TICA}
248251

249252
\subsection{Hidden Markov state models}
250253

251-
\begin{figure}
254+
\begin{figure}[ht]
252255
\includegraphics[width=0.48\textwidth]{figure_1}
253256
\caption{The HMM transition matrix $\tilde{\mathbf{P}}(\tau)$ propagates the hidden state trajectory $\tilde{s}(t)$ (orange circles) and, at each time step $t$, the emission into the observable state $s(t)$ (cyan circles) is governed by the emission probabilities $\bm{\chi}\left( s(t) \middle| \tilde{s}(t) \right)$.}
254257
\label{fig:hmm-scheme}
@@ -285,7 +288,8 @@ \subsection{Hidden Markov state models}
285288
\subsection{Software and installation}
286289

287290
We utilize Jupyter~\cite{jupyter} notebooks to show code examples along with figures and interactive widgets to display molecules.
288-
The user can install all necessary packages in one step using the \texttt{conda} command provided by the Anaconda Python stack (\url{https://anaconda.com}).
291+
The user can install all necessary packages in one step using the \texttt{conda} command provided by the Anaconda Python
292+
stack (\url{https://conda.io/miniconda.html}).
289293
We recommend Anaconda because it resolves and installs dependencies as well as provides pre-compiled versions of common packages.
290294
The tutorial installation contains a launcher command to start the Jupyter notebook server as well as the notebook files.
291295

@@ -312,7 +316,9 @@ \subsection{Software and installation}
312316
The tutorial software is currently supported for Python versions~$3.5$ and~$3.6$ on the operating systems Linux, OSX, and Windows.
313317

314318
Should the user prefer not to use Anaconda, a manual installation via the pip installer is possible.
315-
Alternatively, one can use the Binder service (\url{https://mybinder.org}) to view and run the tutorials online in any browser.
319+
Alternatively, one can use the Binder service
320+
(\href{https://mybinder.org/v2/gh/markovmodel/pyemma_tutorials/master?filepath=notebooks}{https://mybinder.org}) to view
321+
and run the tutorials online in any browser.
316322

317323
\section{PyEMMA tutorials}
318324

@@ -325,6 +331,15 @@ \section{PyEMMA tutorials}
325331

326332
\subsection{The PyEMMA workflow}
327333

334+
\begin{figure}[ht]
335+
\includegraphics[width=0.48\textwidth]{figure_2}
336+
\caption{The PyEMMA workflow: MD trajectories are processed and discretized (first row).
337+
A Markov state model is estimated from the resulting discrete trajectories and validated (middle row).
338+
By iterating between data processing and MSM estimation/validation,
339+
a dynamical model is obtained that can be analyzed (last row).}
340+
\label{fig:workflowchart}
341+
\end{figure}
342+
328343
In short, the workflow (Fig.~\ref{fig:workflowchart}) for a full analysis of an MD dataset might consist of,
329344
\begin{itemize}
330345
\item extracting molecular features from the raw data (01),
@@ -351,17 +366,18 @@ \subsection{The PyEMMA workflow}
351366
we chose to adopt a sequential approach where only the hyper-parameters of the current stage are optimized.
352367
This approach is not only computationally cheaper but allows us to discuss the significance of the necessary modeling choices.
353368

354-
\begin{figure}[bt]
355-
\includegraphics[width=0.48\textwidth]{figure_2}
356-
\caption{The PyEMMA workflow: MD trajectories are processed and discretized (first row).
357-
A Markov state model is estimated from the resulting discrete trajectories and validated (middle row).
358-
By iterating between data processing and MSM estimation/validation,
359-
a dynamical model is obtained that can be analyzed (last row).}
360-
\label{fig:workflowchart}
361-
\end{figure}
362-
363369
\subsection{Feature selection}
364370

371+
\begin{figure}[bht]
372+
\includegraphics{figure_3}
373+
\caption{Example analysis of the conformational dynamics of a pentapeptide backbone:
374+
(a)~The Trp-Leu-Ala-Leu-Leu pentapeptide in licorice representation~\cite{vmd}.
375+
(b)~The VAMP-2 score indicates which of the tested featurizations contains the highest kinetic variance.
376+
(c)~The sample free energy projected onto the first two time-lagged independent components (ICs) at lag time $\tau=0.5$~ns shows multiple minima and
377+
(d)~the time series of the first two ICs of the first trajectory show rare jumps.}
378+
\label{fig:io-to-tica}
379+
\end{figure}
380+
365381
In Markov state modeling, our objective is to model the slow dynamics of a molecular process.
366382
In order to approximate the slow dynamics in a statistically efficient manner,
367383
a lower dimensional representation of our simulation data is necessary.
@@ -400,16 +416,6 @@ \subsection{Dimensionality reduction}
400416

401417
We demonstrate how to apply TICA, suggest how to interpret the projected coordinates, and compare the results to other dimension reduction techniques in notebook~02.
402418

403-
\begin{figure}
404-
\includegraphics{figure_3}
405-
\caption{Example analysis of the conformational dynamics of a pentapeptide backbone:
406-
(a)~The Trp-Leu-Ala-Leu-Leu pentapeptide in licorice representation~\cite{vmd}.
407-
(b)~The VAMP-2 score indicates which of the tested featurizations contains the highest kinetic variance.
408-
(c)~The sample free energy projected onto the first two time-lagged independent components (ICs) at lag time $\tau=0.5$~ns shows multiple minima and
409-
(d)~the time series of the first two ICs of the first trajectory show rare jumps.}
410-
\label{fig:io-to-tica}
411-
\end{figure}
412-
413419
\subsection{Discretization}
414420

415421
TICA yields a representation of our molecular simulation data with a reduced dimensionality,
@@ -421,7 +427,7 @@ \subsection{Discretization}
421427

422428
\subsection{MSM estimation and validation}
423429

424-
\begin{figure}
430+
\begin{figure}[ht]
425431
\includegraphics{figure_4}
426432
\caption{Example analysis of the conformational dynamics of a pentapeptide backbone:
427433
(a)~The convergence behavior of the implied timescales associated with the four slowest processes.
@@ -456,7 +462,7 @@ \subsection{MSM estimation and validation}
456462

457463
\subsection{Analyzing the MSM}
458464

459-
\begin{figure}
465+
\begin{figure}[ht]
460466
\includegraphics{figure_5}
461467
\caption{Example analysis of the conformational dynamics of a pentapeptide backbone:
462468
(a)~The reweighted free energy surface projected onto the first two independent components exhibits five minima which
@@ -467,7 +473,7 @@ \subsection{Analyzing the MSM}
467473
\label{fig:msm-analysis}
468474
\end{figure}
469475

470-
\begin{figure}
476+
\begin{figure}[ht]
471477
\includegraphics{figure_6}
472478
\caption{Example analysis of the conformational dynamics of a pentapeptide backbone:
473479
visualization of the transition paths from $\mathcal{S}_2$ to $\mathcal{S}_4$.
@@ -537,7 +543,7 @@ \subsection{Analyzing the MSM}
537543

538544
\subsection{Connecting the MSM with experimental data}
539545

540-
\begin{figure}
546+
\begin{figure}[ht]
541547
\includegraphics{figure_7}
542548
\caption{Example analysis of the conformational dynamics of a pentapeptide backbone:
543549
(a)~the Trp-1 SASA autocorrelation function yields a weak signal which, however,

0 commit comments

Comments
 (0)