You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: manuscript/manuscript.tex
+17-17Lines changed: 17 additions & 17 deletions
Original file line number
Diff line number
Diff line change
@@ -318,11 +318,11 @@ \subsection{Software and installation}
318
318
\section{PyEMMA tutorials}
319
319
320
320
This tutorial consists of nine Jupyter notebooks which introduce the basic features of PyEMMA.
321
-
The first notebook (00), which we will summarize in the following, showcases the entire estimation,
321
+
The first notebook~00, which we will summarize in the following, showcases the entire estimation,
322
322
validation, and analysis workflow for a small example system.
323
-
The goal of this introductory notebook (00) is to provide the user with the typical steps required to obtain a validated MSM analysis of protein or peptide simulation data.
324
-
The seven subsequent notebooks (01--07) provide in-depth lessons on specific topics,
325
-
and the last notebook (08) contains guidelines on how to deal with common problems during MSM estimation.
323
+
The goal of this introductory notebook~00 is to provide the user with the typical steps required to obtain a validated MSM analysis of protein or peptide simulation data.
324
+
The seven subsequent notebooks~01--07 provide in-depth lessons on specific topics,
325
+
and the last notebook~08 contains guidelines on how to deal with common problems during MSM estimation.
and the more general variational approach for Markov processes (VAMP)~\cite{vamp-preprint}
372
372
provide a systematic means to quantitatively compare multiple representations of the simulation data.
373
373
In particular, we can use a scalar score obtained using VAMP to directly compare the ability of certain features to capture slow dynamical modes in a particular molecular system.
374
-
In Notebook (01), we present in detail how to extract features from MD datasets and how to systematically compare them.
374
+
In notebook~01, we present in detail how to extract features from MD datasets and how to systematically compare them.
375
375
376
376
Throughout this tutorial, we utilize the VAMP-2 score, which maximizes the kinetic variance contained in the features~\cite{kinetic-maps}.
377
377
We should always evaluate the score in a cross-validated manner to ensure that we neither include too few features (under-fitting) or too many features (over-fitting)~\cite{gmrq,vamp-preprint}.
378
378
To choose among three different molecular features reflecting protein structure,
379
-
we compute the (cross-validated) VAMP-2 score (notebook00).
379
+
we compute the (cross-validated) VAMP-2 score (notebook~00).
380
380
Although we cannot MSM optimize lag times with a variational score\cite{husic2017note}, such as VAMP-2,
381
381
it is important to ensure that properties that we optimize are robust as a function of lag time.
382
-
Consequently, we compute the VAMP-2 score at several lag times (notebook00).
382
+
Consequently, we compute the VAMP-2 score at several lag times (notebook~00).
383
383
We find that the relative rankings of the different molecular features are highly robust as a function of lag time.
384
384
We show one example of this ranking and the absolute VAMP-2 scores for lag time~$0.5$~ns in Fig.~\ref{fig:io-to-tica}b.
385
385
We find that backbone torsions contain more kinetic variance than the backbone heavy atom positions or the distances between them (Fig.~\ref{fig:io-to-tica}b).
Discrete jumps between the minima can be observed by visualizing the transformation of the first trajectory into these ICs (Fig.~\ref{fig:io-to-tica}d).
400
400
We thus assume that our TICA-transformed backbone torsion features describe one or more metastable processes.
401
401
402
-
We demonstrate how to apply TICA, suggest how to interpret the projected coordinates, and compare the results to other dimension reduction techniques in Notebook (02).
402
+
We demonstrate how to apply TICA, suggest how to interpret the projected coordinates, and compare the results to other dimension reduction techniques in notebook~02.
403
403
404
404
\begin{figure}
405
405
\includegraphics{figure_3}
@@ -417,8 +417,8 @@ \subsection{Discretization}
417
417
which can greatly facilitate the decomposition of our system into the discrete Markovian states necessary for MSM estimation.
418
418
Here, we use the $k$-means algorithm to segment the four dimensional TICA space into $k=75$ cluster centers.
419
419
The number of cluster centers has been chosen to optimize the VAMP-2 score in a manner identical to how the feature selection was carried out above,
420
-
which is shown in the showcase Notebook (00).
421
-
A detailed comparison between different clustering techniques is provided in Notebook (02).
420
+
which is shown in the showcase notebook~00.
421
+
A detailed comparison between different clustering techniques is provided in notebook~02.
422
422
423
423
\subsection{MSM estimation and validation}
424
424
@@ -451,7 +451,7 @@ \subsection{MSM estimation and validation}
451
451
and shows that the MSM we have estimated at lag time $\tau=0.5$~ns indeed predicts the
452
452
long-timescale behavior of our system within error (blue/shaded area).
453
453
454
-
In Notebook (03), we demonstrate in detail how to estimate and validate MSMs with PyEMMA.
454
+
In notebook~03, we demonstrate in detail how to estimate and validate MSMs with PyEMMA.
455
455
456
456
\subsection{Analyzing the MSM}
457
457
@@ -532,7 +532,7 @@ \subsection{Analyzing the MSM}
532
532
The transition network can be additionally visualized by plotting representative structures of the five metastable states $\mathcal{S}_{(1-5)}$ according to their committor probability (Fig.~\ref{fig:tpt-network}).
533
533
It is easy to see from this depiction that the dominant pathway from $\mathcal{S}_2$ to $\mathcal{S}_4$ proceeds through $\mathcal{S}_5$.
534
534
535
-
More details about (spectral) properties of MSMs and how to analyze them with PyEMMA are discussed in Notebook (04) and Notebook (05).
535
+
More details about (spectral) properties of MSMs and how to analyze them with PyEMMA are discussed in notebook~04 and notebook~05.
536
536
537
537
\subsection{Connecting the MSM with experimental data}
538
538
@@ -568,12 +568,12 @@ \subsection{Connecting the MSM with experimental data}
568
568
We see that the predicted relaxation signal has a much larger amplitude for the nonequilibrium initialization,
569
569
making it more likely to be experimentally measurable.
570
570
571
-
In addition to a detailed demonstration of the above, Notebook (06) demonstrates how to compute J-couplings and dynamic fingerprints from MSMs.
571
+
In addition to a detailed demonstration of the above, notebook~06 demonstrates how to compute J-couplings and dynamic fingerprints from MSMs.
572
572
573
573
\subsection{Summary}
574
574
575
575
In this section, we have summarized how to conduct an MSM-based analysis of biomolecular dynamics data using PyEMMA.
576
-
For the full analysis, please refer to the first notebook (00).
576
+
For the full analysis, please refer to the first notebook~00.
577
577
All notebooks as well as detailed installation instructions are available on \githubrepository{}.
578
578
579
579
\subsection{Modeling large systems}
@@ -597,7 +597,7 @@ \subsection{Modeling large systems}
597
597
we explain how to deal with those in the tutorials (notebook~01).
598
598
599
599
More details on how to model complex systems with the techniques presented here are described, e.g., by~\cite{plattner_protein_2015,plattner_complete_2017}.
600
-
We further examine some symptoms that may indicate problematic or difficult datasets, and demonstrate how to deal with them in Notebook (08).
600
+
We further examine some symptoms that may indicate problematic or difficult datasets, and demonstrate how to deal with them in notebook~08.
0 commit comments