Skip to content

Commit b6d3fbe

Browse files
committed
adding TICA text (#136)
1 parent a1488d0 commit b6d3fbe

File tree

1 file changed

+32
-7
lines changed

1 file changed

+32
-7
lines changed

manuscript/manuscript.tex

Lines changed: 32 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ \subsection{Markov state models}
112112

113113
Markov state modeling is a mathematical framework for the analysis of time-series data, often but not limited to high-dimensional MD simulation datasets.
114114
In its standard formulation, the creation of a Markov state model involves decomposing the phase or configuration space occupied by a system into a set of disjoint, discrete states,
115-
and a transition matrix $P(\tau) = [p_{ij}(\tau)]$ denoting the conditional probability of finding the system in state $j$ at time $t+\tau$ given that it was in state $i$ at time $t$.
115+
and a transition matrix $\mathbf{P}(\tau) = [p_{ij}(\tau)]$ denoting the conditional probability of finding the system in state $j$ at time $t+\tau$ given that it was in state $i$ at time $t$.
116116
Let us make two remarks to avoid common misconceptions:
117117
\begin{enumerate}
118118
\item Equilibrium:
@@ -129,8 +129,8 @@ \subsection{Markov state models}
129129

130130
In order to create a Markov state model for a dynamical system, each data point in the time series is assigned to a state.
131131
Given an appropriate lag time, every pairwise transition at that lag time is counted and stored in a count matrix.
132-
Then, the count matrix is converted to a row-stochastic transition probability matrix $P$, which is defined for the specified lag time.
133-
For MD simulations in equilibrium, $P$ should obey detailed balance which is enforced by constraining the estimation of $P$ to the following equations:
132+
Then, the count matrix is converted to a row-stochastic transition probability matrix $\mathbf{P}$, which is defined for the specified lag time.
133+
For MD simulations in equilibrium, $\mathbf{P}$ should obey detailed balance which is enforced by constraining the estimation of $\mathbf{P}$ to the following equations:
134134
\begin{equation}
135135
\label{eq:balance}
136136
\pi_i p_{ij} = \pi_j p_{ji},
@@ -149,10 +149,10 @@ \subsection{Markov state models}
149149
\end{equation}
150150
When the ITS become approximately constant with the lag time, we say that our timescales have converged and choose the smallest lag time with the converged timescales in order to maximize the model's temporal resolution.
151151

152-
Once we have used the ITS to choose the lag time, we can check whether a given transition probability matrix $T(\tau)$ is approximately Markovian using the Chapman-Kolmogorov (CK) test~\cite{noe-folding-pathways}.
152+
Once we have used the ITS to choose the lag time, we can check whether a given transition probability matrix $\mathbf{P}(\tau)$ is approximately Markovian using the Chapman-Kolmogorov (CK) test~\cite{noe-folding-pathways}.
153153
The CK property for a Markovian matrix is,
154154
\begin{equation}
155-
P(k \tau) = P^k(\tau),
155+
\mathbf{P}(k \tau) = \mathbf{P}^k(\tau),
156156
\end{equation}
157157
where the left-hand side of the equation corresponds to an MSM estimated at lag time $k\tau$, where $k$ is an integer larger than~$1$, whereas the right-hand side of the equation is our estimated MSM transition probability matrix to the $k^\textrm{th}$ power.
158158
By assessing how well the approximated transition probability matrix adheres to the CK property, we can validate the appropriateness of the Markovian assumption for the model.
@@ -161,13 +161,13 @@ \subsection{Markov state models}
161161
The highest eigenvalue, $\lambda_1(\tau)$, is unique and equal to $1$.
162162
Its corresponding left eigenvector is the stationary distribution, $\bm{\pi}$:
163163
\begin{equation}
164-
\bm{\pi}^\top P(\tau) = \bm{\pi}^\top.
164+
\bm{\pi}^\top \mathbf{P}(\tau) = \bm{\pi}^\top.
165165
\end{equation}
166166

167167
The subsequent eigenvalues $\lambda_{i>1}(\tau)$ are real with absolute values less than~$1$ and are related to the \emph{characteristic} or \emph{implied} timescales of dynamical processes within the system (eq.~\ref{eq:its}).
168168
The dynamical process themself (for $i>1$) are encoded by the right eigenvectors $\bm{\psi}_i$,
169169
\begin{equation}
170-
P(\tau)\bm{\psi}_i = \lambda_i(\tau) \bm{\psi}_i,
170+
\mathbf{P}(\tau)\bm{\psi}_i = \lambda_i(\tau) \bm{\psi}_i,
171171
\end{equation}
172172
where the eigenvalue-eigenvector pairs are indexed in decreasing order.
173173
The coefficients of the eigenvectors represent the flux into and out of the Markov states that characterize the corresponding process.
@@ -197,6 +197,31 @@ \subsection{Variational approach and TICA}
197197
However, the MSM lag time cannot be optimized using VAMP,
198198
and must be chosen using a separate validation as described above~\cite{husic2017note}.
199199

200+
Our recommended method for dimensionality reduction, TICA, is a particular implementation of the VAC.
201+
To apply TICA, we need to compute instantaneous ($\mathbf{C}(0)$) and time-lagged ($\mathbf{C}(\tau)$) covariance matrices with elements
202+
\begin{eqnarray}
203+
c_{ij}(0) & = & \left\langle \tilde{x}_i(t) \; \tilde{x}_j(t) \right\rangle_t \\
204+
c_{ij}(\tau) & = & \left\langle \tilde{x}_i(t) \; \tilde{x}_j(t + \tau) \right\rangle_t,
205+
\end{eqnarray}
206+
where $\tilde{x}_i(t)$ denotes the $i^\textrm{th}$ feature at time $t$ after the mean has been removed.
207+
Then, we can solve the generalized eigenvalue problem
208+
\begin{equation}
209+
\mathbf{C}(\tau) \, \mathbf{u}_i = \mathbf{C}(0) \, \lambda_i(\tau) \, \mathbf{u}_i
210+
\end{equation}
211+
to obtain independent component directions $\mathbf{u}_i$ which approximate the reaction coordinates of the system,
212+
where the pairs of eigenvalues and independent components are in descending order.
213+
214+
Dimensionality reduction is achieved by projecting the (mean free) features $\tilde{\mathbf{x}}(t)$
215+
onto the leading $d$ independent components $\mathbf{U}=[\mathbf{u}_1 \dots \mathbf{u}_d]$,
216+
\begin{equation}
217+
\mathbf{y}(t) = \mathbf{U}_d^\top \tilde{\mathbf{x}}(t),
218+
\end{equation}
219+
while retaining the kinetic variance
220+
\begin{equation}
221+
\textrm{KV}_d = \sum\limits_{i=1}^d \lambda_i^2(\tau);
222+
\end{equation}
223+
the total kinetic variance is the sum of the squares of all eigenvalues.
224+
200225
\subsection{Software and installation}
201226

202227
\begin{figure}

0 commit comments

Comments
 (0)