Skip to content

Commit 16580bc

Browse files
committed
Added citation on osprey, discuss linear approach
fixes #148 [ci skip]
1 parent 7a94307 commit 16580bc

File tree

2 files changed

+28
-0
lines changed

2 files changed

+28
-0
lines changed

manuscript/literature.bib

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -677,3 +677,14 @@ @article{banushkina_nonparametric_2015
677677
year = {2015},
678678
pages = {184108}
679679
}
680+
681+
@article{husic-optimized,
682+
title={Optimized parameter selection reveals trends in Markov state models for protein folding},
683+
author={Husic, Brooke E and McGibbon, Robert T and Sultan, Mohammad M and Pande, Vijay S},
684+
journal={The Journal of chemical physics},
685+
volume={145},
686+
number={19},
687+
pages={194103},
688+
year={2016},
689+
publisher={AIP Publishing}
690+
}

manuscript/manuscript.tex

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -243,6 +243,23 @@ \subsection{The PyEMMA workflow}
243243

244244
\subsection{Feature selection}
245245

246+
In the workflow there are multiple hyper parameters to be chosen by the modeler. In our approach we try to optimize a
247+
parameter at the current stage of the pipeline and continue to the next stage, once a good choice was found. This
248+
requires the researcher to understand the consequences of non optimal deciscions for the final result. For instance
249+
a non converged clustering could result in lumping states together which should be seperated from each other.
250+
251+
There also exists automatized approaches to optimize all hyper parameters of the pipeline using a cross-validation
252+
scheme \cite{husic-optimized}. In these approaches the researcher is still required to understand modeling choices like
253+
sane ranges for parameters to avoid wasting computational time, which is spent to explore meaningless areas of the
254+
hyperparameter space.
255+
In the sequential approach, one can fall back to the previous step, if one finds a bad result at any following stage.
256+
This greatly reduces the computational effort and leads to a better understanding of the final model.
257+
258+
%However one will not be able to find a good model based on partially bad modeling choices. E.g. a hidden Markov state
259+
%model could partially correct bad clusterings, but
260+
261+
\subsection{Feature selection}
262+
246263
\begin{figure}
247264
\includegraphics{figure_2}
248265
\caption{Example analysis of the conformational dynamics of a pentapeptide backbone: (a)~The Trp-Leu-Ala-Leu-Leu pentapeptide in licorice representation~\cite{vmd}.

0 commit comments

Comments
 (0)