resolved conflicts

JIMonroe · JIMonroe · commit 177ec5c363d3 · 2018-07-31T17:20:14.000-07:00
diff --git a/paper/basic_training.tex b/paper/basic_training.tex
@@ -176,7 +176,7 @@ \subsubsection{Key concepts}
 
 Classical molecular models typically consist of point particles carrying mass and electric charge, as well as potentially additional interactions such as van der Waals interactions and bonded interactions of various types.
 Sometimes it is much more efficient to freeze the internal degrees of freedoms and treat the molecule as a rigid body where the particles do not change their relative orientation as the whole body moves; this is commonly done, for example, for rigid models of the water molecule.
-Due to the high frequency of the O-H vibrations, accurately treating water classically would require a solving the equations of motion with a very small timestep, so for computational efficiency water is often instead treated as a rigid body.
+Due to the high frequency of the O-H vibrations, accurately treating water classically would require solving the equations of motion with a very small timestep, so for computational efficiency water is often instead treated as a rigid body.
 Keeping specified objects rigid in a simulation involves applying holonomic constraints, where the rigidity is defined by imposing a minimal set of fixed bond lengths and angles through iterative procedures during the numerical integration of the equation of motion (see Section~\ref{sec:integrators} for more on constraints and integrators).
 It is important to understand the concept of point particles, rigid bodies and constraints.
 
@@ -225,9 +225,10 @@ \subsubsection{Key concepts}
 In this sense, the laws of thermodynamics provide us rigorous sanity checks in addition to many useful mathematical relations for computing properties.
 Basic thermodynamic principles thus also dictate proper simulation protocols and associated best practices.
 
-The concept of the thermodynamic limit is particularly important.
+The concept of the thermodynamic limit is particularly important here.
 Specifically, as the size of a finite system is increased, keeping the particle number density roughly constant, at some point it is said to reach the thermodynamic limit where its behavior is bulk-like and no longer depends on the extent of the system.
-Thus, small systems will exhibit unique behaviors that reflect their microscopic size, but sufficient large systems are said to have reached the thermodynamic limit and macroscopic thermodynamics applied.
+Thus, small systems will exhibit unique behaviors that reflect their microscopic size, but sufficiently large systems are said to have reached the thermodynamic limit and macroscopic thermodynamics applied.
+This is due to the fact that the effect of interfaces or boundaries have largely been removed, and, more importantly, that averages of system properties are now over a sufficiently large number of molecules that any instantaneous snapshot of the system roughly corresponds to average behavior (i.e. fluctuations in properties become negligible with increasing system size).
 
 Although we usually think of thermodynamics applying macroscopically and statistical mechanics applying on the microsopic level, it is important to remember that the laws of thermodynamics still hold \textit{on average} regardless of the length scale.
 That is, a molecule in contact with a thermal bath will exchange energy with the bath, but its average energy is a well-defined constant.
@@ -236,8 +237,6 @@ \subsubsection{Key concepts}
 Importantly, as long as we have carefully defined our ensemble and thermodynamic path, we can apply the powerful relationships of thermodynamics to more easily calculate many properties of interest.
 For instance, one may use molecular dynamics to efficiently numerically integrate the Clapeyron equation and construct equations of state along phase coexistence curves~\cite{Kofke1993, GonzalezSalgado2010}.
 
-\todo[inline, color={green!20}]{JIM: I also want to mention that some thermodynamic properties are only truly well-defined for many molecules, not for single molecules or isolated molecules. This is a tricky discussion, though. DLM: What did you have in mind? } 
-
 \subsubsection{Books}
 Equilibrium thermodynamics is taught in most undergraduate programs in physics, chemistry, biochemistry and various engineering disciplines.
 Depending on the background, the practitioner can choose one or more of the following books to either learn or refresh their basic knowledge of thermodynamics. 
@@ -256,19 +255,12 @@ \subsection{Classical statistical mechanics}
 \subsubsection{Key concepts}
 Key concepts from statistical mechanics are particularly important and prevalent in molecular simulations:
 \begin{itemize}
-\item Ensembles, distribution functions for different ensembles. Equivalence of ensembles
-\item What equilibrium means and the difference between equilibrium and non-equilibrium.
-    For instance, what is usually called an ``equilibrium trajectory'' generally will not embody a good sample of the equilibrium ensemble due to insufficient sampling.  
-        On the other hand, truly non-equilibrium conditions such as driven transitions and relaxation are fundamentally different. 
-        Note that relaxation can occur to the equilibrium ensemble or a non-equilibrium condition (e.g., steady state).
-\item Clarify differences between nonequilibrium ensembles: driven
-    nonequilibrium steady-state, systems driven out of equilibrium by a time-dependent field, systems initially out of equilibrium but relaxing
-        to equilibrium
-\item Time averages and ensemble averages
 \item Fluctuations
-\item Correlation functions
+\item Definitions of various ensembles
+\item Time averages and ensemble averages
+\item Equilibrium versus non-equilibrium
 \end{itemize}
-\todo[inline, color={yellow!20}]{DLM: This list is bothering me because it is longer than the others, has more statements in it, and doesn't totally connect with what's in this section. Not sure what to do with it.}
+\todo[inline, color={yellow!20}]{DLM: This list is bothering me because it is longer than the others, has more statements in it, and doesn't totally connect with what's in this section. Not sure what to do with it. JIM: I've proposed changes. Helpful or not?}
 
 Traditional discussions of classical statistical mechanics, especially concise ones, tend to focus first or primarily on macroscopic thermodynamics and microscopic \emph{equilibrium} behavior based on the Boltzmann factor, which tells us that configurations $\conf$ occur with (relative) probability $\exp[-U(\conf)/k_B T]$, based on potential energy function $U$ and temperature $T$ in Kelvin units.  
 Dynamical phenomena and their connection to equilibrium tend to be treated later in discussion, if at all.  
@@ -545,7 +537,7 @@ \subsection{Periodic boundary conditions}
 This is most easily seen by imagining the placement of a solute in a periodic simulation box.
 The solute will be replicated in all of the surrounding periodic images.
 The concentration of solute is thus exactly one per the volume of the box.
-Although proper selection of non-bonded cutoffs will guarantee that these solutes do not directly interact, they may indirectly interact through their perturbation of nearby solvent.
+Although proper selection of non-bonded cutoffs will guarantee that these solutes do not directly interact (hence the common claim that such systems are at infinite dilution), they may indirectly interact through their perturbation of nearby solvent.
 If the solvent does not reach a bulk-like state between solutes, the simulation will still suffer from obvious finite-size effects.
 
 Macroscopic, lab-scale systems, or bulk systems, typically consist of multiple moles of atoms/molecules and thus from a simulation perspective are effectively infinite systems.
@@ -555,7 +547,7 @@ \subsection{Periodic boundary conditions}
 One should always check that unexpected long-range correlations (i.e. on the length-scale of the simulation box) do not exist in molecular structure, spatial position, or orientation.
 It should also be recognized that periodic boundary conditions innately change the definition of the system and the properties calculated from it.
 Many derivations, especially those involving transport properties, such as diffusivity \citep{Yeh2004}, assume infinite and not periodic boundary conditions. 
-\todo[inline, color={green!20}]{JIM: Other examples to cite? I think there's some examples also involving calculations of entropy, but I'll have to check this}
+%\todo[inline, color={green!20}]{JIM: Other examples to cite? I think there's some examples also involving calculations of entropy, but I'll have to check this. Let this go for current version, maybe update later.}
 The resulting differences in seemingly well-known expressions for computing properties of interest are often subtle, yet may have a large impact on results.
 Such considerations should be kept in mind when comparing results between simulations and with experiment.
 
@@ -581,6 +573,7 @@ \subsection{Main steps of a molecular dynamics simulation}
 If such documents do not exist, considerable care should be exercised to determine best practices from the literature.
 In some cases, freely available tools are constructing systems are available and can be a reasonable option (though their mention here should not be taken as an endorsement that they necessarily encapsulate best practices).
 Examples include tools for constructing specific crystal structures, proteins, and lipid membranes, such as Moltemplate, Packmol, and Atomsk.
+\todo[inline, color={green!20}]{JIM: Need to cite these packages!}
 The goal of all of these tools, and system preparation in general, is to create an accurate representation of the system of interest that can be interpreted by the desired simulation package.
 It is further desirable that this starting structure resemble the equilibrium structure of the system at the thermodynamic state point of interest.
 For instance, highly energetically unfavorable configurations of the system, such as blatant atomic overlaps, should be avoided.
@@ -606,6 +599,7 @@ \subsection{Main steps of a molecular dynamics simulation}
 Once the kinetic and potential energies fluctuate around constant values, the thermostat may be removed (if an NVE simulation is desired) and a snapshot selected that is simultaneously as close to the average kinetic and potential energies as possible.
 This snapshot, containing both positions and velocities may be used to then start an NVE simulation that will correspond to a temperature close to that which is desired.
 This is necessary due to the fact that only the average temperature is obtained through coupling to a thermostat (see Section~\ref{sec:thermostats}), and the temperature fluctuates with the kinetic energy at each timestep.
+
 Similarly, equilibration in the NPT ensemble is necessary before production in the NVT if an average density consistent with a specific pressure is desired.
 In this case, the system may be scaled to the desired average volume before the production simulation.
 In the above example, the NPT simulation is said to have equilibrated to a specific volume when the dimensions of the simulation box fluctuate around constant values with minimal drift. 
@@ -622,10 +616,15 @@ \subsection{Main steps of a molecular dynamics simulation}
 \begin{figure}[h]
 \centering
 \includegraphics[width=\linewidth]{Equilibration_Workflow.pdf}
-\caption{Common equilibration work-flows}
+\caption{Common equilibration work-flows are shown.
+Many times, the desired state variables of a simulation, such as energy or density, are defined by their averages in an ensemble in which they may fluctuate, such as at fixed temperature or pressure. 
+Such cases follow the first or third equilibration routes.
+When we wish to simulate at a known, fixed density, such a procedure is not necessary, as shown by the second route.
+The last route emphasizes that generally temperature and pressure should be equilibrated separately, even if the production ensemble is NPT.
+} 
 \label{eqworkflow}
 \end{figure}
-\todo[inline, color={yellow!20}]{DLM: Caption needs updating to make clear why you would choose different options here, especially the two different NVT options.}
+\todo[inline, color={yellow!20}]{DLM: Caption needs updating to make clear why you would choose different options here, especially the two different NVT options. JIM: Better?}
 
 Once equilibration is complete, production data may be collected.
 The production simulation is that from which specific properties of the system of interest will be calculated.
@@ -896,6 +895,10 @@ \subsection{Integrators}
 Algorithms to perform this integration take many forms and are usually called integrators.
 Here, we explain the need for integrators, discuss key criteria like energy conservation, and highlight a number of commonly used integrators.
 
+\subsubsection{Desireable integrator properties}
+
+So-called ``good'' integrators contain certain features that are appealing for molecule simulations.
+We start with the most obvious feature, which is that the integrator induces little error in the dynamics.
 Since integration is fundamentally about taking discrete steps to approximate continuous dynamics, this discretization process introduces errors (as can be observed by comparison to analytically soluble problems, like the harmonic oscillator).
 These errors are termed discretization errors, whereas additional errors called truncation errors are also accumulated through loss of precision during computer calculations.
 As will be discussed shortly, there are many strategies for avoiding discretization errors.
@@ -907,7 +910,7 @@ \subsection{Integrators}
 Luckily, this issue may be avoided simply by guaranteeing that the integrator is reversible~\cite{Frenkel:2001:}.
 More details may be found in ~\citet{Tuckerman:1992:}, but basically if the mathematical operator representing the integrator preserves phase space volume, it also satisfies the definition of reversibility: if the operator is applied to propagate forward by $\Delta t$, the starting condition may be recovered by in turn applying the operator to the result using $- \Delta t$ as the timestep.
 
-Energy conservation is imperative in simulating the microcanonical (NVE) ensemble.
+Energy conservation is also a desirable integrator property and is imperative in simulating the microcanonical (NVE) ensemble.
 This is a much trickier property to examine, and varies with different integrators.
 For instance, some classes of integrators better-preserve energy over short times, while others better-preserve energy at long times.
 The latter is generally preferred, though it may necessitate other sacrifices such as greater energy fluctuations away from the desired, exact system energy.
@@ -934,14 +937,22 @@ \subsection{Integrators}
 This has the added benefit of also reducing truncation error, which is proportional to the number of timesteps taken.
 It is worth noting that the issue of integrator choice versus timestep is not always simple; in some cases, a ``better'' integrator might allow longer timesteps but also carry an additional computational cost that outweighs the benefits of an increased timestep.
 
-The most commonly used integrators are variants of the Verlet algorithm (e.g. Velocity Verlet or Leapfrog). 
-Detailed discussion and derivation of such integrators may be found in section 7.3 of \citet{LeachBook} and 4.3 of \citet{Frenkel:2001:}.
-Such integrators are not applicable, however, for simulations involving stochastic dynamics.
-These simulations include application of a random force to each particle, and represent discretizations of either Langevin or Brownian dynamics.
+\subsubsection{Deterministic integrators}
+
+The most commonly used integrators are variants of the Verlet algorithm (e.g. Velocity Verlet or Leapfrog).
+Such integrators include terms for updating particle positions up to the order of the square of the timestep (i.e. they include forces).
+Inclusion of higher-order terms is favored in other families of algorithms, but generally leads to greater complexity and reduced computational efficiency at only marginal improvement in accuracy.
+Detailed discussion and derivation of many common integrators may be found in section 7.3 of \citet{LeachBook} and 4.3 of \citet{Frenkel:2001:}.
+Such integrators are not applicable, however, for simulations involving stochastic dynamics, as discussed below.
+
+\subsubsection{Stochastic integrators}
+
+Stochastic dynamics simulations include application of a random force to each particle, and represent discretizations of either Langevin or Brownian dynamics.
 A detailed description of such stochastic dynamics may be found in McQuarrie~\cite{McQuarrieStatMechBook}, Chapter 20.
 As detailed in section \ref{sec:thermostats}, it is common to apply temperature control through the use of Langevin dynamics.
 As a brief aside, this highlights the fact that the choice of integrator is often tightly coupled to the choice of thermostat and/or barostat.
 Different combinations may demonstrate better performance and for expanded ensemble methods it is absolutely necessary to utilize an integrator specific to the selected temperature- or pressure-control algorithm.
+
 For simulating Langevin or other stochastic dynamics, the presence of random forces usually prevents the integrator from preserving phase-space volume.
 However, through fortuitous cancellation of error, some stochastic integration schemes may achieve preservation of \textit{part} of the full phase-space (i.e. configurations \textit{or} velocities are preserved)~\cite{Fass2018}.
 Though this may sound dire, in practice this is easily remedied through an appropriate choice of timestep - this just might need to be shorter or longer depending on the integration scheme.
@@ -950,11 +961,11 @@ \subsection{Integrators}
 If dynamics are of interest, the dependence of these properties on the integrator parameters (e.g. friction factor) should be assessed~\cite{Basconi:2013:J.Chem.TheoryComput.}.
 \todo[inline, color={yellow!20}]{DLM: I need to review the paragraphing here; some of these are rather long and cover a lot. }
 
-\todo[inline, color={green!20}]{JIM: Happy to introduce Trotter decompositions, but is it really necessary? Also, we need to add information on constrained dynamics.  Anything else?  Needs more details, or just send people to citations?}
-\todo[inline, color={yellow!20}]{DLM: I don't think necessary to introduce, but in favor of adding citations to useful work/additional resources.}
+%\todo[inline, color={green!20}]{JIM: Happy to introduce Trotter decompositions, but is it really necessary? Also, we need to add information on constrained dynamics.  Anything else?  Needs more details, or just send people to citations?}
+%\todo[inline, color={yellow!20}]{DLM: I don't think necessary to introduce, but in favor of adding citations to useful work/additional resources.}
 
 \subsubsection{How to choose an appropriate timestep?}
-\todo[inline, color={yellow!20}]{DLM: Above should be broken into subsubsections for consistency with thermostats/barostats and because a subsection with only one subsubsection doesn't make sense.}
+%\todo[inline, color={yellow!20}]{DLM: Above should be broken into subsubsections for consistency with thermostats/barostats and because a subsection with only one subsubsection doesn't make sense.}
 
 The maximum timestep for a molecular dynamics simulation is dependent on the choice of integrator and the assumptions used in the integrator's derivation.
 For the commonly-used second order integrators such as the Verlet and Leapfrog algorithms, the velocities and accelerations should be approximately constant over the timestep.