You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: paper/basic_training.tex
+39-28Lines changed: 39 additions & 28 deletions
Original file line number
Diff line number
Diff line change
@@ -176,7 +176,7 @@ \subsubsection{Key concepts}
176
176
177
177
Classical molecular models typically consist of point particles carrying mass and electric charge, as well as potentially additional interactions such as van der Waals interactions and bonded interactions of various types.
178
178
Sometimes it is much more efficient to freeze the internal degrees of freedoms and treat the molecule as a rigid body where the particles do not change their relative orientation as the whole body moves; this is commonly done, for example, for rigid models of the water molecule.
179
-
Due to the high frequency of the O-H vibrations, accurately treating water classically would require a solving the equations of motion with a very small timestep, so for computational efficiency water is often instead treated as a rigid body.
179
+
Due to the high frequency of the O-H vibrations, accurately treating water classically would require solving the equations of motion with a very small timestep, so for computational efficiency water is often instead treated as a rigid body.
180
180
Keeping specified objects rigid in a simulation involves applying holonomic constraints, where the rigidity is defined by imposing a minimal set of fixed bond lengths and angles through iterative procedures during the numerical integration of the equation of motion (see Section~\ref{sec:integrators} for more on constraints and integrators).
181
181
It is important to understand the concept of point particles, rigid bodies and constraints.
182
182
@@ -225,9 +225,10 @@ \subsubsection{Key concepts}
225
225
In this sense, the laws of thermodynamics provide us rigorous sanity checks in addition to many useful mathematical relations for computing properties.
226
226
Basic thermodynamic principles thus also dictate proper simulation protocols and associated best practices.
227
227
228
-
The concept of the thermodynamic limit is particularly important.
228
+
The concept of the thermodynamic limit is particularly important here.
229
229
Specifically, as the size of a finite system is increased, keeping the particle number density roughly constant, at some point it is said to reach the thermodynamic limit where its behavior is bulk-like and no longer depends on the extent of the system.
230
-
Thus, small systems will exhibit unique behaviors that reflect their microscopic size, but sufficient large systems are said to have reached the thermodynamic limit and macroscopic thermodynamics applied.
230
+
Thus, small systems will exhibit unique behaviors that reflect their microscopic size, but sufficiently large systems are said to have reached the thermodynamic limit and macroscopic thermodynamics applied.
231
+
This is due to the fact that the effect of interfaces or boundaries have largely been removed, and, more importantly, that averages of system properties are now over a sufficiently large number of molecules that any instantaneous snapshot of the system roughly corresponds to average behavior (i.e. fluctuations in properties become negligible with increasing system size).
231
232
232
233
Although we usually think of thermodynamics applying macroscopically and statistical mechanics applying on the microsopic level, it is important to remember that the laws of thermodynamics still hold \textit{on average} regardless of the length scale.
233
234
That is, a molecule in contact with a thermal bath will exchange energy with the bath, but its average energy is a well-defined constant.
@@ -236,8 +237,6 @@ \subsubsection{Key concepts}
236
237
Importantly, as long as we have carefully defined our ensemble and thermodynamic path, we can apply the powerful relationships of thermodynamics to more easily calculate many properties of interest.
237
238
For instance, one may use molecular dynamics to efficiently numerically integrate the Clapeyron equation and construct equations of state along phase coexistence curves~\cite{Kofke1993, GonzalezSalgado2010}.
238
239
239
-
\todo[inline, color={green!20}]{JIM: I also want to mention that some thermodynamic properties are only truly well-defined for many molecules, not for single molecules or isolated molecules. This is a tricky discussion, though. DLM: What did you have in mind? }
240
-
241
240
\subsubsection{Books}
242
241
Equilibrium thermodynamics is taught in most undergraduate programs in physics, chemistry, biochemistry and various engineering disciplines.
243
242
Depending on the background, the practitioner can choose one or more of the following books to either learn or refresh their basic knowledge of thermodynamics.
Key concepts from statistical mechanics are particularly important and prevalent in molecular simulations:
258
257
\begin{itemize}
259
-
\item Ensembles, distribution functions for different ensembles. Equivalence of ensembles
260
-
\item What equilibrium means and the difference between equilibrium and non-equilibrium.
261
-
For instance, what is usually called an ``equilibrium trajectory'' generally will not embody a good sample of the equilibrium ensemble due to insufficient sampling.
262
-
On the other hand, truly non-equilibrium conditions such as driven transitions and relaxation are fundamentally different.
263
-
Note that relaxation can occur to the equilibrium ensemble or a non-equilibrium condition (e.g., steady state).
264
-
\item Clarify differences between nonequilibrium ensembles: driven
265
-
nonequilibrium steady-state, systems driven out of equilibrium by a time-dependent field, systems initially out of equilibrium but relaxing
266
-
to equilibrium
267
-
\item Time averages and ensemble averages
268
258
\item Fluctuations
269
-
\item Correlation functions
259
+
\item Definitions of various ensembles
260
+
\item Time averages and ensemble averages
261
+
\item Equilibrium versus non-equilibrium
270
262
\end{itemize}
271
-
\todo[inline, color={yellow!20}]{DLM: This list is bothering me because it is longer than the others, has more statements in it, and doesn't totally connect with what's in this section. Not sure what to do with it.}
263
+
\todo[inline, color={yellow!20}]{DLM: This list is bothering me because it is longer than the others, has more statements in it, and doesn't totally connect with what's in this section. Not sure what to do with it. JIM: I've proposed changes. Helpful or not?}
272
264
273
265
Traditional discussions of classical statistical mechanics, especially concise ones, tend to focus first or primarily on macroscopic thermodynamics and microscopic \emph{equilibrium} behavior based on the Boltzmann factor, which tells us that configurations $\conf$ occur with (relative) probability $\exp[-U(\conf)/k_B T]$, based on potential energy function $U$ and temperature $T$ in Kelvin units.
274
266
Dynamical phenomena and their connection to equilibrium tend to be treated later in discussion, if at all.
This is most easily seen by imagining the placement of a solute in a periodic simulation box.
546
538
The solute will be replicated in all of the surrounding periodic images.
547
539
The concentration of solute is thus exactly one per the volume of the box.
548
-
Although proper selection of non-bonded cutoffs will guarantee that these solutes do not directly interact, they may indirectly interact through their perturbation of nearby solvent.
540
+
Although proper selection of non-bonded cutoffs will guarantee that these solutes do not directly interact (hence the common claim that such systems are at infinite dilution), they may indirectly interact through their perturbation of nearby solvent.
549
541
If the solvent does not reach a bulk-like state between solutes, the simulation will still suffer from obvious finite-size effects.
550
542
551
543
Macroscopic, lab-scale systems, or bulk systems, typically consist of multiple moles of atoms/molecules and thus from a simulation perspective are effectively infinite systems.
One should always check that unexpected long-range correlations (i.e. on the length-scale of the simulation box) do not exist in molecular structure, spatial position, or orientation.
556
548
It should also be recognized that periodic boundary conditions innately change the definition of the system and the properties calculated from it.
557
549
Many derivations, especially those involving transport properties, such as diffusivity \citep{Yeh2004}, assume infinite and not periodic boundary conditions.
558
-
\todo[inline, color={green!20}]{JIM: Other examples to cite? I think there's some examples also involving calculations of entropy, but I'll have to check this}
550
+
%\todo[inline, color={green!20}]{JIM: Other examples to cite? I think there's some examples also involving calculations of entropy, but I'll have to check this. Let this go for current version, maybe update later.}
559
551
The resulting differences in seemingly well-known expressions for computing properties of interest are often subtle, yet may have a large impact on results.
560
552
Such considerations should be kept in mind when comparing results between simulations and with experiment.
561
553
@@ -581,6 +573,7 @@ \subsection{Main steps of a molecular dynamics simulation}
581
573
If such documents do not exist, considerable care should be exercised to determine best practices from the literature.
582
574
In some cases, freely available tools are constructing systems are available and can be a reasonable option (though their mention here should not be taken as an endorsement that they necessarily encapsulate best practices).
583
575
Examples include tools for constructing specific crystal structures, proteins, and lipid membranes, such as Moltemplate, Packmol, and Atomsk.
576
+
\todo[inline, color={green!20}]{JIM: Need to cite these packages!}
584
577
The goal of all of these tools, and system preparation in general, is to create an accurate representation of the system of interest that can be interpreted by the desired simulation package.
585
578
It is further desirable that this starting structure resemble the equilibrium structure of the system at the thermodynamic state point of interest.
586
579
For instance, highly energetically unfavorable configurations of the system, such as blatant atomic overlaps, should be avoided.
@@ -606,6 +599,7 @@ \subsection{Main steps of a molecular dynamics simulation}
606
599
Once the kinetic and potential energies fluctuate around constant values, the thermostat may be removed (if an NVE simulation is desired) and a snapshot selected that is simultaneously as close to the average kinetic and potential energies as possible.
607
600
This snapshot, containing both positions and velocities may be used to then start an NVE simulation that will correspond to a temperature close to that which is desired.
608
601
This is necessary due to the fact that only the average temperature is obtained through coupling to a thermostat (see Section~\ref{sec:thermostats}), and the temperature fluctuates with the kinetic energy at each timestep.
602
+
609
603
Similarly, equilibration in the NPT ensemble is necessary before production in the NVT if an average density consistent with a specific pressure is desired.
610
604
In this case, the system may be scaled to the desired average volume before the production simulation.
611
605
In the above example, the NPT simulation is said to have equilibrated to a specific volume when the dimensions of the simulation box fluctuate around constant values with minimal drift.
@@ -622,10 +616,15 @@ \subsection{Main steps of a molecular dynamics simulation}
\caption{Common equilibration work-flows are shown.
620
+
Many times, the desired state variables of a simulation, such as energy or density, are defined by their averages in an ensemble in which they may fluctuate, such as at fixed temperature or pressure.
621
+
Such cases follow the first or third equilibration routes.
622
+
When we wish to simulate at a known, fixed density, such a procedure is not necessary, as shown by the second route.
623
+
The last route emphasizes that generally temperature and pressure should be equilibrated separately, even if the production ensemble is NPT.
624
+
}
626
625
\label{eqworkflow}
627
626
\end{figure}
628
-
\todo[inline, color={yellow!20}]{DLM: Caption needs updating to make clear why you would choose different options here, especially the two different NVT options.}
627
+
\todo[inline, color={yellow!20}]{DLM: Caption needs updating to make clear why you would choose different options here, especially the two different NVT options. JIM: Better?}
629
628
630
629
Once equilibration is complete, production data may be collected.
631
630
The production simulation is that from which specific properties of the system of interest will be calculated.
@@ -896,6 +895,10 @@ \subsection{Integrators}
896
895
Algorithms to perform this integration take many forms and are usually called integrators.
897
896
Here, we explain the need for integrators, discuss key criteria like energy conservation, and highlight a number of commonly used integrators.
898
897
898
+
\subsubsection{Desireable integrator properties}
899
+
900
+
So-called ``good'' integrators contain certain features that are appealing for molecule simulations.
901
+
We start with the most obvious feature, which is that the integrator induces little error in the dynamics.
899
902
Since integration is fundamentally about taking discrete steps to approximate continuous dynamics, this discretization process introduces errors (as can be observed by comparison to analytically soluble problems, like the harmonic oscillator).
900
903
These errors are termed discretization errors, whereas additional errors called truncation errors are also accumulated through loss of precision during computer calculations.
901
904
As will be discussed shortly, there are many strategies for avoiding discretization errors.
@@ -907,7 +910,7 @@ \subsection{Integrators}
907
910
Luckily, this issue may be avoided simply by guaranteeing that the integrator is reversible~\cite{Frenkel:2001:}.
908
911
More details may be found in ~\citet{Tuckerman:1992:}, but basically if the mathematical operator representing the integrator preserves phase space volume, it also satisfies the definition of reversibility: if the operator is applied to propagate forward by $\Delta t$, the starting condition may be recovered by in turn applying the operator to the result using $- \Delta t$ as the timestep.
909
912
910
-
Energy conservation is imperative in simulating the microcanonical (NVE) ensemble.
913
+
Energy conservation is also a desirable integrator property and is imperative in simulating the microcanonical (NVE) ensemble.
911
914
This is a much trickier property to examine, and varies with different integrators.
912
915
For instance, some classes of integrators better-preserve energy over short times, while others better-preserve energy at long times.
913
916
The latter is generally preferred, though it may necessitate other sacrifices such as greater energy fluctuations away from the desired, exact system energy.
@@ -934,14 +937,22 @@ \subsection{Integrators}
934
937
This has the added benefit of also reducing truncation error, which is proportional to the number of timesteps taken.
935
938
It is worth noting that the issue of integrator choice versus timestep is not always simple; in some cases, a ``better'' integrator might allow longer timesteps but also carry an additional computational cost that outweighs the benefits of an increased timestep.
936
939
937
-
The most commonly used integrators are variants of the Verlet algorithm (e.g. Velocity Verlet or Leapfrog).
938
-
Detailed discussion and derivation of such integrators may be found in section 7.3 of \citet{LeachBook} and 4.3 of \citet{Frenkel:2001:}.
939
-
Such integrators are not applicable, however, for simulations involving stochastic dynamics.
940
-
These simulations include application of a random force to each particle, and represent discretizations of either Langevin or Brownian dynamics.
940
+
\subsubsection{Deterministic integrators}
941
+
942
+
The most commonly used integrators are variants of the Verlet algorithm (e.g. Velocity Verlet or Leapfrog).
943
+
Such integrators include terms for updating particle positions up to the order of the square of the timestep (i.e. they include forces).
944
+
Inclusion of higher-order terms is favored in other families of algorithms, but generally leads to greater complexity and reduced computational efficiency at only marginal improvement in accuracy.
945
+
Detailed discussion and derivation of many common integrators may be found in section 7.3 of \citet{LeachBook} and 4.3 of \citet{Frenkel:2001:}.
946
+
Such integrators are not applicable, however, for simulations involving stochastic dynamics, as discussed below.
947
+
948
+
\subsubsection{Stochastic integrators}
949
+
950
+
Stochastic dynamics simulations include application of a random force to each particle, and represent discretizations of either Langevin or Brownian dynamics.
941
951
A detailed description of such stochastic dynamics may be found in McQuarrie~\cite{McQuarrieStatMechBook}, Chapter 20.
942
952
As detailed in section \ref{sec:thermostats}, it is common to apply temperature control through the use of Langevin dynamics.
943
953
As a brief aside, this highlights the fact that the choice of integrator is often tightly coupled to the choice of thermostat and/or barostat.
944
954
Different combinations may demonstrate better performance and for expanded ensemble methods it is absolutely necessary to utilize an integrator specific to the selected temperature- or pressure-control algorithm.
955
+
945
956
For simulating Langevin or other stochastic dynamics, the presence of random forces usually prevents the integrator from preserving phase-space volume.
946
957
However, through fortuitous cancellation of error, some stochastic integration schemes may achieve preservation of \textit{part} of the full phase-space (i.e. configurations \textit{or} velocities are preserved)~\cite{Fass2018}.
947
958
Though this may sound dire, in practice this is easily remedied through an appropriate choice of timestep - this just might need to be shorter or longer depending on the integration scheme.
@@ -950,11 +961,11 @@ \subsection{Integrators}
950
961
If dynamics are of interest, the dependence of these properties on the integrator parameters (e.g. friction factor) should be assessed~\cite{Basconi:2013:J.Chem.TheoryComput.}.
951
962
\todo[inline, color={yellow!20}]{DLM: I need to review the paragraphing here; some of these are rather long and cover a lot. }
952
963
953
-
\todo[inline, color={green!20}]{JIM: Happy to introduce Trotter decompositions, but is it really necessary? Also, we need to add information on constrained dynamics. Anything else? Needs more details, or just send people to citations?}
954
-
\todo[inline, color={yellow!20}]{DLM: I don't think necessary to introduce, but in favor of adding citations to useful work/additional resources.}
964
+
%\todo[inline, color={green!20}]{JIM: Happy to introduce Trotter decompositions, but is it really necessary? Also, we need to add information on constrained dynamics. Anything else? Needs more details, or just send people to citations?}
965
+
%\todo[inline, color={yellow!20}]{DLM: I don't think necessary to introduce, but in favor of adding citations to useful work/additional resources.}
955
966
956
967
\subsubsection{How to choose an appropriate timestep?}
957
-
\todo[inline, color={yellow!20}]{DLM: Above should be broken into subsubsections for consistency with thermostats/barostats and because a subsection with only one subsubsection doesn't make sense.}
968
+
%\todo[inline, color={yellow!20}]{DLM: Above should be broken into subsubsections for consistency with thermostats/barostats and because a subsection with only one subsubsection doesn't make sense.}
958
969
959
970
The maximum timestep for a molecular dynamics simulation is dependent on the choice of integrator and the assumptions used in the integrator's derivation.
960
971
For the commonly-used second order integrators such as the Verlet and Leapfrog algorithms, the velocities and accelerations should be approximately constant over the timestep.
0 commit comments