For a spontaneous process
and at equilibrium
the most stable configuration of a molecule can be found by minimizing G
Note that we typically minimize the energy E, rather than the free energy G. In
other words, we neglect the entropy effects. We are assuming that most
configurations are similar in entropy, so this effect can be neglected. In
some cases, entropy effects can be calculated and included.
At a minimum of the potential energy surface, the net force on each atom vanishes
stable configurations
Uses
-structural analysis
-model energetics of small conformational changes
-estimate relative enthalpies of binding or strain energy upon binding
-simplify dynamics calculations since minimized structures represent the underlying configurations around which fluctuations occur during dynamics
Calculated energy is relative
-the energy zero is arbitrary
-the total potential energy of different molecules cannot be compared directly
-it is meaningful to compare energies calculated for different configurations
of chemically identical systems
Constraints and restraints can be used to control and direct minimization
Problems
1. typically several different points exist where the forces are zero, and
it is difficult to determine if a particular minimum is the global minimum.
local minima: net zero forces and positive definite second derivative matrix
global minimum: the lowest energy point
saddle point: net zero forces and at least one negative eigenvalue of the second derivative matrix
2. convergence criteria
It is difficult to judge when you are close enough (i.e. have converged) to
the true minimum (which is where forces are zero and second derivative
matrix is positive definite)
target function:
-quantity (function of coordinates) to be minimized
-includes the energy and any external restraining terms to bias the minimization
General strategy for minimization
1. evaluate target function at a given conformation
2. adjust conformation to lower the target function
Efficiency
1. time needed to evaluate target function
2. number of iterations (structural adjustments) needed to converge
Minimizer must determine
1. direction to minimum
2. distance to mininum in that direction
initial direction: negative derivatives of the target function at the current point
-negative derivatives point downhill but not necessarily toward the mininum
-include information on how the derivatives change . greater efficiency
Line search:
1-dimensional minimization along a given direction (i.e. quadratic interpolation)
-independent of algorithm that generates the direction vector
-bracket 1-dimensional minimum and iterate to approach minimum
-costly in number of function evaluations
-the derivative at the minimum of each line search is perpendicular to
the previous direction
starting with initial point
and a specified direction
, a
line search involves the minimization of
as a function of
where
For a 2-dimensional system with target function E(x,y), a line search
minimizes E(x',y') as a function of
where
For a typical line search
3-10 function evaluations must be performed
computationally expensive
A. Steepest descents
Line search direction is the negative of the current derivative or gradient
-each line search produces a new direction perpendicular (orthogonal) to the previous gradient
-each line search deviates from the ideal direction to the mininum, and successive
line searches correct for this deviation, but since each direction must be orthogonal
to the previous direction they cannot efficiently correct
directions oscillate
-can use truncated line search:
update position any time trial point along gradient has a lower energy
fewer functional evaluations per iteration
faster
Disadvantges:
-the directions oscillate along the way to the minimum
requires many line searches
inefficient
-convergence slow near minimum because gradient approaches zero
Advantages
-robust even far from harmonicity
useful when starting far from minimum
-can utilize truncated line search, which is much faster
B. Conjugate Gradients
produces a set of mutually conjugate directions so that each successive step continually refines the direction toward the minimum (without oscillations)
The new direction vector
leading from point i+1 is computed as follows:
: the gradient at point i+1
: the previous direction vector
The next gradient
is orthogonal to all previous gradients
:
and the next direction
is conjugate to all previous directions
where
This ensures that motion along
does not spoil the minimization along
.
Note: an initial direction
must be chosen that is equal to the initial gradient
Disadvantages:
-requires full convergence for each line search . more functional evaluations per iteration
time per iteration longer than for steepest descents
-for nonharmonic systems can get stuck (i.e. if start far from minimum)
Advantage:
-more efficient convergence to minimum
number of iterations smaller than for steepest descents
C. Newton-Raphson
Uses the gradient to identify a direction AND uses the curvature (2nd derivative)
to predict where along the gradient the function will pass through a minimum
The second derivative matrix
(Hessian matrix) defines the curvature in each gradient direction
: the predicted minimum
: an arbitrary starting point
: the Hessian matrix at
Advantages:
-for a quadratic function the minimum can be found without line searches and
with only a single evaluation of the gradient and Hessian matrix.
-for a non-harmonic function, the algorithm must be applied iteratively, but
convergence is very rapid near minimum
Disadvantages:
-terms in Hessian matrix dificult to derive
-unstable when structure is far from minimum
-calculating, inverting, and storing an NxN matrix for large systems is difficult
D. Comparison of methods
What determines which method is best?
1. size of system
2. current state of optimization
Robustness (i.e. ability to reach minimum regardless of initial conditions)
steepest descents > conjugate gradient and Newton-Raphson
conjugate gradient and Newton-Raphson can get stuck if start far
from minimum (i.e. far from quadratic)
Storage
Typically,
steepest descents < conjugate gradient < Newton-Raphson
conjugate gradient: N-dimensional vector for old gradient
Newton-Raphson: N
N Hessian
Number of iterations
Typically,
steepest descents > conjugate gradient > Newton-Raphson
steepest descents: oscillates
conjugate gradient: each step does not spoil previous minimizations so no oscillations
Newton-Raphson: requires only 1 step if quadratic
Time per iteration
Typically,
steepest descents < conjugate gradient < Newton-Raphson
steepest descents: does not require full convergence of line search
conjugate gradient: requires full convergence of line search
Newton-Raphson: no line search but must derive and manipulate Hessian
Strategy
-Use steepest descents for first 10-100 steps
-Then use Newton-Raphson or conjugate gradients to complete minimization to convergence
E. Dynamic quenching and simulated annealing
-Use molecular dynamics for minimization
-Mimic natural process of heating and cooling
-Decrease temperature gradually during simulation
both kinetic and potential energy are reduced
-Potential energy is allowed to increase due to fluctuations
system can overcome small barriers during relaxation
Advantage
-Decrease probability of getting stuck in local minima
F. Vibrational frequencies
Calculating harmonic vibrational frequencies
-expand potential energy surface as a Taylor series around equilibrium geometry
-truncate after the second term
-consider infinitesimal displacements
Uses
-characterize stable points (minima, transition states, etc.)
-compare frequencies to experiment (quantitative accuracy difficult with most forcefields)
Procedure for calculating vibrational frequencies
1. start with minimized structure (gradients must be zero)
2. calculate second derivative matrix
3. mass weight it
4. diagonalize it to obtain eigenvalues
5. convert eigenvalues to vibrational frequencies in wavenumbers
negative eigenvalues
imaginary frequencies
minimum: all frequencies real
transition state: at least one frequency imaginary
|
|
|
|
|
|
|
|
|
|