Fast exploration of an optimal path on the multidimensional free energy surface

In a reaction, determination of an optimal path with a high reaction rate (or a low free energy barrier) is important for the study of the reaction mechanism. This is a complicated problem that involves lots of degrees of freedom. For simple models, one can build an initial path in the collective variable space by the interpolation method first and then update the whole path constantly in the optimization. However, such interpolation method could be risky in the high dimensional space for large molecules. On the path, steric clashes between neighboring atoms could cause extremely high energy barriers and thus fail the optimization. Moreover, performing simulations for all the snapshots on the path is also time-consuming. In this paper, we build and optimize the path by a growing method on the free energy surface. The method grows a path from the reactant and extends its length in the collective variable space step by step. The growing direction is determined by both the free energy gradient at the end of the path and the direction vector pointing at the product. With fewer snapshots on the path, this strategy can let the path avoid the high energy states in the growing process and save the precious simulation time at each iteration step. Applications show that the presented method is efficient enough to produce optimal paths on either the two-dimensional or the twelve-dimensional free energy surfaces of different small molecules.


Introduction
In computational biophysics and biochemistry, people are always interested in finding out how the molecule transfers from the reactant state to the product state [1,2]. From the most optimal path, one can know the detail information about the reaction, like the movements of the different components in the molecule, the direction and the rate of the reaction. Although being important, finding an optimal path is not easy in the simulation. There are three problems should be considered. One is the determination of the possible collective variables. Large number of collective variables could characterize the important states of the molecule precisely. Efficient sampling on the collective variables allows one to study the reaction from the reactant to the product reliably. However, increasing the number of the collective variables also expands the sampling scope in the collective variable space. The sampling speed is also slowed down. So, one must choose the collective variables carefully in the simulation. In theory, a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 good and reasonable collective variables should be independent of each other as much as possible and change smoothly in the conformations between the reactant and the product.
The second is the path construction problem. One must build an initial path first for the subsequent optimization. This seems easy. Since both the reactant state and the product state are characterized by the collective variables, a simple linear or higher order interpolation between the states in the collective variable space can satisfy the requirement [3,4]. However, in some situations, the interpolated path may pass a lot of high energy states, which cause the further free energy calculation to be unreliable. For example, for a small pentapeptide in Ref. [4], only six straight paths between its seven stable states can be utilized in the free energy calculations. Some paths are just abandoned because of the steric clashes between the atoms. Unlike the pure geometric method, a few years ago we proposed a potential-based method to build a feasible path quickly [5]. By a quasi-Newton optimization method that updates the Hessian matrix iteratively with the limited memory Broyden-Fletcher-Goldfarb-Shanno algorithm (L-BFGS) [6,7], the molecule in the reactant state is slowly pulled to the product state. All the snapshots sampled in the optimization are connected together to form the complete path. The method is almost as fast as the above interpolation method because no MD simulations are required in the construction of the path [5]. But we must note that, this method only deduces an initial path on the potential energy surface, not the free energy surface. Additional optimization of path on the free energy surface must be performed subsequently.
Compared with the above path construction problem, the path optimization problem is more difficult. This is because that the path connecting the reactant and the product exists in a high dimensional space spanned by the collective variables. It totally has n×m degrees of freedom. Here n and m are the number of the collective variables and the snapshots on the path, respectively. Such a large number of degrees of freedom increase the complexity of the system. Improper movements of the snapshots in the collective variable space will cause the path to form kinks. These kinks are fatal to the optimization. In the past, many methods have been proposed to circumvent this issue. They have one common objective: optimizing the energy gradients or the free energy gradients of the snapshots orthogonal to the path as much as possible and remaining the smoothness of the path at the same time. But they use different strategies to reach the objective. In the simulation, some methods try to separate the successive snapshots on the path by additional restrained potentials, like elastic band (EB) [8], nudged elastic band (NEB) [9] and minimum Hamiltonian path (MHP) [10]. Only the normal components of the forces on the snapshots (perpendicular to the path) are used to update the positions of the snapshots in the collective variable space. Other methods try to make the free energy gradients to be parallel to the tangent vectors of the path iteratively, like the string method [11]. String method constructs the initial path in the collective variable space and optimizes its shape by two evolution and reparameterization operations [11][12][13][14]. Besides, as two useful variants of the string method, the climbing string method is proposed for searching the transition states [15] and the freezing string method is proposed for decreasing the gradients calculations [16,17]. Recently, we apply the constrained dynamics method [18][19][20][21][22][23][24][25] in the calculation of the free energy gradients of the snapshots on the path [26] and show that the constrained dynamics method [18][19][20][21][22][23][24][25] is also a useful technique for the path optimize problem [27].
Unlike the above methods, there are other methods that do not use the free energy gradients. For example, string method [11] has a modified version [28]. In the simulation, all the snapshots on the path are updated by the average drifts of swarms of short and unbiased trajectories from their positions in the collective variables space. After a few iterations, the method finally optimizes the initial path to the most probable transition path [28]. Besides, path metadynamics [29] and tomographic method [30,31] update the path in a more direct way. They calculate the average positions of the snapshots (as in path metadynamics [29]) or the free energy minima (as in tomographic method [30,31]) on the perpendicular hyperplanes by different strategies and move the snapshots towards the targets subsequently. All the methods discussed above are valid for the optimization of the path. In practice, they can all decrease the free energy barriers on the path effectively. However, these methods require the MD simulations for all the snapshots at every iteration step. Considering so many degrees of freedom in the system, the optimization will be a very time-consuming process.
To save the calculation quantity in the optimization, Chakraborty's group proposed a growing string method in 2004 [32]. This method does not require a guess of the initial path. It builds two path fragments from the reactant and the product simultaneously in the simulation. When the ends of the two path fragments merge to each other, one complete path between the reactant and the product is formed [32]. Since the path is also optimized repeatedly in the growing process and its gradient of the potential is forced to be parallel to the tangent vector of the path, the final path will automatically converge to the minimum energy path (MEP) [32]. Some applications of the method show that it is more efficient than the original string method [33][34][35]. However, this double-ended growing string method [32] and other related singleended versions [36,37] are designed to find the optimal path on the energy surface [32][33][34][35][36][37][38][39]. As far as we know, they have not been applied to the free energy surface till now. As comparison, Ref. [40] provides a method that traces the minimum energy path on the free energy surface. But it requires a prior knowledge of the transition state. It goes downhill on the free energy surface step by step from the transition state to the reactant or the product independently. No further optimization of the path is performed in the simulation [40]. Recently, an anharmonic downward distortion following method (ADDF) is proposed in Ref. [41] for the search of the reaction path from the reactant only. It searches, records and removes the neighboring local minima (called anharmonic downward distortion, ADD) from the reactant successively in the calculation. Combination of the ADDs produces the possible reaction paths and the final products. Ref. [42] performs metadynamic simulation on the path collective variables to search the reaction paths. Unlike the normal expression, it combines the coordination numbers of the atoms with the path collective variables.
In this paper, we use the idea of the single-ended growing string method [36,37] to build an optimal path on the free energy surface. At the beginning, the path starts from the reactant and extends its length constantly to the product. Different from the growing string method [32], the location of the new snapshot on the path is determined by the free energy gradient (not the energy gradient) of the snapshot and the vector from the snapshot to the product. When each new snapshot is grown on the path, the incomplete path will be optimized in the collective variable space by the constrained dynamics method [18][19][20][21][22][23][24][25]. These two growing and optimizing processes are carried out repeatedly one after the other. They will not be stopped until the end of the path reaches the product.
Compared with the previous full-path optimization methods on the free energy surface, our method has two advantages. First, it does not need an initial path at the beginning. The method starts the path construction from one single state (could be any free energy minimum). Second, it does not have to optimize all the snapshots on the whole path from the reactant to the product at every iteration step. The method reduces the size of the system. This could not only decrease the calculation quantity but also accelerate the convergence of the optimization under certain situations.
It must be noted that this idea has also been used in some other path sampling methods. For example, transition path sampling method (TPS) [43,44] obtains an ensemble of transition paths by a shooting strategy. It randomly changes the momenta of atoms of the snapshots on the existed path to generate lots of new paths. From the transition probabilities, it is capable of finding the most optimal path. And steered molecular dynamics (SMD) [45,46] or targeted molecular dynamics (TMD) [47,48] puts an artificial potential on the molecule at the reactant and pulls it gradually to the product. With proper initial conditions, SMD/TMD can generate an reasonable path [49] or path ensemble [50] between the states. The presented method in this paper has similarities with TPS and SMD/TMD. It can start the optimization from an initial guess path as TPS or from one single state as SMD/TMD. But the optimization is performed in a different way. It lets the free energy gradient determine the growing direction of the path. And in TPS, SMD or TMD, the direction is determined by the potential energy gradient in the standard MD simulation.

Theoretical methods
To characterize some important states of a molecule and describe its chemical reaction or state transition process, one must use a set of collective variables (χ 1 , χ 2 , . . ., χ n ). These collective variables are the functions of the Cartesian coordinates of the atoms, like bonds, angles or dihedrals. How to find a proper collective variable for a transition process is difficult. In general, the values of the collective variables should change greatly in the process. This makes the sampling in the space spanned by the collective variables more efficient than the sampling with the Cartesian coordinates of the atoms. Therefore, it is convenient to define and optimize the transition path in the collective variable space. In the simulation of any snapshot on the path, all the collective variables should be fixed simultaneously and the conformation of the molecule is restricted to a small structural ensemble. This is very critical for the calculation of the averages of the thermodynamic variables in a limited time period. Since these collective variables are only related to a small number of atoms in solute, normally they can be constrained in the simulation by solving the constraint equation [51] at every step, no matter there are other solvents or electrons in the system or not.
Traditionally the optimization should start from a complete path that connects the reactant (represented by a vector χ (r) ) and the product (represented by a vector χ (p) ). But as we mentioned before, such a complete path is not easy to construct in a high-dimensional space. The more collective variables are in use, the more high energy states could exist on the interpolated path. In the simulation of any snapshot on the path, the molecule has to satisfy many constraint equations with the collective variables. Any conflict of the collective variables may cause the optimization to fail.
Following the idea of the single-ended growing string method on the energy surface [36,37], here we apply a growing strategy to resolve this issue on the free energy surface. The path is initialized from the reactant state. Then a series of snapshots are grown out of the path one by one with a predefined growth step size s g . Specifically, for a growing path with i−1 snapshots, if the distance between the end of the current path (state vector χ (i−1) ) and the product (state vector χ (p) ) is smaller than the growth step size (|χ (i−1) −χ (p) |<s g ), the product state is directly set as the ith snapshot on the path (χ (i) = χ (p) ). Then the whole path is built out successively. Otherwise, the ith snapshot on the path is determined by the growing direction vector P(i) g with a step size s g Here the growing direction vector P(i) g is calculated by In the formula, rF(χ (i−1) ) is the derivative of the free energy function F with respect to the collective variables of the (i−1)th snapshot (free energy gradient or thermodynamic mean force).
w is the weighting factor, which determines the curvature of the path. The determination of the parameters in the equations depends on our demands on the final path. If one wants to insert more snapshots to the path, the growth step size s g should be small. If one wants to make the path more flexible so that it can pass more intermediate states, the weighting factor w should be small too. When a new snapshot is grown, we do a similar reparametrization step as in the string method [12]. The length of the current path in the collective variable space is calculated by Simpson's rule and then the path is divided into pieces evenly according to the number of the snapshots. Then all the snapshots are moved to their new locations on the same path by SHAKE algorithm [51]. To be distinguished from the interpolation method, from now on, this path construction method will be called the growing method.
In the growing process, the free energy barrier on the path is required to be decreased. One can use the formula of the string method [11]. It is derived from the condition of the minimum energy path (MEP) in the Cartesian coordinate space [11]. By some transformations from the Cartesian coordinates to the collective variables, it gives the following optimization formula on the free energy surface (see Eq. (46) in Ref. [11]) Here χ(i) old and χ(i) new are the current and the new position of the ith snapshot in the collective variable space, respectively. s op is the step size. P(i) op is the optimization direction for the ith snapshot, which depends on three quantities: the unit tangent vector of the path s (i) , the free energy gradient rF(χ (i) ) and the metric tensor matrix M. The first quantity can be easily determined by the interpolation on the positions of the snapshots in the collective variable space. The second quantity is the derived result from the free energy calculation. We will discuss this issue later. The last quantity, metric tensor M, contains the averages of the derivatives of the collective variables with respect to the related Cartesian coordinates in the simulation. Briefly speaking, the formula in the string method requires the projected free energy gradient MÁrF to be parallel to the tangent of the path in the collective variable space [11]. The above formula works for the path optimization. But in this work, we want to compare the results with the optimization on the complete free energy surface. Generally the surface is constructed by the efficient full-space sampling methods like Metadynamics [52][53][54][55] or adaptively biased molecular dynamics (ABMD) [56][57][58]. On the existed surface, an optimal path can also be obtained. But in the construction of the complete free energy surface, it is difficult to perform the sampling in the whole collective variable space by the repulsive biasing potentials and the calculation of the metric tensor M at the same time. This is because the Hamiltonian of the system is modified by the biasing potentials. To make a comparison between the optimizations, we assume the metric tensor M to be diagonal [27] P ðiÞ op ¼ À rFðχ ðiÞ Þ þ ðrFðχ ðiÞ Þ Á s ðiÞ Þs ðiÞ ð5Þ So it forces the free energy gradient rF to be parallel to the tangent of the path (not MÁrF).
Actually, such kind of formula has been used by Ulitsky and Elber in 1990 [3]. They use the formula to search the steepest descent path (SDP) from the transition state to the reactant or the product on the potential energy surface. Here the formula in Ref. [3] is applied to the free energy surface.
As we can see, one important point in the optimization of the path is calculating the free energy gradients. In practice, these gradients can be calculated by the constrained dynamics method [18][19][20][21][22][23][24][25]. The number of the constraints equals the number of the collective variables. In 2003, Schlitter and Klahn proved that the free energy function in the constrained simulation can be described by a concise formula [19] Here the variable ξ is the reaction coordinate that indicates the fraction of the snapshots encompassed by the path. It is also a function of the collective variables. ξ = 0 and ξ = 1 represent the initial and the final state respectively. H c is the Hamiltonian function of the constrained system. T is temperature and k B is the Boltzmann factor. <λ> ξ represents the ensemble average of the Lagrange multiplier at ξ in the constrained simulation [51]. The last quantity Z is a n-dimensional matrix with the following element σ α and σ β are the constraint equations and m i is the mass of the ith atom in the related constraint [19]. By the formula, one can calculate the complete free energy profile on a path or a multidimensional surface that is defined by a set of collective variables [59]. From Eq 6, the gradient of the free energy function with respect to the reaction coordinate or the collective variables can also be calculated. It is clear that the free energy function in Eq 6 has two terms, and so does its derivative. However, as we tested before [26,27], the second term is much smaller than the first term. So in this work, only the first term, i.e., the ensemble average of the Lagrange multiplier <λ>, is used in the calculation of the free energy gradient in Eqs 2 and 5. But, when calculating the free energy profile of the path, both terms are in use. With the free energy gradient, we can build a path gradually from the reactant to the product on the free energy landscape by the growing method (Eq 1) and optimize the path by the constrained dynamics method [19] at the same time (Eqs 3 and 5).
To monitor the convergence of the optimization, an error function is defined as follows. It is the average size of the free energy gradients of all the snapshots perpendicular to the tangent of the path Here n and m are the number of the collective variables and the snapshots on the path, respectively. Smaller σ ? means the free energy gradients on the path are more parallel to the tangent of the path. And thus the path is more optimal on the free energy surface.
To show the performance of the presented method, we select two models in the calculation. The first is the ALA dipeptide with the sequence ACE-ALA-NME. Many simulations find that the peptide has two stable states in vacuum, named C7 eq and C7 ax [11,29,[60][61][62]. In this work, the state C7 eq is set as the reactant and the state C7 ax is set as the product. Their backbone angles, F and C, are chosen to be the collective variables. So the path exists in a twodimensional collective variable space. Low complexity of the system allows us to do a sufficient sampling in the space for the test of the growing method in depth.
The second model is the 10-ALA peptide. It is well known that Alanine-rich peptides prefer the helical conformations in the aqueous environment [63][64][65][66][67], especially the α-helix. So here we set the π-helix as the reactant and the α-helix as the product. The main difference between these two secondary structures is the way to form hydrogen bonds. In π-helix, the hydrogen bonds are formed between the N-H group of an amino acid and the C = O group of the other at five residues earlier (i+5!i). And in α-helix, the interval between the hydrogen-bonded residues is 4 (i+4!i). In this application, we build a path from the π-helix to the α-helix by the growing method. Totally 12 collective variables are defined for the molecule, including one distance between the C α -atoms in the first and the last residue, five i+5!i hydrogen bonds in the standard π-helix structure, and six i+4!i hydrogen bonds in the standard α-helix structure. So many collective variables increase the complexity of the optimization problem. We want to find out if the presented method can work well in such a high-dimensional collective variable space.
The presented method has been written in Fortran code and inserted into the Tinker software [68]. All the simulations are performed at 298 K with AMBER PARM96 force field [69]. The positions and velocities of the atoms are updated by the Beeman's algorithm [70,71] with the 1.0 fs time step. Berendsen method [72] controls the temperature. For the two models, ALA dipeptide is simulated in vacuum, 10-ALA peptide is simulated in the Generalized Born/ Surface Area (GB/SA) implicit solvent [73,74]. To accelerate the optimization, our code is parallelized by MPICH2 [75]. According to the number of the CPUs in the computer, all the snapshots on the path are evenly divided into parts for independent simulations.

Results and discussion
The simulation results of the ALA dipeptide are shown in Fig 1. For a better illustration, the complete free energy landscape of the molecule in the whole collective variable space is also shown in the figure, which is calculated by adaptively biased molecular dynamics method (ABMD) [56][57][58] in a 500 ns trajectory (flooding time is 500 ps). The two free energy minima, reactant state C7 eq and product state C7 ax , are marked in the figure too. The straight dashed line from C7 eq to C7 ax is the initial path built by the linear interpolation method. It is constituted of 40 evenly distributed snapshots. The free energy landscape in the figure shows that the interpolated path passes a high free energy barrier at the center, which requires to be optimized further. The green dashed lines from up to down in the figure are the optimized paths from the initial straight path by Eqs 3 and 5 at 0, 50, 100 and 200 iterations (step size s op = 0.002). To calculate the free energy gradient (ensemble average of the Lagrange multiplier λ in Eq 6), each snapshot is simulated for 3 ps at each iteration step. From now on, the optimization method that starts from the interpolated path will be referred to as the "interpolation method" for short. As shown in the figure, the path moves away from the central free energy barrier gradually in the optimization and converges to the valley below. This reflects the ability of the interpolation method. However, performing simulations for all the snapshots on the path at every iteration step is time consuming. The whole optimization takes 200×40×0.003 ns = 24 ns (6.0×10 5 MD steps per snapshot). If the number of the collective variables and the number of the snapshots increase further, this optimization process will be more slower. Now we use the growing method to build and optimize the path (Eq 1). Different from the above interpolation method, our method starts from the single reactant state, not the whole interpolated path. The growth step size s g in Eq 1 is 0.102. The weighing factor w in Eq 2 is 10.0 (in the test, the optimization with these parameters produces the growth path with the same number of snapshots as previous straight interpolated path). To ensure the smoothness of the path, all the existing snapshots are optimized for 2 steps by Eqs 3 and 5 when a new snapshot is grown. The optimizing step size s op is 0.002, the same as before. Finally 40 snapshots are grown out of the path. The growth path is plotted by the yellow solid line in Fig 1. The shape of the path is in agreement with the previous optimized path by the interpolation method. And more importantly, the total simulation time is only 2×(40+40×(40-1)/2)×0.003 ns = 4.92 ns (1.23×10 5 MD steps per snapshot). This is much shorter than the above interpolation method (6.0×10 5 MD steps per snapshot). Clearly it is because that the growth path is shorter than the interpolation path in most time of the simulation. Fewer snapshots on the path will not only decrease the required simulation time at every iteration step, but also decrease the probability to pass the high free energy states at the early stage of the optimization. Fig 2(a) shows the free energy profiles on the optimized paths by the interpolation method at 0, 50, 100 and 200 iterations (black dashed lines from up to down) and the path by the growing method (blue solid line). The free energies are calculated by Eq 6 [26]. The reaction coordinate ξ on the x-axis represents the location on the path. ξ = 0 is the reactant state (C7 eq ) and ξ = 1 is the product state (C7 ax ). Fig 2(b) gives the changes of the average free energy gradients perpendicular to the tangent of the path (σ ? in Eq 8) in the two optimizations. This is a parameter to show the convergence of the optimization. From the figure, we find that the end-to-end free energy differences on the path by the two methods are identical to each other. The product state is about 2.0 kcal/mol higher than the reactant state in both optimizations. This confirms the fact that the computed free energy is a state function, as it should be. The free energy difference between the states is independent of the transition path in the collective variable space. Moreover, the free energy profile of the path from the growing method is the same as the optimized path by the interpolation method at 200 iterations. This is reasonable because these two paths have the same shapes in Fig 1. Fast convergence speed indicates that the growing method is quite efficient for the optimization of the path on the free energy surface. To compare the methods in depth, we perform different optimizations with different number of snapshots (from 20 to 40) by the two methods (the calculation quantities are the same). The final average perpendicular free energy gradients of the paths (σ ? in Eq 8) in the two optimizations are shown in Fig 3(a), and their per-snapshot simulation time are in Fig 3(b). The results show that the growing method has good performances for different paths with different lengths. The fast convergence speed of the method does not depend on the length of the path.
In the following, we will discuss four important aspects on the path optimization problem. The first is the calculation of the free energy gradient for the path. Eq 6 shows that the free energy of each snapshot in the collective variable space has two terms. So the free energy gradient also has two terms. The first term comes from the constraint force in the simulation (ensemble average of the Lagrange multiplier) and the second is the correction term for the entropy loss from the unconstrained system to the constrained system [18,19]. Due to the determinant of the metric tensor in Eq 6, analytic derivation of the second term is very complicated. For simple molecules like ALA dipeptide in the low dimensional collective variable space, it can be approximately calculated by the numerical central difference method [27]. However, for large molecules with lots of constraints, like 10-ALA in this work, even the numerical method becomes impractical. But we want to note that it is still possible to get the two components of the free energy gradient along the tangent of the path. The data can help us to evaluate the different weights of the two components. In Fig 4(a) and 4(b), we plot the two free energy gradient terms along the straight path from the reactant to the product for ALA dipeptide (blue solid lines) and 10-ALA peptide (black dashed lines), respectively. The results show that the second term is much smaller than the first term. For these two molecules, it is Path exploration on the free energy surface safe to use the first free energy gradient to update the shape of the path in the collective variable space.
The second aspect is the choice of the way to perform the path optimization. With some kinds of artificial potentials, a complete free energy surface in the whole collective variable space can be built by many well-known free energy calculation methods like ABMD [56][57][58] or Metadynamics [52][53][54]. So, before the path optimization, one can first prepare a complete free energy surface in a long simulation and then do the path optimization on this surface continuously. Otherwise, as in this work, one can calculate the free energy gradients during the optimization at every iteration step. When the path is updated, all the free energy gradients on the path are required to be calculated again in the new constrained simulations [18,19]. In Fig  5, we show that these two ways of optimization are equivalent to each other for ALA dipeptide. The blue arrows on the straight path are the selected free energy gradients rF from the complete free energy surface by a 500 ns ABMD simulation [56][57][58] (multiplied by 0.06). The magenta arrows are the gradients rF on the current path obtained from a few independent constrained simulations [18,19] (100 ps for each snapshot, multiplied by 0.06). It shows that the former free energy gradients are rather close to the latter. The final paths of the two optimizations are also in good agreement with each other (the blue dotted curve is from the former optimization on the complete surface and the magenta solid curve is from the latter, see the descriptions on the optimizations below). It is known that the optimization of the path on the free energy surface requires some MD simulations. But, performing the simulations before or during the optimization does not change the final results.
The third aspect we want to discuss is the acceleration methods in the path optimization. As introduced before, steepest descent method in Eq 3 is simple and practical. But the optimization is sensitive to the step size s op . Large s op could cause the path to form kinks and small s op will slow down the convergence speed. So, a variable step size may be useful. Initially one can set a relatively large step size. At every iteration step, if the sum of the free energies of all the snapshots on the path is smaller than the previous iteration step, then the step size is reduced by half. Otherwise, it is reset to the initial value. Besides the steepest descent method, quasi-Newton method is also an efficient optimization method. It updates the Hessian matrix iteratively by a method like L-BFGS [6,7]. Applications shows the method performs well in the structural optimizations of the molecules [6,7] with the all-atom force field [68]. However, in the test of the path optimization problem, we find that L-BFGS [6,7] moves the snapshots too far in the collective variable space at every iteration step. This could twist the path and fail the optimization. So, two more operations are executed to solve this problem. The first is to reduce the movement distance of each snapshot by a step size control parameter s step Here χ(i) old and χ(i) new are the current and the new position of the ith snapshot in the collective variable space. And χ(i) bfgs is the temporary position updated by L-BFGS [6,7]. To reduce the movement distance, the control parameter s step must be smaller than 1.0. The second operation is to smooth the updated path at every iteration by a positive objective function [30].
As we said before, all the three above optimization methods calculate the free energy gradients by some independent constrained simulations [18,19] at every iteration step. Alternatively, one can also prepare a complete free energy surface in the collective variable space by ABMD [56][57][58] before the optimization. The surface data will be used later in the free energy gradient calculation and the path optimization.
To compare the efficiencies, we perform four optimizations for ALA dipeptide from the straight path by the four different methods discussed above. The first is a steepest descent optimization with a fixed step size s op = 0.002 (Eqs 3 and 5). The second is also a steepest descent optimization but with a variable step size (default step size s op = 0.004). The third optimization is performed by Jorge Nocedal's L-BFGS code [6,7] (step size control parameter s step in Eq 9 is 0.2). To calculate the free energy gradients at every iteration step, one 3 ps constrained simulation [19] is launched for each snapshot at every iteration step. The last optimization is similar to the first one. It also uses the steepest descent method with a fixed step size 0.002. But the gradients are simply calculated from the complete free energy surface from a 500 ns ABMD simulation [56][57][58] (full-space optimization). The changes of the average perpendicular free energy gradients (σ ? in Eq 8) in all the four optimizations are shown in Fig 6. As we can see, quasi-Newton method with L-BFGS formula [6,7] has the fastest convergence speed. The shape of the optimized path by the quasi-Newton method [6,7] after 200 iterations is plotted as magenta solid curve in Fig 5. The path is perfectly overlapped with that from the last optimization on the complete free energy surface (blue dotted curve in Fig 5).
The last aspect is the choice of the driving force in the optimization. As shown in Eq 5, all the snapshots on the path are moved by their free energy forces (negative free energy gradients -rF) in the collective variable space. Alternatively, as in ref. [3], they can also be moved by the energy forces -rV too. Calculation of the free energy force (also called "thermodynamic mean force") demands more calculation time than the energy force. But the free energy force is corresponding to the driving force on the molecule from the free energy surface. With the free energy force, the molecule can search and locate the thermodynamically stable states reliably. And moreover, only the free energy force could provide the free energy profile on the path. The free energy profile is the key to the study of the reactions. So, unlike some previous methods that optimize the path by the energy force, in this work the presented method utilizes the free energy force in the optimization.
Simulation results of ALA dipeptide show that the growing method works well in the low dimensional space. Now we turn to a more complicated model: 10-ALA peptide. It has ten residues and 103 atoms. 12 collective variables are defined for this model, including one end-toend distance and 11 hydrogen bonds (five i+5!i hydrogen bonds in the standard π-helix structure, and six i+4!i hydrogen bonds in the standard α-helix structure). We select these collective variables is because that they are greatly different in the predefined reactant state (π-helix) and the product state (α-helix). Now we build a path by the growing method first. The growth step size s g in Eq 1 is 0.1 and the weighting factor w in Eq 2 is 5.0. The path is optimized for 2 steps with the step size s op = 0.001 after each new snapshot is grown. To calculate the free energy gradient, one 30 ps MD simulation is performed for each snapshot on the path. Finally the growth path has 80 snapshots. To show the path, we reduce the 12-dimensional collective variable space to a 2-dimensional space and project the growth path into the space (red dotted line in Fig 7). nHB α on the x-axis is the effective number of the hydrogen bonds in the standard α-helix structure (i+4!i). And similarly, nHB π is the effective number of the hydrogen bonds in the standard π-helix structure (i+5!i). Both collective variables are calculated by the formula in the users' manual of the AMBER software [76]. It defines one type of collective variable, N_OF_STRUCTURES, in the ABMD simulation [56][57][58]: Here the sum in the formula includes all the hydrogen bonds in the standard helical structure. The black dashed line is the data in the optimization by the steepest descent method with a fixed step size. The red dash-dotted line is the data of the steepest descent method with a variable step size. The magenta solid line is the data by the quasi-Newton method with L-BFGS formula [6,7]. The blue dotted line is the data of the optimization on a prepared complete free energy surface from a ABMD simulation [56][57][58]. The shapes of the final optimized paths in the last two optimizations are plotted as the magenta solid curve and the blue dotted curve in Fig 5,  So k = 6 for the α-helix and k = 5 for the π-helix. d is the actual bond length in the current structure and d 0 is the reference bond length which is set to be 3.5 Å. For comparison, the projection of the linear interpolated path on the two-dimensional plane between the reactant and the product is also plotted in the figure (blue dash-dotted line). It can be found that the two paths are greatly different from each other. Although sharing the same reactant and the product, the growth path more prefers the intermediate snapshots with fewer hydrogen bonds than the interpolated path. To find out which one is more optimal on the free energy surface, we perform the optimizations for the paths and calculate the free energy profiles on them.
Both of the two paths of 10-ALA peptide are further optimized for 600 iterations by quasi-Newton method with L-BFGS formula [6,7]. Here the step size control parameter s step in Eq 9 is 0.06. In the optimization from the interpolated path, all the evolved paths at the iterations are plotted in Fig 7. The colors in the figure represent the free energy values on the paths. They approximately reveal the original free energy landscape in the high dimensional collective variable space, from the initial interpolated path (blue dash-dotted line) to the final optimized path The blue dash-dotted line, red dotted line and magenta dashed line represent the interpolated path, the growth path and the optimized path, respectively. The two collective variables, nHB α and nHB π , are the effective numbers of the hydrogen bonds in the standard α-helix (i+4!i) and the standard π-helix (i+5!i) (Eq 11). The colors in the figure show the free energy profiles on the paths in the optimization from the interpolated path (unit: kcal/mol). Note that all the paths are originally defined and optimized in a 12-dimensional collective variable space.
https://doi.org/10.1371/journal.pone.0177740.g007 at 600 iterations (magenta dashed line). The free energy profiles of the paths at 0, 100, 200 and 600 iterations are plotted by black dashed lines from up to down in Fig 8(a). The same data of the optimization from the growth path are shown by blue solid lines. We find that the two optimizations produce the same results. Fig 8(b) gives the changes of the average perpendicular free energy gradients (σ ? in Eq 8) in the two optimizations. It indicates that the optimization from the path generated by the growing method converges much faster than that from the interpolated path. This confirms the result in Fig 7 that the initial position of the growth path on the free energy landscape (red dotted line) is closer to the final optimal path (magenta dashed line) than the interpolation path (blue dash-dotted line).
Moreover, there is a free energy minimum exists on the final optimized path. By checking the hydrogen bonds in the structure (the distance between backbone oxygen atom and hydrogen atom is smaller than 2.5 Å), we find that the snapshot at this minimum maintains the first five hydrogen bonds in the standard α-helix and the last hydrogen bond in the standard πhelix. So the π-to-α reaction can be divided into two stages approximately. At the first stage, the peptide breaks the first four i+5!i hydrogen bonds at the N-terminus of the π-helix and forms the first five i+4!i hydrogen bonds. Then, at the second stage, it breaks the last i+5!i Path exploration on the free energy surface hydrogen bond at the C-terminus and forms the remaining one i+4!i hydrogen bond at the same time. So, based on the findings from the simulation with AMBER PARM96 force field [69], it may be concluded that the π-to-α transition process of 10-ALA peptide starts from Nterminus and ends at C-terminus.

Conclusions
Optimization of the path from the reactant to the product on the free energy surface is an interesting but also challenging problem. From the optimal path, one can know the reaction process in detail and conclude the reaction mechanism. In the study, building a good initial path is an important part of a successful optimization. The path can be built by a linear or higher order interpolation in the collective variable space [3,4]. But such a geometric method does not consider the potential energy or the free energy of the molecule. If any part of the interpolated path passes the high free energy states at the beginning, it will take many additional iteration steps to move the path away from the states. Furthermore, at every iteration step, all the snapshots on the path must be simulated by the molecular dynamics method [70] for a period of time. Of course, this is also a time consuming process.
In this paper, we present a growing method to accelerate the construction and the optimization process of the path. This method uses the free energy gradients from the constrained simulations [18][19][20][21][22][23][24][25]. It grows the path gradually from the reactant to the product and optimizes the existing path constantly in the growing time. So the path can adaptively avoid the high free energy states at an early stage of the optimization and save much simulation time in the later iterations. Applications on two peptides show that the presented method is quite faster than the standard interpolation method. Actually, the growing strategy is not only useful for the construction of the path over the barriers (transition states), but also useful for the path that passes the basins on the free energy surface (intermediate states).
We want to note that the growing method in this work is a deterministic method. It always evolves to one final optimal path at the end of the optimization. For simple free energy landscape with one unique transition path between the reactant and the product, this method should work well. However, in a case of a molecule with multiple independent transition paths, it may not be so efficient. In such situations, one can utilize the "self-avoiding walk" strategy in discrete state model method [77,78] or the "shooting" strategy in TPS method [43,44]. These methods are able to generate lots of possible conformations or paths with weights in a quick way. Performing the optimization from these guess paths may save the simulation time. Moreover, the presented method calculates the free energy gradient of the snapshot on the path one by one. Each free energy gradient is determined by a structure ensemble corresponding to a set of collective variables. If there are some hidden barriers in the ensemble, insufficient conformational sampling in the simulation may produce the biased results. To improve the sampling, one can simulate a set of successive snapshots on the path simultaneously as in Ref. [79]. By a proper exchange between the snapshots according to their Hamiltonians, the molecule should have a high chance to pass the hidden barriers in the ensemble.
Supporting information S1 File. It contains the molecular structures, the simulation parameters and the results in this paper. (ZIP)