Transient helicity in intrinsically disordered Axin-1 studied by NMR spectroscopy and molecular dynamics simulations

Many natural proteins are, as a whole or in part, intrinsically disordered. Frequently, such intrinsically disordered regions (IDRs) undergo a transition to a defined and often helical conformation upon binding to partner molecules. The intrinsic propensity of an IDR sequence to fold into a helical conformation already in the absence of a binding partner can have a decisive influence on the binding process and affinity. Using a combination of NMR spectroscopy and molecular dynamics (MD) simulations we have investigated the tendency of regions of Axin-1, an intrinsically disordered scaffolding protein of the WNT signaling pathway, to form helices in segments interacting with binding partners. Secondary chemical shifts from NMR measurements show an increased helical population in these regions. Systematic application of MD advanced sampling approaches on peptide segments of Axin-1 reproduces the experimentally observed tendency and allows insights into the distribution of segment conformations and free energies of helix formation. The results, however, were found to dependent on the force field water model. Recent water models specifically designed for IDRs significantly reduce the predicted helical content and do not improve the agreement with experiment.


S3
1 Implementation of the dRMSD

Distance Deviations as a Reaction Coordinate
Intrinsically disordered proteins can be in a stable conformation for microseconds. Classic continuous MD simulations can access this timescale for the desired system size and thus could capture the transition from one conformation to another. Reliable statistical averages and population probabilities, however, cannot be extracted if transitions between the states are rare. To gain any insight on the statistics of disordered proteins, advanced sampling techniques like Hamiltonian Replica exchange (H-REMD) [1] have to be used. The crucial choice for any H-REMD method is the selection of a reaction coordinate. In the case of unfolding proteins or peptides, the coordinate should reliably distinguish between the folded state and unfolded states. The RMSD from the folded state, as used in the method of Woo & Roux [2], is a possible choice. It does, however, require a fit to the reference structure for each frame. Instead of taking the RMSD of the coordinates of atoms, here we used the RMSD of a chosen set of distances compared to respective reference distances, which can be taken from the reference structure (exemplary bonds in Figure  D). This dRMSD R is defined as Using distances avoids any fitting as the intra-molecular distances are rotation and translation invariant. In addition, the area where conformational freedom is of interest can be freely chosen by the definition of the dRMSDpairs. With the same mechanism a single helical fragment of a protein can be unfolded or entire domains can be moved with respect to one another.
To enhance sampling along the dRMSD, harmonic potentials force the system to sample specific regions of R around a reference value R 0 . The potentials with a force constant k 0 are of the form which for pair i between atoms a and b creates a force of pointing, depending on the sign of the R deviation, towards or away from the bond partner. Replica exchange between Hamiltonians with different positions of the umbrella minimum further enhances sampling of different conformations and allows the system to overcome artificial barriers introduced by the additional potentials. Note that the phase space volume is not constant with respect to R.

Equations of the dRMSD potential
The dRMSD is defined as Where index i runs over all distances between the N atom pairs that contribute to the dRMSD. The harmonic potentials along R are of the form with a specific reference dRMSD R 0 and the distance d i being a function of the coordinates r i1 , r i2 of the two atoms of the pair: The forces on atom coordinate x of atom i1 is then calculated from Thus the vectorial force is given by Finally, the contribution to the Hamiltonian of this distance RMSD potential is

Lambda Scaling Along the dRMSD
For the application of US typically several windows along the reaction coordinate are defined via a transition coordinate λ. We defined the λ-dependence of R as The λ-dependent distance RMSD potential has then the form This allows transitions from one state with reference distances d A i0 to another state with reference distances d B i0 . Also, with no d B i0 defined, a continuous sampling of λ in the range [0, 1] allows sampling from the structure defined with distances d A i0 to unfolded structures up to a dRMSD deviation of R A 0 . The derivative of the potential with respect to λ then is S6 and the force in direction x for atom i1 is and thus the vectorial force is

Gromacs Implementation
The dRMSD as a reaction coordinate for umbrella sampling with replica exchange has been implemented in GROMACS [4] 4.6.2 and uploaded to github: https://github.com/enzyx/gromacs-4.6-drmsd To run a simulation with the adapted GROMACS code the following settings are required. Atom pair ai, aj which forms a distance pair. Default function type 1. d0 is the reference distance of this atom pair. Note that ai, aj are molecule interal indices. The first atom of each molecule has index 1.

mdrun Parameters
The possible long distance bonds that have to be calculated each step clash with the domain decomposition principle of Gromacs. Simulations will not start without the explicit request of particle decomposition: The implementation of the distance restraint can write out the dRMSD calculated during the simulation. Additionally the tool g drmsd can be used to calculate dRMSDs from a given trajectory. To obtain the distances and applied forces use g drmsd. g drmsd has to be given at trajectory and a run input file with all the settings for the dRMSD method. The tool then for each frame of the trajectory extracts the dRMSD and the resulting potential to a output file.
-f Input, trajectory: .xtc, .trr etc. -s Input, run input file: .tpr -o Output file (drmsd.xvg), optional If g drmsd is given a list of trajectories and tpr files it will calculate the drmsd and potential for the first given trajectory with the first tpr and so forth. Non-matching numbers after the last underscore, e.g. traj 1.xtc and topol 2.tpr will give an error. Output for each trajectory will be written to files with matching number.