The Role of Conserved Waters in Conformational Transitions of Q61H K-ras

To investigate the stability and functional role of long-residence water molecules in the Q61H variant of the signaling protein K-ras, we analyzed all available Ras crystal structures and conformers derived from a series of independent explicit solvent molecular dynamics (MD) simulations totaling 1.76 µs. We show that the protein samples a different region of phase space in the presence and absence of several crystallographically conserved and buried water molecules. The dynamics of these waters is coupled with the local as well as the global motions of the protein, in contrast to less buried waters whose exchange with bulk is only loosely coupled with the motion of loops in their vicinity. Aided by two novel reaction coordinates involving the distance (d) between the Cα atoms of G60 at switch 2 and G10 at the P-loop and the N-Cα-C-O dihedral (ξ) of G60, we further show that three water molecules located in lobe1, at the interface between the lobes and at lobe2, are involved in the relative motion of residues at the two lobes of Q61H K-ras. Moreover, a d/ξ plot classifies the available Ras x-ray structures and MD-derived K-ras conformers into active GTP-, intermediate GTP-, inactive GDP-bound, and nucleotide-free conformational states. The population of these states and the transition between them is modulated by water-mediated correlated motions involving the functionally critical switch 2, P-loop and helix 3. These results suggest that water molecules act as allosteric ligands to induce a population shift among distinct switch 2 conformations that differ in effector recognition.


Introduction
The role of solvent on the structure and function of proteins has been the subject of numerous previous studies [1][2][3][4][5][6]. For instance, it has been shown that waters in the dimer interface of hemoglobin play an important role in its transition between deoxy and oxy forms [7,8]. Water molecules serve as ''lubricants'' to facilitate conformational inter-conversion [9,10] or as ''adhesives'' in binding interfaces [11]. Dehydrated biomolecules therefore lose their biological activity and have suppressed dynamics. Experiments by Frauenfelder and colleagues using Mossbauer and neutron scattering techniques demonstrated that internal fluctuations of proteins are linked to the dynamics of the surrounding solvent [12,13], leading to the idea that protein dynamics is 'slaved' by [13][14][15][16][17] (or coupled to [18]) solvent dynamics. Many other experimental and theoretical studies arrived at similar conclusions [19][20][21][22][23][24]. However, it is less clear how buried solvent molecules might modulate an allosteric coupling between spatially distant regions in multi-domain or monomeric proteins [25][26][27][28]. A better understanding of how protein-bound waters modulate coupled motions and allostery can lead to new strategies for ligand and protein design.
Water molecules reported in crystallographic structures, particularly those located at domain boundaries, interfacial regions or near active sites, often have a structural or functional role [29][30][31][32][33][34]. In contrast, the role of buried waters located far away from the active or ligand-binding site of a monomeric protein is not always obvious. In some cases, these waters may participate in long-range hydrogen bond networks that connect distal protein segments to functionally important regions. Recent reports indicate that this applies to H-ras [35][36], a close homologue of the subject of this study, K-ras. Both H-and K-ras belong to the Ras family of molecular switches that play a central role in a variety of signaling pathways [37][38][39]. Ras is turned on when bound to guanosine triphosphate (GTP) and off when bound to guanosine diphosphate (GDP) [40]. The population of the 'on' and 'off' states is regulated by guanine nucleotide exchange factors (GEFs), which increase the dissociation rate of GDP, and GTPase-activating proteins (GAPs), which accelerate the slow intrinsic rate of GTP hydrolysis [39,41,42]. Oncogenic mutations that impair intrinsic Ras function and/or GAP action are found in ,15% of all human tumors and in up to 90% of cases in specific tumor types [38,43,44].
The catalytic domain of K-ras is bi-lobal, with lobe1 (residues 1-86), but not lobe2 (residues 87-167), being evolutionarily conserved among the Ras family [45][46][47][48]. The major structural difference between the GDP-and GTP-bound forms involves two switch regions located in lobe1 (S 1 : residues 25-40 and S 2 : 57-75; see Fig. 1). Among several key residues known to have distinct interactions in the inactive and active forms are T35 on S 1 and G60 on S 2 (See Fig. 1B), which interact with Mg 2+ and the cphosphate of GTP; GTP hydrolysis leads to the loss of these interactions and relaxation of the switches to an open conformation [39,49]. Furthermore, a recent 31 P-NMR [50,51] study indicated the presence of two conformational states in GTP-bound H-ras, state 1 and state 2 [52]. The two states are characterized by different chemical shifts in the aand c-phosphorous atoms of the GTP and by the ability of state 2 to interact with effector proteins while state 1 is stabilized by GEFs [51].
To investigate the role of buried water molecules on the distribution of functionally relevant conformational states of K-ras, we analyzed all available Ras x-ray structures and conformers derived from seven sets of twenty explicit solvent multi-copy MD simulations of Q61H K-ras. The simulations were performed with and without restraints on selected water molecules and protein segments, as well as in the presence and absence of two conserved protein-bound water molecules. Using a pair of simple, previously uncharacterized, reaction coordinates we grouped both the experimental and simulated structures into active GTP-bound, intermediate-like GTP-bound, inactive GDP-bound and nucleotide-free conformational states. We describe the effect of selected protein-bound waters on the distribution of these states.

Results/Discussion
Analysis of available Ras crystal structures (see Methods) yielded a number of conserved water molecules around the switch regions and the P-loop (Fig. 1A). To evaluate the structural and functional significance of these waters, we carried out a systematic study based on seven sets of multi-copy MD simulations (Table 1). In these independent simulations, crystal waters W 3 and W 4 were either (i) unaffected, (ii-iii) individually restrained, (iv) simultaneously replaced by dummy molecules, (v) both removed at the start of the simulations or (vi-vii) their gating protein segments S 2 and H 3 harmonically restrained in separate simulations.

Identification of conserved crystallographic water molecules
As of 2010, the protein data bank has 53 entries of Ras crystal structures comprised of 65 chains (see supplementary Table S1). Of these, 45 chains with a resolution of 2.8 Å or better contain crystallographic waters. Five water molecules were found conserved in over 70% of these structures. Fig. 1A shows the location of these waters in the protein based on their nearest neighbor, which is defined as the nearest of any backbone nitrogen or oxygen atom that is within 3.5 Å of a water oxygen atom. Two of the five water molecules are well-known for their interaction with the bound nucleotide [53]. However, little attention has been given to the remaining three waters, despite their apparent role in stabilizing the functionally critical nucleotide binding switches and the P-loop (Fig. 1A). These include W 1 coordinated by H27 and V29 at S 1 and A18 at helix H 1 , W 3 interacting with G10 at the P-loop as well as side-chain atoms of T58 and R68 at S 2 , and W 4 which interacts with backbone atoms of A11 at the P-loop, V81 at b 4 and S89 at H 3 . The average reported B-factors for W 3 and W 4 are 22 and 19 Å 2 , suggesting limited thermal fluctuation. The average backbone solvent accessible surface areas (SASA) of their nearest neighbors G10 and A11 are 2.2 and 1.2 Å 2 . In contrast, W 1 is relatively dynamic with a B-factor of 24 Å 2 , and Backbone nitrogen atoms of the Q61H K-ras x-ray structure were used to highlight protein sites occupied by one or more water molecules in .70% (red) and 50-70% (blue) of the available Ras crystal structures (the analyzed structures are listed in Table S1). P-loop (residues 10-17), switch1 (residues 25-40) and switch2 (residues 57-75) are colored in red, green and orange, respectively. The rest of the protein is in yellow cartoon and the molecular surface in transparent light green. (B) Schematic illustration of some of the key residues in the functionally important P-loop and switch regions. doi: 10 Mutations that affect the population of these states can cause cancer or developmental disorder. Using simulation approaches, here we show that a number of water molecules buried within the structure of an oncogenic Kras protein modulate the distribution of its conformational states. Moreover, a detailed analysis based on two novel structural parameters revealed the existence of long-range water-mediated interactions that facilitate a dynamic coupling between the two lobes of the protein. These findings pave the way for a dynamics-guided strategy to inhibit abnormal Ras signaling.
surface exposed with a backbone SASA of 2.9 Å 2 for its nearest neighbor A18. W 3 and W 4 are curiously missing in several Ras structures with mutations at position 12 (P-loop, e.g., PDB: 621P), 38 (effector binding loop, e.g., PDB: 221P), and 61 (S 2 , PDB: 721P). In addition, W 3 (but not W 4 ) is missing in a G12D variant (PDB: 1AGP) and a Ras-Sos complex (PDB: 1NVW), whereas W 4 is missing in G12R (PDB: 421P), another Ras-Sos complex (PDB: 1HE8) and a Ras-PI3K complex (PDB: 1BKD). It is important to note that oncogenic mutations frequently occur at positions 12 and 61, and mutations at position 38 often impede effector binding. Moreover, W 4 is located at the interface between the two lobes of Ras, connecting the P-loop in lobe1 with H 3 in lobe2 via the interfacial strand b 4 (Fig. 1A). These observations prompted us to undertake a systematic MD analysis of the structural and functional roles of these waters.

Hydration waters in unrestrained MD trajectories of Q61H K-ras
The time-and ensemble-averaged water diffusion coefficient, D, (see Methods) calculated as a function of distance from the protein surface, r, shows that water molecules within 2.5-5.0 Å of the protein diffuse very slowly (D<0.12-0.33 Å 2 /ps; see Fig. 2A). D progressively increases and stabilizes after r<10 Å to ,0.4 Å 2 /ps, which corresponds to the bulk-water diffusion coefficient of the TIP3P water model used in this work [54]. The reduction in translational motion of the protein-bound waters is coupled with a change in their orientation, as ,cosh. ranges between 20.3 and 0.1 in the region 2.4#r#6 Å before stabilizing near zero (Fig. 2B). Further information about the influence of the protein-solvent interaction on the diffusive property of waters in different hydration shells can be obtained from the autocorrelation function of the electrical dipole moment, m. Plots of this function for r values of 3-11 Å at an increment of 2 Å clearly show that the reorientation dynamics of the water dipoles in the first hydration shell (r = 3.0 Å ) is much slower than those in the second hydration shell (r = 5.0 Å ), which in turn is much slower than those in bulk ( Fig. 2C). Previous studies have shown that protein-solvent interaction, especially hydrogen bonding, is the primary reason for the retarded dynamics of protein-bound waters [55]. The affinity of these hydration waters for the protein can be quantified by their survival probability, N w (t). A plot of N w (t) versus time indicates that up to 6 waters have a survival probability of at least 8-10 ns in the three unrestrained trajectories (Fig. 2D). Of these, 2-3 waters survived for the whole 100 ns duration of the simulations. This indicates that some of the conserved crystal waters described in the previous section are stably bound to the protein in solution and can be considered an integral part of the protein structure. The stability of these water molecules is remarkable given the higher diffusive property of TIP3P waters compared with some of the other more popular water models [56][57][58][59]. In sum, our results about the hydration behavior of Q61H Kras are consistent with many previous reports on the hydration of other proteins, such as lysozyme [55] and plastocyanins [60].

Long-residence waters
Six water molecules were found to have t mean (a)$1 ns (see eq. 2 and Table 2, Fig. 3A). Each of these waters interacts with 2-3 backbone or side-chain atoms of nearby residues (Fig. 3B). Nearest-neighbor analysis (see Methods) identified A18:O, G10:N, and A11:N as the primary interaction partners of W 1 , W 3 , and W 4 , respectively. These are the same water-binding sites found in the majority of the x-ray structures (Fig. 1A), indicating their preservation during the simulations. The remaining three waters W 2 , W 5 and W 6 at sites K16:O, A83:O and D126:O were not present in the starting structure or the majority of the Ras xray structures. W 1 has a mean residence time at site A18:O (t mean (A18:O)) of 17 ns ( Table 2) and occupancy of 30% (see Methods). The other interaction partners of W 1 are the backbone polar atoms of H27 in lobe1 and occasionally A146 in lobe2. The latter interaction often occurs through another water molecule and appears to facilitate hydrogen bonding between A146 carbonyl and the nucleotide base. In this context, it is important to mention that A146T K-ras mutation has been recently implicated in colorectal cancer [61]. W 2 , which bridges the backbone of K16 and the side-chain of D57, has a mean residence time (t mean (K16:O)) and occupancy of  Table 2. Mean and maximum residence times, and mean square deviation of selected protein-bound waters. Only water molecules that are protein-bound in simulations F and W { 3,4 are shown. The mean residence time, t mean (a) was calculated using eq.2. t max is the maximum mean residence time at a particular hydration site. The mean square displacement of a water molecule with t max (MSD tmax ) was calculated using eq. 3.
The relatively small t mean (A11:N) is because W 4 has exchanged and eventually escaped in one of the three simulations, though it was stable for the whole 100 ns in the other two. doi:10.1371/journal.pcbi.1002394.t002 11 ns and 20%, respectively. K16 stabilizes the phosphate of the nucleotide [39], while D57 is part of the conserved D 57 TAGQ 61 motif [62,63]. W 3 and W 4 have occupancies of 72 and 84% at their respective sites G10:N and A11:N. They are tightly bound to these atoms with a mean square displacement (MSD) of 0.48 and 0.39 Å 2 . W 3 is located near the highly flexible S 2 and is further stabilized by interactions with the side chain functional groups of T58 and R68. As a result of S 2 's dynamics, W 3 frequently exchanges with bulk water (t mean (G10:N)<3 ns), suggesting that water binding at this site is entropically favored. In contrast, the more buried W 4 rarely exchanges with bulk water (t mean (A11:N)<41 ns), suggesting a strong enthalpic binding that involves accepting a hydrogen bond from A11:N-H and donating to V81:O and S89:O. This internal water thus couples the P-loop to H 3 . Note that the smaller t mean (A11:N) is due to the exchange and eventual loss of W 4 in one of the three trajectories (t res,j = 100 ns in the other two). The non-crystallographic waters W 5 and W 6 also have significant t mean (2.0 and 1.5 ns) at their respective sites A83:O and D126:O.
Overall, these six long-residence waters stabilize functionally critical regions, such as the P-loop and the nucleotide binding switches, or physically connect the evolutionarily conserved lobe1 with the variable lobe2. The latter is similar to previous reports on protein-kinase A where several water molecules participate in an extended network of inter-domain interactions [19,33].

Novel reaction coordinates to characterize Ras conformational states
To further investigate the functional role of the long-residence waters discussed above, it is important to identify features of the protein that may change with the presence and absence of these waters. In previous reports, we used principal component analysis (PCA)-based collective coordinates to discriminate between active and inactive conformations of Ras [45,46,64,65]. We found that the wild-type GDP-and GTP-bound structures form distinct clusters while mutant structures group into separate clusters that are intermediate between the two major clusters [45,46]. Though dominated by the switch regions, PCA contains contribution from the overall dynamics of the protein. Since the effect of individual water molecules may be localized near their binding site, we searched for reaction coordinates that only involve protein segments in the vicinity of W 3 and W 4 . Ideally, such reaction coordinates should also be able to discriminate between the fully active GTP-, less active GTP-, inactive GDP-bound and nucleotide-free conformations. (For brevity, we will refer to these states as the GTP, intermediate, GDP and free state.) The GTP and GDP states are very well characterized and covered in numerous excellent reviews (e.g., [26,39,66,67]). Nucleotide-free conformations, often found complexed with Sos [68], are characterized by wide-open switch conformations. The intermediate state is comparatively less well-characterized, but several structural and biochemical studies have shown that some mutations on S 1 (e.g., T35S) and S 2 (e.g., G60A) lead to intermediate GTP-bound structures [62,63,69,70] that are defective in effector recognition [51,52].
The adjacent P-loop residues G10 and A11 interact in two different directions via W 3 and W 4 : with T58 & R68 involving the conserved D 57 TAGQ 61 motif in lobe1 and with V81 & S89 in lobe2, respectively. Considering the role of G60 in stabilizing the c-phosphate of GTP via its amide group (Fig. 1B) and the displacement of S 2 away from the P-loop in the GDP state [39], we reasoned that the distance between the C a atoms of G10 and G60 (d) and the G60 N-C a -C-O dihedral (j) may prove useful (Fig. 1B).
Indeed, Fig. 4A shows that analysis of the x-ray structures in our data set in terms of d led to two groups populated by GTP-bound (d#7.5 Å ) and GDP-bound plus nucleotide-free conformers (d.7.5 Å ). A similar analysis in terms of j produced two other groupings populated by GDP plus intermediate conformers (j#0.0u) on the one hand, and GTP plus nucleotide-free conformers (j.0.0u) on the other. A scatter plot of d versus j yielded four distinct clusters populated by structures in the GTP, intermediate, GDP and nucleotide-free state (Fig. 4A). It is remarkable that these simple reaction coordinates enable such a neat discrimination among states, including the intermediate GTP-bound sub-state populated by mutant structures (e.g., A59G, PDB: 1FL0) and few Ras-Sos complexes (e.g., PDB: 1NVX) (see Supplementary Material). The active and inactive clusters largely coincide with the previous PCA results [45,46,64,65], apart from the GDP-bound G12V structure (PDB: 2Q21) that, unlike in the PCA analysis, now clusters with the rest of the inactive structures. This is to be expected because the current classification emphasizes S 2 while it was the uniquely open S 1 conformation that separated 2Q21 from the main inactive cluster [45,46]. Overall, our d/j plot centered on S 2 and the P-loop effectively captures the nucleotide dependent conformational dynamics of Ras.

Protein-bound waters modulate the distribution of Q61H K-ras conformational states
Projection of the MD-derived d/j values into the d/j space defined by the crystal structures shows that only 45% (see Methods) of the MD conformers sample from the GTP state (Fig. 4B). Another 25% went to the intermediate state while the remaining conformers are distributed in the GDP and nucleotidefree states. Clearly, Q61H K-ras samples all four conformational states at room temperature and in the presence of GTP, with the active state being the most favored, followed by the intermediate state. This result thus re-enforces our previous finding that Ras exists in different conformational sub-states in solution and predominantly operates via a conformational selection mechanism [64].
Projection of conformers with and without W 3 at G10:N (Fig.  S1, A&B) indicates no correlation with the water-occupancy of this site and the protein conformation. This implies that dynamics of the flexible loop of S 2 is not directly dependent on W 3 , as one would expect from the interaction of this water with S 2 residues T58 and R68. Interestingly, however, escape of the distal W 4 , which occurred during one of the three F series of trajectories and represents about 17% of the conformers, shifted the equilibrium towards the more open inactive state (Fig. 4B). This is surprising because, unlike W 3 , W 4 does not directly interact with S 2 whose motion dominates the inter-state dynamics captured by the d/j plot. This can be understood by noting that the S 2 loop around G60 moved away from H 3 in conformations lacking W 4 , which allowed for W 4 to escape from its deeply buried position between the two lobes. This resulted in an increased solvation of G10, as shown by the water coordination number of its carbonyl oxygen from 1-2 in the presence of W 4 to 3-4 in its absence (Fig. 4C). Thus, W 4 influences the solvation behavior of W 3 's nearest neighbor even if the dynamics of the two waters is not strictly coupled.
While waters W 1 and W 6 do not seem to affect the dynamics of S 2 , entry of W 2 and W 5 in the latter half of the simulations resulted in significant changes in the conformational distribution (Fig. S1, C-F). In the presence of W 2 the intermediate state is favored over the GTP state. In contrast, conformers containing W 5 preferentially sample the nucleotide-free region.
Taken together, the decoupled dynamics of the protein with some waters, such as W 3 , and its strong coupling with others, such as W 4 , implies that a global master-slave relationship of water and protein dynamics is not always applicable at the level of individual water molecules. The relationship between water and protein motion appears to be modulated by subtle differences in binding sites, such as whether a given structural water is entirely coordinated by the less flexible backbone atoms instead of side chains. Our results also underscore that the dynamics of individual water molecules can influence protein fluctuation locally or at a distance.

Protein controlled water dynamics and the role of protein-water hydrogen bonding
To assess the role of backbone dynamics on the fluctuation of individual protein-bound waters and evaluate the extent of their coupling, we ran four independent sets of additional simulations.

1)
Restraining protein segments controlling the dynamics of structural waters (S r 2 and H r 3 ): In two sets of three H r 3 independent runs (each 100 ns-long), a harmonic force constant of 10 kcal/ mol/Å 2 was applied on the C a atoms of S 2 and H 3 that gate W 3 and W 4 , respectively. A d/j analysis of the S r 2 trajectories (Fig. 5A) indicates a restricted sampling in a region confined to the starting structure. W 3 did not exchange in any of the three S r 2 trajectories, consistent with our initial expectation that lack of S 2 dynamics would abolish exchange of W 3 with bulk. Thus S 2 gates the entry and exit of W 3 even if their dynamics is not strictly concordant (see previous section). W 4 also remained stable during these simulations, suggesting an indirect stabilization of W 4 by S 2 . In the H r 3 trajectories, the vast majority of the conformers populated the GTP and intermediate states (Fig. 5B). This is consistent with the results of the unrestrained simulations in which these two states were favored by conformers with intact W 4 (Fig. 4B).

2)
Restraining selected protein-bound waters (W r 3 and W r 4 ): In another two sets of three 60 ns runs, W 3 and W 4 were positionally restrained to evaluate the specific role of protein-water hydrogen bonds for the stability and dynamics of the protein and the waters themselves. As expected, the restriction of W 3 in each of the W r 3 simulations abolished its exchange with bulk even if S 2 was still dynamic. The inability of W 3 to undergo rotational dynamics did not allow it to remain hydrogen bonded with the fluctuating protein, which led to the early reorientation of its principal interaction partner G10:N (j,0, see Fig. 5C). As a result, the intermediate and GDP states were favored by an overwhelming majority (,87%) of the conformers. W 4 also remained tightly bound in all of the W r 3 trajectories. Unlike W 3 whose restriction in the initial position was tolerated by the protein, restraining W 4 led to an enhanced global displacement of the protein away from W 4 . This happened within 0.7-2.2 ns in each of the W r 4 simulations, but it took 2-35 ns for a new water molecule to come in. This lag time allowed the protein to spend a considerable amount of time in the GDP state (Fig. 5D). Taken together, these results show that the dynamics of deeply buried waters that exclusively interact with backbone atoms is intimately linked to the dynamics of the protein, whereas the dynamics of less buried waters partially stabilized by hydrogen bonds with side chains, such as W 3 , is less coupled with the protein motion.

3)
Removal of partial charges in protein-bound waters (W 0 3,4 ): Yet another set of three 100 ns simulations were carried out to further evaluate the apparent link between the rotational dynamics of structural waters, such as W 4 , and their ability to remain hydrogen bonded with the fluctuating protein. In these simulations W 3 and W 4 were simultaneously replaced by dummy molecules (i.e., with mass and structure of a water molecule but lack partial charges). During each simulation the uncharged initial waters were quickly replaced (within few nanoseconds) by other water molecules. The quick replacement of the hydrogen bond-deficient waters by those capable of forming hydrogen bonds led to conformational sampling similar to the unrestrained simulations, apart from the absence of conformers that resemble the nucleotide free form (Fig. 5E). These results further  highlight the importance of solvating backbone atoms with unsatisfied hydrogen bonds, and emphasize how compounds with defective hydrogen bonding capabilities may not be tolerated at water-binding sites.

4)
Removal of selected protein-bound waters (W { 3,4 ): In each of the two 100 ns-long W { 3,4 simulations, bulk waters quickly re-entered (during equilibration or within few nanoseconds of the production phase) to take the place of the removed W 3 and W 4 . However, the vast majority of the resulting conformers (81%) are in the intermediate state, with ,17% wandering off to the GDP state and a negligible proportion (2%) remaining in the GTP state (Fig. 5F). The trigger for the redistribution of the conformational states is hard to pinpoint, but it appears that absence of the two waters, especially W 4 , in the beginning of the simulations resulted in the reorientation of G60 in the S 2 loop. Another unique feature of these simulations is that the non-crystallographic W 5 and W 6 have larger t mean (12 and 11 ns, Table 2) and occupancy (58 and 66%) at sites A83:O and D126:O. Of the two, W 5 appears to play a particularly important role in conformational sampling, as described below.

Water-mediated allosteric pathways
In the unrestrained simulations, W 5 -containing conformers (representing 4% of the total) are characterized by j.0u/ d.7.5 Å and preferentially populate the nucleotide-free region (Fig. 4B). In contrast, W 5 -containing conformers derived from the W { 3,4 simulations are characterized by j,0u/d,7.5 Å and populate the intermediate state (Fig. 5F). The reason for this discrepancy is not clear, but it may be related to the recently proposed multi-pathway nature of the allosteric effect [71] where an ensemble of states (instead of a single propagation pathway) defines a particular allosteric interaction. In both cases, however, the GTP state is disfavored, and entrance of W 5 to the Nterminus of H 3 coincides with a large conformational change of S 2 , as shown by superposing the average structures with and without W 5 (Fig. 6). As an example, A83:O and S89:OH were ,3.6 Å apart in the absence of W 5 but a partial loss of a helical turn at the N-terminus of H 3 upon the entry of W 5 led to increased solvation of S89. That this local conformational change at the N-terminus of H 3 is felt by S 2 suggests their allosteric coupling, with W 5 acting as a ligand. Though we are not aware of a report linking S 2 dynamics with the N-terminus of H 3 , previous studies have found a water-mediated interaction between Y32 at S 1 and N86 at the N-terminus of H 3 [35], as well as a correlated motion between S 2 and the C-terminus of H 3 [46]. Although the full implication of this result to Ras function warrants further investigation, it underscores the potential of multiple/long simulations to uncover mechanistic insights hidden in crystallographic average structures. Nonetheless, the following observations highlight how water-mediated allostery might underlie the differential sampling of phase space by Q61H K-ras in the presence and absence of specific water molecules.
Recent studies by Buhrman et al. [35,36] showed that a conformational change at H 3 /loop7 allosterically modulates S 2 and thereby positions Q61 for a direct action in GTP hydrolysis. One of the two hydrogen bond networks responsible for the allosteric coupling involves Y96, Q99 and R102. In our simulations Y96 underwent major displacement away from S 2 , thereby increasing the solvent accessibility of key P-loop and S 2 backbone atoms (Fig. 7). The distribution of the distance between G60:O and Y96:OH (Fig. 7A) shows that the phenoxy group of Y96 is hydrogen bonded, often through a water molecule, with the carbonyl of G60 in the GTP-like conformers (with a mean distance of 3.5 Å for the active and intermediate states). This hydrogen bond is broken as Y96 reorients away from S 2 in the GDP and nucleotide-free states (average G60:O-Y96:OH distance of 6.5 Å ). These mean values closely match the average G60:O-Y96:OH distances obtained from GDP-and GTP-bound x-ray structures (ca. 4.0 and 7.6 Å , Fig. 7A). Consistent with the previously observed positive correlation between the dynamics of S 2 and H 3 [46], the reorientation of Y96 is coupled with the displacement of S 2 away from the P-loop, as suggested by a strong correlation (R = 0.78) between distances d and Y96:OH-G60:O. The reorientation of Y96 side chain toward H 3 was analyzed by the angle between a vector connecting the C a atoms of S89 and Y96 that approximately traces the helical axis, and a vector from the C c to the OH atoms of Y96 along the plane of the ring (Fig. 7B). The angle decreased from ,60u in the GTP/intermediate states to as low as 30u in the GDP state. The decrease is accompanied by the expulsion of W 4 and a significant increase in the water coordination number of G10, i.e., the site of W 3 binding (Fig. 4C  & 7C). Entry of waters such as W 5 at the N-terminus of H 3 did not result in a similar Y96 reorientation, but is associated with a larger opening of S 2 and entry of other solvent molecules.
These observations led us to conclude that the way in which Q61H K-ras samples conformational states, and the population of those states, is modulated in various ways by protein bound water molecules. The observed water-mediated correlated motion among S 2 , the P-loop and H 3 is very sensitive to perturbation and may be altered by ligands designed to interfere with the interlobe communication of Ras proteins.

Conclusion
The aim of the current work was to investigate the solution stability and role of buried water molecules on the overall dynamics of Q61H K-ras, as well as to assess if crystallographic waters have a role in the inter-lobe communication and allosteric behavior of Ras proteins. To this end, extensive restrained and free MD simulations, involving seven sets of 20 multi-copy runs for a total of 1.76 ms, were carried out in the presence and absence of selected crystallographic water molecules. Analysis of hydration waters indicated that the dynamics of K-ras-bound waters is dramatically different from the bulk behavior. Moreover, the dynamics of several deeply buried long-residence water molecules, such as W 4 , is coupled with the local as well as global motions of the protein. As a result, the protein was able to sample different regions of phase space in the presence and absence of some of these waters. Three water molecules, namely, W 1 (located in lobe1), W 4 (at the interface between the lobes) and W 5 (at lobe2) facilitate the relative motion of residues located at the two lobes of Q61H K-ras. On the other hand, the fluctuation of less buried waters is only loosely coupled with the global motion of the protein even in cases where their exchange is gated by the fluctuation of functionally important protein segments, such as the S 2 loop.
Our analysis was facilitated by two simple reaction coordinates: distance (d) between the C a atoms of G60 at S 2 and G10 at the Ploop and the N-C a -C-O dihedral (j) of G60. A d/j scatter plot allowed us to classify the available Ras x-ray structures into active GTP-bound, intermediate GTP-bound, inactive GDP-bound, and nucleotide-free states. We emphasize that the structures classified here as intermediates are deficient in effector binding and contain mutations or are derived from complexes with GEFs. Projection of the MD-derived conformers on the d/j space derived from the xray structures enabled us to associate functionally distinct conformational states with the presence and absence of conserved high-residence water molecules. Moreover, we found a watermediated correlated motion involving S 2 , P-loop and H 3 . This motion is mediated by specific residues that received little attention in previous studies. Notably, the reorientation of Y96 side chain away from S 2 in GDP-like structures leads to increased solvation of G10 at the P-loop. The presence and absence of this interaction led to differences in the conformation of S 2 , suggesting a watermediated modulation of S 2 by H 3 . Thus, water molecules act as allosteric ligands to induce a population shift in the conformational states of the canonical switches.

Methods
Crystal structure analysis 53 x-ray structures (65 chains) of Ras were downloaded from the Protein Data Bank (PDB [72]) and analyzed for their water content, focusing only on waters within the first hydration shell and interacting with the protein backbone, defined by a distance cutoff of 3.5 Å between water oxygen and backbone oxygen or nitrogen atoms. Analysis of the fraction of crystal structures containing one or more water molecules bound to a given protein site identified two particularly interesting waters, W 3 and W 4 (see Results and Discussion). These waters were selected for a more detailed analysis by MD.

Molecular dynamics simulations
The 2.27 Å resolution crystal structure of GTP-bound Q61H K-ras (PDB id: 3GFT) was used to perform seven sets of multicopy MD simulations (Table 1). An oncogenic mutant of H-ras G12V has been shown to sample a wide range of conformational space [45,46]. We therefore reasoned that the oncogenic Q61H K-ras, the only available K-ras structure, may also sample a large conformational space during classical MD simulation. In the three simulations named F (i.e., free), all crystal waters were kept and no restraints were applied. In simulations S r 2 and H r 3 , restraints were applied (force constant k = 10 kcal/mol/Å 2 ) on the C a atoms of switch 2 (S 2 : residues 57-75) or part of helix 3 (H 3 : residues 87-95). Using the same force constant, conserved waters W 3 and W 4 were positionally restrained in simulations W r 3 and W r 4 , their partial charges were removed in simulations W 0 3,4 and they were excluded at the start of the W { 3,4 simulations (Table 1). In each case, the system setup involved assignment of charges assuming neutral pH (D, E and C-terminus de-protonated and K, R and Nterminus protonated). The protein was solvated in a cubic box of side 60.4 Å containing TIP3 waters, allowing a minimum of 10 Å distance between the edge of the box and the protein. The system was neutralized by adding 12 Na + ions, and an additional 30 Na + and 30 Cl 2 ions were added to achieve a 150 mM ionic strength. Energy minimization was then carried out for 2000 steps with the protein heavy atoms fixed and for another 5000 steps with all atoms set free. During the initial 200 ps equilibration, a harmonic restraint of k = 4 kcal/mol/Å 2 was applied on the C a atoms, which was then progressively reduced by 1 kcal/mol/Å 2 every 100 ps. All equilibration steps used a 1 fs time step, which was increased to 2 fs in the production phase with SHAKE [73] applied to covalent bonds involving hydrogens. The temperature of the system was maintained at the physiological value of 310 K using Langevin dynamics with a damping coefficient of 2 ps 21 . The Nose-Hoover Langevin piston method was used to maintain constant pressure at 1atm with a piston period of 100 fs and decay time of 50 fs. The short-range van der Waals interactions were switched off gradually between 8.5 and 10 Å with a 12 Å cutoff used for non-bonded list updates. Long-range electrostatic interactions were calculated using the Particle Mesh Ewald (PME) method [74]. Each simulation was run for either 60 or 100 ns, yielding an aggregate simulation time of 1.76 ms. All simulations were performed with the NAMD program [75] and the CHARMM27 force field [76].

Identification and characterization of hydration waters
For the purpose of this work, hydration waters are defined as waters whose oxygen atom is within 3.5 Å of any non-hydrogen protein atom.
Water dynamics. The dynamics of the water molecules that hydrate the protein was characterized in terms of their survival probability (N w (t)), diffusion coefficient (D), dipole (m) orientation order parameter (,cosh.), and dipole autocorrelation function (Sm m(t) :m m(0)T). For each water molecule j, N w (t) was calculated based on the conditional probability P j (t n ,t), where P j (t n ,t) is 1 if the j th water molecule is within 3.5 Å of any protein backbone atom between time t n and t n +t, and 0 otherwise (see ref. [55]). N w (t) is the sum over P j (t n ,t), which is maximal at the start of the simulation (small t) and gives the number of waters permanently attached to the protein at t equal to simulation length. The profile of D as a function of distance r from the protein surface was calculated by a finite-difference method previously described by Makarov et al. [77]. h is the angle between a vector along a water molecule's electrical dipole and a vector connecting the center of mass of the protein to the oxygen atom of the same water molecule. The resulting cosh values were averaged over the number of water molecules within a distance r from the protein surface, which ranged between 2.5 and 10 Å with an increment of 0.5 Å . m was defined by the vector between a water oxygen atom and the mid point of the distance between its hydrogen atoms. The autocorrelation function Sm m(t) :m m(0)T was computed as described in refs. [60,78].

Identification of waters with long-residence times.
Garcia and colleagues [79,80] suggested that the probability that a bulk water molecule has a coordination number (CN) less than 4 for longer than a few picoseconds is negligible. Following this approach, the residence time of water j, t res,j , was determined by summing over the entire number of frames (N F ) in which its CN is consecutively less than 4 for a period t n,j greater than 1 ns: t res,j~P N F n~1 t n,j , t n,j §1ns 0, t n,j v1ns Water molecules with t res,j $10 ns were defined to be longresident. Note that these waters may interact with different regions of the protein at different times. To identify protein sites preferred by these waters, we defined the nearest neighbor, a, that most frequently interacts with long-residence water j based on a distance criterion of 3.5 Å between the water oxygen and a protein backbone nitrogen or oxygen atom. Then, considering those waters that continuously occupy site a for at least 1 ns (t j (a)$1 ns), the mean residence time of waters at that site, t mean (a), was quantified as, where N w is the total number of water molecules occupying site a during the simulations. The maximum residence time, t max , is the longest uninterrupted period of time the j th water molecule occupies site a. The mean square displacement (MSD) of a water molecule with t max (W tmax ) is, where r(0) corresponds to the initial position and r(t) the instantaneous position of W tmax . Finally, the percent water-occupancy of a given protein site was obtained from the fraction of frames (sampled every 10 ps) that have a water molecule bound to that site.

Author Contributions
Conceived and designed the experiments: AAG. Performed the experiments: PP ASA AAG. Analyzed the data: PP ASA AAG. Contributed reagents/materials/analysis tools: PP ASA AAG. Wrote the paper: PP AAG.