Conformational Sampling and Nucleotide-Dependent Transitions of the GroEL Subunit Probed by Unbiased Molecular Dynamics Simulations

GroEL is an ATP dependent molecular chaperone that promotes the folding of a large number of substrate proteins in E. coli. Large-scale conformational transitions occurring during the reaction cycle have been characterized from extensive crystallographic studies. However, the link between the observed conformations and the mechanisms involved in the allosteric response to ATP and the nucleotide-driven reaction cycle are not completely established. Here we describe extensive (in total long) unbiased molecular dynamics (MD) simulations that probe the response of GroEL subunits to ATP binding. We observe nucleotide dependent conformational transitions, and show with multiple 100 ns long simulations that the ligand-induced shift in the conformational populations are intrinsically coded in the structure-dynamics relationship of the protein subunit. Thus, these simulations reveal a stabilization of the equatorial domain upon nucleotide binding and a concomitant “opening” of the subunit, which reaches a conformation close to that observed in the crystal structure of the subunits within the ADP-bound oligomer. Moreover, we identify changes in a set of unique intrasubunit interactions potentially important for the conformational transition.


Introduction
GroEL participates in the folding of 5-10% of cellular Escherichia coli proteins by providing an isolated chamber for non-native substrate proteins together with a heptameric ring shaped cochaperonin, denoted GroES [1][2][3][4][5]. It has also been suggested that GroEL might forcefully unfold kinetically trapped misfolded intermediates [6,7]. GroEL is composed of two heptameric rings made up of identical subunits, each of 57 kDa. The two separate rings, denoted cis (active) and trans (inactive), are stacked back-toback to form two folding environments (or cages) working offphase, analogous to a two-stroke engine. Each subunit is divided into three domains; equatorial (residues 1-133, 409-548), intermediate , and apical (191-376) domain, separated by two hinges that facilitate large conformational transitions in the complex [8][9][10].
Substantial mechanistic insight has been obtained through comparison of the large amount of structural data available for the GroEL complex. The first high-resolution X-ray structure was released in 1994 by Braig et al. [8], and since then, numerous structural studies, including X-ray crystallography, cryo-EM, and NMR have been published of different functional states of GroEL (for reviews see [11,12]). On this structural background it has been possible to make predictions and educated hypothesis about the transition pathways during the protein functional cycle. This includes the ATP dependent opening of the cis cavity with the concomitant increase of its volume from 85,000Å 3 to 175,000Å 3 [10]. The conformational transitions occurring on the subunit level are substantial, and its trajectory is generally explained in a sequential manner.
Binding of ATP to each of the seven equatorial domains in the cis ring, together with a non-native polypeptide produces a *15 0 counterclockwise twist of the apical domain, and a downward rotational movement of the intermediate domain [13,14]. This structural state of the ring is denoted R (where each of the 7 subunits are in the r state) while the initial closed state is denoted T (each of the 7 subunits are in the t state). Reaching the R state facilitates GroES association to the apical domains of the cis ring promoting much larger conformational changes and resulting in the fully open R9 conformation (all seven subunits in the r9 state). This r9 conformer is characterized by a *60 0 elevation and *90 0 clockwise twist of the apical domains (opposite direction to that seen upon ATP binding) [10]. Non-native polypeptide folding takes place within the cis ring of the R9 state of the reaction cycle, which is the longest lived (about 8-10 s) [15], and continues until ATP hydrolysis induce the R0 conformation permiting ATP binding to the opposite trans ring [16,17]. This final rearrangement result in a conformer very similar to the r9 form (RMSd of 1.46 Å ).
Despite this extensive structural insight, the mechanisms involved in allosteric signaling are not yet fully understood at atomic and residue level. In general, X-ray crystallography provides invaluable snapshots of different states of the protein reaction cycle, but not of the transitions between them. In this context, computational approaches have the potential to nicely complement the experimental techniques. In particular, molecular dynamics (MD) simulations provide important insight into protein dynamics at the atomic level, and allow following subsequent individual atomic interactions and fluctuations as a function of time [18]. Additionally, normal mode analysis (NMA) has proven to be efficient and accurate in the task of predicting and describing large scale conformational transitions in proteins [19][20][21][22]. NMA analytically characterizes all possible deformations of a protein around a stable equilibria with respect to their energetic cost. Although the utilization of NMA involves a loss of time-dependent fluctuations and resolution, i.e. by the use of coarse-grained models, it has been shown that it provides functionally relevant motions, and information on allosteric mechanisms [23][24][25][26][27][28].
Various computational methods have indeed provided essential information on the GroEL subunit dynamics. Notably, the transitions between the main functional states (T, R, and R0) have been further refined by utilizing NMA [27][28][29], targeted MD (TMD) [30], brownian dynamics [31], principal component analysis (PCA) [32], and MD simulations [33,34]. Moreover, a number of computational studies have been dedicated to find pathways for both intra-and inter-ring communication in GroEL [35][36][37][38][39][40]. From these, several residues have been pointed out as important for the allosteric signaling thus increasing the understanding of the involved mechanisms.
The concept of the two-stage transition has been strengthened by the TMD simulation of the GroEL subunit which pulls the t form to the fully open r0 form along a 500 ps long MD simulation [30]. This computational study suggested that the transition begins with a downward tilt of helix M, and the subsequent counterclockwise twist of the apical domain. Moreover, biasing the MD simulation by employing temperature acceleration to increase the conformational sampling recently showed the ability of the isolated GroEL subunit to undergo the t to the r0 state transition [34]. However, this simulation was not able to sample the closing of the binding pocket as seen in the Xray structures of the r0 conformers. Finally, unbiased MD simulation of the GroEL subunit has been performed in order to investigate the ATP-driven conformational changes [33]. This simulation samples the transition from the t to the semi-relaxed r conformation during 20 ns, nicely illustrating formation and rupture of hydrogen bonds.
A full scale monitoring of individual particle motions to probe the mechanistic basis for the transitions requires extensive unbiased conformational sampling at atomistic resolution. The 20 ns long MD simulations of Sliozberg and Abrams are too short to observe the full scale transitions of the subunit [33]. Moreover, significant variations between individual MD simulations have been detected [23,41,42], thus highlighting the importance of multiple simulations to extract statistically relevant information on the conformational changes. Investigating the positive (intra-ring) and negative cooperativity (inter-ring) of ATP-binding would require MD simulations of the entire GroEL oligomer. Such simulations on relevant timescales for conformational transitions are beyond the capabilities of present computational resources. It has also been reported the surprising facility to obtain stable folded monomers of GroEL by a variety of means [43][44][45][46][47], which can be explained by the small area buried by the monomers at subunit interfaces in the X-ray structure of the GroEL complex [8]. Furthermore, under some particular experimental conditions it seems that monomeric GroEL exhibits a weak chaperone activity [46]. Despite the fact that the chaperone activity of monomeric GroEL is a controversial issue [48], the ability of the monomer to fold into a native-like conformation that can bind nucleotide, points to the subunit as a relevant structural unit to be investigated by MD simulations.
In the current work we present extensive (in total 2:2 ms long) unbiased MD simulations of the GroEL subunit starting from the closed (t) and open (r0) conformations, with and without bound nucleotide. We use PCA of 287 experimentally obtained GroEL subunits to interpret the conformational sampling of our simulations. We observe nucleotide dependent shifts in the conformational ensembles, and show that the subunit response, as observed in the oligomeric structure, is intrinsically coded in the structure-dynamics of the isolated subunit. These simulations provide so far unexplored sampling from t all the way to a structure close to r0. A nearly complete transition in the opposite direction is also sampled by removing ADP from the r0 form. Another interesting outcome of our simulations is that the inherent motion of unliganded GroEL subunit is biased along the transition pathway towards both the r and r0 states. Furthermore, the MD simulations reveal a weak stabilization of the equatorial domain upon ATP binding, resulting in a modest decrease of the configurational entropy of this domain. Conversely, a larger increase of entropy is found for the whole GroEL subunit. Finally, we decipher the underlying mechanisms for the conformational transitions by investigating the atomic interactions unique to the unliganded and nucleotide bound structures obtained from the X-ray structures and MD simulations. Several of the interactions that characterize the conformational intrasubunit effects brought about by ATP binding were not revealed in previous studies.

Author Summary
Molecular machines convert chemical energy to mechanical work in the process of carrying out their specific tasks. Often these proteins are fueled by ATP binding and hydrolysis, enabling switching between different conformations. The ATP-dependent chaperone GroEL is a molecular machine that opens and closes its barrel-like structure in order to provide a folding cage for unfolded proteins. The quest to fully understand and control GroEL and other molecular machines is enhanced by complementing experimental work with computational approaches. Here, we provide a description of the molecular basis for the conformational changes in the GroEL subunit by performing extensive molecular dynamics simulations. The simulations sample the conformational population for the different nucleotide-free and bound states in the isolated subunit. The results reveal that the conformations of the subunit when isolated resemble those of the subunit integrated in the GroEL complex. Moreover, the molecular dynamics simulations allow following detailed changes in individual interatomic interactions brought about by ATPbinding.

Analysis of GroEL structures and dynamics
27 crystal structures of the GroEL complex, with a total of 364 subunits, were collected from the RCSB protein databank. Six of the crystal structures were not considered due to missing coordinates, which leaves 21 GroEL complexes with 287 subunits for the following analysis. These include ATP and ADP bound conformers as well as apo forms from wild type and mutant structures. The 287 GroEL subunits were superimposed onto the invariant 'core' defined as the area with least structural variation [49], and PCA was performed to investigate the major conformational differences between the collected structures. As much as 91.5% of the total variance of the atomic fluctuations was captured along the first principal component (PC), while 2 and 3 dimensions were necessary to capture 95.3% and 97.5%, respectively (see inset in Figure 1A).
The GroEL subunits can be divided into three major groups along the two first PCs; the closed cis/trans t forms, open cis r0 forms, and the semi-relaxed r forms ( Figure 1A). The first PC is shown in Figure 1B and describes the main differences between the r0 and t conformers. The motions described by PC1 consist of (1) an upward movement of the apical domain away from the equatorial domain;  [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27] and C (residues 65-85) show a rotation relative to helices D (residues 88-107), E (residue 113-134), and R (residues 496-514). Similar internal motions in the equatorial domain are also observed in PC2. In particular, helix A (residues 8-27) undergoes a translational motion along its longitudinal axis, while both helices D (residues 88-107) and R (residues 496-514) rotate along their axis. PC2 represents the largest variation between the t and r conformers, consisting of a counterclockwise twist and a modest elevation of the apical domain. Similarly, PC3 describes a rotational motion of the apical domain together with a translational motion of helix A along its longitudinal axis.
The internal fluctuations within the equatorial domain reveal the presence of two sub-domains (helices A+C, and helices D+E+R) consistent with a previous study utilizing 35 subunit structures [32]. Moreover, it is interesting to notice that the invariant core is a relatively small part of the equatorial domain (helices E, O, and R), thus pointing out the structural variability within this domain.

Conformational population shift as a response to nucleotide binding
To investigate the intrinsic response of nucleotide binding to GroEL we conducted 14 MD simulations of the isolated GroEL subunit starting from both ends of the reaction cycle; the closed (t) and open (r0) states, with (holo) and without (apo) bound nucleotide (ATP or ADP). Of these, we performed four 300 ns Each conformer obtained from the MD simulations was projected onto the first two PCs determined from the X-ray structures ( Figure 2). These projections display the relationship between the MD conformers in terms of the conformational differences described by the two first PCs, thus enabling interpretation of the conformational space sampled in each of the simulations.
Remarkably, the simulations of the closed subunit all sample the space along PC2, which describes the transition between the t and r form, independently of whether ATP is bound or not. The main difference between the unliganded and ATP-bound simulations lay in the sampling along PC1; the ATP bound simulations shift the ensemble of conformations along PC1, which thus comes closer to the fully open r0 form ( Figure 2A+C). This difference is most apparent in the 300 ns simulations where the ensemble samples closest to the fully open structure (see Video S1). The difference in sampling along PC1 is also significant for the 100 ns simulations ( Figure S10), however they are too short to fully observe this effect. Admittedly, 100 ns is not long enough simulation time to bring the ATP bound simulation (N) away from the t form ( Figure 2N).
A similar shift of the ensemble along PC1 and PC2 is also observed for the open simulations. Interestingly, the ADP-bound simulation samples the space in the proximity of the r0 form ( Figure 2D). Conversely, removing ADP yields a sampling of the conformational space which is significantly shifted, in particular along PC1, towards the t conformers ( Figure 2B). Clustering analysis on each of the four 300 ns long trajectories was performed to identify the predominant conformations throughout the simulations. Each frame in the trajectories is attributed to one particular cluster depicted as color bars in Figure 3 along with the resulting RMSd values. The conformations sampled in the closed apo simulation ( Figure 3A) resembles its initial starting structure (t). Conversely, the closed holo simulation, shown in Figure 3C, comes much closer to the open r0 form than its apo counterpart. The RMSd values for the average conformations in each of the clusters with respect to both the closed and open X-ray structures are shown in Table 1. The predominant clusters of the closed apo simulation are clusters A1+A2 which have a relatively close similarity to the closed X-ray structure, with an RMSd value of 2.56 and 3.45 Å , respectively. These low RMSd values of the closed apo simulation stand in contrast to higher RMSd values of the closed holo simulation, i.e. 5.97 and 4.09 Å for the two dominating clusters C2+C3. Cluster C4 of the closed holo simulation, which consists of 92 member conformations out of a total of 1500 MD conformers, has a higher average similarity to the open (RMSd 6.86 Å ) than the closed X-ray structure (9.21 Å ). We thus sample relatively close to the experimental r0 conformer (minimal deviation of 5.8 Å ) with the major difference being an open nucleotide pocket, similar to what a recent temperature biased MD simulation by Abrams and Vanden-Eijnden showed [34]. Interestingly, both the unbound and the ATP bound simulations also show a tendency to mimic the r form (minimal deviation of 2.6 Å ) characterized by the counterclockwise rotation of the apical domain (i.e. clusters A3 and C1).
The RMSd values and clustering of the open (r0) apo and holo simulations are shown in Figure 3B+D. The apo simulation initially samples conformations around its starting point; close to the open X-ray structure. The predominant clusters of this simulation are B2 and B4 with a total of 933 member conformations, and the average conformations of these clusters have RMSd values much closer to the open than to the closed Xray structure as shown in Table 1. After about 240 ns of simulation the structure undergoes a remarkable conformational change resulting in higher similarities towards the closed (t) form. These structures are assigned to cluster B6, and the member conformations have an average RMSd of 7.23 Å towards the closed X-ray, and 8.11 Å towards the open X-ray structure. We thus sample relatively close to the experimental t conformer starting from the r0 conformer (minimal deviation of 5.1 Å ) with the major difference being a small twist of the apical domain.
Conversely, the open holo simulation consistently shows an open structure with RMSd values closer to the open r0 than to the closed X-ray structure ( Figure 3D and Table 1).

Stabilization of equatorial domain upon nucleotide binding
To probe the differences between the unliganded and ATPbound simulations we investigated the residue fluctuations based on the 10-60 ns interval of the subunit simulations in the closed form. As expected, most residues show a similar fluctuation pattern in the holo and apo simulations. In the equatorial domain, residues adjacent to ATP show the most evident differences between the apo and holo simulations (see Figure 4A-B). In particular, residues Lys28, Lys34, Asp87, Asn457, Glu461, Tyr478-Glu483, and Tyr485, and its immediate neighbors are significantly (a~0:05) stabilized in the presence of ATP. Conversely, Glu61, Glu63, Arg421, and Asn475, show increased fluctuations in the holo simulations indicating rearrangement of these residues upon ligand binding. Modest differences in the fluctuation profiles are also observed within the apical ( Figure S11A) and intermediate ( Figure  S11B-C) domains. Only two residues are shown to be significantly altered (a~0:05); Arg350 (apical) and Lys390 (intermediate).
Since the equatorial domain possesses the ATP binding site it is thus the epicenter for the initiation of the conformational changes. In order to relate the differences in fluctuations to binding of ATP we calculated the potential hydrogen bonds between ATP and the protein along the simulations. Figure 5 shows the occupancy of each of the ATP hydrogen bonds in the closed holo simulations. Three hydrogen bonds are shown to be particularly strong with an occupancy of more than 90% of the simulation; Thr89 and Thr91 bind to the beta and gamma phosphate, respectively. The fifth single strongest bond is Asn479 which binds to the H 2 N group of the adenine ring, with a occupancy of about 80% of the simulation. The negatively charged carboxyl group of Asp495 forms 4 alternating hydrogen bonds with the ribose OH, which have occupancies of approximately 40% each. Weaker bonds also exist, such as Gly32, Lys51, Thr90, and Ala480. Moreover, Asp87 is held in close contact with ATP through the tight binding to Mg 2z , resulting in a mean distance of 3.58 Å (+0:001Å) between the oxygens of Asp87 and the three adjacent phosphate oxygens of ATP.
The identified hydrogen bonds between ATP and specific protein residues can be directly linked to the change in the fluctuation pattern observed in Figure 4. In particular hydrogen bonds involving Gly32, Thr89-Thr91, and Asn479-Ala480 might cause the stabilization of these areas. Moreover, the tight binding between Asp87 and Mg 2z might be a stabilizing factor for Asp87 and its neighbors.
Configurational entropies of the closed simulations were estimated to investigate the entropic penalty upon ATP binding. Entropy values calculated from quasi-harmonic analysis of MD simulations are sensitive to simulation length and the number of frames in which the calculation is based on [50]. Moreover, significant variation between individual MD has been reported [41,51]. We thus performed entropy calculations on the multiple closed MD simulations for the equatorial domain and for the whole subunit ( Figure 6). We observe only small changes in entropy within the equatorial domain upon nucleotide binding. Moreover, during the first 60 ns of the simulations, no entropy difference is found for the whole subunit, while after 80-100 ns the entropy of the ATP-bound simulation is slightly higher than that of the apo simulations.

Differential atomic interactions
Comparing distance maps between different conformations has the potential of revealing unique atomic interactions potentially important for the conformational transitions within the GroEL subunit. We used difference contact maps (DCMs) to identify atomic interactions (all heavy atoms) unique to the t, r, and r0 Xray conformers ( Figure S12A-E) complemented with DCMs obtained from the MD simulations of the closed form (Figure 7). Of particular interest are those residues which change interaction partner during the transition from t to r0.
A large number of contacts are found to be unique for either the t or the r0 forms ( Figure S12A). High density of differential atomic interactions are observed within the equatorial domain, and in particular the interaction pattern between helices B, C, and D is altered upon the t-r0 transition. Contacts between helices B-C and C-D are found to be unique for the r0 conformers. Conversely, the t conformers show tighter interactions between helix C and the loop connecting helix C and D. Here, Asn82 contacts with Asp87 in the t conformers but with Arg58 in the r0. Moreover, Asp87 changes its contact partners from Asn82 in the t state to Ser151 and Asp398 in the r0 state, which is the main interaction between helix M of the intermediate domain and the equatorial domain. Unique for the r0 forms is also a set of hydrophobic interactions between helices C and D (Val74-Ile100, Val77-Ala96, Ala81-Ala92, Val77-Ala92). Conversely, the t forms have unique contacts between helix C and R (Val74-Val510, Val77-Tyr506, Ala85-Val499).
The intermediate domain also undergoes conformational rearrangements during the transition. Residues downstream of helix G initially interact with residues of helix M in the t form, but change interaction with helix L and K during the transition, e.g. the interaction Glu172-Arg404 in the t forms is changed to Glu172-Arg350 in the r0 forms. Moreover, Asp179 contacts Lys390 at the intermediate domain in the t forms, and the Thr48 at the equatorial domain in the r0 forms.   While the ensemble of static X-ray structures provides the differential atomic interactions between the two end points of the reaction cycle, the MD simulations can reinforce the findings based on X-ray and further complement this by providing the differential atomic interactions at an earlier phase in the transition. Consistent with the findings from the X-ray-based DCM, the areas of high density differential contacts are situated in the equatorial and intermediate domains (Figure 7). Twenty differential contacts are consistently identified (Table 2). Perhaps the most interesting of these are the interactions Val412-Asn475 and Leu134-Asn475 located near the lower hinge ( Figure 8A). These interactions show similar average distance difference in the X-ray and MD distance matrices. Moreover, they are located in the equatorial domain, close to the lower hinge, making them potentially important for the interaction between the equatorial and intermediate domain.
In the intermediate domain Asp179 and Lys390 connect helix M to three sheets (b4, b5, b14) close to the apical domain in the t forms. These residues are 0.9 Å closer in the apo simulations than in the holo ones. This is consistent with the X-ray DCMs, which shows that Lys390 interacts with Thr48 in the equatorial domain.
The distance matrices can also be helpful to probe the underlying mechanisms for the changes in fluctuation pattern of the equatorial domain ( Figure 4A-B). Table 3 summarizes the average difference in atomic distances for residues identified to have altered fluctuations. Of particular interest is Arg421 and Asn475 which show higher RMSF values in the holo simulations.
Arg421 shows a tighter binding to Gly471 in the apo simulations than in the holo ones (Dd~1:5Å). This is consistent with X-ray data, although the difference is about 1 Å smaller (Dd~0:5 and Dd~0:4Å). Perhaps the most evident difference affects Asn475 which in the MD simulations appears 2.2 Å closer to Val412, 2.0 Å closer to Leu134, and 1 Å closer to Ala413. These findings are again consistent with the differences found in the X-ray structures; i.e. 1.8, 1.8, and 0.8 Å for the t-r0 difference, and 1.4, 2.5, and 0.7 Å for the t-r difference.
Interactions between b2-loop-b3 and b16-loop-b17, close to the ATP-binding site, are strengthened upon nucleotide binding which might explain the smaller RMSF values in this area ( Figure 8B). The interaction Lys34-Glu483 is particularly affected; the average distance difference is 2.71 Å for the MD structures, while 0.8 and 1 Å for the X-ray structures. A significant difference is also found for Val54-A78 which are 1.4 Å closer in the holo simulation than in the apo simulations (3.5 and 1.7 Å for the X-ray differences).
Determining the residue cross correlation to investigate whether the motions of one residue are related to the motions of another can aid in deciphering the underlying mechanisms for the observed conformational transitions. We have previously reported the cross-correlation map for the entire GroEL subunit revealing global correlations [23]. In the present work we focus on the equatorial domain which has the advantage of revealing more detailed correlations around the binding site. The corresponding residue-residue correlation map for the equatorial domain is shown in Figure 9A, and highlights 7 off-diagonal correlated areas. These areas are consistently described as correlated throughout all subsets of the simulation (10, 20, and 50 ns intervals), and are the following: (1) residues Lys16-Gly20 (helix A) and Phe66-Asn68 (helix C), including two positively charged residues in helix A, and one negatively charged residue in helix C; (2) residues Gln454+I-le455+Asn458 (helix P) and Thr31-Gly33 (loop region between helix A and b2); (3) The latter region is also correlated to the phosphates of ATP; (4) the phosphates of ATP and residues Gly89-Thr92 (helix D); (5) residues Asp115-Ile119+Ala123 and Asp435-Asn437+Ile440+Ala443; (6) Val411-Val412 (b15) and Leu494-Thr497 (b18) close to the lower hinge; and (7) the adenine ring of ATP and residues Asn479-Thr482 (b16-loop-b17 region).

Extensive unbiased MD simulations
Several computational structural biology attempts have focused on the investigation of the ligand-induced conformational changes in GroEL [28,30,31,33,34]. Comparison of crystal structures at different stages of the functional cycle, NMA analysis and targeted MD simulations provide important information on the conformational space and motions associated to ligand-binding. Nevertheless, all-atom unbiased MD simulations have a superior potential to reveal the mechanisms involved in conformational transitions in proteins since using targeted MD simulations applying fictitious driving forces bias the motions toward the target and may drive the transitions along unrealistic deformations. However, GroEL has been considered to be beyond the analysis by unbiased MD simulations which have been held back due to the large size of the protein. Nevertheless, Sliozberg and Abrams have previously performed a 20 ns long unbiased MD simulation of one subunit, investigating the ATP-driven conformational changes [33]. This simulation manages to sample the transition from the t unbound, tense conformation to r, defined as a semi-relaxed conformation intermediate towards the open r0 conformation. This transition  includes a *20 0 counterclockwise rotation of the apical domain along with closing of the nucleotide binding pocket, and was associated to the initial transition in a 2-stage activation by ATP [13,30]. The T to R transition has been associated to the intra-ring cooperative allosteric response to ATP binding, since functionally the R conformer shows high affinity for ATP [13]. Following this reasoning Sliozberg and Abrams interpreted their results on the way the r structure would be concertedly transmitted to all seven subunits in the cis ring. In our present simulations of the isolated subunit, we sample close to the r structure (minimum deviation of 2.6 Å ), and in addition, these long MD simulations (300 ns) provide so far unexplored insight into the ATP-induced transition from r all the way to a structure that is, according to the RMSd values, close to r0 (minimum deviation of 5.8 Å ). An almost full transition in the opposite direction, from r0 to t, is revealed from a similar simulation of the ligand free apo state starting from the r0 structure (minimum deviation of 5.1 Å ). This transition would correspond to the conversion occurring in the trans ring, which functionally is accompanied by ejection of the hydrolysis product ADP and the co-chaperonin GroES. We thus show for the first time with unbiased MD simulations that the conformational transitions in the GroEL oligomer are favored by the intrinsic behavior of the isolated GroEL subunit. The predictive power of the present study is reinforced by the multiple 100 ns long MD simulations.
The conformer plots and the fluctations along the simulations also reveal that the r conformation is sampled both in the presence and absence of ATP. Thus, our results further add to the accumulating experimental and theoretical proofs of the preexisting equilibrium between inactive and active states (see e.g. [52][53][54][55][56]). A similar trend was also detected in a recent fluorescence study where the authors were able to measure the fraction of molecules in the T and R state of GroEL with increasing ATP concentrations [57]. They determined that about 50% of the molecules were in the T state even with high ATP concentrations, indicating a constant cycling between the two conformations. Moreover, Chaudhry et al. showed that the inherent motions of the unliganded GroEL were ''biased along the transition pathway that leads to the folding-active state'' [58]. X-ray structures of the nonhydrolyzable ATP analog ATPcS, in which the domain rotations are not observed [9], give means to further support a t to r cycling.
These unbiased simulations also show that the open structure, similar to the r0 structure, is attained for the isolated subunit, just in the presence of ADP, and without the need of the cochaperonin GroES. It is accepted that binding of GroES to the heptameric cis ring with the subunits in the r conformation is the main functional determinant for the induction of the r0 conformer, triggering the encapsulation of bound protein substrate into the cavity. Our results thus support that the subunit conformational changes along the functional cycle of GroEL, as revealed by X-ray crystallography and cryoelectron microscopy, are largely intrinsic to the 3D structure (sequence-structure-dynamics) of the subunit (intramolecular), though stabilization of intermediate conformations should be associated to intersubunit or interprotein (intermolecular) interactions in the GroEL-GroES oligomers.
Isolating a monomer from its natural biological assemblage may result in an altered dynamical behavior. However our results indicate that the dominant motion of a single GroEL subunit resembles that occurring in the ensemble of GroEL complex crystal structures. Thus, the dynamics of the subunit alone seems to be reflected in the larger biological assembly. This is seen by the RMSd analysis of the trajectories (Figure 3), as well as in the comparison between the X-ray and MD derived PCs (root mean squared inner product = 0.73) ( Figure S13). The dominance of this intrinsic motion might be related to the relatively small intersubunit contact area (6.6%, or 1,400 A 2 ) [8], which is comparatively low for an oligomeric protein [59,60]. Nevertheless, each of the 7 apical domains interacts with two neighboring apical domains, as well as with one intermediate domain in the initial closed T form [8]. This arrangement imposes constraints on the motion of the subunit, though at present to an unknown degree. It is therefore expected that, for similar timescales, simulating the entire GroEL assembly would lead to a more restricted conformational sampling of the subunit which would most likely be more consistent with the timescale upon which these conformational transitions are thought to occur [15].

Conformational changes
Sliozberg and Abrams described that MgATP binding to the subunit in the t conformation initiates a conformational change mostly due to the strong interaction of Mg 2z and Asp398 [33]. This interaction induces a series of H-bond ruptures (such as the Asp155-Arg395 salt bridge) and H-bond formations that were described in an induced-fit, cascade-like way [61]. We were however unable to reproduce this pattern of interactions. As seen in the conformer plots ( Figure 2) the effect of ATP binding is rather described by causing a shift of conformational equilibria towards the r0 conformation, with the peculiarity that in this case the ATP-bound conformation corresponds to the open conformation of the subunit, while usually ligand binding leads to a closed structure for the majority of proteins. In fact, in GroEL, ATP binding leads to a closed and less dynamic equatorial-domain structure but open and more dynamic (r and r0) subunit structure.
In the case of preexisting equilibria where the ligand binds selectively to an ''active'' conformation the energy barrier between the conformations in equilibrium should be low [52], and binding should not bring a high entropic penalty, as certainly seems to be the case for the GroEL subunit ( Figure 6). Further experimental support for low energy barriers is also found for Hsp70 proteins [62]. Moreover, a thermodynamic study on nucleotide binding to GroEL provided a positive entropy change for the binding of ATP at temperatures higher than 15 0 C [63]. Notwithstanding the fact that these measurements were performed on the complete oligomer, it is noteworthy to relate the trend in entropy change to that found for the isolated monomer.

Detailed changes
Computational studies employing advanced procedures in conjunction with NMA and brownian dynamics have probed allosteric networks in the GroEL system [31,37,39,40]. They have been able to determine and highlight a large set of residues responsible for the transitions in the GroEL complex. Lys80-Asp359, Asp83-Lys327, Arg58-Glu209, Pro33-Asn153, and Gly257-Arg268 are among several intra-subunit interactions that have been identified to be important during the conformational transitions in GroEL [31,39]. Of these, Arg58, Asp83, Gly209, and Lys327 were also highlighted by Tehver et al. [40], and shown to be highly conserved [64].
Our study of the MD trajectories and the large amount of Xray data reinforces these studies as we identify many of the same interactions and residues. Moreover, we identify several other atomic differential interactions brought about by ATP-binding which have not previously been detected. Of particular interest is the tight binding between Lys34 and Glu483 in the presence of ligand ( Figure 8A), which is captured in both the MD simulations and the X-ray crystallographic data. This effect appears as an important rearrangement induced by ATP-binding. These charged residues are situated close to the nucleotide binding site in two separate loops (b16-loop+b17 and aA-loop-b2), which movements interestingly are highly correlated to ATP; residues Asn479-Thr482 to the adenine ring, and Thr31-Gly33 to the phosphates of ATP. The correlation between these areas and ATP is likely to contribute to the stabilization of these loops as observed by fluctuation calculations (Figure 4), which in turn might aid in the salt-bridge formation between Lys34 and Glu483 ( Figure 8B).
More peripheral to the ATP-binding site in the equatorial domain, close to the lower hinge, we observe weaker contacts between Leu134-Asn475, Arg421-Gly471, and Val412-Asn475 ( Figure 8A). This is possibly due to rearrangements of helix N upon nucleotide binding which disrupts the initial interactions found in the t conformers and along our MD simulations of the unbound t forms. Moreover, the DCM analysis pinpoint an altered interaction pattern for helix C in the equatorial domain, possibly explaining the movements of helices A+C in the PCA analysis of X-ray structures. Among these interactions, hydrophobic amino acids are a common denominator; Ala84-Tyr506, Val77-Ala92, Val54-Ala78, Val77-Ala507 all show a tighter binding in the nucleotide bound conformers. Experimentally, this core of hydrophobic amino acids has been shown to be important, and e.g. the Ala92Thr protein vaiant shows a low ATPase activity [65]. Interestingly, the MD simulations also capture the rupture of the Asp179-Lys390 contact within the intermediate domain, and the formation of Arg231-Glu310 contact in the apical domain upon ATP-binding, both consistent with X-ray data.
Realizing that a deep understanding of the function of GroEL as a molecular machine certainly requires analysis of inter-subunit and inter-ring interactions in the oligomer, MD simulations of the complete GroEL complex would be required to probe the full effects of ATP-binding. While these are expected to reveal essential details on the allosteric regulation of GroEL function, w100 ns multisubunit simulations of a system of *600.000 atoms are a very computationally-demanding task. Regardless of the capacity of GroEL to assemble into oligomeric structures, various studies have also shown the surprising facility to obtain stable folded monomers of GroEL by a variety of means [43][44][45][46][47], thay might display a weak chaperone activity of the GroEL monomer [46].
In the present work we have attained detailed and statistically proven results from MD simulations of the isolated GroEL subunit, which have highlighted the importance of considering the proper dynamics and response of the subunit in the context of the large scale transitions in the GroEL complex. Since the equatorial domain holds the ATP binding site, it constitutes the epicenter of the conformational changes in the GroEL complex. By paying extra attention to the detailed mechanisms in this region of the protein during the conformational transitions observed along the simulations we have been able to map important intramolecular rearrangements which potentially are a prerequisit for the intermolecular transitions to occur.

Principal component analysis
The calculation of the PCA modes involves two main steps; (1) the calculation of the covariance matrix, C, of the positional deviations, and (2) the diagonalization of this matrix [67,68]. The 3N dimensional covariance matrix is calculated based on an ensemble of protein structures, and the elements of C are defined as where x i and x j are atomic coordinates and the brackets denote the ensemble average. The diagonalization of the symmetric matrix C involves the eigenvalue problem where A is the eigenvectors and l the associated eigenvalues.
Our PCA calculations were based on the Ca coordinates of the ensemble of the 287 GroEL crystal structures and was performed with the Bio3D package [49]. The X-ray and MD conformers were projected into the sub-space defined by PC1 and PC2, where the maximum variation of the conformational distribution was observed. Plotting these projections results in 'conformer plots' which displays a low dimensional representation of the conformational change in terms of the two principal components.

Molecular dynamics simulations
All-atom MD simulations of the closed and open forms of the isolated GroEL subunit were performed both with and without bound nucleotide (designated holo and apo, respectively). The atomic models were prepared from the high-resolution crystal structures with PDB codes 1XCK chain A [69] and 1SVT chain A [58], for the closed and open forms, respectively. MgATP coordinates for the closed holo (with nucleotide) simulation was collected from the crystal structure with PDB id 1KP8 chain A [70]. All atomic models were prepared with Amber10 [71] and the corresponding Amber03 forcefield [72,73]. ATP and ADP parameters were obtained from Meagher et al. [74]. For each of the simulations, the protein was solvated in a periodic truncated octhahedron box with TIP3 water molecules [75], providing 16 Å of water between the protein surface and the periodic box edge. The solute was minimized for 10,000 steps, followed by 10,000 steps of minimization of the entire system. The protein was then heated to 100 K with weak restraints for 100 ps, and to 300 K in 200 ps. 2 ns of equilibration with constant pressure and temperature (NPT) of the system was performed prior to the production run in order to ensure correct density. The production runs were performed with constant volume and energy (NVE) with a 1 fs time step, using SHAKE constraints on hydrogen-heavy atom bonds.
A total of 14 MD simulations of the GroEL subunit were carried out. Of these, we performed four 300 ns long simulations:

Difference contact maps
Distance matrices were calculated between all 3856 pairs of atoms in order to monitor the atomic interactions (between residues at least four residues apart in sequence) [49]. The distance matrix was issued to residue grouping by only considering the minimal atomic distance between the residue pairs. Residue pairs closer than 4 Å are assumed to be in contact, and constitute the contact matrix for one particular conformation. Contact matrices were calculated for 28 closed (T) apo subunits (PDB code: 1XCK, 1SS8, 1OEL), 28 closed (T) ATP bound subunits (PDB code: 1KP8, 1SX3) and 28 open (R0) ADP bound subunits (PDB codes: 1AON, 1SVT, 1SX4, and 1PF9), and 6000 snapshots obtained from the last 50 ns of the 12 independent MD simulations on the closed GroEL subunit (6 apo and 6 holo). Only contacts with at least 50% occupancy and an average distance difference of 0.5 Å were considered. The difference of two contact maps (DCM), i.e. difference between apo and holo, then defines side-chain contacts which exist in one form, but not in the other.

Additional analysis of simulated data
Clustering analysis, correlation maps, entropy calculations, and hydrogen bond analysis were performed with the the ptraj module of AmberTools [71]. Clustering was performed on the MD conformers using the average-linkage clustering algorithm [76]. Cross-correlation calculations were performed on several subsets of the closed ATP bound simulations (10, 20, and 50 ns intervals) in order to obtain areas of consistent correlations. All figures of GroEL are made in Pymol [77].  Figure S3 Difference contact maps between t, r, and r0 conformers. Difference in side chain contacts are mapped for the X-ray structures (A-E), and MD structures (F). Two residues are assumed to be in contact if the minimal distance between them is v4Å. Only contacts existing in at least half of the structures are considered. Green triangles represent contacts present in the unliganded forms and not in the liganded forms. While blue dots represents contacts which are present in the liganded forms, but not in the unliganded forms. Orange lines display residues in which a change in contact partner has been detected. Red circles depict contacts which are found in both the X-ray and MD analysis. Secondary structure elements are indicated schematically with helices in black and strands in gray. (EPS) Figure S4 Overlap map of the X-ray and MD PCs. Squared overlap between pairs of eigenvectors obtained from X-ray and MD PCA are shown. Dark grey corresponds to values close to 1 (high similarity) while white depict overlap close to 0 (orthogonal). Root mean squared inner product (RMSIP) is provided according to ref. [78].

(EPS)
Video S1 Time-dependent conformer plot of the 300 ns ATPbound simulation. Projection of X-ray (contoured grey dots) and MD (blue) structures (along the simulation time) onto the two most significant principal planes defined by 287 GroEL experimentally obtained structures. The distribution of MD structures is depicted with density shaded blue points. Orange, red, and green dots represent the closed (t), semi-relaxed (r), and open (r0) X-ray structures, respectively. Gray dashed line depicts the average pathway of the conformational sampling. (AVI)