Modulation of Global Low-Frequency Motions Underlies Allosteric Regulation: Demonstration in CRP/FNR Family Transcription Factors

Allostery in bacterial transcription factors arises from changes in global low-frequency protein dynamics. Amino acids that regulate low-frequency dynamics are identified and seen to be evolutionarily conserved.


Introduction
Small regulatory molecules frequently bind proteins at regions remote from the active site. These allosteric events can switch proteins between inactive and active states [1]. Knowledge of the molecular basis of allostery derives from a wealth of theoretical and experimental studies and traditionally describes the process in terms of conformational change within the protein [2,3]. Combinations of X-ray crystallography and NMR have permitted analysis of the ligand binding sites, intermolecular interactions, and conformational fluctuations that underpin diverse allosteric systems [4,5]. There is also considerable evidence that allosteric cooperativity can be communicated between distant sites on proteins through modulation of their dynamic properties, even in cases where that are no structural changes between the ligand bound (holo) and unbound (apo) forms [6][7][8][9][10][11][12]. Since the original identification, by Cooper and Dryden [4], of this alternative route of ''allostery without conformational change,'' there has been considerable debate over the mechanisms by which dynamic fluctuations are communicated between allosterically coupled sites on proteins.
One hypothesis for fluctuation-induced allostery is that binding modifies the structure of the thermally excited global normal modes and thence the coupling interaction between cooperative elements. This in turn affects the structural ensemble of the distant sites and so the free energy of binding [13][14][15]. Another view maintains that physically connected pathways of excited or repressed dynamics, coupled along their trajectories, connect allosteric sites [16][17][18]. Here we propose the hypothesis that the normal modes of protein structural motion, large-scale motions dispersed across the entire protein, are important carriers of the allosteric signal and act without requiring structural change. Previous studies of the normal modes have demonstrated that conformational transitions in proteins, including those that underpin allosteric regulation dependent on conformational change, are well described by one or a few low-frequency modes [19][20][21][22][23][24][25]. The normal modes, however, can also be used to describe the whole spectrum of internal fluctuations of a protein around a mean structure. The low-frequency global modes, in particular, can involve entire protein domains. Alteration of the normal modes might therefore be communicated to distant sites of a protein as a change in the degree of motion around a mean structure without overall conformational change. Global lowfrequency fluctuation therefore represents an alternative theoretical approach to allosteric communication that does not depend upon conformational change. An important consequence of this alternative mechanism of allosteric communication is that it can be captured by coarse-grained representations and models, such as the elastic network model (ENM). Here we develop this theory, and the validity of a coarse-grained model approach, through a computational and experimental study of the homodimeric CRP/ FNR family transcription factors Catabolite Activator Protein (CAP) of Escherichia coli and GlxR of Corynebacterium glutamicum.
CAP is a 210-amino-acid transcription factor that binds cAMP generated by adenylyl cyclase in response to the phosphorylated form of Enzyme IIA Glc (phosphorylated in response to the phosphoenolpyruvate-carbohydrate phosphotransferase system) [26,27]. cAMP-bound CAP regulates the transcription of over 100 genes crucial for carbon utilization through its binding to a specific promoter region and recruitment of RNA polymerase [28]. Previous studies of the ligand binding domain of CAP demonstrated negative cooperativity between cAMP binding sites in the absence of structural change within this domain [10]. The observed negative cooperativity in this isolated domain occurs through a conformational entropic penalty for binding the second molecule of cAMP, but there is no mechanistic description for how such a phenomenon can occur in the full-length protein. Seven of eight CAP mutants previously examined showed a direct correlation between DDG and the adiabatic compressibility (b s u) where proteins with a higher b s u (reflecting increased structural flexibility in solution) demonstrated enhanced negative cooperativity [29]. While it is therefore reasonable to hypothesize a role for protein dynamics in allostery in CAP, there is no conceptual framework to describe how these changes in motion might arise, how they contribute to allostery, and how a resulting theory might translate to related molecules. CAP is therefore a suitable model system for a theoretical and experimental investigation of the contribution of the normal modes to allostery.
Here we propose that changes to global low-frequency protein backbone fluctuations are carriers of an allosteric signal in CAP and present this in the context of a significant new quantitative theory for allosteric coupling. We produce coarse-grained models that describe global low-frequency protein backbone motions of CAP and show a strong correlation between negative cooperativity for cAMP and modulation of the delocalised normal modes on ligand binding without a requirement for a spatially distinct physical pathway or conformational change. We demonstrate experimentally that altered connectivity between backbone elements in CAP can give predictable alterations to cooperativity for cAMP binding through altered mode amplitudes. We further demonstrate a broader applicability for this theory using an additional CRP/FNR family transcription factor, GlxR of C. glutamicum. We unite our findings for CAP and GlxR to determine the extent to which key inter-and intramolecular parameters contribute to negative cooperativity in CRP/FNR family transcription factors. We further demonstrate that amino acids that contribute significantly to allosteric control are more likely to be conserved in variant proteins from diverse species. The theoretical and experimental work and associated data analysis provide both a significant advance in our understanding of the mechanisms that underpin the dynamic regulation of allostery and also a means for informed rational engineering of cooperativity in proteins.

An ENM for CAP Correctly Predicts Negative Allostery
To computationally address cases of allostery that arise from fluctuation-modification, without conformational change, requires a very different approach from those corresponding to the classic Monod-Wyman-Changeaux case of conformational switching. On the one hand, fully atomistic simulations are not capable of attaining, in most cases, the long dynamical time scales explored by the slow, global dynamic modes whose thermodynamics are essential for the effect. On the other hand, because these modes by their nature integrate many local interactions into their effective geometries and potentials, coarser-grained models of protein structure can possibly provide sufficiently accurate calculations of the relevant dynamics, while allowing the computation of dynamics to the necessary timescales. Models that represent protein structures by Ca-atom positions alone reproduce lowfrequency modes well in comparison to experimental data [21,30]. We therefore used the co-ordinates from a high-resolution crystal structure determination of the full-length cAMP bound CAP homodimer to construct an ENM [31] for the apoprotein as well as single and double ligand bound holoprotein states ( Figure S1). Free energies, DG, were calculated using the full harmonic solution, and the negatively cooperative binding of cAMP to wild-type full-length CAP confirmed by calculating a positive value for DDG = (DG holo2 2DG holo1 )2(DG holo1 2DG apo ) = 179 cal mol 21 consistent with experimentally obtained values (Table S2) [32][33][34][35]. To confirm that the total motion within the ENM is not an

Author Summary
Allostery is a process by which a molecule binding to one site of a protein alters the activity of the protein at another site. Allostery is typically thought to occur through a change in protein structure, but there is now clear evidence that the dynamic properties of a protein can also regulate allostery without a change in overall conformation. Here we examine two members of a large family of bacterial transcription factors and provide a mechanism to describe the allosteric binding of their activating ligands. We demonstrate, in these systems, that allostery arises as a natural consequence of changes in global low-frequency protein fluctuations on ligand binding. We further demonstrate that the higher dimensional parameter space that describes all potential variant transcription factors can be reduced to a two-dimensional free energy landscape that determines the key molecular parameters that predominantly regulate allostery. We additionally show that the amino acids we determine as contributing sensitively to allosteric control tend to be conserved in diverse bacteria; thus we identify a link between residues that contribute to low-frequency fluctuations and evolutionary selection pressures.
artefact of coarse-graining, we also carried out molecular dynamics simulations [36] with full atomistic detail, including an explicit water model, and performed principle component analysis (PCA) on the generated trajectories [37]. B factors represent the convolution of static and dynamic disorder in the crystal. Dynamic disorder can be attributed to local motions of individual atoms, whereas static disorder represents different atomic positions in the individual protein molecules. The experimental B factors, albeit constrained by crystal packing, therefore represent a reasonable approximation of the local motions in solution [38]. ENMs and atomistic PCAs represent overall unconstrained dynamic motions and hence show much larger deviations in the termini and the flexible loop regions (for example, residues 150-175 of Figure S2). The crystallographic B factor data were qualitatively well represented at either scale of coarse-graining ( Figure S2a) and the distribution of the normal mode frequencies agreed well between ENM and PCA ( Figure S2b). The total predicted motion within the ENM, at least at the level of B factors and lowfrequency mode structure, is therefore similar to other methods of analysis and not an arbitrary feature of the model. Since the fluctuation-induced allosteric effect arises from the low-frequency structure of the protein dynamics, the ENM level of analysis applies to the experimental phenomena studied here.
We hypothesized that if side-chain replacement on amino acids at sites distinct from the cAMP binding site of CAP do not cause conformational rearrangement, yet increase or decrease amino acid side chain hydrophobic or electrostatic forces in their local environments, the normal modes of protein motion would be altered without significant structural changes. If these changes to the normal modes have sufficiently global effects, they will in turn modify cooperativity between the cAMP binding sites through an entropic contribution to the binding free energy. Amino acid side chain replacement can therefore act as a sensitive probe of the contribution of side chain connectivity to cooperativity and the underlying mechanism for allostery within the elastic structure of the protein. The change in allosteric free energy (DDG) as a function of altering the entire primary amino acid sequence (one residue at a time) can therefore be viewed as a quantitative map of the contribution of the normal modes to cooperativity. Such a quantitative map can be constructed either by simulation or experiment; in practice, it is convenient, as we demonstrate below, to use simulation of the entire allosteric map to guide mutagenesis for experimental study. We therefore performed a scanning computational mutagenesis of the entire CAP protein to investigate the influence of side chain connectivity on cooperativity via their influence on the normal modes.
Changing the effective elastic potential between protein backbone carbon atoms in the neighbourhood of each residue of the ENM in turn and calculating effects on DDG was used to determine the scanning computational mutagenesis map. The increase and decrease in elastic potential in the ENM was hypothesized to simulate the effects of local strengthening and weakening of side chain interactions in CAP. A color-coded map corresponding to altered cooperativity with changing local interaction strength is plotted graphically by amino acid residue ( Figure 1a) and in real space (Figure 1b). The global map for the ENM (Figure 1a) demonstrates large regions where cooperativity is susceptible to control by altering side chain connectivity. It is important to note that these control regions are not necessarily adjacent to the cAMP-binding site. For example, regions corresponding to amino acids 127-137 (at the interface between the two monomers) and 150-162 (within the DNA binding domain, far from both the dimer interface and cAMP binding regions) appear to exercise considerable control over cooperativity without contributing to a spatially distinct dynamic pathway and without direct interference with the cAMP binding site.

Residues of CAP That Modify the Normal Modes Predictably Alter Allostery
To experimentally test the model and demonstrate rational engineering and control of allostery, we selected the residues of CAP highlighted in Figure 1b. We examined amino acids predicted to show altered (V132, H160) or neutral (V140) responses to altered amino acid side chain interactions ( Table 1). The removal (V132A) or addition (V132L) of a side chain methyl group of V132 was engineered to decrease and increase, respectively, the strength of hydrophobic interaction across the dimer interface. Computation predicted that these mutations would result in more negative and positive cooperativity in CAP, respectively ( Figure 2a) and that the most important contacts contributing to this effect were with L62 and V132 of the opposing monomer ( Figure S3b). High-resolution X-ray crystal structures of CAP mutants V132A and V132L demonstrated that these variants possessed decreased and increased hydrophobic interactions across the dimer interface, respectively ( Figure 2b). Comparison of variant crystal structures with wild-type demonstrated that there was no statistically significant change in structure ( Figure S4, Table S1). Cooperativity for cAMP binding was studied by isothermal titration calorimetry (ITC) for wild-type, V132A, and V132L proteins to examine whether the experimentally observed changes in cooperativity matched computational predictions (Figure 2c-e, Table 1). The ITC data were well-described by a three-site model, with two major and one minor cAMP binding site ( Figure S5) [39] and allowed derivation of the thermodynamic parameters for all proteins (Table S2). The qualitative computational prediction for altered cAMP cooperativity was tested experimentally including a significant controlled inversion of the sign of the cAMP cooperativity (V132L). The thermodynamic parameters for wild-type CAP demonstrated an overall favourable entropy change and unfavourable enthalpy change on binding the second molecule of cAMP consistent with a previous report [39]. A previous study of the truncated CAP ligand-binding domain demonstrated that binding of the second molecule of cAMP was entropically unfavoured [10]. The difference in thermodynamics between our experiments (Table S2) and previous experiments using the ligand-binding domain alone [10] is therefore likely due to the contribution of motions of the DNA binding domain [40]. This interpretation is supported by previous analysis that has calculated the thermodynamic contribution of the DNA binding domains in the switch to the active conformation [41]. Previous calculations and experiments anticipate that, while the contribution of the normal modes to allostery is entropically controlled (in terms of the net allosteric free energy), coupling of the lowfrequency modes to side-chain motion generically gives rise to additional, but compensating, contributions to enthalpy and entropy and this is observed in our thermodynamic data (Table  S2) [9]. It is notable that, due to this self-cancelling of the contribution of local fast modes within the total free energy, the entropically driven ENM is able to predict qualitative changes to experimental cooperativity despite the local mode contribution of enthalpy to overall thermodynamics.
The ENM calculations predicted a reduction in the negative cooperativity of CAP in response to a reduction in the strength of the local interactions of residue H160 (Figure 3a). In particular, H160 was predicted to form interactions that contribute to allostery with D162 and Q165 ( Figure S3a). The mutation H160L was predicted to break these interactions while maintaining side chain bulk; this was confirmed by X-ray crystallography of the H160L CAP protein (Figure 3b). No overall change in H160L protein structure was evident compared to wild-type ( Figure S4, Table S1). ITC experiments ( Figure 3c) demonstrated that cooperativity for cAMP became less negative as predicted by computation (Table 1). This crucial experiment demonstrates that altering low-frequency motions at a site distant from both the ligand binding site as well as the dimer interface, and from any presumed physical pathway of structural change connecting these sites, can nonetheless give predictable effects on cooperativity.
Altering local interactions associated with V140 was predicted by the ENM to have minimal effects on cooperativity ( Figure 4a) despite significant local hydrophobic interactions; we therefore examined the effect of decreased and increased local hydrophobic interactions in V140A and V140L variants as a control experiment. The V140L mutant protein had no discernible effect on protein structure ( Figure S4). As predicted by the ENM mutagenesis, measurement of cooperativity for cAMP in V140L by ITC ( Figure 4c) showed no differences when compared to wildtype (Table 1). Interestingly, although V140A protein showed no global change in structure ( Figure S4), there is, in this mutation, a significant local conformational change evident in the crystal structure where the mutated V140A residue formed a new hydrophobic contact with the rotated side chain of C179 that is not present in the wild-type or V140L proteins ( Figure 4b). When included in the model, simulated as k C179 /k = 4, this new contact revealed new interactions within the monomer ( Figure S3a) that drove CAP towards positive cooperativity on simulation (Table 1). ITC experiments (Figure 4d) demonstrated that this CAP variant with the identified side chain rearrangement was positively cooperative, thus supporting the qualitative prediction of the model.
A bar graph for the calculated and observed values for K 2 /K 1 revealed the agreement in the direction of the change of cooperativity on simulation and experiment ( Figure S6a). A plot of the experimentally observed value for K 2 /K 1 against that predicted from the ENM demonstrated a correlated relationship where observed increases to K 2 /K 1 are associated with similar changes to K 2 /K 1 by the ENM ( Figure S6b). The consistency in prediction by the ENM and the quantitative correlation between predicted and observed changes do not support the notion that the agreement between experiment and the ENM is due to a chance occurrence.
The ENM can provide further insight into the mechanism by which allosteric control is associated with alterations to the normal modes. No global structural changes were induced in the ENM simulations or were evident from crystal structures of variant proteins; only the pattern of coupled low-frequency fluctuations was modified by the simulated side-chain mutations. This appearance of ''control at a distance'' in the CAP homodimer is explained, through contributions to binding entropy, if there are correlations in the low-frequency motions between cAMP binding sites and if ligand binding or side chain mutation modifies this correlation [42]. As all fluctuating systems dominated by locally harmonic interactions possess a structure of normal modes, with just such distant correlations, they suggest the mechanism for allostery in CAP. To examine whether the mutations studied here can have such distant effects, we calculated the change to local Ca flexibility in the case of tightening and loosening side chain interactions at V132 at the dimer interface ( Figure 5a). Modifications to simulated backbone flexibility are present throughout CAP with varying amplitude and furthermore follow opposite signs at k V132 /k = 0.25 (V132A) and k V132 /k = 4 (V132L). For example, k V132 /k = 4 shows significant tightening of the protein (compare Figure 5a and Figure S3b). An examination of the effect of simulated mutations at V140 and H160 on nonlocal Ca flexibility reinforces this finding ( Figure S7). The predominantly neutral mutation, V140L, simulated as k V140 /k = 4 has little effect on protein backbone flexibility, except at sites where V140 has calculated interactions, consistent with the absence of any effect on allostery on both simulation and experiment. In the case of H160 (k H160 /k = 0.25; at a surface loop distant from both the cAMP binding site and dimer interface) and V140A (k C179 /k = 4, k V140 /k = 0.25), the simulated mutations create a uniform decrease in flexibility Global map for the ENM plotting amino acid number for the CAP monomer and dimensionless change in spring constant (k R /k; corresponds to k amino acid number /relative spring strength). The colour chart represents changes in the ratio of the second to first dissociation constants for cAMP. White corresponds to values of K 2 /K 1 predicted by the wild-type ENM. Red corresponds to increased values of K 2 /K 1 (increased negative cooperativity) and blue corresponds to decreased values of K 2 /K 1 (decreased negative cooperativity and positive cooperativity). (B) The global map plotted in real space onto the wild-type CAP homodimer structure at k R /k = 0.25. The specific residues investigated in this study are indicated. doi:10.1371/journal.pbio.1001651.g001 throughout the monomer except for the straightforward loosening/tightening at the site of the mutations. There is a general trend, therefore, for those simulated mutations that decrease negative cooperativity to be associated with decreased protein backbone motion nonlocally.
A specific requirement of global low-frequency motion as an underpinning mechanism for allostery at a distance is a coupling between protein motion and the behaviour of the cAMP-binding site. We find that the loosening and tightening effects of simulated mutations is correlated with significant modulation of backbone flexibility in the region of the cAMP-binding site (amino acids 71-74, 83-85, and 121) (Figure 5b). The figure shows that, in general, changes in root-mean-square deviation (rmsd) at the ligand-binding site induced by mutation correlate (in this case, k R /k = 0.25) with cooperativity. Mutations that increase motion at the ligand bind site are associated with an increase in the extent  of negative cooperativity and vice versa. This is entirely consistent with the controlling entropic allosteric mechanism in these cases, providing that cAMP binding has the effect of increasing local rigidity. This interaction between the heightened local motions following the first cAMP-binding event creates an entropic contribution to negative cooperativity in DDG [9]. Heightened fluctuation at the second binding site (on binding the first molecule of cAMP) is a general mechanism for negative cooperativity without conformational change [6]. Positive cooperativity without conformational change can be induced by reducing the fluctuation amplitude (for example, the MetJ transcription factor of E. coli [9]).

Global Low-Frequency Dynamics Regulates Allostery in the CRP/FNR Family Transcription Factor GlxR
Studies using CAP have successfully demonstrated that changes to global low-frequency protein dynamics are associated with allostery. We investigated another protein to explore the more general applicability of the mechanism. GlxR of C. glutamicum is a cAMP binding homodimeric transcription factor of the CRP/FNR family that activates genes required for aerobic respiration, glycolysis, and ATP synthesis [43,44]. We solved the X-ray crystal structure of the GlxR apoprotein to produce an ENM for the non-cAMP bound state [45]. Coordinates from an available crystal structure determination of the full-length cAMP bound GlxR homodimer allowed us to construct an ENM for the single and double ligand bound holoprotein states. Examination of the structures for GlxR in the apo and holo forms revealed no significant difference in structure. GlxR therefore represents a new exemplar for allostery in the absence of conformation change. Free energies, calculated from ENMs for GlxR, predicted considerably greater negative cooperative binding of cAMP (K 2 /K 1 = 2.37; DDG = 513 cal mol 21 ) than for CAP (K 2 / K 1 = 1.35; DDG = 179 cal mol 21 ). This prediction of enhanced negative cooperativity was confirmed on experiment with an observed value for K 2 /K 1 of 19.47 (Table 2). A computational scanning mutagenesis map was produced for GlxR, as done previously for CAP, and altered cooperativity with changing local interaction strength is plotted graphically by amino acid residue ( Figure 6a) and in real space (Figure 6b). Both local tightening and loosening across the dimer interface, depending on the residue, was predicted to reduce negative cooperativity and therefore provides a robust experimental test of the model. We generated dimer interface loosening (k L134 /k = 0.25; L134V; Figure 7a) and tightening (k A131 /k = 4; A131V; Figure 7b) GlxR variants and compared simulated and experimental values for cooperativity in these proteins. Both L134V and A131V showed a clear reduction in negative cooperativity, as predicted, when compared to wild-type (Table 2) by ITC (Figure 7c-e), despite the fact that the mutants have opposing effects on hydrophobic interactions across the dimer interface. Allostery is therefore correlated with global low-frequency dynamics in an additional CRP/FNR family transcription factor.

Determining Design Parameters for Mapping Dynamically Driven Allostery
Our findings indicate general biophysical principles that describe the emergence of negative cooperativity in CRP/FNR family transcription factors through the allosteric modulation of normal modes. The property that allosteric effects are carried in general by the more globally distributed, and so typically longer wavelength, normal modes motivated the exploration of the underlying physics by coarse-graining the CAP and GlxR representations even further into rotational-translational block representations [46]. Two coarse-grained blocks per monomer (one is the entire DNA-binding region, coupled only to the other block of its own monomer) emerged naturally from the many residue-residue couplings internal to and between monomers at the molecular level. Figure 8a and 8b display the block structure and the corresponding ''super-coarse-grained'' model. A single representative internal mode within each dynamically tight block and the coupling strengths between the blocks (including coupling across the dimer interface) were investigated as ''design parameters'' for a general class of cooperative homodimer. Figure 8c (CAP) and 8d (GlxR) show allosteric cooperativity, calculated at this high level of coarse-graining, as a function of the integrated coupling strengths within the ligand binding domain (k 1 ) and between monomers (k 12 ). Points below and above the z = 0 plane correspond to positive and negative cooperativity, respectively. The wild-type proteins for both CAP and GlxR are offset from the maxima of anti-cooperative ridges in the two-dimensional free energy landscapes that emerge. At this position, loosening coupling internal to monomers (k 1 ) moves the system into a basin of less negative cooperativity (GlxR) or positive cooperativity (CAP), while loosening in the coupling region (k 12 ) moves the system for both CAP and GlxR to the top of the ridge (red) to increase negative cooperativity. Further analysis demonstrated consistency in the negative cooperativity arising through the normal modes in the ENM and in the super-coarse-grained model. For example, the simulated loosening (k V132 /k = 0.25; V132A) and tightening (k V132 /k = 4; V132L) mutations of the CAP ENM and the tightening (k A131 /k = 4; A131V) mutation of GlxR alter cooperativity through generating effective changes in k 12 at the super-coarse-grained level. The super-coarse-grained model therefore effectively reveals the critical intra-and intermolecular parameters that associate with cooperativity and how these parameters can be altered to move within the allosteric free energy landscape.

Amino Acid Residues That Contribute to Allostery in CAP Are Evolutionarily Conserved
If cooperativity confers a selective advantage on the organism, then the allosteric free energy landscape can also be viewed as evolutionary landscape. In this case, the position of a protein within the landscape depends upon selection pressures that impact upon k 1 and k 12 . This general hypothesis can be used to make an additional significant experimental prediction. If the similar position of CAP and GlxR within their respective free energy landscapes is the result of a selection pressure, then we predict that amino acids that contribute significantly to quantitative allosteric control (Figure 1a and 6d) will be more invariant in related proteins from different species. We therefore examined 163 CAP variants from diverse bacterial species and plotted the frequency of mutation of each amino acid residue against the contribution of that amino acid to allostery (defined as absolute change (D) in K 2 / K 1 for that amino acid in the canonical CAP ENM at k R /k = 0.25). We found evidence that the rate at which an amino acid mutates is negatively related to DK 2 /K 1 (LRT, G 2 = 33.7, p,0.001; Figure 9). The coefficient quantifying this decrease, b 1 , was significantly different from zero [95% CI = (23.34,21.49)]. Amino acids of CAP that contribute to allostery through regulation of lowfrequency protein dynamics are therefore more likely to be conserved in CAP variants through their contribution to protein function. Note that a test for overdispersion was significant, even after allostery had been accounted for (LRT, G 1 = 1,663.9, p,0.001), suggesting that other variables also have an influence on mutation rates.

Discussion
Here we demonstrate that negative allostery in CRP/FNR family transcription factors is correlated with modulation of the normal modes of protein motion on ligand binding in the absence of conformational change. The model makes key predictions that we test at select sites of the CAP and GlxR proteins, the latter identified as an important new exemplar for allostery in the absence of conformation change. The alterations in protein flexibility that are a signature for allostery in CRP/FNR family transcription factors are a consequence of the global nature of those normal modes responsible and mutations that predictably alter cooperativity do so by influencing protein backbone flexibility. Our theory describes how allostery can arise from changes to low-frequency dynamics in the absence of any mean structural change. The theory is particularly significant as it describes allostery as a natural consequence of the dynamic properties of a protein without a requirement for spatially localised dynamic pathways between allosteric sites. The allostery observed is unlikely to have microheterogeneity as an alternative explanation as all CAP proteins crystallised as a single superimposable structure. Any form of heterogeneity reduces the likelihood of forming ordered crystals [47]. Microheterogeneity is therefore not supported as a molecular cause for allostery in CAP.
The possibility of a direct interaction between cAMP binding sites might also be considered as a mechanism to explain the allostery observed. The closest distance between the two cAMP molecules in the CAP dimer is 9.5 Å (the distance between the N6 atoms of the adenine ring). Although it is impossible to conclusively eliminate small local changes that binding of the first molecule of cAMP has at the second site, no conformational changes have been reported in this region in previous NMR studies, making this explanation unlikely. The possibility of a direct interaction is made even more unlikely as, similar as to that described above, any invoked direct interaction between cAMP binding sites would have to consistently match not only the qualitative aspects of the computational predictions for the role of the global modes, but also their quantitative correlation with the observed experimental values. Analysis of the relationship between Cartesian distance and protein motions demonstrated strongly correlated motions between allosteric sites at distances of ,10-20 Å [48] and the global normal modes are a suitable candidate to mediate such correlations in CRP/FNR family transcription factors.
The range of available sites for side chain mutagenesis of CRP/ FNR family transcription factors do not constitute as large a set of separate and independent control parameters as at first seems, but in a good approximation explore a lower dimensional space (i.e., reducing the very high dimensional parameter-space of the entire number of residues, just one slice of which is represented in Figures 1a and 6d, to the two-dimensional parameter spaces of Figure 8c-d). We hypothesize that this two-dimensional parameter space is, in turn, related to an evolutionary landscape for a protein.
In the case of CAP and GlxR, our analysis reveals that evolutionary selection has resulted in the location of the proteins in a region close to maximizing negative cooperativity. The extent    6 kcal mol 21 ). The values of DDG observed and manipulated experimentally are those that modulate the concentration range of cAMP to which the system is sensitive by an order of 1. Engineering of cooperativity is therefore possible by manipulating DDG, as described here, with the caveat that it is likely only possible over a thermodynamic range to which the protein is responsive.
We find that there is a selection pressure against mutation of residues that contribute to allostery in CAP variants. A significant question that arises, therefore, is that of the selective advantage provided through negative cooperativity in CAP. In general, the advantages conferred by negative cooperativity in biological systems are not well resolved [49]. It is proposed that negative cooperativity reduces the sensitivity of a system and extends the concentration range over which a response can be observed [50]. In metabolism, recent modelling suggests that there is a significant overall advantage for metabolic pathway flux with components showing negative cooperativity [51,52]. In transcriptional regulation, negative cooperativity in the binding of D-camphor to the CamR repressor of Pseudomonas putida is proposed to enable coupling of high specificity for D-camphor with a physiological response to high concentrations of the metabolite [53]. Against this framework, it is reasonable to conjecture that negative cooperativity in CAP offers a selective advantage by increasing the concentration range over which a transcriptional response can be generated [54]. The decreased sensitivity of the response to cAMP in negative cooperativity might result in a selective advantage through resource conservation when compared to amplifying effect of a signal response in positive cooperativity [50]. The position within the effective parameter space can also allow CAP variants to further tune cooperativity in either direction without a potentially disastrous influence on protein structure and therefore function. Future experiments to experimentally validate the selective advantage provided by negative cooperativity will therefore be crucial and might typically combine high throughput sequencing of extensive mutational libraries of CAP, after selection in E. coli, with the simulated mutational map of this study [55].
The super-coarse-graining and finer-grained tools we have developed and tested in this work suggest a route to artificial protein design through modification of protein low-frequency fluctuations without compromise of structure. The mechanism also reflects an important balance between phenomena at different length scales within molecular biology. The role of the global normal modes in conveying allosteric signals requires a similarly coarse-grained picture of the protein to identify and discuss the mechanism. On the other hand, the exquisite specificity to local biochemistry is preserved in the mechanism; a set of single residues, themselves spatially distant from either binding site, exercise significant control on the size (and sign) of the underlying allosteric signal. The delicate interactions of effects at different length scales are missed without such a multiscale approach to the physics of protein dynamics. Changes to the normal modes are presented as an important new theory to describe how allostery can arise in the absence of structural change and provide an important theoretical context within which to frame global issues of allostery in proteins.  (30) ,0.001 The ratio of the second to first dissociation constants for cAMP (K 2 /K 1 ) for wildtype and mutant GlxR proteins were calculated from the ENM or obtained by ITC. The p value is for a comparison of means to the wild-type. doi:10.1371/journal.pbio.1001651.t002

Protein Preparation
The open reading frame corresponding to the full-length CAP protein was cloned into the BamHI and HindIII sites of pQE30 and mutant variants constructed by site-directed mutagenesis. Wildtype and mutant recombinant protein was expressed from E. coli M182 DCAP F 2 D(lacIPOZY)X74 galE15 galK16 rpsL thi + lambda 2 [pREP4] for 2 h at 37uC with 1 mM IPTG. Protein was purified using sequential nickel-chelated sepharose affinity and Superdex 75 16/60 size exclusion columns (GE Healthcare). Protein concentration was calculated using the Beer-Lambert Law and a molar extinction coefficient of 20,065 M 21 cm 21 at 280 nm. Fulllength GlxR protein was expressed and purified as previously described [56].

ITC
Protein was dialyzed against 100 mM KPO 4 pH 7.8, 200 mM KCl, 2 mM 1-thioglycerol at 4uC. Protein and buffer were degassed under vacuum and degassed buffer used to dilute cAMP ligand. cAMP concentration was calculated using the Beer-Lambert Law and a molar extinction coefficient of 14,650 M 21 cm 21 at 259 nm. Data were generated using an iTC200 (MicroCal) by typically 40 sequential 1 mL injections of 4-6 mM cAMP into 202 mL 130-400 mM protein. Data for the first injection was routinely discarded as this is affected by diffusion between the syringe and the protein solution during equilibration prior to data collection.

Data Fitting for ITC
Ligand binding for cAMP to CAP was described by a sequential three-site model (two major and one minor binding site [39]). The presence of three cAMP binding sites in CAP was further confirmed in the crystal structures from this study ( Figure S5). A sequential two-site model described ligand binding for cAMP to GlxR. The free ligand concentration, [L], was calculated for each injection using the bisection method, which allowed calculation of the fraction of the protein in each bound state, F i : Comparing the calculated heat content, Q, to the experimental value allowed calculation of the best fit of the binding constants, K i , and the binding enthalpies, DH i , using the solver plug-in for Excel:

Statistics
ITC and ENM data for mutant proteins was compared to the wild-type by a comparison of means by one-way ANOVA. Normal distribution of the data was confirmed by the Shapiro-Wilk test. Homogeneity of variances was rejected for ITC data and confirmed for ENM data using the Levene test. ITC data were therefore examined using a Dunnett's T3 post hoc test for pairwise comparisons with unequal variances and ENM data examined using a two-sided Dunnett's post hoc test for pairwise comparisons with equal variances.

Crystal Structure Determinations
CAP crystals were produced at pH 6.5 with 7-10% (w/v) polyethylene glycol 3350 and 15-20% (v/v) 2-methyl-2,4pentanediol with 2 mM cAMP in 24-well hanging-drop vapour diffusion plates. Crystals were cryoprotected using mother liquor containing 30% (v/v) glycerol and flash cooled in liquid nitrogen [57]. Diffraction data for the wild-type protein were collected inhouse using a Bruker MicroStar rotating anode and processed with SAINT [58]. All CAP mutant data were collected at the Diamond Light Source beams I-04 and I-24 and processed using Mosflm [59] and Scala [60]. CAP structures were solved using molecular replacement with Phaser [61] using CAP (PDB 1I5Z). Model building and refinement were accomplished iteratively using COOT [62] and Refmac5 [63] in CCP4 [59]. CAP structures from crystals produced at pH 6.5 were indistinguishable from those previously produced at pH 7.5 [64]. Structural and refinement statistics are provided in Table S4. Full details of GlxR crystallography and analysis of the structures will be reported elsewhere [45]. Members of the CAP family often crystallise with more than one protein chain in the asymmetric unit. In these cases the functional protein dimer is either generated by the crystallographic 2-fold axis on each of the protein chains or by noncrystallographic symmetry leading to a varying degree of asymmetry [65,66]. Significantly different conformations for each monomer have been observed in some homodimeric bacterial regulator proteins, most notably Mt-CRP [67]. The structures presented here contain one dimer (wild-type CAP in space group P2 1 ), two dimers (wild-type in space group P1), and three dimers (V140A CAP in space group I2) (see Table S4). In all cases the dimers are symmetric with no significant differences between the two protein chains than for the functional dimer.

Coarse-Grained Simulations
ENM simulations were performed using our own code based on the regular implementation [31,68]. The spring constants were set Figure 9. CAP residues that contribute to allostery are conserved in variants from diverse bacterial species. Relation between the minimum number of times an amino acid needs to exhibit a mutation to explain the variation in the set of sequence data ( = 165 proteins) and the contribution of that amino acid to allostery (DK 2 /K 1 for that amino acid at k R /k = 0.25). Each circle corresponds to one of 210 amino acids. The best fitting model predicts that mean mutation rate declines with allostery (solid line). Note that absolute allosteric values have been used and increased by 0.000001 to allow plotting of zeros. doi:10.1371/journal.pbio.1001651.g009 to a constant value of 1 kcal mol 21 Å 22 with a cutoff radius of 8 Å , and only the Ca atoms in the protein were considered. The presence of cAMP effector at the binding site was treated by the addition of one node at the mass weighted average coordinate for each ligand. Varying the spring constant of any springs attached to a single residue of the protein was used to represent side chain mutations. The allosteric free energy was calculated by summing over modes 1 to n. n was determined by examining where values K 2 /K 1 converged ( Figure S8). The final results quoted used the converged value of K 2 /K 1 . PDB files for constructing CAP ENMs were 1CGP, 1G6N, 1HW5, 1I5Z, 1I6X, 1J59, 1O3T, 1RUN, 1RUO, 1ZRC, 1ZRD, 1ZRF, 2GZW, 4HZF (this work), and an additional in-house file isostructural to 2GZW. The PDB file for constructing the GlxR ENM was 3R6S.

Super-Coarse-Grained Model
The CAP and GlxR proteins were modelled as two blocks for each monomer, one for the ligand binding domain and one for the DNA binding domain. We assigned one internal breathing mode to each subunit and allowed each subunit to move, producing seven degrees of freedom. For the apo-protein the internal subunit coupling strengths are characterized by k 1 though k 4 and the intersubunit couplings by k 12 , k 13 , and k 24 (Figure 4b). The effect of one ligand binding was included by modifying k 1 by a factor b, k 12 by a, and k 12 by c. The second ligand binding was therefore represented by further modifying k 2 by b, k 12 by a further factor of a, and k 24 by c. The allosteric free energy was determined from the determinant of the interaction matrix [69]. The couplings were defined from PCA analysis of 300 ns molecular dynamics simulations for the three states. In each case the protein was divided into the four zones by performing a rotational-translational-block approximation (Figure 8a) [46,70].

Atomistic Simulation
Molecular dynamics (MD) simulations employed the harmonic force field equations used in the ff99SB and GAFF force fields within the AMBER simulation program [71].
The simulations employed the ff99SB force field for the CAP protein and the GAFF force field (v. 1.4) for cAMP. ff99SB force field is used as the energetic interactions of side chains, which are reasonably represented by this force field [72], and outperforms the ff03 force field [73]. MD calculations used a short-range cutoff of 10 Å , with the long-range portion of the Coulomb potential represented by an Ewald summation, and employed a time step of 2 fs. The bond lengths were constrained by the SHAKE algorithm. The initial starting structures were obtained directly from X-ray diffraction. These structures were then solvated in TIP3P water and energy minimized prior to simulation [74]. The system was heated to 300 K over a period of 20 ps and further equilibrated for 40 ns. Production runs at 300 K were carried out over 200 ns. PCA was performed by diagonalising the mass weighted covariance matrix of the atomistic simulations. The eigenvectors represent the shape of the atomistic motion and the corresponding eigenvalues the extent of the motion.

Analysis of CAP Variants
To determine if DK 2 /K 1 , hereon denoted x, is associated with the mutation rate of amino acids, we first estimated the relative amino acid mutation rate using the sequence data for CAP variants and we then statistically tested for an effect of x on this rate. Relative mutation rate was estimated by finding the minimum number of amino acid mutations needed to generate the observed variations in the sequence data, which we denote N. For each of the 165 proteins we found the protein having the smallest number of amino acid differences. The sum of these differences gave N. When summing differences, if more than one protein had the minimum difference, we included all the proteins having the minimum. We then determined the number of these mutations that were associated with each of the 210 amino acids, which we denote n i . Thus, n i estimates the relative mutation rate of amino acid i, and these estimates account for the evolutionary history of the proteins. If all amino acids had an equal mutation rate, then we would expect the n i to all be approximated by N/210. We assumed that the true relative rate of mutation was related to x according to the logistic function: where b 0 , b 1 , and b 2 are constants. To account for overdispersion among the n i , which might be due to unmeasured covariates associated with the proteins, we assumed that the variation between the n i could be described by the beta-binomial distribution. Under these assumptions, the log-likelihood of the model described by the set of parameters h = {b 0 ,b 1 ,b 2 ,w}, is given by: where BB(n|N,m,w) is the beta-binomial distribution, which describes the probability of observing n successes from N trials when, on average, successes occur with probability m and variation in this probability among replicates is described by the betadistribution with variance m(12m)w/(1+w) [75].
Evidence that mutation rate was related to x was found by applying a likelihood ratio test (LRT) comparing the fit of the full model with the model that ignored x (i.e., when b 1 = b 2 = 0). Let LL 1 and LL 0 be the maximum log-likelihood of the full model and the simpler model, respectively. Under the null hypothesis that x is not associated with mutation rate, the test statistic G = 2[LL 1 2LL 0 ] is chi-square distributed with two degrees of freedom, as the more complex model has two additional free parameters: b 1 and b 2 . A LRT was also used to test for overdispersion by comparing the fit from the full model described above with the model that assumed variation had a binomial distribution (Q is vanishingly small). This latter test, if significant, justifies the use of the beta-binomial distribution rather than the binomial. Confidence intervals for model parameters were estimated using the likelihood profile approach.
The genome accession numbers analysed are: NP_232242.   Figure S3 ENM predicted residue interactions that impact on cooperativity. (A) The change in cooperativity that occurs when k R /k is varied at the indicated residue (legend) against every amino acid within the same monomer (within an 8 Å cutoff). (B) The change in cooperativity that occurs when k R /k is varied at the indicated residue (legend) against every amino acid within the opposing monomer (within an 8 Å cutoff). (TIF) Figure S4 Least-squares superposition of one representative chain of each of the seven doubly cAMP-bound crystal structures treating the two domains (dimerization/cAMP-binding domain and DNA-binding domain) as rigid bodies with a flexible linker (wild-type, green; V132A, cyan; V132L, dark cyan; V140A, magenta; V140L, orange; H160L, red). The transformation matrices were obtained using RAPIDO [76]. (TIF) Figure S5 Fitting of ITC data. Binding isotherm for a representative data set for the calorimetric titration of cAMP to wild-type CAP protein showing experimental data and fitted curves for two and three molecules of ligand cAMP. The inset shows the structure of CAP (green) with three bound molecules of cAMP (blue). (TIF) Figure S6 Calculated and observed values for cooperativity in CAP. (A) The ratio of the second to first dissociation constants for cAMP (K 2 /K 1 ) for wild-type and mutant CAP proteins were calculated from the ENMs (calculated) or obtained by ITC (observed). The coloured lines correspond to the value for K 2 /K 1 in the wild-type to enable comparison of the direction of change. (B) Values for K 2 /K 1 obtained by ITC plotted against values for K 2 /K 1 predicted by the ENM demonstrating the correlation between the extents of experimentally observed and predicted values for K 2 /K 1 . Dotted line represents the 95% confidence interval for the linear regression (R 2 = 0.85). (TIF) Figure S7 Mapping local dynamics in CAP. (A) The effect of mutation of V140 and H160 on local dynamics over the CAP monomer. The chart represents the percentage variation in B-factor from the wild-type ENM plotted against amino acid number. Inset shows the same chart with an expansion of the y-axis. (B) The chart is identical to that shown in panel C except with the y-axis expanded. (TIF) Figure S8 The dependence of K 2 /K 1 on the number of summed modes. The chart represents the calculated value for K 2 /K 1 from the ENM plotted against the total number of summed modes.

(TIF)
Table S1 Least-squares superposition of all independent protein chains in each of the doubly cAMP-bound CAP crystal structures. (PDF)