Electrostatically Accelerated Encounter and Folding for Facile Recognition of Intrinsically Disordered Proteins

Achieving facile specific recognition is essential for intrinsically disordered proteins (IDPs) that are involved in cellular signaling and regulation. Consideration of the physical time scales of protein folding and diffusion-limited protein-protein encounter has suggested that the frequent requirement of protein folding for specific IDP recognition could lead to kinetic bottlenecks. How IDPs overcome such potential kinetic bottlenecks to viably function in signaling and regulation in general is poorly understood. Our recent computational and experimental study of cell-cycle regulator p27 (Ganguly et al., J. Mol. Biol. (2012)) demonstrated that long-range electrostatic forces exerted on enriched charges of IDPs could accelerate protein-protein encounter via “electrostatic steering” and at the same time promote “folding-competent” encounter topologies to enhance the efficiency of IDP folding upon encounter. Here, we further investigated the coupled binding and folding mechanisms and the roles of electrostatic forces in the formation of three IDP complexes with more complex folded topologies. The surface electrostatic potentials of these complexes lack prominent features like those observed for the p27/Cdk2/cyclin A complex to directly suggest the ability of electrostatic forces to facilitate folding upon encounter. Nonetheless, similar electrostatically accelerated encounter and folding mechanisms were consistently predicted for all three complexes using topology-based coarse-grained simulations. Together with our previous analysis of charge distributions in known IDP complexes, our results support a prevalent role of electrostatic interactions in promoting efficient coupled binding and folding for facile specific recognition. These results also suggest that there is likely a co-evolution of IDP folded topology, charge characteristics, and coupled binding and folding mechanisms, driven at least partially by the need to achieve fast association kinetics for cellular signaling and regulation.


Introduction
Cellular signaling and regulation are frequently mediated by proteins that, in part or as a whole, lack stable structures under physiological conditions [1][2][3]. Such intrinsically disordered proteins (IDPs) are highly prevalent in proteomes [4] and overrepresented in diseases pathways [5,6]. For example, nearly onethird of eukaryotic proteins have been predicted to contain extended disordered regions [7], and about 25% of diseaseassociated missense mutations can be mapped into predicted disordered regions [8] (although cancer mutations appear to prefer ordered regions [9]). The prevalence of intrinsic disorder suggests that protein conformational heterogeneity could provide crucial functional advantages, for which many concepts have been proposed [10][11][12][13][14]. Understanding the physical basis of how intrinsic disorder mediates protein function (and how such functional mechanism may fail in human diseases [15]) is of fundamental significance and has attracted intense interests in recent years [16]. Important progresses have been made on characterizing the conformational properties of unbound IDPs and determining how these conformational properties contribute to efficient and reliable interactions [16][17][18][19][20][21][22].
A key recent recognition is that frequent requirement of protein folding for specific recognition of IDPs could lead to kinetic bottlenecks [23][24][25]. As predicted by the dual-transition-state theory [23], the diffusion-limited encounter rate constant represents the upper bound for that of a coupled binding and folding interaction. Importantly, the upper bound can be achieved only if the IDP readily folds upon encounter, which requires folding rates on the order of 10 ms 21 or greater [23]. That is, IDPs need to achieve folding rates beyond the typical ms 21 ''speed limit'' estimated for folding of isolated proteins [26] to maximize association kinetics. Therefore, the putative functional advantages of intrinsic disorder, especially structural plasticity for specific interactions with numerous partners [27], come with a potential cost of slow binding kinetics. Such kinetic bottleneck must be resolved for IDPs to be viable in cellular signaling and regulation. Interestingly, a recent survey of binding kinetic data revealed that IDP binding was not systematically slower than that of globular proteins [28]. The implication is that most IDPs do manage to fold rapidly upon nonspecific binding, and this is apparently consistent with the accumulating observations that IDP coupled binding and folding tends to follow induced folding-like baseline mechanisms (i.e., bind then fold) [16,19]. Several factors could contribute to efficient folding of IDPs upon binding, in particular small interacting (and folding) domains and simple folded topologies with low contact orders. There also appears to be a delicate balance between pre-folding and conformational flexibility that allows an IDP to quickly fluctuate among accessible conformational states, especially upon encounter [16,29,30]. Nonetheless, it is not yet clear how in general IDPs may achieve fast folding at rates beyond the traditional ms 21 folding ''speed limit'' upon encountering their specific targets.
An important characteristics of IDPs is that they are enriched with charged and polar residues [31]. Electrostatics can thus be expected to play key roles in IDP structure and function. For example, the charge content can modulate compaction and other conformational properties of free IDPs [32,33]; DNA search efficiency is controlled by charge composition and distribution in disordered tails of DNA-binding proteins [34,35]. It has been also observed or speculated in a few cases that electrostatics might be important for fast IDP recognition [36][37][38][39]. However, these discussions have been often based on the classic electrostatic steering effects [40], and the actual underlying mechanisms of putative electrostatic acceleration were not known. Our recent computational and experimental study of the p27-Cdk2/cyclin A interaction revealed that long-range electrostatic forces could promote facile IDP recognition via an ''electrostatically accelerated encounter and folding mechanism'' [24]. Specifically, the measured p27/Cdk2/cyclin A association rate constants showed a strong salt-dependence, increased ,12 fold when the ionic strength was reduced from 0.6 to 0.075 M. However, the salt-dependence is poorly described by an approximate Debye-Hückel relation [41] that mainly captures the electrostatic steering effects. Instead, simulations using a series of topology-based coarse-grained models suggested that long-range electrostatic forces exerted on a large number of charges on p27 did not only accelerate the encounter rate (via the classical electrostatic steering effect [40]), but enhance the efficiency of p27 folding upon encounter by promoting native-like encounter topologies.
Analysis of surface charges in a set of existing IDP complexes further revealed that the vicinity of IDP binding sites tended to be enriched with charges to complement those on IDPs [24] (even though the IDP binding interface itself is more hydrophobic than the rest of the protein surface as previously observed [42]). Electrostatic forces are known to be a dominant long-range force that can guide protein orientation in protein-DNA interactions [43,44] and/or modulate early stages of protein folding [45][46][47]. One implication of enriched charges near IDP binding sites is thus that the electrostatically accelerated encounter and folding mechanism observed for p27 may be prevalent in signaling and regulatory IDPs. Nonetheless, the ability for long-range electrostatic forces to enhance folding upon binding can be surprising, as nonspecific interactions (electrostatic or van der Waals) have been generally expected to accelerate binding but slow down folding [48,49]. It has also been predicted that, while inter-chain electrostatic interactions facilitate binding of disordered chaperone Chz1 to histone variant H2A.Z-H2B, intra-chain electrostatic interactions could lead to premature collapse of Chz1 under low salt conditions and hinder the overall rate of forming the specific complex [50].
In the present work, we investigated the recognition mechanisms and the roles of long-range electrostatic interactions in forming of three IDP complexes, namely, p53-TAD1/TAZ2, HIF-1a/TAZ1, and NCBD/ACTR (Table 1). All these complexes have important biological functions. For example, tumor suppressor p53 is considered one of the most important proteins in cancer [51]; NCBD and TAZ1/2 are key regulatory domains of CBP, a key component of the general transcriptional machinery that plays critical roles in cell fate regulation [52]. For understanding IDP recognition, these systems involve more complex folded topologies than that of p27 in the p27/Cdk2/cyclin A complex. As shown in Fig. 1, both HIF-1a/TAZ1 and NCBD/ACTR possess extensive binding interfaces, whereas the binding interface in p53-TAD1/ TAZ2 is more localized. Importantly, while strong charge complementary exists near the binding interface (as expected), the surface electrostatic potentials of the folded substrates do not show prominent features like those observed on Cdk2/cyclin A (e.g., see Fig. 1 of reference [24]) to directly suggest that long-range electrostatic forces could promote native-like (and thus more folding-competent) encounter complexes. The NCBD/ACTR complex involves synergistic folding of two IDPs and thus offers a particularly interesting opportunity to understand whether and how electrostatic interactions may modulate the formation of nontrivial folded topologies. Amazingly, all three complexes associate with on-rates in excess of 10 7 M 21 s 21 (see Table 1), a regime that is typically considered ''diffusion-limited'' and can only be accessed in the limit of ultrafast conformational transitions [40].

Topology-based modeling of IDP coupled binding and folding
Series of topology-based coarse-grained models were first derived based on the complex structures to allow direct simulation of reversible binding and folding with tractable computational cost. Topology-based modeling is based on the theoretical framework of minimally frustrated energy landscapes for natural proteins [53], and has been highly successful in predicting essential features of protein folding mechanisms [53][54][55]. Formation of stable IDP complexes such as those studied in this work should also satisfy minimal frustration, and thus topology-based modeling is applicable. Indeed, it has been successfully applied to several IDP

Author Summary
Intrinsically disordered proteins (IDPs) are key components of regulatory networks that dictate various aspects of cellular decision-making. They are over-represented in major disease pathways, and are considered novel albeit currently difficult drug targets. Recognition of IDPs has extended the traditional protein structure-function paradigm, and various concepts have been proposed on how intrinsic disorder may confer crucial functional advantages. However, the physical basis of these concepts remains poorly established. In particular, while IDPs alone exist as ensembles of fluctuating structures, they frequently fold upon specific binding. Analysis of the physical timescales of protein folding and protein-protein encounter predicts that the requirement of peptide folding for specific binding could lead to a major kinetic bottleneck. In this work, carefully calibrated topology-based coarse-grained models were applied to directly simulate reversible folding and binding and investigate the recognition mechanisms of three IDP complexes. The results strongly support an electrostatically accelerated encounter and folding mechanism, where long-range electrostatic forces not only accelerate protein-protein encounter via ''electrostatic steering'' but also promote ''folding-competent'' encounter topologies to enhance the efficiency of IDP folding upon encounter.  Abbreviations: ACTR: the activation domain of p160 steroid receptor co-activator; HIF-1a: hypoxia-inducible factor 1 a subunit; NCBD: the nuclear-receptor co-activator binding domain of CREB binding protein (CBP); p53-TAD1: the transactivation domain 1 of tumor suppressor p53; TAZ1/2: the TAZ domains of CBP. The sequences of all IDPs involved (highlighted in bond fonts) are provided in the Supporting Information. Text S1. b The experimental K D values were measured at 308 K for p53-TAD1/TAZ2, 298 K for HIF-1a/TAZ1, and 304 K for NCBD/ACTR. Note that K D only weakly depends on temperature for p53-TAD1/TAZ2 (doubled when the temperature is increased from 288K to 308K [77]). c Numbers of charged residues and the net charges (in parentheses) of the IDP, its binding site, and the vicinity of the binding site. Residues at the IDP binding interface are identified as those with greater than 1.0 Å 2 solvent accessible surface area changes upon complex formation. Surface residues are identified as those with .5% solvent accessibility. All surface residues within 15 Å Ca-Ca distance from the bound IDP but not directly involved in intermolecular contacts are considered to be within the vicinity of the IDP binding site. d Estimated based on the association rate constant of p53-TAD2/TAZ2 (,10 10 M 21 s 21 [38]), assuming that TAD1 and TAD2 have similar off rates. TAD2 binds to the TAZ2 primary site with K D ,32 nM [38], about two orders of magnitude stronger than TAD1. doi:10.1371/journal.pcbi.1003363.t001 complexes [56][57][58][59][60], with many key predictions substantiated by independent experimental studies. Nonetheless, important differences do exist between IDPs and structured proteins in sequence compositions and binding interface characteristics [42]. We have previously demonstrated that traditional topology-based models need to be carefully calibrated to ensure proper balance among competing intramolecular and intermolecular interactions (see Methods for detail on the calibration protocol) [61]. We note that the importance of model calibration was also illustrated in a recent study of the HIF-1a/TAZ1 complex [59]. Table 2 summarizes the final calibrated models for all three complexes. The calculated residual helicity distributions of the unbound states are show in Fig. S1. Three independent models were constructed for each complex: one without explicit charges (mimicking high salt concentration with fully screened long-range electrostatic interactions), one with explicit charges (mimicking low salt concentration with unscreened long-range electrostatic interactions), and a third one with explicit charges and 0.05 M salt (mimicking physiological conditions). All models reproduce the experimental K D to the same order of magnitude, except that the no charge model for HIF-1a/TAZ1 yields a K D value about one order of magnitude too large. We note that calculated K D values can be very sensitive to small changes of in the scaling of intermolecular interactions during model calibration (see Methods). It is computationally expensive to use REX simulations to systematically search for the parameter space, especially for models without explicit charges due to slower transitions. Nonetheless, by performing production simulations at the corresponding melting temperatures, remaining imperfections in the balance of various interactions should be further suppressed, allowing reliable comparative studies of the mechanistic roles of electrostatic interactions in coupled binding and folding.

Baseline mechanisms of coupled binding and folding: Effects of electrostatic forces
Free energy surfaces were constructed using various combinations of folding and binding order parameters to understand the baseline mechanisms of coupled binding and folding and to dissect the effects of long-range electrostatic forces. In particular, the fractions of native contacts formed have been shown to provide natural reaction coordinates for such mechanistic analysis [62]. Fig. 2 compares the free energy surfaces as a function of intra-and inter-molecular native contact factions for all three complexes, calculated using calibrated Gō-like models with and without explicit charges and/or salt (see Table 2). Both p53-TAD1 and HIF-1a recognitions follow induced folding-like mechanisms, where the peptides only gain structures after forming significant numbers of native intermolecular contacts. For example, Fig. 2A shows that p53-TAD1 does not start to fold until Q inter reaches ,0.5. Free NCBD is a molten globule with folded-like secondary structures [63], and its synergistic folding with ACTR has been previously shown to involve multiple stages of selection and induced folding [25,60], reminiscent of the ''extended conformational selection'' mechanism [30]. Nonetheless, neither protein gains significant secondary (for ACTR) or tertiary (for NCBD) structures until over 20% of native intermolecular contacts are formed ( Fig. 2G and 2J).
Interestingly, formation of all three complexes involves intermediates, even though the intermediate in p53-TAD/TAZ2 interaction only become pronounced in the presence of nonspecific electrostatic forces (see Fig. 2A vs 2C). Detailed examination of the simulation trajectories and various free energy surfaces using fractions of native contacts formed by different IDP segments (e.g., see Figs. S2, S3, S4) revealed the existence of multiple parallel pathways for forming HIF-1a/TAZ1 and NCBD/ACTR. While these mechanistic details are not the focus of the current work, they appear to be highly consistent with previous experimental and computational studies. For example, as shown in Fig. S2, both the first and third helices of HIF-1a could initiate recognition, with the pathway initiated by the third helix binding being much more prevalent. Similar observations were also made in a separate computational study [59]. Specific recognition of NCBD/ACTR appears to be primarily initiated by the C-terminal segments of these two peptides (Figs. S3, S4), which forms a key intermediate  Table 1 for the experimental values); k TS was calculated from the production Langevin simulations at the corresponding T m , as k TS = N TS /t tot , where N TS is the number of reversible binding and folding transitions observed during the total simulation time span t tot . As all simulations were performed at T m , k TS as defined is half of the binding and unbinding rates. k cap k esc and k evo are defined in Eqns. that was also suggested by an H/D exchange mass spectrometry study [64]. Kinetic data from a recent stop-flow study of the NCBD/ACTR interaction [65] are consistent with the prediction of induced folding as a baseline mechanism and have further confirmed the existence of parallel pathways and multiple folding intermediates. Representative snapshots along the dominant binding and folding pathways of p53-TAD1/TAZ2 and HIF-1a/TAZ1 are shown in Figs. S5, S6. Explicit inclusion of charges does not significantly perturb the baseline mechanisms of coupled binding and folding. As shown in Fig. 2 and Figs. S2, S3, S4, long-range electrostatic forces do not lead to fundamental changes in any of the free energy surfaces examined. The baseline mechanisms for the formation of all three complexes remain induced folding-like. Furthermore, nonspecific electrostatic interactions do not change the relative prevalence of the parallel pathways that exist. For example, HIF-1a still initiates binding mainly through the third helix (Fig. S2); synergistic folding NCBD and ACTR is still mainly initiated through their Cterminal segments (Figs. S3, S4). The key effect of electrostatic forces appears to be substantial reductions in the free energy barriers that separate various basins. That is, even under the no salt condition, strong nonspecific electrostatic interactions do not appear to add to the ruggedness of coupled binding and folding free energy surfaces. An implication is that there exists a level of Figure 2. Free-energy surfaces at T m as a function of the fractions of intra-and intermolecular contacts formed, computed using various Gō -like models with and without explicit charges and/or 50 mM salt (see Table 2). Rows A-C, D-F and G-L are for the p53-TAD1/ TAZ2, HIF-1a/TAZ1 and NCBD/ACTR complexes, respectively. Q inter is the fraction of intermolecular contacts formed; Q p53 , Q HIF-1a and Q ACTR are the fractions of intramolecular contacts formed by p53-TAD1, HIF-1a and ACTR, respectively; Q NCBD-tert is the fraction of tertiary intramolecular contacts formed by NCBD (the helical content of NCBD remain similar during coupled binding and folding). Contours are drawn every kT, where k is Boltzmann constant and T is the absolute temperature. doi:10.1371/journal.pcbi.1003363.g002 self-consistency between the charge distribution and folded topology in the bound states, despite a lack of apparent complementary between folding topologies and surface electrostatic potentials for these IDP complexes (see Fig. 1).

Kinetic effects of long-range and nonspecific electrostatic forces
Kinetics of coupled binding and folding was derived directly from production Langevin dynamics simulations performed using the calibrated Gō-like models at their corresponding T m . The results, summarized in Table 2, show that long-range electrostatic forces accelerate the reversible binding/unbinding transition rates for all three complexes. The overall electrostatic acceleration, estimated by comparing the average transition rates (k TS ) calculated using models with and without explicit charges, ranges from ,5 fold for HIF-1a to 10 fold for NCBD/ACTR. The magnitude of acceleration is similar to what was previously measured for other IDPs including p27 [24] and PUMA [39] (both ,10 fold). The presence of 0.05 M salt significantly attenuates the predicted electrostatic acceleration, to only about two fold. However, the effect of salt screening on electrostatic acceleration is likely over-predicted [24], which is due to the C a -only model used in this work and may be corrected with more detailed protein models [45]. Consistent with the kinetic analysis, there are significant reductions in the free energy barriers along Q inter (see Fig. 3), which has been shown to be a good binding reaction coordinate [61]. In addition, the magnitude of barrier reduction correlates well with the degree of rate acceleration calculated directly from Langevin dynamics simulations, with the largest barrier reduction observed for NCBD/ACTR and the smallest reduction observed from HIF-1a/TAZ1.
To further analyze the effects of electrostatic interactions on different stages of coupled binding and folding, the recognition process was divided into two generic steps, including an encounter step followed by an evolving (folding) step to final bound and folded state (Eq. 1 in Methods). Such generic decomposition ignores the details of IDP-specific folding pathways, to allow on to focus on the net effects of electrostatic forces on the overall efficiency of IDP folding upon encounter. For this, three general states were identified during production simulations, including the unbound (U), collision complex (CC), and bound (B) states (see Methods for specific criteria for state assignment). The mean first passage times (MFPT) and numbers of transitions (N tran ) among these states were then calculated. The results, summarized in Tables S1, S2, S3, show that long-range electrostatic forces greatly reduce the average encounter time, from 0.72 to 0.03 ns for p53-TAD, from 0.37 to 0.20 ns for HIF-1a, and from 7.71 to 1.26 ns for NCBD. At the same time, long-range electrostatic forces also significantly enhance the efficiency of IDP folding upon encounter, allowing much larger fractions of the encounter complexes to eventually evolve to the bound states. For example, for NCBD/ ACTR, only 16 out ,2300 encounter events evolved to the bound state in absence of long-range electrostatic forces (0.7%); whereas with explicit charges, there was ,37% probability (108 out of 288) of forming the specific complex once the proteins were captured into the collision complex state (Table S3). For the HIF-1a/TAZ1 complex, the percentages of collision to specific complex transition are 0.4% without and 7% with explicit charges (Table S2); for p53-TAD1/TAZ2, the production percentages are 0.6% without and 60% with explicit charges (Table S1). It should be emphasized that nonspecific electrostatic interactions significantly stabilize the collision complexes, due to large and complementary net charges of the interacting proteins (see Table 1). As such, much fewer fully unbinding events were observed during production simulations using the charged models. This effect also led to more reversible transitions between the bound and collision complex states and thus an overestimation of the true folding efficiency of IDPs upon collision as estimated above. We also note that the collision complexes as defined in our analysis were not intended to represent so-called ''encounter complexes'' that have been often considered key intermediates of protein-protein association [66], although encounter complexes are also believed to be mainly stabilized by nonspecific electrostatic interactions.
The enhanced apparent efficiency of folding upon encounter appears to be frequently achieved at the cost of longer folding times. For example, the MFPTs of transitions from the collision complexes to the bound states increase from 0.26 to 3.94 ns for the p53-TAD1/TAZ2 complex (Table S1) and from 8.14 to 44.56 ns for the NCBD/ACTR complex (Tables S3). The net effects on the kinetics of encounter and folding stages can be quantified by calculating three effective rate constants as defined in Eqns. 2-4 (see Methods) [28]. The results, summarized in Table 2 and plotted in Fig. 4, clearly demonstrate that nonspecific electrostatic interaction enhance the encounter rates and reduce the escape rates of the collision complexes. Importantly, the effective evolution rates are always faster, by about three fold, in the presence of long-range electrostatic forces, despite longer MFPTs for the transitions from the collision complexes to the bound state observed for the p53-TAD1/TAZ2 and NCBD/ACTR complexes. The magnitude of electrostatic acceleration of folding upon encounter is similar to what was previously observed for folding and binding of p27 to the Cdk2/cyclin A complex [24].

Mechanism of electrostatically accelerated folding upon encounter
Inspection of the conformational properties of the collision complexes provides further insights into the molecular basis for enhanced efficiency of IDP folding upon encounter due to longrange electrostatic forces. As shown in Fig. 5, without nonspecific electrostatic interactions (models without explicit charges), the initial contacts between two binding partners are largely random, and the distributions of IDP initial contact points on the substrate surface in the collision complexes are relatively uniform (left column). In contrast, with the inclusion of explicit charges, the probabilities of IDP encountering near the native binding interface are dramatically increased. Coupled with reduced escape rates, this allows much higher efficiency of IDP folding upon encounter to achieve higher overall association rate constants ( Table 2). The ability of long-range electrostatic forces to guide the recognition process is also reflected in the free energy surfaces as a function of binding RMSD of the IDP and center of mass separation between two peptides. As shown in Fig. 6, long-range electrostatic forces generate a strong free energy gradient that extends over 10-15 Å away from the native bound positions, without creating overstabilized misfolded states at short separation distances. It is intriguing that, even though both NCBD and ACTR are disordered in the unbound state, nonspecific long-range electrostatic forces between complementary charges on these two proteins can still manage to promote native-like topologies in the collision complexes. In particular, there is a much higher probability of NCBD and ACTR initiating contacts via the C-terminal helix of NCBD and the second helix of ACTR (Fig. 5E-F). This is part of a key pathway of synergistic folding inherent to the NCBD/ACTR complex that was predicted by coarse-grained and atomistic simulations [25,60] and later substantiated by H/D exchange mass spectrometry [64]. Therefore, nonspecific electrostatic interactions appear to mainly augment existing folding pathways inherent to the folded topologies to facilitate efficient folding of IDPs upon encounter. Coupled with the previous observation that the vicinity of the IDP binding site tends to be enriched with charges to complement those on IDPs [24], thee current results suggest that there is likely a co-evolution of IDP folded topology, charge characteristics, and coupled binding and folding mechanisms. Furthermore, the co-evolution is likely driven by the important need to achieve facile IDP recognition for cellular signaling and regulation.

Discussion
While fulfilling important functional constraints such as structural plasticity for binding numerous specific targets, protein intrinsic disorder can lead to potential kinetic bottlenecks to be viable in cellular signaling and regulation. Our previous work on the p27/Cdk2/cyclin A complex has revealed a mechanism where nonspecific electrostatic interactions not only enhance the proteinprotein encounter kinetics but also promote folding-competent encounter topologies to increase the efficiency of IDP folding upon encounter [24]. Using carefully calibrated topology-based coarsegrained models, we have now further demonstrated that similar electrostatically accelerated encounter and folding mechanisms also underlie the formation of three IDP complexes with more complexed folded structures, namely, p53-TAD1/TAZ2, HIF-1a/TAZ1, and NCBD/ACTR. Importantly, these complexes lack apparent features on the electrostatic surface potentials to directly suggest the ability of nonspecific long-range electrostatic forces to promote native-like encounter topologies to enhance the IDP folding efficiency upon encounter. Nonetheless, there seems to exist a sufficient level of self-consistency between the charge distributions and folded topologies in the bound state to allow accelerated recognition in presence of nonspecific electrostatic interactions. Therefore, enriched charges on IDPs not only play key roles in modulating the conformational properties of the unbound state, but also likely play general and important roles in regulating efficient interactions of IDPs with specific partners. We note that IDPs are frequently regulated by post-translational modifications that add or remove charges. Improved mechanistic understanding of electrostatic forces in IDP recognition derived from the current work will thus help to dissect the profound impacts of post-translational modifications and disease-related mutations on IDP structure and interaction. Methods Calibration of topology-based coarse-grained models with and without explicit charges C a -only sequence-flavored Gō-like models [67] were first derived from the complex structures of p53-TAD1/TAZ2, HIF1-a/TAZ1 and NCBD/ACTR (see Table 1) using the Multiscale Modeling Tools for Structural Biology (MMTSB) Gō-Model Builder (http://www.mmtsb.org) [68]. The 3 zinc ions bound to TAZ1 in the HIF1-a/TAZ1 complex were modeled explicitly with distance restraints to the coordinating residues. All three models were then calibrated to balance the intrinsic folding propensity and the strength of intermolecular interactions using a previously described protocol [61]. Briefly, the strengths of intramolecular native contact were uniformly scaled to reproduce the experimentally measured residual helicity of unbound IDPs, which are mainly based on NMR secondary chemical shift and/or circular dichroism analysis (p53-TAD1 [69], NCBD/ACTR [63], and HIF1-a [70]). The residual helicity distributions calculated using the final models listed in Table 2 are provided in Fig. S1. Then, the strengths of intermolecular contacts were adjusted, such that binding affinities calculated from replica exchange molecular dynamics (REX-MD) simulations approximately match the experimental values (see Table 1). Following the previously described procedure [24], the calibrated sequence-flavored Gōlike models were then further modified by assigning proper explicit charges to all charged residues (Lys, Arg, Glu and Asp) as well as zinc ions in the HIF1-a/TAZ1 complex. The charged models were then re-calibrated to reproduce the experimental residual structure level (Fig. S1) and binding affinity ( Table 2). Such calibration is critical to avoid inherent bias for particular types of interactions, e.g., intra-vs. inter-molecular or native vs. nonspecific electrostatic. Nonspecific electrostatic interactions were modeled using the Debye-Hückel potential to account for ionic screening. The dielectric constant was set at 80.

Simulation protocols
The complexes were simulated in cubic boxes with periodic boundary conditions imposed in CHARMM [71,72]. The box sizes are 100, 100 and 105 Å for p53-TAD1/TAZ2, HIF-1a/ TAZ1 and NCBD/ACTR, respectively. Langevin dynamics was performed with 15 fs time steps and a friction coefficient of 0.1 ps 21 . SHAKE was used to fix all virtual bond lengths [73]. Non-bonded interactions were cut off at 25 Å . Unbound IDPs were simulated at 300 K for 750 ns to calibrate the intramolecular interactions. REX-MD was performed using the MMTSB Toolset [68] for calibration of the intermolecular interactions. For this, eight replicas spanning 270 to 400 K were used. The lengths of REX calibration simulations ranged from 1.05 ms (for p53-TAD1/ TAZ2) up to 10 ms (for NCBD/ACTR), as needed for achieving sufficient convergence. Temperature weighted histogram analysis method (WHAM) [74] was used to compute the heat capacity (C V ) curves and generate unbiased probability distributions for free energy and thermodynamic analysis. In particular, the dissociation constants (K D ) were calculated from the bound and unbound probabilities at 300 K [61], where the unbound state was defined as the state without any native intermolecular contacts formed. For NCBD/ACTR complex, the 1D free energy profile lack significant barriers between the unbound and partially bound intermediate states (Fig. 3C, red trace). Therefore, the unbound probability was calculated as 1 -P bound , where P bound is the bound probability (see below for the specific criteria of state assignments). Once calibrated, production simulations of 30-40 ms in lengths were performed using all models at the corresponding T M 's (see Table 2). The T M value was first identified based on the C V curve and then fine tuned to ensure that similar probabilities of sampling the bound and unbound states were observed in the production simulation.

Free energy and kinetic analysis
All free energy profiles were calculated from the REX simulations and the kinetic analysis was performed based on the production simulations, unless otherwise stated. For calculation of contact fractions, a given native contact was considered as formed if the inter-Ca distance was within 1.0 Å of the distance in the native complex. Nonspecific intermolecular contacts are considered as formed when the inter-Ca distance is within 10 Å cutoff. Three general conformational states were defined for each complex, including the unbound (U), collision complex (CC) and bound (B) states, to understand the effects of electrostatic forces on protein-protein encounter and subsequent folding upon encounter. The unbound state includes conformations with no specific or nonspecific contacts formed between IDP and substrate, and the collision complex state includes conformations with at least one nonspecific but no specific intermolecular contact formed. The bound states are defined as following: 1) for p53-TAD1/TAZ2: N inter $11; 2) for HIF-1a/TAZ1: N inter $26 for the no charge model, N inter $23 for the charged model, and N inter $24 for the charged model with 0.05 M salt; 3) for ACTR/NCBD: N inter $30. N inter is the total number of native intermolecular contacts formed. Note that slightly different criteria were used to define the bound state of HIF-1a/TAZ1 due to small shifts of the bound free energy basins calculated using different models (see Fig. 3). 15-ps running averages were used for assigning states, to avoid including fictitious transitions due to rapid small fluctuations in the calculated contact counts (especially between the U and CC states). The overall on and off rates were calculated directly from the average lifetimes of the bound and unbound states (see Table S4). In addition, MFPTs and numbers of transitions among all three states were derived from the production simulation trajectories, and various rates were calculated as defined in Eqns. 2 Here, k cap , k esc , and k evo are the capture, escape (to the unbound state) and evolution (to the bound state) rates of the collision Figure 6. Free-energy surfaces at T m as a function of binding RMSD of the IDP and center of mass separation between two peptides (R CM ), computed using various Gō -like models with and without explicit charges and/or 50 mM salt (see Table 2). The binding RMSD (of the IDP) was calculated by first aligning the snapshot with respect to the folded structure using only the folded substrate. For NCBD/ACTR, both proteins are IDPs and the (regular) RMSD was calculated using the whole complex. Rows A-C, D-F and G-I are for the p53-TAD1/TAZ2, HIF-1a/TAZ1 and NCBD/ACTR complexes, respectively. Contours are drawn every kT. doi:10.1371/journal.pcbi.1003363.g006 complex, respectively; N esc and N evo are the numbers of escape and evolution transitions. Note that the MFPTs calculated correspond to the average times spent in an initial state before a transition to the final state. Ideally, the average lifetime of CC should be independent of whether the trajectory ends up in either the U or B state for a true three-state model as shown in Eq. 1. However, the actual transitions between the CC and B states involve several intermediates that are not represented in Eq. 1, and the effective MFPTs as calculated thus depend on both the initial and final states (e.g., see Tables S1, S2, S3). Analytical expressions on similar MFPTs involved in amyloid fibril templating can be found a recent theoretical analysis by Schmit [75]. All molecular visualizations were prepared using VMD [76]. Figure S1 Residual helicities of (a) p53-TAD1, (b) HIf-1a, and (c) ACTR in the unbound states calculated using different Gō-like models. The solid traces correspond to models without explicit charges and the dashed traces are from the charged models. The black traces were computed from models with no adjustment of the intramolecular interaction strengths (i.e., scale = 1.0), which significantly over-stabilized the helices. The red traces were calculated using the final calibrated models with optimal scaling of intramolecular interactions (see Table 2 of the main text). The residual helicity showed minimal dependence on the salt concentration for all peptides and the corresponding profiles are thus not shown. (TIF) Figure S2 2D free energy surfaces at T m calculated using models with (panels A, C, and E) and without explicit charges (panels B,D, F) (see Table 2 of the main text). Q HIF-1aA inter and Q HIF-1aC inter are the fractions of native intermolecular contacts formed by the first and third helices of HIF-1a, respectively. R CM is the distance between the centers of mass of HIF-1a and TAZ1. Contours are drawn every kT. (TIF) Figure S3 2D free energy surfaces at T m calculated using models with (panels A, C, and E) and without explicit charges (panels B,D, F) (see Table 2 of the main text). Q ACTR-H1 inter ,Q ACTR-H2 inter and Q ACTR-H3 inter are the fractions of native intermolecular contacts formed by the first, second and third helices of ACTR, respectively. Contours are drawn every kT. (TIF) Figure S4 2D free energy surfaces at T m calculated using models with (panels A, C, and E) and without explicit charges (panels B,D, F) (see Table 2 of the main text). Q NCBD-H1 inter ,Q NCBD-H2 inter and Q NCBD-H3 inter are the fractions of native intermolecular contacts formed by the first, second and third helices of NCBD, respectively. Contours are drawn every kT. (TIF) Figure S5 Representative snapshots along the binding and folding pathways of p53-TAD1/TAZ2 extracted from the production simulation using the calibration model without explicit charges. (TIF) Figure S6 Representative snapshots along the binding and folding pathways for HIF-1a/TAZ1 extracted from the production simulation using the calibration model without explicit charges. (TIF)

Supporting Information
Table S1 MFPTs and numbers of transitions (in parenthesis) between conformational sub-states of the p53-TAD1/TAZ2 complex computed from the production Langevin simulations. (DOC)

Table S4
Averaged on and off rates (k on and k off ), as calculated from the mean residence times in either unbound or bound states during the production Langevin simulations at the corresponding T m (as estimated from short replica exchange simulations).

(DOC)
Text S1 Amino acid sequences of all four IDPs simulated. (DOC)