Molecular recognition and packing frustration in a helical protein

Biomolecular recognition entails attractive forces for the functional native states and discrimination against potential nonnative interactions that favor alternate stable configurations. The challenge posed by the competition of nonnative stabilization against native-centric forces is conceptualized as frustration. Experiment indicates that frustration is often minimal in evolved biological systems although nonnative possibilities are intuitively abundant. Much of the physical basis of minimal frustration in protein folding thus remains to be elucidated. Here we make progress by studying the colicin immunity protein Im9. To assess the energetic favorability of nonnative versus native interactions, we compute free energies of association of various combinations of the four helices in Im9 (referred to as H1, H2, H3, and H4) by extensive explicit-water molecular dynamics simulations (total simulated time > 300 μs), focusing primarily on the pairs with the largest native contact surfaces, H1-H2 and H1-H4. Frustration is detected in H1-H2 packing in that a nonnative packing orientation is significantly stabilized relative to native, whereas such a prominent nonnative effect is not observed for H1-H4 packing. However, in contrast to the favored nonnative H1-H2 packing in isolation, the native H1-H2 packing orientation is stabilized by H3 and loop residues surrounding H4. Taken together, these results showcase the contextual nature of molecular recognition, and suggest further that nonnative effects in H1-H2 packing may be largely avoided by the experimentally inferred Im9 folding transition state with native packing most developed at the H1-H4 rather than the H1-H2 interface.


Introduction
Molecular recognition is the basis of biological function.For different parts of the same molecule or different molecules to recognize one another, a target set of interactions need to be favored while other potential interactions are disfavored.Biomolecules accomplish these simultaneous tasks via the heterogeneous interactions encoded by their sequences.For proteins, such energetic heterogeneity is enabled but also constrained by a finite alphabet of twenty amino acids.Thus the degree to which non-target interactions can be avoided through evolutionary optimization is limited [1,2].Conflicting favorable interactions, referred to as frustration, are often present in biological systems.From a physical standpoint, it is almost certain that some of the frustration is a manifestation of the fundamental molecular constraint on adaptation, although under certain circumstances frustration can be exploited to serve biological function [3,4].
Protein folding entails intra-molecular recognition.Early simulations suggested that nonnative contacts can be common during folding [5].This predicted behavior applies particularly to models embodying a simple notion of hydrophobicity as the main driving force [6,7].Experimentally, however, protein folding is thermodynamically cooperative [7,8].Folding of many single-domain proteins does not encounter much frustration from nonnative interactions in the form of kinetic traps [9].Celebrated by the consistency principle [10] and the principle of minimal frustration [11], these empirical trends have inspired Gōlike modeling, wherein native-centric interactions are used in lieu of a physics-based transferable potential [12][13][14].Extensions of this approach allow nonnative interactions to be treated as perturbations in a largely native-centric framework [15][16][17].The success of these models poses a fundamental challenge to our physical understanding as to why, rather nonintuitively, natural proteins are so apt at avoiding nonnative interactions.Solvation effects must be an important part of the answer [18], as has been evident from the fact that coarsegrained protein models incorporating rudimentary desolvation barriers exhibit less frustration and higher folding cooperativity than models lacking desolvation barriers [7,19,20].More recently, and most notably, folding of several small proteins has been achieved in molecular dynamics studies with explicit water [21,22].Nonnative contacts are not significantly populated within sections of the simulated trajectories identified as folding transition paths [23] though they do impede conformational diffusion [24].These advances suggest that certain important aspects of protein physics are captured by current atomic force fields, although they still need to be improved to reproduce the high degrees of folding cooperativity observed experimentally [22,[25][26][27][28].
In this context, it is instructive to ascertain how atomic force fields, as they stand, disfavor nonnative interactions, so as to help decipher molecular recognition mechanisms in real proteins.We take a step toward this goal by comparing the stabilities of native and nonnative configurations of fully formed helices from a natural protein.By construction, this approach covers only a fraction of all possible nonnative configurations and therefore only provides, albeit not unimportantly, a lower bound on the full extent of frustration.Nonetheless, because of its focus on tractable systems, we obtain a wealth of reliable simulation data from which physical insights are gleaned.We do so by applying explicit-water molecular dynamics simulations to compute potentials of mean force (PMFs) between various helices [29] of the E. coli colicin immunity protein Im9 [30].Im9 is a small single-domain protein that undergoes twostate-like folding [31,32] to a native structure with four helices packed around a hydrophobic core [33].Its folding mechanism and that of its homolog Im7 have been extensively characterized experimentally [30][31][32][33][34][35][36][37][38][39][40] and theoretically [41][42][43][44][45][46].Of particular relevance to our study are experimental F-value analyses suggesting that the Im9 folding transition state has a partially formed hydrophobic core stabilized by interactions between helix 1 (H1) and helix 4 (H4), whereas helix 3 (H3) adopts its native conformation only after the rate-limiting step of folding [32].These experimental inferences have since been rationalized by simulations showing that H1 and H4 are formed whereas about one half of helix 2 (H2) remains unstructured in the Im9 transition state [41], and that, unlike Im7, there is no significant kinetic trap along the Im9 folding pathway [45,46].
Building on these advances, our systematic PMF analysis provides a hitherto unknown perspective on these hallmarks of Im9 folding.Notably, we found significant packing frustration between H1 and H2, viz., a nonnative packing orientation can achieve a lower free energy than that afforded by the native packing of these two helices in isolation.Superficially, this simulation result seems at odds with experiments indicating little frustration in Im9 folding.On closer examination, however, our discovery provides an unexpected rationalization for experiments indicating that folding is initiated by the more stabilizing H1-H4 interactions rather than by H1-H2 packing.Because the H1-H2 packing frustration can be circumvented by following such a kinetic order, our finding suggests that the Im9 folding pathway might have evolved to avoid a potential H1-H2 kinetic trap.This example underscores that the inner workings of molecular recognition can be rather subtle and deserves further exploration, as will be elaborated below.

Results
With the above rationale in mind, we apply the technique described in Methods and S1 Text for extensive molecular dynamics simulations to study the 86-residue helical protein Im9 [47], focusing primarily on the interactions among various sets of fragment(s) comprising one or more helices.For terminological simplicity, each fragment set in an interacting pair-including a single helix-is referred to as a bundle below.PMFs of nine pairs of bundles (Fig 1 and Table 1) are computed to ascertain whether native or nonnative associations are preferred.Although intra-bundle conformational variations are restricted in most of our model systems (Methods), the studied configurations are all physically realizable.It follows logically that the observation of favorable nonnative packing in our simulations is sufficient to demonstrate, at least for the atomic force field used here, that favorable nonnative interactions do exist in Im9.

Helices 1 and 2 favor nonnative packing in isolation
We begin by investigating the free energy landscape for the association of H1 with H2, a packing interaction that accounts for the largest two-helix interface in the native state of Im9, burying 5.3 nm 2 or 17% of the total surface area of H1 and H2.Throughout this study, surface areas of helical bundles are computed as the solvent-accessible surface areas of the given bundles in isolation, irrespective of the solvent exposure of the configurations in the complete Im9 folded structure.Using an enhanced sampling technique known as umbrella sampling with virtual replica exchange (US-VREX, see Methods) for restrained helical configurations at systematically varied target packing angles, we compute PMFs for H1-H2 association in the  The two interacting bundles in each system are separated by an arrow.Superscripts "N" and "C" represent, respectively, the three residues N-terminal to H4 and the eight residues C-terminal to H4. Superscript "L" represents the loop residues connecting two consecutive helices (e.g., in H2 L H3 L H4 C , the first "L" stands for residues 24-29, and the second "L" stands for residues [45][46][47][48][49] whereas a slash between two helixcontaining fragments (blocks of residues) indicates that the chain segment between the fragments is not part of the bundle.
b Helical residue selection is based on DSSP [48], except H3, which is extended from 51-54 to 50-55 based on Friel et al. [32].c Note that Asp62, Ser63, and Pro64 are part of this extended H4 fragment.
absence of their intervening loop (the H1!H2 system in Fig 1B and Table 1).The PMFs are determined for the native orientation as well as for nonnative orientations and nonnative crossing angles entailed by the imposed rotational preferences (Methods and S1 Text).Our technique allows these simulations to converge rapidly (S1 Fig) .Each PMF is then integrated over a free-energy basin to provide a binding free energy, ΔG bind , for a specific inter-helix geometry.
Unexpectedly, H1-H2 association is favored by a 20-30˚positive rotation of H1 against H2.Binding in this nonnative orientation is 10-12 kJ/mol more stable than that in the native orientation (black circles in Fig 2A and Table 2) provides the relative favorability of packing orientations resulting from simultaneous rotations of H1 and H2.This two-dimensional PMF is generated by combining sampling data for H1 and H2 rotations under harmonic biasing potentials (S1 Text).It is clear from this two-dimensional landscape that native packing [(H1, H2) rotations equal (0˚, 0˚)] is less favored than the free energy minimum at (+19˚, +4˚).Indeed, this minimum is situated in a rather broad basin encompassing many nonnative orientations with simultaneous H1 rotation from approximately +5˚to +25˚and H2 rotation from approximately -3˚to +15˚that are energetically more favorable than the native H1-H2 orientation (0˚, 0˚).Fig 2C reveals further that there exists another basin of favorable nonnative H1-H2 packing for which both helices rotate by approximately -20˚.In short, our systematic analysis in Fig 2 demonstrates unequivocally that packing frustration exists in Im9, in that when H1 and H2 are considered in isolation, nonnative packing is favored over native packing.
To assess the prospect that intervening loop residues may provide additional guidance for native packing of H1 against H2, we also simulate this helix-loop-helix as a single chain (H1 L H2 system; Table 1).Because the covalent connection of H1 to H2 is incompatible with the large helical separations used in our importance sampling, we study the H1 L H2 system without inter-helical distance bias in simulations initiated in either the native state or one of 20 different nonnative orientations in which H1 or H2 is rotated by ±10-50˚.[Because the actual rotations sampled during simulations are close to those targeted by the restraining potentials (S4 Fig) , we do not distinguish between target and actual rotations hereafter].Although these simulations do not converge to a single conformational distribution, they show broad sampling of H1 rotation with a stable or metastable state near +20˚rotation of H1, even when simulation is initiated at the native packing angle (S6 Fig).

But helix 1 is favored to pack natively against the rest of the protein
To explore how the H1-H2 packing frustration might be overcome in Im9 folding, we next investigate the impact of the rest of the protein on the packing between H1 and H2 by computing binding free energies for the association of H1 and H2 not in isolation but in the presence of additional protein fragments involving the other two helices H3 and H4 as well as loop and ΔG bind is computed from the total Boltzmann-weighted H1-H2-distance-dependent population of the entire free-energy basin (thus it correlates with but is not necessarily equal to the minimum PMF value; see S1 terminal residues.The conformations of the loop and terminal residues in our simulations are restrained to those in the Im9 PDB structure.
We first consider the association of H1 with a bundle comprising helices 2, 3, and 4 connected by their intervening loops and extending to the protein's C-terminus (H1!H2 L H3 L H4 C ; Table 1).Interestingly, for this system, native packing is found to be 13 ± 3 kJ/mol more favorable than the nonnative packing resulting from a +30˚rotation of H1 (Table 2).The very fact that a nonnative rotation of H1 is substantially favored in H1!H2 (Fig 2A and Table 2) but disfavored in H1!H2 L H3 L H4 C (Table 2) demonstrates clearly that some components of the H2 L H3 L H4 C bundle besides H2 are crucial for overcoming the H1-H2 packing frustration and guiding H1 to pack natively.Furthermore, because native packing is favored in H1!H2 L H3 L H4 C despite the residues N-terminal to H1 (including a Text).(B) PMFs here are distance-dependent free energies for the association of H1 and H2 in native (black curve) and nonnative orientations with H1 targeted to be rotated by +30˚(blue curve) or −30˚(red curve).Actual rotation angles sampled during the computations of these PMFs are close to the targets (S3  Each row represents a pair of interacting bundles.One bundle is the reference.The other, i.e., those along the "Rotated helix" column, is rotated.The relative positions of all C α atoms within any given bundle-which include C α atoms in the loop and/or terminal regions if they are part of the bundle-are maintained at the corresponding relative positions in the PDB structure of the entire protein.The absolute position of the reference bundle is fixed in a global Cartesian coordinate system by harmonic position restraints on all of its C α atoms along all three-x, y, and z-axes of the coordinate system.For the rotated bundle, the position restraints are applied only along the y and z axes.This serves to fix the relative angular orientation of the two bundles but allow for a variable distance between them.Center-of-mass distance between the reference and rotated bundles is varied during simulations by changing the favored x-value of the one-dimensional harmonic restraint on the rotated bundle.See Methods. b ΔΔG bind = ΔG bind (+30˚rotation) -ΔG bind (native); negative values of ΔΔG bind indicate that +30˚rotation is more favorable than the native orientation.https://doi.org/10.1371/journal.pcbi.1005909.t002short 3-10 helix) being excluded in this model system, these N-terminal residues are likely not necessary for ensuring native packing of H1 against the rest of the Im9 protein.

H3 and loop residues surrounding H4 assist native packing of H1 in varying degrees
We now dissect the H2 L H3 L H4 C bundle to ascertain the contributions from different parts of this bundle to native H1 packing.To this end, binding free energies for the association of H1 with a variety of subsets of H2 L H3 L H4 C are computed.We first consider a bundle comprising helices 2 and 4 (H1!H2/H4; Table 1).Somewhat surprisingly, native packing in the H1!H2/ H4 system is disfavored by as much as 22 ± 1 kJ/mol when compared against nonnative packing with H1 rotated by +30˚, even more than the corresponding nonnative preference of 10 ± 1 kJ/mol for H1!H2 (Table 2).This observation implies that H4 by itself is not promoting H1-H2 native packing and therefore H3, loops, and/or the C-terminus must be responsible for driving native packing of H1 with H2 L H3 L H4 C .Indeed, when compared against H2/H4, the presence of these other elements in H2 L H3 L H4 C results in a 26 ± 1 kJ/mol preference for native H1 packing and a 9 ± 3 kJ/mol discrimination against nonnative H1 packing with a +30˚rotation (Table 3).
To better pinpoint the role of H3 in this intra-molecular recognition process, we compute binding free energies for the association of H1 and a bundle comprising helices 2, 3 and 4 but without the intervening loops and the C-terminus (H1!H2/H3/H4; Fig 1D and Table 1).For this model system, native packing is less favorable than +30˚rotation of H1 by 11 ± 3 kJ/mol (Table 2).Nonetheless, in comparison to H1!H2/H4, the inclusion of H3 favors native packing more than it favors nonnative packing with a +30˚rotation of H1 (Table 2).This observation indicates that H3 is capable of correcting part of the nonnative tendencies of H1 imparted by its interactions with a bundle comprising only of H2 and H4; but H3 is insufficient to ensure native packing in the absence of the connecting loops and/or the C-terminus.
To explore whether inclusion of residues neighboring H4 may alter its effect on H1-H2 packing, we consider three residues immediately N-terminal to H4 (Asp62, Ser63, and Pro64).These residues are chosen because they are known to associate directly with H1 in the NMR structure [47] and thus they may contribute positively to native intra-molecular recognition.Consistent with this expectation, once these three residues are included, the H1-binding free energies in the resulting H1!H2/ N H4 system (Table 1) for native packing and nonnative +30˚rotation of H1 become essentially energetically equivalent (ΔΔG bind = 2 ± 6 kJ/mol; Table 2).Inasmuch as promoting native H1-binding is concerned, this represents a significant improvement over H1!H2/H4 that favors the +30˚-rotated nonnative packing by 22 ± 1 kJ/ mol (Table 2).Indeed, in the context of H1!H2/H4, addition of these N-terminal flanking residues assists native packing by 31 ± 5 kJ/mol, much more than the 7 ± 3 kJ/mol increase in stability they also impart on the nonnative packing of H1 with a +30˚rotation (Table 3).These numbers underscore the important role of Asp62, Ser63, and Pro64 in discriminating against nonnative packing of H1.
Another set of helix-flanking residues that may assist native packing in Im9 is its C-terminus.Such an effect is expected because a +30˚rotation of H1 would likely place its constituent residue Phe15 into a steric clash with the C-terminal residue Phe83 (S7 Fig) and thus existence of the C-terminus should discriminate against such a rotation of H1.To evaluate this hypothesis, we compute H1-binding free energies with a bundle comprising H2 and H4 as well as the protein's C-terminus (H1!H2/H4 C ; Table 1).Similar to the addition of Asp62, Ser63, and Pro64 N-terminal to H4 in H2/ N H4 bundle, inclusion of the C-terminus in H2/H4 C eliminates the strong nonnative bias in H1!H2/H4, resulting in essentially no discrimination between the native orientation and a +30˚rotation of H1 (ΔΔG bind = 1 ± 3 kJ/mol; Table 2).Relative to H1!H2/H4, addition of the C-terminus not only favors native packing by 6 ± 2 kJ/mol but also directly disfavors +30˚rotation of H1 by 17 ± 2 kJ/mol (Table 3).The latter penalization of nonnative packing (which does not occur in H1!H2/ N H4) is consistent with the aforementioned steric consideration (S7 Fig).
Interestingly, the native-promoting effects of N-and C-terminal extensions to H4 are essentially additive.When both extensions are added to H4, the H2/ N H4 C system (Table 1) is sufficient to favor native packing of H1 by 14 ± 6 kJ/mol over the nonnative packing with +30r otation of H1 (Table 2).

Native packing between H1 and H4 is assisted by flanking loop residues
After analyzing systems involving H2, we now turn to the intra-molecular recognition between H1 and H4 without involving H2.Native H1-H4 packing constitutes the second largest twohelix interface in the Im9 folded structure, burying 3.7 nm 2 which amounts to 13% of the sum of individual surface areas of H1 and H4.PMFs for helices 1 and 4 in isolation (H1!H4; Fig 1C and Table 1) are computed in the native orientation as well as nonnative orientations resulting from rotations of H1 or H4.When H1 is rotated while H4 is fixed, native packing is favored (Fig 3A, black circles); however, when H4 is rotated with H1 fixed, a +30˚nonnative rotation of H4 leads to 5 ± 1 kJ/mol stabilization (decrease in ΔG bind ) relative to native (red squares in Fig 3A and Table 2).Distance-dependent PMFs for the native orientation and ±30r otations of H4 are shown in Fig 3B, indicating that the favored nonnative packing at +30˚is attained at an H1-H4 separation slightly larger than native by about 0.1 nm.The two-dimensional PMF (Fig 3C) as a function of H1 and H4 rotation angles shows further that native H1-H4 packing (0˚, 0˚) is situated at the periphery of a broad basin of favored orientations centered roughly around (+10˚, +10˚).The same two-dimensional landscape suggests that H1 rotations of !+50˚or -50˚can also be favored with little or no H4 rotation.
We noted earlier that a 3-residue N-terminal extension to H4 directly contacts H1 in the native state and that the inclusion of these residues assisted the native packing of H1 against a bundle comprising helices H2 and H4.Consistent with that observation, these three residues -Asp62, Ser63, and Pro64-likewise assist the native packing of H1 against H4, viz., their inclusion in the H1! N H4 system (Table 1) makes native packing (ΔG bind = -44 ± 1 kJ/mol)  2).We conclude from these results that helices H1 and H4 are nearly capable of associating in native-like conformations by themselves in isolation; and that they can certainly achieve native packing with the assistance from the 3-residue N-terminal extension to H4.These results suggest that Im9 residues 12-23 and 62-78 may serve as major components of a native-like folding nucleus.

Certain specific interactions are particularly favorable to nonnative packing
To better understand the driving force for nonnative H1-H2 packing, the potential energies between specific pairs of amino acid residues on the H1-H2 interface in the native orientation are compared against those in the nonnative orientation with a +30˚H1 rotation.We make this comparison for helix-helix center of mass distance d 0 i = 1.10 nm in both the native and non-native configurations, wherefore each pair of helices in question is in close spatial contact (Fig 4).The analysis indicates a prominent role by the more favorable Lennard-Jones interactions between interfacial residue pairs Glu14-Met43, Leu18-Phe40, and Ile22-Phe40 in favoring the nonnative packing, whereas electrostatic interactions between these residue pairs are of similar strengths for the native and nonnative packing orientations.In contrast, the interaction between Ile22 and Leu33 favors native packing, but its effect is more than compensated by the aforementioned multiple residue-residue interactions that drive nonnative packing such that a +30˚rotation of H1 is favored over the native orientation for H1-H2 packing in isolation.It is noteworthy, however, that while these residue-residue energetic effects can be significant individually (Fig 4) and collectively (Table 2), they are not accompanied by obvious, drastic structural changes at the level of residue-residue contacts.When contacts between residues on different helices at a helix-packing interface are identified by a commonly used proximity threshold, contact probabilities between the helices are seen to remain essentially unchanged upon a +30˚H1 native-to-nonnative rotation in both the H1!H2 and H1! H2 L H3 L H4 C systems (S8 Fig).
Seeking physical reasons for favoring native packing in H1! H2 L H3 L H4 C but not in H1!H2, we compare the potential energies of these systems in the native and the +30H 1-rotated nonnative configurations (Fig 5).When potential energies are analyzed by the molecular species involved in the interactions, for H1!H2, solvent-protein (solvent-helix) interactions are more unfavorable with nonnative rotation of H1 by +30˚, but this effect is overwhelmed by larger, favorable changes in solvent-solvent and intra-and inter-helix interactions (Fig 5A).More specifically, this nonnative H1 rotation favors inter-helix Lennard-Jones interactions (as exemplified by the three residue pairs circled in red in Fig 4A) as well as intrahelix and solvent-solvent electrostatic interactions (Fig 5A), netting an overall favorable (more negative) potential energy for the nonnative orientation (Fig 5A, "sum").In contrast, the corresponding analysis for H1!H2 L H3 L H4 C yields a set of average potential energies that favors the native state overall (Fig 5B, native "sum" more negative than nonnative).This potential energy (enthalpic) trend is consistent with the above PMF/binding free energy prediction that the native orientation is favored for H1!H2 L H3 L H4 C (Table 2), though entropic effects may make additional contribution to the stability of native packing of H1 against H2 L H3 L H4 C (see

Entropic stabilization of native packing
To gain further insight into the differential effects of H2 and H2 L H3 L H4 C on the favorability of the native orientation upon H1 binding, we resolve the distance-dependent H1!H2 and H1!H2 L H3 L H4 C PMFs (Fig 6A and 6B, respectively) into their enthalpic (Fig 6C and 6D) and entropic (Fig 6E and 6F) components.Since the backbones of the helical elements in our simulation systems are restrained to be essentially rigid, the entropic contributions computed

Enthalpic and volume barriers in helix-helix binding
Every helix-helix association in Fig 6 entails an enthalpic barrier at separation %1.5 nm (Fig 6C and 6D).As implied by the absence of PMF barriers at these positions (Fig 6A and 6B), the enthalpic barriers here are compensated by a larger decrease in entropic free energy at the same positions (Fig 6E and 6F).Further examples of enthalpic barriers and entropic  protein folding [51,52].For the systems studied in Fig 6, it is not surprising that the enthalpic and volume barriers are higher for H1!H2 L H3 L H4 C than for H1!H2 because the former binding process buries a significantly larger protein surface area.Therefore, we expect a larger transient void volume between the approaching helices before close packing is achieved for H1!H2 L H3 L H4 C than for H1!H2.It is interesting to note that, perhaps because void volumes are largely a consequence of geometry and less of energetics, the volume barrier heights in

Discussion
To recapitulate, we have conducted a systematic analysis of the relative stability of native versus nonnative packing of helices in the Im9 protein as a means to address the physical basis of biomolecular recognition.These results are summarized schematically in Fig 7: Relative to native packing, three nonnative configurations (H1!H2, H1!H2/H4, and H1!H2/H3/H4, each with H1 rotated) are significantly stabilized whereas one other nonnative packing orientation (H1!H4 with H4 rotated) is mildly stabilized.Other Im9 systems that we have simulated either favor the native configuration or essentially do not discriminate between native and nonnative packing.As emphasized at the outset, our method is designed to characterize packing frustration of constrained, locally native protein substructures by varying the orientation between interacting substructures that are rigid by construction, viz., the secondary structure (main-chain conformation) of each of the helices is essentially fixed.It follows that while our substantial computational effort has succeeded in gaining structurally and energetically high-resolution information about frustration that is novel and complementary to that obtained from our previous coarse-grained chain model study of Im7 and Im9 [45], the present investigation-unlike our coarse-grained modeling [45]-cannot by itself address certain general questions regarding folding pathways such as the viability of nucleation-condensation mechanisms [53] because backbone conformational freedom is not treated.For the same reason, the present method does not tackle frustration involving disordered, flexible main-chain segments that may adopt locally nonnative conformations.A notable example in this regard is the second helix of Im7.Among the four respective helices in Im7 and Im9, the amino acid sequence of the second helix varies the most between the homologs [30].The second Im7 helix has been identified as a part of the protein which is disordered and participates in nonnative interactions that stabilize a kinetically trapped folding intermediate during the process of nontwo-state folding of Im7 [45].However, revealingly, the significant role of a disordered H2 in frustrating Im7 folding is not reflected by its behavior as an ordered helix: Unlike the H1!H2 system of Im9, the H1!H2 system of Im7 exhibits no favorable nonnative packing (S10 Fig) .This finding underscores the importance of disordered conformations to frustration in globular protein folding, an effect that the present analysis has not addressed.From a broader perspective, such effects have to be even more critical for molecular recognition among intrinsically disordered proteins [54,55].
Notwithstanding aforementioned limitations of the present approach, several important lessons can already be learned from our extensive computational investigation.First, a majority of the helical systems that we consider favor native packing, indicating that the Im9 amino acid sequence encodes a sufficiently strong native bias such that the native structure can be recognized by the folding protein.Second, frustration exists, manifested most notably by-but not necessarily limited to-the significantly stabilized nonnative H1-H2 packing.Although the conformational space accessible to an 86-residue polypeptide is vast compared to what is accessible via contemporary simulation and thus our ability to identify all possible sources of frustration is limited, the systematic approach taken in the present study does pinpoint one class of frustrated configurations.Third, the native fold is favored overall despite frustration, at least within the class of configurations we tested, because nonnative H1-H2 packing is destabilized when other parts of the protein, especially H4 and its flanking residues, are involved in the interaction.
A logical inference from our results is that favorable nonnative interactions can be largely suppressed during Im9 folding by favoring trajectories that assemble H1 and H2 not in isolation but only in the presence of H4 plus flanking residues.Such preference would help avoid kinetic traps to facilitate known two-state folding behaviors of Im9 [32,36].This expectation is consistent with the Im9 folding mechanism deduced from experimental phi-values (F F ) by Radford and coworkers, who determined that residues in H2 have the lowest F F -values among H1, H2, and H4; but F F -values are higher for the hydrophobic residues in H1 and H4.This  1. Helices are depicted as circles, covalent linkages are indicated by black bars, and the positions of native helix-helix interfaces are highlighted by red arcs.For each system, the configurations on the left and right are native and nonnative, respectively, with the nonnative rotation indicated by a red arrow.Black arrows point toward the orientation with a more favorable binding free energy, with bidirectional arrows indicating energetic equivalence and arrow thickness representing free energy differences with absolute values that are mild ( 6kJ/mol) or significant (! 10 kJ/mol).Black or grey boxes enclose, respectively, nonnative packing configurations that are significantly or mildly favored over native packing.https://doi.org/10.1371/journal.pcbi.1005909.g007and other findings led them to conclude that the H1-H4 interface "is the most structured region in the transition state ensemble", and that the native configuration of H1, H2, and H4 is partially formed in the transition state whereas H3 is formed after the rate-limiting step [32].Since our simulation results also suggest that H1-H2 interactions should be weaker than those between H1-H4 to minimize kinetic trapping, our data offer a physical rationale as to why the Im9 folding pathways might have evolved.
A general theoretical formalism due to Wolynes and coworkers provides quantitative estimates of local frustration [3,42,56].Of relevance here is their configurational frustration index, which quantifies the likelihood of a pair of residues that are in contact in a protein's native structure to be engaged in favorable nonnative interactions in alternate conformations.Their web-based "Protein Frustratometer" algorithm [56] predicts a high configurational frustration region in Im9 encompassing residues 25-38, which overlaps substantially with H2 (residues 30-44, Table 1).In contrast, H1 and H4 are predicted to be situated in lower configurational frustration regions on average (S11 Fig).These predictions are consistent with, and therefore lend further support to the aforementioned perspective emerging from our simulation results.It is noteworthy, however, that the Frustratometer-computed configurational frustration F c of Im7 is not noticeably higher on average than that of Im9 (S11 Fig) , notwithstanding the fact that folding is significantly more frustrated for Im7 than for Im9 experimentally [30][31][32][33][34][35][36][37][38][39][40].In particular, while the predicted frustration of H4 is higher for Im7 than for Im9 (which is consistent with H4's involvement in nonnative interactions with H2 in Im7 folding), the predicted configurational frustration of H2 of Im7 is similar to, or even slightly lower than that of Im9.It would be instructive to investigate whether this apparent inability of the algorithm to clearly delineate the key experimental difference in Im7 and Im9 folding kinetics is because the decoy inter-residue contact distances used to compute configurational frustration F c [56] are insufficient to fully capture the conformational possibilities of a disordered H2 that make strong nonnative interactions in Im7 possible [45].Intuitively, this limitation might be similar or even related to the impossibility of discerning Im7 frustration from the packing of fully formed H1 and H2 alone (S10 Fig) despite the fact that many of the favorable nonnative interactions in Im7 folding are between residues in H1 and H2.This question deserves further attention.
Owing to the high computational cost of the present approach, applications have been confined to the commonly used OPLS-AA/L force field.While useful insights are gained as reported above, it should be noted that current molecular dynamics force fields can be limited in their ability to accurately model disordered protein states (reviewed in [57,58]) and to capture subtle effects such as conformational switches [59].It is important, and would be instructive, to assess how discrimination against nonnative interactions is affected by ongoing efforts to improve current force fields [57,58].Much work remains to be done before the physical basis of biomolecular recognition can be fully deciphered.

Methods
We use molecular dynamics (MD) simulations to systematically study helix packing in Im9 (PDB ID: 1IMQ [47]) by constructing pairs of various Im9 fragments (bundles) comprising one or more helices (Fig 1 and Table 1) and computing their PMFs of association (Figs 2-4).A more limited set of Im7 bundles is also studied for comparison.Helical residues (Table 1) are defined by DSSP [48].Helical rotations with positive and negative angles indicate clockwise and counter-clockwise angular displacements, respectively, around the helix's long axis in the N-to C-terminal direction (Fig 1) relative to the native orientation.Positive and negative changes in helix-helix H1-H2 crossing angles are, respectively, rigid rotations of H1 in the clockwise and counter-clockwise directions with respect to the vector directed from the center of mass of H1 to the center of mass of H2, the angular changes being relative to the native H1-H2 crossing angle.

Umbrella sampling with virtual replica exchange (US-VREX)
Umbrella sampling (US) [60] simulations are employed to quantify the extent to which the residues in two pre-folded regions of the protein are sufficient to drive native-like association.Specifically, we compute orientation-specific free energies for the binding of H1 to a systematic selection of helices from other parts of the protein with and without connecting loops.To enhance computational tractability, the latter helical bundles are prevented from unfolding or changing their relative orientations by imposing harmonic restraints on the positions of all C α atoms with force constants of 1000 kJ/mol/nm 2 .Unfolding of H1 is disallowed by C α position restraints that are enforced only in the Cartesian y and z dimensions, using the same force constant.The US order parameter is the magnitude of the Cartesian x component of the vector connecting the centers of mass of C α atoms in the two bundles.This linear displacement, d, is harmonically restrained at a specified target value, d 0 i , in each umbrella i, with a force constant of 2000 kJ/mol/nm 2 .For each system, 39 umbrellas span 0.7 nm d 0 i 2.6 nm in 0.05 nm increments.To further enhance the rate of convergence in these US simulations, we allow equilibrium exchange of umbrellas using the virtual replica exchange (VREX) approach [61,62].Further details of the US-VREX approach are provided in S1 Text.US-VREX simulations are conducted for the H1!H2, H1!H4, H1! N H4, H1!H2/H4, H1!H2/ N H4, H1!H2/ H4 C , H1!H2/ N H4 C , H1!H2/H3/H4, and H1!H2 L H3 L H4 C systems of Im9 (Table 1), where the arrow separates the two interacting fragments (bundles) under consideration.The H1!H2 system of Im7 is also simulated using the same method.The two bundles in any given system are on equal footing because their association with each other is mutual.The arrow in our notation serves merely to indicate their spatial association without regard to the arrow's direction.Each system is simulated for 100 ns/umbrella, except for H1!H2 and H1!H2 L H3 L H4 C in the native orientation and with nonnative H1 rotation by +30˚, which are simulated for 500 ns/umbrella.In total, these US-VREX simulations comprise >300 μs of simulated time.Despite the application of position restraints to prevent the helices from unfolding or changing their relative orientations during PMF computations, the rotation angle of helices varies within ±10˚of the target packing angle.We identify the simulated systems by the angles to which they are targeted.

Single-chain simulations
The H1 L H2 system comprising H1, H2, and their connecting loop is simulated from the native [47] and twenty different nonnative initial conformations generated by removing inter-helical loop residues 24-29, rotating H1 or H2 about its long axis by ±10˚, ±20˚, ±30˚, ±40˚and ±50˚, and then modeling loop residues using the prediction program Loopy [63].Secondary structure is maintained while allowing changes in helical rotation and separation by applying intrahelical distance restraints on all backbone atom pairs with force constants of 1000 kJ/mol/nm 2 .Each simulation covers 1 μs, with the first 125 ns discarded in subsequent analysis.

Simulation protocol
MD simulations are conducted with version 4.5.5 of the GROMACS simulation package [64].The water model is TIP3P [65].Protein is modeled by the OPLS-AA/L parameters [66,67].Simulation systems are neutralized and excess NaCl is added at 0.4 M, mimicking experimental conditions [31,68].Water molecules are rigidified with SETTLE [69] and protein bond-lengths are constrained with P-LINCS [70].Lennard-Jones interactions are evaluated using a group-based cutoff and truncated at 1 nm without a smoothing function.Coulomb interactions are calculated using the smooth particle-mesh Ewald method [71,72] with a Fourier grid spacing of 0.12 nm.Simulations are in NPT ensembles by isotropic coupling to a Berendsen barostat [73] at 1 bar with a coupling constant of 4 ps and temperature-coupling the simulation system using velocity Langevin dynamics [74] at 300 K with a coupling constant of 1 ps.The integration time step is 2 fs.The nonbonded pair-list is updated every 20 fs.Further details are provided in S1 Text and S1 Table.Here a contact is said to exist between two residues if at least two heavy atoms, one from each residue, are separated by 0.45 nm.(B, C) Corresponding contact probabilities for H1! H2 L H3 L H4 C between residues in H2 and those in H3 and H4 (B), and between residues in H1 and those in H2, H3, and H4 (C).Color scale (top right) indicates a range from no contact (white for probability zero) to constant contact (blue for probability of one).In each of these cases (A, B, and C), results shown are for native (left panel) and nonnative rotation of H1 by +30˚(right panel).For the H1! H2 L H3 L H4 C results in (B) and (C), residues of the helices are marked by color bars to the right of each set of contact maps (H2: blue, H3: grey, H4: green).(TIF)

S9 Fig. Changes in local water density upon Im9 helix-helix binding.
Colors (top scale) indicate densities that are greater (blue) or less (red) than bulk water at 300 K for a 0.4 nm slice passing through the center of mass of H1 and H2 (A, B, C) or H2 L H3 L H4 C (D, E, F).Data shown depict three representative separations between the approaching helix bundles (cf.Fig 6 of the main text): (A, D) the position corresponding to the solvent-separated enthalpy minimum at d = 1.90 nm, (B, E) the desolvation enthalpic barrier at d = 1.45 nm, and (C, F) the free energy minimum at d = 1.15 nm.Note that the sidechains of the approaching helix bundles are farther apart at the desolvation enthalpic barrier (B, E) than at contact (C, F).However, unlike the situation in (A, B), there is no water between the helix bundles in (B, E).Thus the total system volume is larger for (B, E) than for either (A, B) or (C, F).In other words, a volume barrier develops around d = 1.45 nm for both H1!H2 and H1!H2 L H3 L H4 C systems (see native state is from PDB 1AYI (ref.[1] of S1 Text), with H1 and H2 comprising residues 12-connection of H1 to H2 by its intervening 6-residue loop although the loop itself is not present in the simulations.
The H1H4 system is constructed similarly, with analogous rotation of H4 in place of H2 but crossing angle changes are not considered.
Because binding free energies at +20° and +30° H1 rotation are not significantly different in the H1H2 system (Fig. 2A of the main text), we expect a larger rotation of H1 (+30° instead of +20°) to provide binding free energy differences that are easier to interpret given our use of harmonic restraints rather than holonomic constraints to maintain helical orientations.Loop conformations from the native state are used for the H1 H2 L H3 L H4 C system.Number of water molecules and initial box dimensions are indicated for all simulation systems in S1 Table .Simulations of H1H2 systems for Im7 are also conducted for comparison (S10 Fig) .Umbrella sampling with exchange of umbrellas (US-VREX).Virtual replica exchange (VREX) is a generalized ensemble sampling algorithm.Similar to replica exchange (RE) [3][4][5][6][7], VREX comprises a set of simulations (replicas) each of which samples a target ensemble that is restrained at a different position along a globally defined order parameter.In RE, pairs of simulations are periodically subjected to a reciprocal exchange attempt based on their potential energies and their instantaneous positions along the order parameter using a Metropolis Monte Carlo criterion that preserves equilibrium populations within each ensemble.VREX improves upon the computational efficiency of RE by removing the need for synchronicity of the two simulations in the exchange attempt.This asynchronicity is accomplished by storing the instantaneous value on which exchange is based, in a list that is associated with the target sampling ensemble.These stored values are used subsequently in virtual reverse exchange attempts while evaluating the probability of a transition for a replica in a neighboring ensemble [8].In this work, we use the approach of Neale et al. to conduct a random walk in helical displacement by exchanging umbrellas, i.e., umbrella sampling-VREX or US-VREX in short [9].US-VREX simulations employ 39 umbrellas using the GROMACS pull code.Following 5,000 steps of steepest-descent energy minimization, 3 ns of molecular dynamics (MD) simulation are performed in the NPT ensemble to generate initial conformations.In US-VREX simulations, the helical displacement, d, is harmonically restrained at a specified target value, 0 i d , in each umbrella i, with a force constant, k u = 2000 kJ/mol/nm 2 .Stochastic jumps in 0 i d are attempted every 4 ps for a total of 100 ns per umbrella (510 4 exchange attempts per replica) or a total of 500 ns per umbrella (2.510 5 exchange attempts per replica).No values are stored on virtual-exchange lists [9] in the first 250 simulation segments of any replicas and no exchanges are attempted for the first 300 simulation segments.Sampled values are stored every 0.2 ps.
Free energies.The values of d sampled in US-VREX simulations are converted to PMFs by using Alan Grossfield's implementation [10] of the weighted histogram analysis method (WHAM) [11].Accordingly, recorded values of helical separation in the range 0.7 nm ≤ d ≤ 2.6 nm are distributed among 500 histogram bins and the WHAM calculation is performed with a tolerance of 4.18410 -5 kJ/mol.Each resulting PMF describes the free energy as a function of separation distance, ΔG d , from a maximal separation of 0 i d =2.6 nm to closest (contact) interaction at 0 i d = 0.7 nm.Each PMF is then shifted such that the average value of ΔG d in the range 2.0 nm ≤ d ≤ 2.6 nm, denoted as ΔG bulk , is equal to zero.The standard deviation of the mean values of PMFs are calculated using the block averaging technique [12] by computing PMFs from blocks of 20 ns and 120 ns, respectively, for simulations with 100 ns and 500 ns/umbrella.The binding free energy, ΔG bind , used in the present analysis is obtained by a procedure of trapezoid integration of the PMF similar to one employed previously [13] over a free energy basin defined to be in the range 0.7 nm ≤ d ≤ 2.0 nm, viz.,   Here β=(k B T) -1 where k B is Boltzmann constant and T is absolute temperature; d(d) is the differential element for the helix-helix distance d.Effectively, the quantity ΔG bind is defined relative to an unbound ΔG d = 0 reference state that has a d-value range of 1.0 nm [13].The ΔG bind so defined by Eq.S1 is not dependent upon concentration although concentration effects can be incorporated in a derived quantity of ΔG bind that takes into account translational entropy of the bundles.Because the free energy cost of conformational restraints should be independent of the packing angle when the bundles are sufficiently far from one another (taken to be d > 2.0 nm in the present calculation), we may quantify the relative binding affinities of simulated systems with different compositions and/or orientations by comparing their ΔG bind values without regard to the particulars of their common reference state.
To assess systematic sampling errors in our free energy computation, we use WHAM to compute free energy profiles based on data sets comprising sampling from time t to t + Δt from each umbrella independently for Δt = 20 ns and increasing values of t.We then use visual inspection to determine the amount of initial non-equilibrium sampling (20 ns per umbrella).
After discarding this initial equilibrating sampling, the uncertainty in ΔG bind is estimated by its deviation, bind G   , which is calculated after dividing the remaining trajectory into three blocks (i.e., Δt = 27 and 160 ns, respectively, for the simulations with 100 and 500 ns/umbrella).
When evaluating relative free energy differences, ΔΔG bind = ΔG bind (i) -ΔG bind (j), for orientations i (corresponding to +30° helical rotation) and j (native; see Tables 1 and 2  Every Δ-quantity is the value of the given quantity at distance d minus the average of the quantity in the region 2.0 nm ≤ d ≤ 2.6 nm that we use to define the ΔG baseline (see above).
Volume V is obtained by averaging over NPT simulation box sizes and P = 1 bar, as has been specified in the main text.This analysis is applied to H1H2 and H1H2 L H3 L H4 C in Fig. 6 of the main text and S2 Fig.
While statistical error can probably be reduced by computing the entropic component via the temperature-derivative of the entropy, i.e., by conducting simulations at multiple temperatures, average internal energies should provide adequate results [17] for our present purposes.
Solvent-accessible surface area.Solvent accessible surface area is computed using a GROMACS analysis tool that implements the algorithm outlined by Eisenhaber et al [18].For terminological simplicity, it is referred to as "surface area" in the main text.a The two interacting bundles in each system are separated by an arrow.Superscripts "N" and "C" represent, respectively, the three residues N-terminal to H4 and the eight residues C-terminal to H4. Superscript "L" represents the loop residues connecting two consecutive helices, whereas a slash between two helix-containing blocks of residues indicates that the chain segment between the blocks is not part of the bundle of interest.Deviations between actual and targeted rotations arise from effects of many potential energy terms in the simulated system in addition to the imposed angle-restraining potential.The differences between actual and target angles shown here are relative to baselines defined by the behavior of the system at 0 i d > 2.0 nm for which the interactions between the two bundles are expected to be sufficiently weak such that they may be considered to be independent.

Fig 1 .
Fig 1. Schematics of Im9 simulation systems.(A) Full-length Im9 (PDB ID: 1IMQ [47]).Helices are represented as cylinders.(B-D) Combined helical wheel and cylinder representations of systems wherein H1 packs against (B) H2, (C) H4, or (D) H2, H3, and H4.For each helical wheel, the red arrow indicates the residue closest to the viewer.Energetic effects of translating H1 in the directions of the solid blue arrows are determined with the position(s) and orientation(s) of the opposing helix or helices (cylinders) fixed.To evaluate the energetic consequences of helical rotation and nonnative packing, the fragment depicted by the helical wheel is rotated (dashed arrows) to nonnative orientations with positive (+) and negative (-) rotation angles.Residues on the helical wheels are colored differently depending on the type of amino acid: charged residues in grey, nonpolar residues in yellow, and polar residues in white.https://doi.org/10.1371/journal.pcbi.1005909.g001 , a free energy difference equivalent to a ~50-fold increase in bound population (S1 Text).In contrast, the binding free energy profiles for rotating H2 against H1 (Fig 2A, red squares) or changing the H1-H2 crossing angle (Fig 2A, blue triangles) indicate that the state corresponding to native packing (0˚angle in Fig 2A) is situated well within the basin of lowest free energy with respect to these degrees of freedom, although a 50˚positive change in H1-H2 crossing or a 20˚negative rotation of H2 against H1 would leave the system approximately iso-energetic with the native packing (Fig 2A).As mentioned, these binding energies are computed from PMFs such as those in Fig 2B and S2 Fig.A broader view of the orientation-dependent H1-H2 packing free energy landscape can be seen in Fig 2C.Instead of fixing either H1 or H2 in its native orientation (as in Fig 2A), Fig 2C

Fig 2 .
Fig 2. Im9 binding free energies for H1!H2.(A) Binding free energies, ΔG bind , for the association of H1 and H2 with native and nonnative packing angles.Nonnative configurations are generated by rotating H1 (filled black circles), or H2 (open red squares), or changing the H1-H2 crossing angle (filled blue triangles).ΔG bind is computed from the total Boltzmann-weighted H1-H2-distance-dependent population of the entire free-energy basin (thus it correlates with but is not necessarily equal to the minimum PMF value; see S1 Fig).Standard deviations of the mean from block averaging are shown as vertical bars in (A) or shaded regions in (B).(C) Two-dimensional PMF of the H1, H2 packing angles in simulations with helical rotation but no change in H1-H2 crossing angle.Data are drawn from multiple simulations, including one started and restrained to the native orientation and twenty others with preferred nonnative packing angles in which one helix is rotated by ±10-50˚.Each free energy value (bottom color scale) plotted is the minimum of the distance-dependent PMF for a given inter-helix geometry (S1 Text).White regions have no sampling.By construction, the H1-H2 distance at the minimum of PMF can be different for different rotation angles (see example in S4 Fig).It is noteworthy that the two free-energy basins exhibited here are nonetheless robustly observed at essentially the same packing angles in multiple restrained simulations wherein inter-helical distances are targeted at a given d 0 i ranging from 1.0 nm to 1.3 nm (S5 Fig).https://doi.org/10.1371/journal.pcbi.1005909.g002

Fig 3 .
Fig 3. Im9 binding free energies for H1!H4.(A) Binding free energies, ΔG bind , for the association of H1 and H4 with native and nonnative packing angles generated by rotating H1 (filled black circles), or H4 (open red squares).ΔG bind is computed from multiple PMF values as in Fig 2 (S1 Text).(B) PMFs describing distance-dependent free energies for the association of H1 and H4 in native (black curve) and nonnative orientations of H4 rotated by +30˚(blue curve) or −30˚(red curve).Standard deviations of the mean from block averaging are shown as vertical bars (A) or shaded regions (B).(C) Two-dimensional PMF of H1 and H4 packing angles, constructed by the same procedure as that in Fig 2C (S1 Text).White regions have no sampling.https://doi.org/10.1371/journal.pcbi.1005909.g003below).Because nonnative +30˚H1 rotation has opposite effects on intra-H2 (Fig 5A) versus intra-H2 L H3 L H4 C (Fig 5B)Coulomb energies, one of the reasons for disfavoring nonnative +30˚H1 rotation in H1!H2 L H3 L H4 C is that this rotation of H1 induces energetic strain within H2 L H3 L H4 C , resulting in a destabilizing increase in intra-H2 L H3 L H4 C Coulomb energy collectively, whereas the same +30˚H1 rotation leads to an overall stabilizing decrease in intra-H2 Coulomb energy.The atomic basis of this difference remains to be analyzed.

Fig 4 .
Fig 4. Residue-specific potential energies for Im9 H1!H2 at d 0 i = 1.10 nm.(A) Lennard-Jones (filled) and electrostatic (hashed) potential energies for direct interaction of selected interfacial residue pairs from the native configuration (red) and a nonnative (blue) configuration with H1 rotated by +30˚.Pairs with |ΔE| > 1 kJ/mol are circled (S1 Text).(B) Snapshot of H1 (orange) packed against H2 (blue) in the native orientation, superposed with the configuration with a nonnative +30˚rotation of H1 (both helices in grey).Sidechains involved in residue pairs with |ΔE| > 1 kJ/mol, identified in (A), are shown as sticks.(C, D) Helical wheels show (C) native and (D) nonnative interactions between residues that contribute to the more favorable (red dashed lines) and more unfavorable (green dashed lines) component binding energies for nonnative than for native packing (see part A).As in Fig 1, amino acid residues on the helical wheels are color coded: grey for charged, yellow for nonpolar, and white for polar residues.https://doi.org/10.1371/journal.pcbi.1005909.g004

Fig 6 .
Fig 6.Energetic profiles of Im9 H1 binding.Change in free energy (PMF, ΔG), its enthalpic (ΔH) and entropic (−TΔS) components, and system volume (ΔV) for H1 binding to other molecular fragments at 300 K as a function of the displacement, d (distance), between H1 and the opposing helical bundle: H1!H2 (A, C, E, G) and H1!H2 L H3 L H4 C (B, D, F, H).Simulation data are shown for the native orientation (red curves) and the nonnative orientation with +30˚rotation of H1 (blue curves).Vertical dashed lines mark the positions of the native (red) and nonnative (blue) PMF minima.Error bars show standard deviations of the mean estimated by block averaging (S1 Text).https://doi.org/10.1371/journal.pcbi.1005909.g006 Fig 6G and 6H are essentially insensitive to the difference between native and nonnative packing.

Fig 7 .
Fig 7. Schematics of relative preference for native versus nonnative binding in Im9 fragments.System identifiers are defined in Table1.Helices are depicted as circles, covalent linkages are indicated by black bars, and the positions of native helix-helix interfaces are highlighted by red arcs.For each system, the configurations on the left and right are native and nonnative, respectively, with the nonnative rotation indicated by a red arrow.Black arrows point toward the orientation with a more favorable binding free energy, with bidirectional arrows indicating energetic equivalence and arrow thickness representing free energy differences with absolute values that are mild ( 6kJ/mol) or significant (! 10 kJ/mol).Black or grey boxes enclose, respectively, nonnative packing configurations that are significantly or mildly favored over native packing.

S6Fig.
Population density maps of Im9 H1 and H2 packing angles obtained from the single-chain H1 L H2 system.Each subplot represents an independent simulation that was initiated in either the native state (NS) or with the indicated helix rotated (R) by the specified angle.(TIF) S7 Fig.A potential steric clash.Positive rotation of H1 brings Im9 H1 residue F15 into closer contact with C-terminal residue F83, leading to a likely steric clash if the C-terminal region retains its structure in the native state.Helices in the Im9 NMR structure (PDB ID: 1IMQ; see ref. [2] of S1 Text) are colored as follows: H1, orange; H2, blue; H3, black; and H4, green; whereas intervening loops and C-terminus are in grey.Enlarged view (right): F15 and F83 side chains are shown as cyan sticks in the native configuration and the F15 side chain is shown in red after rotation of H1 by +30˚.(TIF) S8 Fig. Contact probability maps of Im9 helical association.(A) Contact probabilities for H1!H2 between residues in H1 and those in H2.
Fig 6G and 6H of the main text).(TIF) S10 Fig. Im7 binding free energies for H1!H2.(A) Binding free energies, ΔG bind , for the association of H1 and H2 with native and nonnative packing angles.Nonnative configurations are generated by rotating H1 (filled black circles), or H2 (open red squares), or changing the H1-H2 crossing angle (filled blue triangles).ΔG bind is computed by integrating the PMF over a free-energy basin as in Fig 2A and Fig 3A of the main text.(B) PMFs shown are distancedependent free energies for the association of H1 and H2 in native (black curve) and nonnative orientations with H1 rotated by +30˚(blue curve) or −30˚(red curve).Standard deviations of the mean from block averaging are shown as vertical bars in (A) or shaded regions in (B).Im7 Fig. 2C and Fig. 3C of the main text, respectively.During simulation, helical rotations are enforced by way of position restraints for C α atoms (see main text Methods) rather than bona-fide helical orientation restraints.Therefore, we assume that the C α spatial restraints can be collectively approximated by a single rotational restraint and estimate the effective force constants for helical rotation.These effective force constants are then used to combine the sampled ensembles so as to generate a PMF via WHAM.To this end, we use the sampled values of the helical rotation angle, θ, and the average sampled rotation angle,  , to compute histograms of the deviation of the sampled rotation,       , for d > 2.0 nm, where the PMFs are flat.The average rotation angle at d > 2.0 nm is used as a reporter on the center of restraint because of uncertainties associated with defining the helical rotation angle based on the positions of fluctuating C α atoms.The probability histogram of Δθ, P Δθ , is converted to a profile of the rotational free energy, ΔG Δθ , according to ΔG Δθ = k B T ln[P Δθ ].A harmonic biasing potential of the form

S2Fig.
Im9 energetic profiles for H1H2.Inter-helical PMFs (A, C, E) and distancedependent enthalpies (B, D, F) are shown for rotation of H1 (A, B), rotation of H2 (C, D), and changing of the H1-H2 crossing angle (E, F), while leaving the backbone native configuration of the opposing helix unchanged in the spatial coordinates of the simulation system.In each plot, data for native and nonnative packing angles are shown as black and colored curves, respectively.Colors for rotation or crossing angles are listed at the top of this figure, where negative and positive angular changes are indicated, respectively, by solid and dashed lines.Error bars show standard deviations of the mean estimated by block averaging.S3 Fig. Average helical rotation sampled during US-VREX simulation of the H1H2 system.Data show actual rotation of (A) H1 and (B) H2 for native (black curve) and nonnative orientations with H1 rotation targeted to +30° (blue curve) or −30° (red curve).

S4Fig.
Representative structures corresponding to free energy minima for the H1H2 system.The structures are restrained to the native orientation (solid color) and a nonnative orientation with H1 rotated by +30° (translucent color).Free energy minima are located at helical separation distance d = 1.14 nm for the native orientation and d = 1.09 nm for the nonnative orientation with H1 rotated by +30°.S5 Fig. Two-dimensional PMFs of the H1, H2 packing angles in H1H2 simulations with helical rotation but no change in H1-H2 crossing angle.The format is similar to that of Fig. 2 in the main text.Data are for restrained inter-helical distances, 0 i d , from 2.0 nm to 1.0 nm as indicated above each plot.The color scale on the right is for the relative free energy at any given value of 0 i d , but the scale does not apply across different values of 0 i d .White regions have no sampling.S6 Fig. Population density maps of Im9 H1 and H2 packing angles obtained from the single-chain H1 L H2 system.Each subplot represents an independent simulation that was initiated in either the native state (NS) or with the indicated helix rotated (R) by the specified angle.S7 Fig.A potential steric clash.Positive rotation of H1 brings Im9 H1 residue F15 into closer contact with C-terminal residue F83, leading to a likely steric clash if the C-terminal region retains its structure in the native state.Helices in the Im9 NMR structure (PDB ID: 1IMQ; see [2] of S1 Text) are colored as follows: H1, orange; H2, blue; H3, black; and H4, green; whereas intervening loops and C-terminus are in grey.Enlarged view (right): F15 and F83 side chains are shown as cyan sticks in the native configuration and the F15 side chain is shown in red after rotation of H1 by +30°.

F15F83S8Fig.
Contact probability maps of Im9 helical association.(A) Contact probabilities for H1H2 between residues in H1 and those in H2.Here a contact is said to exist between two residues if at least two heavy atoms, one from each residue, are separated by ≤ 0.45 nm.(B, C) Corresponding contact probabilities for H1 H2 L H3 L H4 C between residues in H2 and those in H3 and H4 (B), and between residues in H1 and those in H2, H3, and H4 (C).Color scale (top right) indicates a range from no contact (white for probability zero) to constant contact (blue for probability of one).In each of these cases (A, B, and C), results shown are for native (left panel) and nonnative rotation of H1 by +30° (right panel).For the H1 H2 L H3 L H4 C results in (B) and (C), residues of the helices are marked by color bars to the right of each set of contact maps (H2: blue, H3: grey, H4: green).S9 Fig. Changes in local water density upon Im9 helix-helix binding.Colors (top scale) indicate densities that are greater (blue) or less (red) than bulk water at 300 K for a 0.4 nm slice passing through the center of mass of H1 and H2 (A, B, C) or H2 L H3 L H4 C (D, E, F).Data shown depict three representative separations between the approaching helix bundles (cf.Fig. 6 of the main text): (A, D) the position corresponding to the solvent-separated enthalpy minimum at d = 1.90 nm, (B, E) the desolvation enthalpic barrier at d = 1.45 nm, and (C, F) the free energy minimum at d = 1.15 nm.Note that the sidechains of the approaching helix bundles are farther apart at the desolvation enthalpic barrier (B, E) than at contact (C, F).However, unlike the situation in (A, B), there is no water between the helix bundles in (B, E).Thus the total system volume is larger for (B, E) than for either (A, B) or (C, F).In other words, a volume barrier develops around d = 1.45 nm for both the H1H2 and H1H2 L H3 L H4 C systems (see Fig. 6G, H of the main text).S10 Fig. Im7 binding free energies for H1H2.(A) Binding free energies, ΔG bind , for the association of H1 and H2 with native and nonnative packing angles.Nonnative configurations are generated by rotating H1 (filled black circles), or H2 (open red squares), or changing the H1-H2 crossing angle (filled blue triangles).ΔG bind is computed by integrating the PMF over a freeenergy basin as in Fig. 2A and Fig. 3A of the main text.(B) PMFs shown are distance-dependent free energies for the association of H1 and H2 in native (black curve) and nonnative orientations with H1 rotated by +30° (blue curve) or −30° (red curve).Standard deviations of the mean from block averaging are shown as vertical bars in (A) or shaded regions in (B).Im7 native state is from PDB 1AYI (ref.[1] of S1 Text), with H1 and H2 comprising residues 12-26 and 32-45, respectively, as determined by DSSP (ref.[20] of S1 Text).