Binding of Two Intrinsically Disordered Peptides to a Multi-Specific Protein: A Combined Monte Carlo and Molecular Dynamics Study

The unique ability of intrinsically disordered proteins (IDPs) to fold upon binding to partner molecules makes them functionally well-suited for cellular communication networks. For example, the folding-binding of different IDP sequences onto the same surface of an ordered protein provides a mechanism for signaling in a many-to-one manner. Here, we study the molecular details of this signaling mechanism by applying both Molecular Dynamics and Monte Carlo methods to S100B, a calcium-modulated homodimeric protein, and two of its IDP targets, p53 and TRTK-12. Despite adopting somewhat different conformations in complex with S100B and showing no apparent sequence similarity, the two IDP targets associate in virtually the same manner. As free chains, both target sequences remain flexible and sample their respective bound, natively -helical states to a small extent. Association occurs through an intermediate state in the periphery of the S100B binding pocket, stabilized by nonnative interactions which are either hydrophobic or electrostatic in nature. Our results highlight the importance of overall physical properties of IDP segments, such as net charge or presence of strongly hydrophobic amino acids, for molecular recognition via coupled folding-binding.


Introduction
It has become clear that many functional proteins do not fold into unique three-dimensional structures, as expected from the classical view of proteins, but remain highly conformationally dynamic under native conditions. These intrinsically disordered proteins (IDPs) are now estimated to represent a significant fraction of many genomes. For example, roughly half of the proteins in mammals are predicted to contain disordered segments of more than 30 amino acids in length, and a fourth to be fully disordered [1]. Interestingly, the prevalence of disorder is far from uniform among different functional classes, suggesting IDPs carry special and biologically relevant properties. In particular, the prevalence is high among proteins performing important regulatory and signaling functions [2].
The perhaps most striking IDP property is their ability to undergo a conformational disorder-order transition upon contact with a target molecule. Various biological advantages have been suggested to stem directly from this coupled folding-binding process [3][4][5][6], including the ability to bind specifically to multiple and structurally diverse partners [7]. Indeed, intrinsic disorder has been found to be a common feature of ''hub'' proteins with especially large numbers of links in protein interaction networks [8,9]. An important example is given by the tumor suppressor p53, a transcription factor and regulatory protein that can induce cell cycle arrest and apoptosis [10]. The C terminal domain of p53 (henceforth simply p53) binds to at least four different globular proteins adopting four different conformations in the process [11][12][13]. How this binding diversity is achieved at the molecular level is not well understood and, in particular, cannot be rationalized by the classical lock-and-key model of protein-protein interactions.
The binding diversity of IDPs can in principle lead to two different types of signaling modes, one-to-many, in which one disordered region binds to multiple folded proteins, and many-toone, in which many disordered segments bind to the same folded protein [14]. An example of the latter is given by S100B, a Ca 2zmodulated homodimeric protein with several disordered protein partners [15][16][17], one of which is p53. Each S100B monomer consists of two EF-hand motifs which upon Ca 2z -binding reorients one a-helix (helix III), thereby exposing a hydrophobic binding pocket. In order to better understand this multiple specificity phenomenon, we study the coupled folding-binding of two disordered peptide sequences to S100B. As target peptides we choose p53 and a fragment from the protein CapZ, denoted TRTK-12, which both have available experimental structural data for the complexes [16,[18][19][20]. Molecular Dynamics (MD) simulation studies have previously been performed on the disordered N and C terminal regions of p53, with either positional constraints on the peptide [21] or by focusing on the characteristics of the bound and free states separately [22][23][24]. There has, however, not been a study comparing the processes of different IDP segments binding to the same ordered target.
MD simulations of biomolecular systems are typically hampered by limitations in conformational sampling. Large dynamical transitions such as coupled folding-binding of proteins is particularly challenging. We circumvent the problem by using a computationally efficient Monte Carlo-based implicit-solvent model, which has been extensively tested on the folding of peptides and small proteins [25][26][27]. Our basic approach has previously been tested on PDZ domain-peptide binding, with a variation of the potential energy function [28,29]. One advantage of this approach over recent protein-peptide docking methods [30][31][32] is that an equilibrium picture of the association is obtained. Additionally, we perform explicit-water MD simulations on IDP segments as free chains and in complex with S100B to further validate the MC result. Our work suggests striking similarities in the association mechanism of p53 and TRTK-12, despite their different amino acid sequences. In particular, we find that both sequences populate a transient intermediate state which is stabilized by either hydrophobic or electrostatic interactions.

Results/Discussion
Using an all-atom MC procedure, described in detail in Methods, we perform extensive fixed-temperature simulations to characterize the interaction between the Ca 2z -loaded form of the S100B homodimer and the two disordered peptides p53 (positions 374-388 of the full-length p53) and TRTK-12 (positions 265-276 of CapZ). In one-letter code, the amino acid sequences are GQSTSRHKKLMFKTE and TRTKIDWNKILS for p53 and TRTK-12, respectively. These peptides were chosen because they are well-characterized targets of S100B, and the interaction with p53 has previously been studied computationally [22,23]. The S100B structural forms used in the MC simulations are taken from the respective p53 and TRTK-12 bound structures but no specific knowledge of the binding pocket is built in. Free energy surfaces of the folding-binding processes are constructed and used to define bound and intermediate states which are then characterized in detail. Additional explicit-water MD simulations are also performed on free peptides, as well as on selected bound structures, to test the validity of our approach.

Conformational Preferences of Free Peptides
We start by examining the conformational behavior of our two peptide sequences as free chains using both MC and MD simulations, respectively (see Methods). Despite no discernable sequence similarity between the p53 and TRTK-12 peptides, we find that the two chains behave qualitatively similarly, as seen from Figure 1. Both peptides remain rather flexible at temperatures T&298 K and sample a wide range of conformations. A slight preference for a-helical states can be seen. Consequently, the conformational fluctuations of the free peptide chains include structures resembling the bound state. These bound-like populations are nonetheless quite low (see the low value tails of the RMSD distributions), in agreement with previous MD simulations on the p53 peptide [21].
We also find that the results of the two different simulation approaches are in good qualitative agreement. This is important as a validation of the full-scale protein-peptide simulations to be described below. The largest deviation appears for the p53 sequence, for which the MD results produce a peak in the RMSD probability distribution at around 6 Å . By inspection, we see that this peak represents peptide chains folded into a b-hairpin-like structure. The MC simulations of p53 also produce a small population of b-hairpins, albeit typically more fully formed, resulting in a shift of the peak to around 7.5 Å (see Figure 1A). We note that the tendency for the free p53 peptide to form b-hairpin-

Author Summary
A substantial fraction of our proteins are believed to be partly or completely disordered, meaning that they contain regions that lack a stable folded structure under typical physiological conditions. This is a feature which plays a key role in their functions. For example, it allows them to have many structurally different binding partners which in turn permits the construction of the intricate signaling and regulatory networks necessary to sustain complex biological organisms such as ourselves. Whereas measuring the binding strengths of associations involving disordered proteins is routine, the binding process itself is today still not fully understood. We use two different computational models to study the interactions of a folded protein, S100B, which can bind various disordered peptides. In particular, we compare two peptides whose structures are known when in complex with S100B. Our results suggest that, although the peptides assume different structures in the bound state, there are similarities in how they associate with S100B. The possibility to computationally model the interplay between proteins is an important complement to experiments, by identifying crucial steps in the binding process. This is essential to understand, e.g., how single mutations sometimes lead to serious diseases.
like structures was also observed in the MD study of Allen et al. [22]. Overall, we find that both the free p53 and TRTK-12 peptides sample multiple conformations, including transient ahelical structures.

Binding Free Energy Surfaces
We turn now to the binding processes of S100B and the p53 and TRTK-12 peptides, respectively. To this end, we rely on the computational convenience of the MC approach. The basic procedure follows earlier work [28,29] and operates in the following way. The S100B dimer structure is maintained in its native state by constraints, allowing side-chain and small backbone motions, while the peptide chains are free to explore the protein surface. Simulations are performed at fixed temperatures and are long enough to produce multiple binding and unbinding events in each trajectory such that an equilibrium picture of the interaction is obtained. Representative MC trajectories are shown in Figure  S1 in Supporting Information.
To monitor the progress of binding we use two different observables, DR CM and N c . The distance DR CM is taken between two points, the center-of-mass of the simulated peptide and the center-of-mass obtained from the peptide coordinates in the experimental complex structure. Hence, a small DR CM value indicates binding close to the experimentally determined binding site. The other observable, N c , is the number of amino acid contacts between the peptide chain and the S100B peptide binding pocket, which we have defined as a set of amino acid surface positions (see Methods). Because of the symmetry of the S100B homodimer, with its two identical binding pockets, we determine the center-of-mass distance using where DR CM1 and DR CM2 are obtained for the two binding pockets, respectively. For N c , we count the total contacts made with either binding pocket residues, thus assuming that a peptide can not contact both sites at the same time. This way, high N c and low DR CM is a signature of a peptide bound tightly at either of the two binding pockets. The overall features of the obtained folding and binding processes can be seen from the free energy surfaces in Figure 2. Both sequences exhibit major free energy minima representing bound states, characterized by DR CM 10 Å and N c &20-40. This indicates that binding occurs mainly at the experimentally determined S100B peptide binding pockets. We emphasize that this result is obtained in unbiased simulations, in which the peptides are left free to explore the entire protein surface.
We find additional, more detailed similarities between the binding processes of the p53 and TRTK-12 sequences. In the bound state, both p53 and TRTK-12 exhibit fluctuating helical contents. Hence, the peptides display a significant population of bound but not yet folded structures. This population is apparent in the one-dimensional free energy profiles in Figure 3, which reveal a plateau-like behavior in N c representing an intermediate state between the fully bound and free states. The term ''encounter complex'' has been used to indicate such a metastable state in protein-protein interactions [33]. For coupled folding-binding of disordered chains, there are two extreme mechanistic possibilities for association corresponding to either folding-before-binding (conformational selection) or binding-before-folding (induced folding) of the peptide [34]. The significant population of ''nonnative'' bound peptide conformations we observe for p53 and TRTK-12 indicate a binding mechanism dominated by inducedfolding for both sequences. Similar observations have been made in the studies of Chen [21] and Pirolli et al. [23] for the p53 sequence, as well as for associations involving other IDPs [35][36][37].
In previous studies on PDZ domain-peptide interactions we found a two-state-like binding mechanism [28,29], i.e., where the bound state was reached via a single well-defined binding transition state. The difference with the current systems may be a length effect, as PDZ domains typically bind much shorter  Figure 1). Free energies are obtained from extensive fixed-T equilibrium MC simulations of the interaction between peptide and S100B, in which several reversible binding and unbinding events occur, where T~290 K and 270 K for for p53 and TRTK-12, respectively. doi:10.1371/journal.pcbi.1002682.g002 peptides (around 4-6 amino acids). These short chains may not have sufficient chain flexibility to promote the formation of intermediate states [38].
The Bound State Ensemble: Structure and Protein-Peptide Contacts X-ray and NMR structural studies on p53 [16] and TRTK-12 [19,20] show that both peptides bind the hydrophobic surface on S100B, exposed upon Ca 2z -binding, but with somewhat different configurations. In complex with S100B, the p53 peptide is almost fully helical and docked parallel with the S100B helix III while TRTK-12 has only a single a-helical turn and is oriented perpendicular to helix III. The two minimum-energy conformations obtained from our MC binding simulations of p53 and TRTK-12, respectively, manage to capture many of these characteristics as shown in Figure 4. The TRTK-12 peptide in particular has in effect assumed the correct structure, being only partly helical and its tryptophan (Trp271) sidechain directed into the hydrophobic binding pocket. The min-E conformation of p53 deviates slightly from the experimental structure. As depicted in Figure 4A, its phenylalanine (Phe385) is contacting the hydrophobic pocket rather than being solvent exposed, as in the experimental structure [16]. This hinders the p53 peptide from being fully a-helical. However, the overall chain orientation relative to helix III is essentially correct.
It is clear from the binding free energy surfaces in Figure 2 that both peptides exhibit a significant structural diversity in the bound state. Minimum-energy structures may therefore not always provide a complete picture. The p53 peptide has a single, smooth bound state free-energy minimum spanning DR CM &2-10 Å . It turns out that this state is structurally quite well represented by the p53 min-E conformation in Figure 4A, with contacts between the S100B binding pocket and the p53 Phe385 as a prominent feature. By contrast, TRTK-12 has two distinct bound state minima characterized by DR CM &2-5 Å and &10 Å , respectively. The underlying structures of the small-DR CM (global) free energy minimum are similar to the min-E conformation. Interestingly, the local minimum at DR CM &10 Å exhibited by TRTK-12 corresponds to an a-helix parallel to the S100B helix III, reminiscent of the experimental binding pose of p53 [16].
To make a more detailed comparison with experimental data, we define the bound state (BS) by N c §11 and N c §17 for the p53 and TRTK-12 systems, respectively, based on the free energy profiles in Figure 3A. The populations of all possible amino acid contacts in the BS between the peptide and the S100B binding pocket are illustrated in Figure 5. p53 mainly binds to the binding pocket via its hydrophobic residues Leu383, Met384 and Phe385, in contrast to TRTK-12 which interacts with all its amino acids. This difference aside, the simulations suggest that the largest hydrophobic residue in each peptide fulfills a similar function. For p53, Phe385 is involved in all of the most populated contacts, equivalent to Trp271 in TRTK-12. These residues interact mainly with the hydrophobic amino acids of the binding pocket, especially with Ile36, Leu44, Val56 and Phe76. Our simulations suggest more involvement of the p53 Phe385 than what at first glance is apparent from the NMR structure. However, the importance of this position for binding is implied in the study of Rustandi et al. [39], who found that mutation of the Phe to a Trp, a larger hydrophobic amino acid, increases the binding affinity by 3-4 fold. As demonstrated in Figure 6, most S100B binding-pocket residues make on average the same number of contacts with p53 as with TRTK-12. There are however some positions, in particular Lys55 and Cys84, which are mostly involved in binding TRTK-12. These two residues indeed form contacts with TRTK-12 in the experimental structures [19,20], which we further find to be stable also in MD simulations of the S100B-TRTK-12 complex, performed as described below (see Figure S2).
While there is an overlap between the obtained contact maps in Figure 5 and the set of interactions derived from experiments, our BS ensembles do not perfectly fit with the experimental NMR structures. Some residue-residue contacts are only present in the simulations and, conversely, some contacts are found in the experimental structures but not in the simulations. The latter mainly emerges in the p53 simulations, where the two intermolecular salt bridges (Arg379-Glu45 and Lys386-Glu86), as well as contacts involving the N-terminal part of the p53 chain, are mostly absent in our simulations. The reason is probably the rather frequent interaction of Phe385 with the hydrophobic S100B binding pocket, which lead to a disrupted folding of the p53 ahelix. Our BS conformational ensembles are also more diverse than the set of NMR derived model structures. This can be seen by calculating the average RMSD between pairs of structures ij in the NMR and BS ensembles, respectively. For p53, we obtain vRMSD ij w~1:9+1:3 Å for the NMR ensemble of S100B-p53 structures and 7:0+1:4 Å for the simulation-derived BS ensemble. The corresponding values for TRTK-12 are 3:2+0:7 Å and 4:6+1:3 Å , respectively. It is possible therefore that our model is not able to capture the folding of the peptides in full, at least for p53. There is some uncertainty, however, in the interpretation of the nuclear Overhauser enhancements (NOEs) intensities which underly NMR structure calculations, especially with regard to the diversity of the protein ensemble [40,41]. The differences may also be due to the relatively weak binding conditions used in our simulations, where the BS is populated only at around 50%. At lower temperatures, the BS ensemble will likely become structurally more well-defined.

Testing Proposed Structures of the S100B-TRTK-12 Complex
In the analysis of our binding simulations we rely on available experimental structures of the protein-peptide complexes. For S100B and TRTK-12, three structures have been proposed [18][19][20]. One of these involves a novel coil-like conformation of TRTK-12 (PDB ID 1MQ1) [18] while the NMR and X-ray structures of Weber et al. both indicate a partially a-helical conformation (PDB ID 1MWN and 3IQQ) [19,20]. Our MC results are in line with the presence of a short a-helix in TRTK-12, Experimental structures of the p53-S100B (1D7T) [16] and TRTK-12-S100B (1MWN) [19] complexes are displayed in gray and pink. The ''nonoptimized'' root-mean-square deviation between the minimum-energy and experimental peptide structures are RMSD no~9 :6 and 7.3 Å for p53 and TRTK-12, respectively (see Methods). The figure was prepared with PyMOL [59] as demonstrated by the minimum-energy conformation in Figure 4B.
To further inspect the experimental disagreement in TRTK-12 structure, we employed our MD approach to test the stability of two of the proposed structures, 1MQ1 [18] and 1MWN [19]. Three simulations of 50 ns were performed for each complex, both with and without Ca 2z loaded in S100B, as shown in Figure 7. The calculations consistently indicate a higher stability for 1MWN than for 1MQ1, at least in terms of RMSD fluctuation measured over the TRTK-12 chain. The peptide secondary structure in the 1MWN trajectories remains intact over the full MD trajectories (data not shown). These MD trajectories thus also support the partly a-helical structural models of Weber et al. [19,20] and, moreover, provide further support for an agreement between our two different computational approaches.

Characterizing the Intermediate State
How do the peptide chains reach their respective bound states or, in other words, what is the peptide binding mechanism? Key to answering this question is partly bound peptide conformations populated transiently during the binding process. Although we cannot directly assess kinetic aspects of the binding process with our MC approach, we can examine the structural characteristics of peptide conformations which are neither entirely bound nor in the unbound state. As shown above, both sequences exhibit similar plateau-like regions in the N c free energy profiles (see Figure 3).
Many of these conformations must be populated during binding and it is therefore of interest to characterize and compare these states. We define this intermediate state, IS, as conformations which are neither unbound (N c~0 ) nor part of the bound state, BS.
Contrary to the BS ensembles, the IS of the two peptide sequences exhibit strong similarities. To show this, we performed a clustering procedure on all IS conformations with the aim of visualizing typical structures (see Methods for details). The four largest clusters for p53 and TRTK-12, respectively, are displayed in Figure 8, representing more than half of the IS ensemble in both cases. As depicted in Figures 8A and C, the IS peptide conformations are located primarily in two different regions in the periphery of the S100B binding pocket. In the first group, represented by the violet and the green cluster centroid structures in Figure 8, the peptides are located in the vicinity of a hydrophobic surface composed primarily of Phe43 and Phe87 on the S100B domain. Contacts between these amino acids and the peptides are also evident from the IS contact probabilities in Figure 5, which moreover indicate that they involve mainly the peptide residues Phe385 and Trp271 on p53 and TRTK-12, respectively. Interestingly, while these contacts are among the most frequent in IS, they are much less populated in the BS. In the second IS group, indicated by the red and the yellow centroids in Figure 8, the peptides are situated between helix III and helix IV of S100B. Here the stabilizing interactions are instead primarily  electrostatic. The possible existence of binding spots in the periphery of the binding pocket is in part supported by the work of Arendt et al. [42], who found that small molecules from a fragment library mainly bound to two sites of the S100B surface, one of which overlaps with the first region described here.
The similar binding mechanism of p53 and TRTK-12 is especially interesting in light of their low sequence similarity and the significant structural differences in their bound states. The S100B interaction with disordered peptides can be compared with so-called peptide recognition domains (PRDs), which are common modules in signaling proteins and include 14-3-3, PDZ, and SH3 [43]. These domains typically bind sets of peptide sequences conforming to simple linear sequence motifs. For example, SH3 domains bind peptides comprising a Pro-X-X-Pro motif, where Pro is proline and X any amino acid type. A structural analysis of the many-to-one binding exhibited by a particular 14-3-3 domain revealed a relatively high structural similarity in the proteinpeptide complexes involving five different target peptides [14]. Similarly, PDZ domains typically bind their peptides in a structurally specific way involving the peptide C terminus [44]. The multiple specificity exhibited by S100B thus differ from PRDs in at least two ways. First, the peptide sequences do not conform to a simple linear motif and, second, their bound structures differ significantly. A possible biological benefit of these two properties is that cross-reactivity among different S100B target peptides may be minimized. In other words, the relatively high sequence disparity among the IDP targets keep them from acquiring each others general functions. For example, the difference in sequence between CapZ/TRTK-12 and p53 may keep CapZ/TRTK-12 from binding to p53 partner domains other than S100B.
Despite the lack of a simple sequence motif for known S100B targets, we note some interesting common characteristics. Both p53 and TRTK-12 include a single aromatic amino acid (Phe385 and Trp271, respectively), one or more aliphatic amino acids, as well as several Lys and Arg resulting in a net positive charge of the peptide. These characteristics holds also for a third disordered S100B target for which a structure of the complex is available [45], taken from the N terminal regulatory domain of NDR kinase. It is possible that this basic compositional similarity of target peptides underpins the similarity in the binding mechanism, such that interactions made early in the binding process, being diverse and nonspecific in nature, are largely determined by the overall physical properties of the chain segment. An issue which remains to be addressed is to what extent a common binding mechanism is a general characteristics of many-to-one protein signaling. For this comprehensive additional simulations beyond the scope of the present investigation are required. However, we note that preliminary simulations for the interaction between S100B and NDR, performed using our all-atom MC approach, produced a minimum-energy conformation where the (sole) aromatic residue of the NDR peptide is involved in binding in a similar way as for p53 and TRTK-12 (see Figure S3).

Summary and Conclusion
We have used a combination of MD and MC all-atom simulations to understand the coupled folding and binding of two disordered peptides, p53 and TRTK-12, to the Ca 2z -loaded form of S100B, as an example of a many-to-one signaling mechanism. Despite the significant difference in sequence, we find  remarkable and unexpected similarities in their association behavior with S100B. First, as free peptides, the two sequences have similar predispositions towards helical conformations, but overall remain rather flexible chains. Only to a small extent do they sample their respective S100B-bound structures. Second, both p53 and TRTK-12 populate an intermediate, metastable state during the folding-binding process, which may serve as an initial encounter complex of the interaction. The intermediate state is divided structurally between two different binding surfaces on the periphery of the binding pocket where stabilization occur by either hydrophobic or electrostatic interactions and includes a significant fraction of a-helical structure (see Figure 8). Interestingly, major disparities between the two S100B targets become apparent only in the final bound state, where the patterns of contacts with the S100B surface are significantly dissimilar.

Molecular Dynamics Simulations
MD simulations of the free p53 and TRTK-12 peptides were started from random conformations which were obtained from high temperature (400 K) MD simulations. 20 simulations were performed for p53 and TRTK-12, respectively. The duration for each simulation was 200 ns.
The simulations of the Ca 2z -S100B-TRTK-12 complex were initialized from the two available NMR structures (PDB IDs 1MWN and 1MQ1). For each version three runs of 50 ns, with different initial velocity distributions, were performed. In 1MQ1, the coordinates for the four Ca 2z ions are missing. They were therefore added by superimposing the structure of 1MWN on 1MQ1 and transferring the coordinates of the Ca 2z ions from 1MWN to 1MQ1. A final manual adjustment of the ion positions was made in order to avoid steric repulsions with the proteins. We also performed three simulations of each complex variant without the Ca 2z ions present. In order to maintain the structure of S100B, position restraints on the backbone atoms were added.
All simulations were performed using the program GROMACS 4.07 [46,47] with the OPLS-AA/L force field [48] and the SPC/E water model [49]. The starting conformation was placed in the center of a cubic water box with at least 10 Å from the box edge. Periodic boundary conditions were used and counter ions (Na + or Cl 2 ) were added to neutralize the net charges. The long-range electrostatic interactions were treated with the particle mesh Ewald method [50]. The cutoff distances were set to 10 Å for short-range Coulomb and van der Waals interactions. The bond lengths were fixed by the LINCS algorithm [51], and a time step of 2 fs was used. Each system was first relaxed by 1000 steps of steepest-descent energy minimization. After the minimization, the system was equilibrated at 298 K for 100 ps under an NVT ensemble and further equilibrated for 200 ps at constant pressure (1 bar). V-rescale [52] and Parrinello-Rahman [53] were used to couple the system to the simulation temperature and pressure with coupling constants of 0.1 ps and 2.0 ps, respectively. Production simulations were performed at constant temperature (298 K) and pressure (1 bar). Coordinates were saved every 5 ps.

Monte Carlo Simulations
For the free peptides, 10 simulations were performed with the PROFASI program package [25][26][27] using a simulated tempering algorithm [54,55]. Each run consisted of 10 9 MC steps and visited 6 temperatures in the range 280-330 K for p53 and 201-301 K for TRTK-12. The temperature intervals were chosen to envelop the protein-peptide simulation temperatures. 10 simulations were also performed for each S100B-peptide system, where every run consisted of 3:7|10 9 MC steps. All binding simulations were performed in a cube with side length 150 Å and periodic boundary conditions. The temperature T was held fixed, chosen such that the peptide would be bound roughly half of the time (T~290 K for p53 and 270 K for TRTK-12). The protein had flexible backbone and sidechains, but was kept close to its native bound conformation (taken from the NMR structures with PDB IDs 1DT7 and 1MWN, where the best representative conformers were chosen). This was done by using a total energy function E~E 0 z40 : RMSD no , where E 0 is the original physical energy function defined in Ref. [27]. Sampling of protein conformations close to the S100B native state was ensured by the non-optimized root-mean-square deviation penalty term given by where r i and r nat i are the positions of the C a atom of amino acid i in the simulated and experimental protein structures, respectively, and N is the number of amino acids in the protein. The additional energy term constitutes typically less than 25% of the total energy in the system. Four flexible N-terminal residues, Phe88-Glu91, of each S100B monomer were removed.

Peptide Folding
The average amount of secondary structure in the free peptides was determined using STRIDE assignments [56]. The similarity to the natively bound structure was analyzed by calculations of the peptide backbone RMSD (translationally and rotationally optimized) with respect to the best representative conformer of the NMR structures (1DT7 for p53 and 1MWN for TRTK-12).

Binding Observables
As an indication of whether the peptide is bound to the correct binding site, we used the observables DR CM and N c . DR CM is the distance between the centers-of-mass of the simulated and the experimental peptide in the complex, respectively. N c is the number of contacts between the peptide and the protein binding pocket, where two residues are defined to be in contact if they exhibit at least two pairs of heavy atoms separated by less than 4.5 Å . The residues defining the protein binding pocket are listed in Figure 5.

Clustering Procedure
In order to characterize the most populated structural states of the intermediate, we used a Complete Linkage Hierarchical clustering method [57]. The aim of this procedure is to group protein-peptide conformations that are similar to each other into clusters. To determine similarity we used the standard root-meansquare-deviation measure, RMSD. The cutoff value in the clustering procedure was taken to be 6 Å and 8 Å for TRTK-12 and p53, respectively, which resulted in roughly the same number of clusters for both sequences. The clustering procedure guarantees that within each cluster, the RMSD between all pairs of conformations is less than the chosen cutoff value. Figure S1 Representative MC trajectories of the binding and unbinding of S100B and its peptide targets.

Supporting Information
Shown is the MC evolution of DR CM for two of the total 20 independent simulations performed for (A) p53 and (B) TRTK-12. (EPS) Figure S2 Time evolution of residue-residue distances in MD trajectories of the S100B-TRTK-12 complex. Distances are calculated between the sidechain center-of-mass points of the involved residues, Lys55 and Cys84 for S100B and Ile5 and Leu11 for TRTK-12. showing two independent binding/unbinding events, and (B) the minimum-energy conformation found during the trajectory where the simulated peptide is shown in green and the experimental protein-peptide structure is shown in gray and pink (PDB ID 1PSB) [45]. The NDR peptide sequence used in the simulation is KETEFLRLKRTRLGLE.