Tolerance of Protein Folding to a Circular Permutation in a PDZ Domain

Circular permutation is a common molecular mechanism for evolution of proteins. However, such re-arrangement of secondary structure connectivity may interfere with the folding mechanism causing accumulation of folding intermediates, which in turn can lead to misfolding. We solved the crystal structure and investigated the folding pathway of a circularly permuted variant of a PDZ domain, SAP97 PDZ2. Our data illustrate how well circular permutation may work as a mechanism for molecular evolution. The circular permutant retains the overall structure and function of the native protein domain. Further, unlike most examples in the literature, this circular permutant displays a folding mechanism that is virtually identical to that of the wild type. This observation contrasts with previous data on the circularly permuted PDZ2 domain from PTP-BL, for which the folding pathway was remarkably affected by the same mutation in sequence connectivity. The different effects of this circular permutation in two homologous proteins show the strong influence of sequence as compared to topology. Circular permutation, when peripheral to the major folding nucleus, may have little effect on folding pathways and could explain why, despite the dramatic change in primary structure, it is frequently tolerated by different protein folds.


Introduction
Processes such as point mutation, gene duplication and fusion, recombination and circular permutation drive evolution. Circular permutations, where the old termini are sealed and new termini are created at a different site, were considered as rare events since they are difficult to detect by looking only at primary structure. With an increased amount of available 3D structures and structure comparison tools the estimated number of circular permutants has increased drastically. Jung and Lee showed that 14% of the domains in the structural SCOP domain database have at least one circularly permuted ''homolog'', i.e., one of these is likely to have arisen through circular permutation [1].
Among the handful of studies on isolated protein domains that have been published, circular permutation often results in more complex kinetic folding mechanisms than for the wild type [2][3][4][5] and sometimes population of low energy intermediates [6][7][8]. The only exception is the two-state folder chymotrypsin inhibitor 2, where the folding pathway remained the same on circular permutation [9]. In general, more complex folding mechanisms result in accumulation of intermediates and misfolding, which in turn may cause disease and will therefore be disfavoured by evolution [10]. Why then is circular permutation so frequent? Otzen and Fersht suggested that folding of protein domains with diffuse folding nuclei are more likely to be unaffected by circular permutation. Another study showed that if the cleavage site is within the ''folding elements'', stretches of amino acids important for early folding events, the protein will not fold, while if located elsewhere it will fold with conserved early folding events [11]. To learn more about how circular permutation affects folding pathways, we analyzed a protein domain with a relatively complex folding pathway, namely the second Postsynaptic density protein-95/Discs large/Zonula Occludens-1 (PDZ) domain from synapse associated protein 97 (SAP97). SAP97 is a member of the membrane-associated guanylate kinase family, and involved in establishing cell polarity [12] and synaptic potentiation [13]. We also compare our results to those from another PDZ domain, PDZ2 from protein tyrosine phosphatase-BL (PTP-BL), an enzyme involved in signal transduction and which carries a number of recognition domains in addition to its catalytic domain [14]. PDZ domains are usually part of such multi domain proteins and have important roles in molecular recognition.
PDZ domains are well-characterized globular protein domains of around 90 amino acids with a conserved fold but with substantially different primary structure [15,16]. In the case of SAP97 PDZ2 and PTP-BL PDZ2 the identity is only 43% but their 3D structures superimposable. PDZ domains consist of six bstrands and two ahelices ordered in the following way: b1-b2-b3-a1-b4-b5-a2-b6 ( Figure 1A). There is also a naturally occurring circularly permuted variant of the canonical PDZ domain, where b-strand 1 is placed after b-strand 6 [17,18]. In the case of PDZ2 from PTP-BL, this circular permutation was engineered and resulted in accumulation of a low-energy intermediate in the folding reaction [6,7]. Indeed, this permutation stabilized the b-sheet formed by strands b1 and b6 in a region where the early nucleus is formed in the folding reaction of PTP-BL PDZ2 [19,20].
Wild type PTP-BL PDZ2 is known to fold without any low energy intermediates. On the other hand, folding of SAP97 PDZ2 involves a low energy intermediate, which can be either on-or offpathway [21,22]. Therefore, this protein domain offers a good experimental system to probe the effect of circular permutation on a complex folding energy landscape. We have therefore determined the crystal structure and studied the folding pathway of the b6-b1 circular permutant of SAP97 PDZ2 (Figure 1). In contrast to PTP-BL PDZ2, we found that the folding mechanisms for the canonical and circularly permuted SAP97 PDZ2 are remarkably similar.

Design of a Circularly Permuted Protein
A circularly permuted (cp) SAP97 PDZ2 domain was generated by fusing the N-and C termini of a pseudo wild-type (pwt) SAP97 PDZ2 [23] with a glycine and serine linker (GSG) between E315 and P405. The new N-and C-termini, K327 and P326, respectively, were selected since they correspond to the terminals of the naturally occurring circular permutant of a PDZ domain from a green alga [17,18] (see Figure 1A). The same permutation, had a severe effect on the folding of PTP-BL PDZ2 [6,7].
The Circularly Permuted and Canonical SAP97 PDZ2 Share the Same Fold We solved the crystal structure of the cpSAP97 PDZ2 to ensure that the overall structure was not altered by the permutation. The cpSAP97 PDZ2 protein crystallized in the space group C2 with two molecules in the asymmetric unit. The structure was solved by molecular replacement and refined to a resolution of 2.3 Å . In the deposited pdb entry (4AMH), residues Lys13 -Pro91 and Glu95 -Pro106 correspond to the residues Lys327 -Pro405 and Glu315 to Pro326, respectively, in wild-type SAP97-PDZ2. Below, we will refer to the residue numbering of the wild-type protein. In both molecules in the asymmetric unit the residues from Lys327 to Pro 405 via the linker Gly-Ser-Gly and from the next residue Glu315 to Lys324 are ordered. The N-terminal 12 residues including the histidine tag, the thrombin cleavage site and the C-terminal residues (Pro326 in molecule A and Gly325-Pro326 in molecule B) are disordered. The data collection and refinement statistics are shown in Table 1. The cpSAP97 PDZ2 protein structure has the typical PDZ domain fold with six b-strands (b1 to b6) and two ahelices (a1 and a2) ( Figure 1B). Superposition of molecules A and B ( Figure 1C) shows that the overall root mean square deviation (r.m.s.d) between the A and B molecules is 1.34 Å over 91 C a atoms (Lys327-Pro405-Gly-Ser-Gly-Glu315-Lys324). Minor conformational changes are observed in a2 and the preceding loop as well as in the b2-b3 loop. A larger conformational change is observed in the engineered b6-b1 loop with a maximum deviation of 6.5 Å between the C a atoms of the second glycine in the linker region of the two molecules. The weak electron density and high B-factors for the Gly-Ser-Gly linker and the neighbouring residues in both molecules suggest that this loop is highly flexible.
Examination of the crystal contacts shows that the b2-b3 loop and b6-b1 loop in both molecules are involved in crystal packing Figure 1. Structure of the circularly permuted SAP97 PDZ2 (cpSAP97 PDZ2). A. Schematic picture of the rearrangement of secondary structural elements in cpSAP97 PDZ2. The secondary structure arrangement is naturally occurring in a PDZ domain in green alga [17,18] and even though it seems modest, had a significant effect on the folding of PTP-BL PDZ2 [6,7]. B. Ribbon representation of the cpSAP97 PDZ2 structure showing the new N and C termini. C. Superposition of the two cpSAP97 PDZ2 molecules in the crystal structure, A (green) and B (blue), shown as C a trace. D. Superposition of cpSAP97 PDZ2 (green) and pwtSAP97 PDZ2 (pink) shown as C a trace. doi:10.1371/journal.pone.0050055.g001 interactions. Superimposing the cpSAP97 PDZ2 (molecule A) onto the pwtSAP97 PDZ2 (pdb 2X7Z) shows that the structures are very similar ( Figure 1D), except for the different N and C-termini, the break in the b1-b2 loop and the new loop connecting b6 to b1. The overall r.m.s.d for 91 aligned C a atoms (Lys327-Lys324) is 0.88 Å . Interestingly, even after moving b1 from the N-terminus to the C-terminus, the orientations of the side-chains in the Ile317-Ile323 region of the cpSAP97 PDZ2 structure are similar to those in the pwtSAP97 PDZ2 structure. As previously observed in the pwtSAP97 PDZ2 structure, Lys324 in the cpSAP97 PDZ2 does not form a salt bridge with Asp396 but forms a hydrogen bond interaction with the side-chain of Thr394.
Circular dichroism experiments confirmed that the circularly permuted protein is folded and contains similar ratios of secondary structural elements as the pwtSAP97 PDZ2 in our experimental conditions ( Figure 2A).

The Circular Permutation Reduces the Stability of the Protein
To compare the thermodynamic stability of the pwt-and cpSAP97 PDZ2 we did equilibrium denaturation experiments in 50 mM potassium phosphate pH 7.5 by varying the urea concentration and measuring the fluorescence of the single tryptophan (Trp342) present in the respective protein ( Figure 2B). The equilibrium constants were obtained by fitting the data to the general equation for solvent denaturation of a protein according to a two-state mechanism [24], since the denaturation curves displayed a simple sigmoidal transition with no evidence for intermediate states populated at equilibrium. Since the two proteins have very similar sequence and overall fold, the m D-Nvalue, which reflects the change in solvent accessible surface area on denaturation, can be shared in the fitting process. The data show that the circular permutation has decreased the stability of the protein from 4.760.03 to 2.460.09 kcal/mol ( Table 2).

The cpSAP97 PDZ2 Retains Binding to the Wild Type Ligand
From kinetic binding experiments, SAP97 PDZ2 is known to bind to a peptide, LQRRRETQV, generated from the C-terminus of the human papillomavirus 18 (HPV-18) E6 protein [23]. We showed that cpSAP97 PDZ2 retains its binding to this peptide ( Figure 2C and D) by measuring the fluorescence change from Trp342 upon binding, using the stopped flow technique. The association rate constant, k on , was one-third of that of the pwt SAP97 PDZ2, whereas the dissociation rate constant, k off , remained essentially unchanged, as determined from a separate displacement experiment [23] (Table 2).

Observed Folding Rate Constants are Consistent with a Multi-step Mechanism
Having demonstrated that the structure and function of the cpSAP97 PDZ2 were intact, we investigated its folding pathway. Folding kinetics can be studied in urea-induced unfolding experiments, and buffer-induced refolding experiments in the stopped flow fluorimeter, using the same tryptophan probe (Trp342) as in the binding experiments. Kinetic traces were biphasic, both for the refolding and unfolding reactions, resulting in two observed rate constants (k obs ). The observed refolding phases were independent of protein concentration at low urea concentration within a range of 0.2-5 mM, thus excluding protein aggregation events. By plotting the k obs value against the urea concentration on a semi-logarithmic scale a chevron plot is generated. In Figure 3, the chevron plot from the experiments in 50 mM potassium phosphate, pH 7.5 at various urea concentrations is shown, both for pwt-and cpSAP97 PDZ2. When double exponential (un)folding kinetics is observed, the folding mechanism is more complex than a two state reaction. However, to distinguish among different reaction schemes is very difficult. The three simplest reaction schemes that result in double exponential folding kinetics are (i) a two-step folding with an on-pathway intermediate, (ii) a two-step folding with an off-pathway intermediate, and (iii) a triangular scheme with an on-pathway intermediate as well as a direct formation of the native state from the denatured state [25][26][27][28]. Haq et al. [18] suggested that the data for pwtSAP97 PDZ2 were best fitted to a two-step folding scenario with an off-pathway intermediate at 25uC, or an on-pathway or triangular scheme at 37uC [21]. The (un)folding rate constants for cpSAP97 PDZ2 can be nicely fitted to all of the three above suggested reaction schemes (on-pathway fit shown in Figure 3). Therefore, this data set was not sufficient to distinguish between different potential folding pathways for the cpSAP97 PDZ2. The cpSAP97 PDZ2 Folds with Two Compact and Two Denatured Like Species More information on the folding pathway of cpSAP97 PDZ2 was obtained by performing interrupted refolding and interrupted unfolding experiments. In these experiments refolding or unfolding is interrupted after various delay times and then the protein is unfolded/refolded again. This powerful technique allows detection of populations of individual species with time after mixing. We observed two kinetic phases at long delay times in the interrupted refolding experiment (thus, at equilibrium) showing that there are two distinct states at equilibrium ( Figures 4A and 4D), which are denoted I and N in the scheme in Figure 5. The state that unfolds faster (I) is formed in a clear double exponential fashion. Similarly, the interrupted unfolding experiments revealed two distinct states at high urea concentration ( Figure 4E), denoted D and D cis-P in Fig. 5. Here, the rapidly refolding species unfolds in a clearly  Table 2 for fitted parameters. C. cpSAP97 PDZ2 binding trace at 10 mM peptide fitted to a single exponential function. Residuals are shown below the trace. D. Observed rate constants for the binding of the peptide LQRRRETQV to cp-and pwtSAP97 PDZ2 in 50 mM potassium phosphate, pH 7.5, plotted against peptide concentration. Fitting was done with the general equation for a second order bimolecular association [50]. The association rate constant, k on (slope of the linear region of the curve), decreased from 8.760.3 to 2.960.02 mM 21 s 21 upon circular permutation, while the dissociation rate constant remained basically the same. doi:10.1371/journal.pone.0050055.g002 double exponential way, but the rise from 0 to maximum amplitude is faster than the dead-time of the stopped flow instrument in the sequential mix setup (the minimum delay time between the first and the second mix being in the order of 10 ms). Together, these experiments illustrate that at least four states are involved in the folding of cpSAP97 PDZ2. The simplest reaction scheme to describe such folding data is a square model with two more compact states (I and N) and two denatured, expanded species (D and D cis-P ). Our suggested folding model for cpSAP97 PDZ2 is shown in Figure 5. In the interrupted refolding experiment the fast phase would be represented by the transition from the denatured state D to the native state N (illustrated by the first phase in Figure 4A) but also by the transition between D and D cis-P . Because of the low rate constants, as discussed below, we postulate this heterogeneity in denatured states to arise from a denatured state with at least one proline in cis conformation (hence D cis-P ). The slow phase in Fig. 4A would then represent the transition from D cis-P to the equilibrium intermediate I. In Figure 4C, we demonstrate that our data on cpSAP97 PDZ2 can be fitted to the square model by using the program Copasi [29], which simulates how the concentrations of the different species change with time in the folding reaction. Normal curve fitting was difficult to employ since the equation describing the square model is very complex.

Proline Isomerization is the Likely Cause of the Slow Phase
The folding of some proteins containing prolines is slowed down due to the proline cis-trans isomerization, which gives rise to an additional folding phase [30,31]. Some of these proteins have been reported to fold according to a square scheme [32]. The cpSAP97 PDZ2 has three prolines that are located at positions 326, 343 and 405. Hence, it is possible that one of the phases in our suggested square model comes from a proline phase, as outlined below. From the interrupted unfolding experiments we found that the fractions of D and D cis-P at 4 M urea, 12.5 mM HCl, 2.5 mM potassium phosphate, were 78% and 22%, respectively. These numbers were used when fitting data to the interrupted un/ refolding experiments with Copasi ( Figure 4C). The observed ratio is similar to those previously reported for prolines in cis and trans position in small peptides and other proteins [33,34]. Furthermore, from our interrupted refolding experiment, the rate of interconversion between D and D cis-P was also similar to that  previously reported for proline isomerization [35]. These results together with the proposed square-folding scheme for the cpSAP97 PDZ2 suggest that the proline cis-trans isomerization is the likely cause for the slow kinetic phase. While addition of Pro cis-trans isomerases has been employed to confirm Pro phases, results from these experiments may sometimes be inconclusive due to, for example, the specificity of the enzyme and isomerization of non-Pro peptide bonds [36]. Given the complexity of the observed kinetics for cpSAP97 PDZ2, we chose not to perform such experiments.
The Canonical pwtSAP97 PDZ2 also Folds According to a Square Reaction Scheme Results from the interrupted refolding experiments for pwtSAP97 PDZ2 [21] (replotted in Figure 4B) and cpSAP97 PDZ2 initially appear to be different due to the lack of obvious transition for pwtSAP97 PDZ2 that corresponds to the main folding phase. However, a possible explanation for this are the similar rates between D cis-P to D and D to N, respectively. In fact, since we argue that the transition between D and D cis-P is a proline phase, this phase is likely to have the same rate constants independently of protein and conditions. Therefore, since the observed folding rate constant of the main phase in the conditions used in the two interrupted refolding experiments is ten-fold lower for the canonical pwtSAP97 PDZ2, it becomes similar to the rate constants for the proline isomerization. Therefore, we hypothe-sized that there is a proline phase also in the folding of pwtSAP97 PDZ2 that previously escaped detection, since an interrupted unfolding experiment was not included in the previous analysis [21]. To compare the pwt-and cpSAP97 PDZ2 folding pathways, we therefore did an interrupted unfolding experiment for pwtSAP97 PDZ2. This experiment clearly confirmed that pwtSAP97 PDZ2, similarly to cpSAP97 PDZ2, has two distinct states at high urea ( Figure 4F). Thus, the simplest folding scheme for the pwtSAP97 PDZ2 would also be a square model ( Figure 5).

The Kinetics for Direct Formation of Native from Denatured Protein is Conserved between cpSAP97 PDZ2 and Other PDZ Domains
In the kinetic folding experiments of cpSAP97 PDZ2 the main phase (with the highest amplitude) is the phase reflecting direct formation of native protein. The observed rate constant for this phase shows a refolding rollover if measured in stabilizing conditions (0.6 M sodium sulfate) and an unfolding rollover when analysed in destabilizing conditions (50 mM sodium acetate, pH 5.6) ( Figure 6). Such non-linearities in chevron plots are indications of changes in rate limiting transition states for (un)folding [28]. The degree of solvent accessible surface for theses transition states is described by the b T value, obtained by curve fitting to three state models. The native state N has a b T value of 1 and the denatured state D a value of 0. We have previously shown that PDZ domains fold via three conserved transition states, and hence can be fitted with three shared b Tvalues [22]. Our data for the cpSAP97 PDZ2 fitted well to the b Tvalues (0.17, 0.65 and 0.86, respectively) found in Hultqvist et.al [19], suggesting that the formation of native cpSAP97 PDZ2 follows a similar path as pwtSAP97 PDZ2. The data from the curve fitting can be found in Table S1.
The chevron plots for pwt-and cpSAP97-PDZ2 measured under identical conditions are similar, but with a general shift to lower stability of the main phase due to increased unfolding rate constants for the circularly permuted protein (Figures 3 and 6). Wvalues of 0 or 1 are not associated with the same caveats as intermediate values [37]. The change in k u and identical k f , therefore suggest a Wvalue for circular permutation close to 0, or in structural terms, that the site of circular permutation has not Plot of the amplitudes for the two observed rate constants in an interrupted refolding experiment for pwtSAP97 PDZ2. The data is from a previous publication [21]. C. The experimental data from Figure 4A together with simulated traces for the square model. The simulation was done using Copasi [29] and using the rate constants in Table S2. The initial distribution of the D states, 72% D and 28% D cis-P , was calculated from the ratio of D and D cis-P at equilibrium in the interrupted unfolding experiment. The excellent fit illustrates that the square model can explain our experimental data. D. Examples of experimental traces from interrupted refolding of cpSAP97 PDZ2 after various delay times. The traces were fitted to a double exponential curve (black) with shared rate constants and kinetic amplitudes plotted versus delay time (panel A). E. Plot of the amplitudes for the two observed rate constants in an interrupted unfolding experiment for cpSAP97 PDZ2. F. Plot of the amplitudes for the two observed rate constants in an interrupted unfolding experiment for pwtSAP97 PDZ2. The delay time plotted on the x-axis is the incubation time of the first mix. For example, in an interrupted refolding experiment it is the time the protein is allowed to refold before unfolding is initiated by the second mixing event. doi:10.1371/journal.pone.0050055.g004 Figure 5. Unifying folding reaction scheme for pwt-and cpSAP97 PDZ2. We have illustrated that the folding of cpSAP97 PDZ2 and its natural canonical version pwtSAP97 PDZ2 can be described by this unifying folding scheme with four states, D cis-P , D, I and N. D cis-P and D are denatured states, N is the native state, whereas I is a compact state with native-like burial of hydrophobic residues. The high energy intermediares I 1 * and I 2 * seem to be a conserved feature among most PDZ domains [22,44] and give rise to the observed nonlinearities for the observed (un)folding rate constants for the main phase. doi:10.1371/journal.pone.0050055.g005 formed native contacts in the transition state for folding of pwtSAP97 PDZ2.

Discussion
Folding pathways of circularly permuted proteins have been studied in a limited number of cases [2,[4][5][6][7][8][9]38,39] and in only one of these has the folding pathway remained the same as for the native protein [9]. It has been argued that changes in folding pathway due to circular permutation depend on the folding nucleus; a diffuse folding nucleus covering most of the protein is less likely to change the folding pathway compared to a regional compact nucleus [9,40]. In agreement with this notion, Gianni and co-workers demonstrated that circular permutation of PTP-BL PDZ2 resulted in stabilization of an intermediate [7].
The folding mechanism of PTP-BL PDZ2 has been thoroughly investigated by W-value analysis and constrained molecular dynamics simulations [19], to estimate the extent of formation of native contacts in the transition state for folding. PTP-BL PDZ2 folds with an early rather compact regional nucleus and a late, very native like transition state [19,20]. Its early folding nucleus consists of b-strands 1, 4 and 6. For PTP-BL PDZ2, the same circular permutation was made as the one in the present study (i.e., based on the naturally occurring circularly permuted PDZ domain D1pPDZ [17]), but with a different outcome. Thus, by linking b1 and b6 in PTP-BL PDZ2, this early nucleus is stabilized, which is reflected in a higher folding rate constant but also significant stabilization of an intermediate, which is likely to be off-pathway [6]. It is believed that such intermediates are dis-favoured by natural selection because of the increased risk for misfolding [10].
It was recently suggested that the relation between the position of the cleavage site and active site in circular permutants is important for whether the folding pathways change due to the permutation [41]. The site of our permutation is one amino acid away from the GLGF site, which is conserved among all PDZ domains and involved in binding of the backbone and C-terminus of the protein ligand [42]. However, while our data do not directly address the effect of permutation in the binding site, we note that SAP97 PDZ2 is not affected by the circular permutation but its homolog PTP BL PDZ2 displays a dramatic change in kinetic folding mechanism.
For SAP97 PDZ2, circular permutation increased the unfolding rate constant but the folding rate constant (D to N transition, Figure 5) remained unchanged. Effectively, this corresponds to a Wvalue of zero, both at the site of linkage (new turn between b1 and b6) and at the sites of the new N and C-termini (loop between b1 and b2). In other words, these structural elements have not started to form native contacts in the rate limiting transition state for folding. Furthermore, the rate constant for formation of the intermediate (D cis-P to I in Figure 5) was decreased upon circular permutation resulting in a lower maximum concentration of intermediate during the folding reaction. Thus, the result of the circular permutation is very different for the structurally very similar domains, PTP-BL PDZ2 and SAP97 PDZ2, and the basis for the difference is found in their early folding events. Figure 6. Chevron plots of the main phases of cp-and pwtSAP97 PDZ2 under different conditions. The main phase is the k obs value with the largest amplitude. Rollovers in the refolding and unfolding arm of the chevron plots can be detected when altering between stabilizing and destabilizing buffers, respectively. These rollovers illustrate switches between the rate limiting transition states of the (un)folding reaction. Fitting was done using b T -values obtained from a curve fit with 6 different PDZ domains in a previous study [19] and the good fit to the data for the circular permutant illustrates that the positions of the folding transition states along the reaction coordinate is similar for all PDZ domains, including the circular permutant. See Table S1 for the best fit parameters. The 0.6 M Na 2 SO 4 buffer also contained 50 mM potassium phosphate, pH 7.5, while the 50 mM potassium acetate buffer, pH 5.6, contained KCl to keep the ionic strength at the same value for all experiments. doi:10.1371/journal.pone.0050055.g006 To sum up, our results show how a circular permutation neither alters the structure (Figure 1) nor significantly affects the function (Figure 2) of the protein, SAP97 PDZ2. We further demonstrate that the canonical protein and the circular permutant fold via a similar mechanism (Figure 5), and that the rate of formation of the low energy intermediate has decreased in the circular permutant. These data illustrate the general feasibility of circular permutation as a mechanism for molecular evolution and, as suggested earlier [9], show that such events are most likely to be successful in regions of the protein that are not part of a folding nucleus.

Cloning, Expression and Purification
Cloning. The cDNA for the circular permutant of human SAP97 PDZ2, residues 327-405 connected to residues 315-326 via a GSG linker (see Figure 1A), was ordered from Geneart. Two additional mutations as compared to wild type SAP97 PDZ2 were present in the circularly permuted construct: I342W, as a probe for fluorescence, and C378A, to avoid formation of disulfide bridges. Both mutations have been shown to only have minimal effects on the wild type SAP97 PDZ2 [23]. The cDNA construct was cloned into the EcoRI/BamHI sites of a modified pRSET vector (Invitrogen), which added an N-terminal MHHHHHLVPRGS tag to the expressed protein. This His tag has previously been shown not to affect the stability nor binding of PDZ domains [23,43,44]. The expressed product is hereafter referred to as cpSAP97 PDZ2. The canonical variant, pwtSAP97 PDZ2, refers to amino acids 311-407 of the same protein and with the same mutations (I342W, C378A) as used in previous studies [21,23].
Expression. The vector was transformed into Escherichia coli BL21-DE3 pLyS cells that grew on LB-agar plates under selection of ampicillin (100 mg/ml) and chloramphenicol (35 mg/ml) at 37uC overnight. From the plates colonies where transferred to liquid LB culture at 37uC under selection of 50 mg/ml ampicillin. At an A 600 of ,0.6, protein expression was induced with 1 mM isopropyl-b-D-1-thiogalactopyranoside (IPTG) and grown for 3 more hours before harvesting by centrifugation.
Purification for kinetic experiments. The cell pellet was resuspended and frozen in 50 mM potassium phosphate, pH 7.0. After thawing, the cells were disrupted by ultrasonication followed by centrifugation (35 000 g) for 1 hour. The resulting supernatant was filtered through a 0.2 mm filter and incubated with Ni-NTA agarose, Qiagen, for 30 minutes. The agarose was then washed with 50 mM potassium phosphate, 25 mM imidazole, pH 7.5, until the A 280 was close to 0. The protein was eluted with 50 mM potassium phosphate, 250 mM imidazole at pH 7.5. The eluate was dialysed against 50 mM potassium phosphate pH 7.0 and loaded onto an S-column (GE healthcare), and eluted with a 0-500 mM NaCl gradient. The resulting protein sample was concentrated and dialysed against 50 mM potassium phosphate, pH 7.5. The mass and purity of the protein were analysed through matrix-assisted laser desorption ionization time-of-flight mass spectrometry and SDS-PAGE, respectively.
Purification for crystallization. The cell pellet was thawed and resuspended in 50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 0.1% Triton X-100, 5 mg lysozyme, containing complete protease inhibitor (Roche, Germany). The cells were lysed by sonication and the lysate was clarified by centrifugation (23 500 g) for 45 min at 4uC. The supernatant was filtered and loaded onto a Bio-Rad Econo-Pac gravity flow column containing Ni-Sepharose TM High Performance (GE Healthcare, Sweden) pre-equilibrated with 50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 10 mM imidazole and incubated for 1 h at 4uC. The column was washed with 50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 40 mM imidazole and the His-tagged cpSAP97 PDZ2 was eluted with 50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 250 mM imidazole. The fractions containing His-tagged cpSAP97 PDZ2 protein were pooled and concentrated to a final volume of 5.0 ml using a Vivaspin 5 kDa cut-off concentrator (Sartorius Stedim Biotech, Germany). The protein was further purified through a HiLoad TM 16/60 Superdex TM 75 prep grade column (GE Healthcare, Sweden) using gel filtration buffer (50 mM Tris-HCl, pH 7.5, 100 mM NaCl). The peak fractions containing cpSAP97 PDZ2 protein were pooled and concentrated to 20 mg/ml using a Vivaspin 5 kDa cut-off concentrator (Sartorius Stedim Biotech, Germany).

Structure Determination
Crystallization. A single crystal of cpSAP97 PDZ2 protein grew after several weeks at 4uC by vapour diffusion in a sitting drop composed of 1.5 ml protein (20 mg/ml His-tagged cpSAP97 PDZ2 in gel filtration buffer) and 1.5 ml reservoir solution (0.1 M MES, pH 6.0, 2.4 M ammonium sulfate) (Grid Screen Ammonium Sulfate, Hampton Research). For data collection, the crystal was cryoprotected by soaking in the reservoir solution supplemented with 20% glycerol for 1 min and flash frozen in liquid nitrogen.
Data collection and processing. X-ray diffraction data were collected on beam line ID23-2 at ESRF, Grenoble, France. Data were processed in space group C2 using XDS [45]. Initial phases were obtained by molecular replacement with the program Phaser [46] using the pwtSAP97 PDZ2 structure, pdb 2X7Z [21] as a search model. The final complete model obtained by iterative rounds of model building using Coot [47] and refinement using PHENIX [48] has R work of 21.9% and R free of 26.8%. The quality of the structure was assessed using MolProbity [49]. Refinement statistics can be found in Table 1. The refined coordinates have been deposited in the pdb with accession number 4AMH. Structure Figures were prepared using PyMOL (The PyMOL Molecular Graphics System, Version 1.2r1, Schrödinger, LLC).
Circular dichroism. Far-UV circular dichroism was measured using a Jasco J-810 spectropolarimeter for wavelengths between 200 and 260 nm, with 20 mM protein in 50 mM potassium phosphate, pH 7.5.

Stability Experiments
Urea induced equilibrium denaturation experiments were carried out with 5 mM protein in 50 mM potassium phosphate pH 7.5 at 25uC and varying urea concentrations. After excitation at 280 nm, the emission between 300 and 400 nm from Trp342, Tyr349 and Tyr399 was monitored with an SLM 4800 spectrofluorimeter (SLM Aminco, Urbana, IL). The resulting curve from plotting fluorescence at 350 nm against the urea concentration was fitted to the standard equation for solvent denaturation [24].

Binding Experiments
Ligand-binding kinetic experiments were carried out with 1 or 3 mM protein in 50 mM potassium phosphate, pH 7.5 at 10uC and at various peptide (LQRRRETQV) concentrations. All kinetic experiments were carried out on an SX-20MV stopped flow instrument (Applied Photophysics, Leatherhead, UK). After excitation (280 nm) of Trp342 the change in emission above 320 nm upon binding was followed using a cut-off filter. The observed rate constants, k obs , were obtained after fitting the resulting trace to a single exponential equation [23]. These rate constants were plotted against peptide concentration and the data fitted to the general equation for a second-order bimolecular association [50] to get the association rate constant, k on , and dissociation rate constant, k off . Experiments with the His-tag cleaved of by thrombin were performed to confirm that the Histag did not affect the binding.

Folding Experiments
Single-mixing experiments. Kinetic folding and unfolding rate constants were measured in the SX-20MV stopped flow spectrometer by 1:10 mixing of protein solution and urea-buffer solutions of varying concentrations of urea, at 25uC. The final protein concentrations in these single-jump kinetic experiments were 3 mM when measured in 50 mM potassium phosphate, 1.2 mM when measured in 50 mM potassium acetate, 8.6 mM KCl, pH 5.6 and 0.6 mM when measured in 0.4 M or 0.6 M sulfate, 50 mM potassium phosphate pH 7.5. Individual points with protein concentrations between 0.2 and 10 mM were tested to ensure that the results were not dependent on protein concentration.
For the refolding experiments the protein stock solutions were in 5.5 or 8 M urea and the corresponding final buffer. No refolding traces were measured at pH 5.6. The samples for all folding experiments were excited at 280 nm and the fluorescence emission through a 320 nm cut-off filter was recorded. The measured time courses in all of the refolding and unfolding experiments were fitted to either a single or a double exponential equation to obtain the observed rate constants. The rate constants were plotted on a semi-logarithmic plot against urea concentration, to obtain a chevron plot. The logarithms of the microscopic rate constants were assumed to have linear dependence on the urea concentration [24]. The chevron plot obtained in 50 mM potassium phosphate, pH 7.5 was fitted to equations for a sequential threestate mechanism with an on-pathway or off-pathway intermediate or to a triangular scheme [21]. The main phase of the chevron plots were also analysed individually assuming a sequential pathway with two high-energy intermediates and three transition states, where the switch from TS1 to TS2 causes the refolding arm rollover and the switch from TS2 to TS3 causes the unfolding arm rollover for the cpSAP97 PDZ2. When no change in the ratelimiting step (rollover) occurred for the main phase in the chevron plot, it was fitted to a two state equation with transition state 2 (TS2) as rate limiting When a change in the rate-limiting step could be detected, as a rollover in one of the arms, the chevron curve was fitted to a three state equation. All equations can be found in ref. [22].
Interrupted refolding. In interrupted refolding experiments 2.4 mM of protein in 4 M urea, 10 mM HCl (no buffer) was refolded by mixing 1:1 with 0.8 M Na 2 SO 4 , 100 mM potassium phosphate pH 7.5. After different delay times the protein was unfolded by mixing 1:1 with 9.2 M urea, 0.4 M Na 2 SO 4 , 50 mM potassium phosphate, pH 7.5 and the resulting kinetic trace was recorded. Thus, the refolding was done in 50 mM potassium phosphate, pH 7.5, 2 M urea, 0.4 M Na 2 SO 4 , and the subsequent unfolding in the same buffer but with a final urea concentration of 6.6 M. The resulting kinetic traces could be fitted to a double exponential equation. Since all the points were measured in the same experimental conditions but just with different delay times, the observed rate constants should be identical. Hence, in one double jump experiment, we fitted all the obtained kinetic traces to shared rate constants to get the amplitudes at different delay times. These amplitudes were plotted against delay time and fitted to a single or double exponential equation.
Interrupted unfolding. In interrupted unfolding experiments of cpSAP97 PDZ2, 2.4 mM of protein in 5 mM potassium phosphate, pH 7.5 was unfolded by mixing 1:1 with 8 M Urea, 25 mM HCl. After different delay times the protein was refolded by mixing 1:1 with 0.8 M Na 2 SO 4 , 100 mM potassium phosphate, pH 7.5 and the resulting kinetic trace was recorded. For pwtSAP97 PDZ2, 2.4 mM of protein in 2 M urea, 5 mM potassium phosphate, pH 7.5, was unfolded by mixing 1:1 with 8 M Urea, 25 mM HCl. After different delay times the protein was refolded by mixing 1:1 with 100 mM potassium phosphate, pH 7.5 and the resulting kinetic trace was recorded. The resulting traces were analysed as previously described for the interrupted refolding experiments.

Supporting Information
Table S1 Best fit folding parameters to chevron plots of the main phase of cpSAP97 PDZ2 and pwtSAP97 PDZ2 under different conditions. Fitting was done using the b T -values obtained in a previous study (ref. [22] in the paper), where six PDZ domains were found to fold via a unifying mechanism. See Fig. 6 for experimental data and fitted curves. (DOCX)

Table S2
Rate constants used in the Copasi simulation in Figure 4C of experimental data ( Figure 4A) to the square model. (DOCX) Author Contributions