Fig 1.
Misfolding mechanism of tandem domains.
The schematic shows the native-like stable intermediates populated en route to native folding (upper) or misfolding (lower), and used to explain single-molecule and ensemble folding kinetics [12]. The correctly folded dimer (c) is formed from the unfolded chain (a) via an intermediate (b) in which either of the domains folds natively. The misfolded dimers (e) form via initial formation of a domain-swapped “central domain” (d) formed by the central regions of the sequence, followed by a “terminal domain” formed by the terminal regions of the sequence. The blue and red dots indicate the N- and C- terminal respectively, in each case. The N- and C-terminal halves of the chain are also coloured in blue and red respectively.
Fig 2.
Native states of the single domains.
The experimentally determined structure of a single domain of each of the protein domains studied here: (a) SH3, (b) SH2, (c) TNfn3, (d) PDZ, (e) Titin I27, (f) Ubiquitin and (g) Protein G. The PDB accession code are 1SHG, 1TZE, 1TEN, 2VWR, 1TIT, 1UBQ and 1GB1, respectively.
Fig 3.
Folded and misfolded topologies of Src SH3.
(a) Schematic of Src SH3 fold, in which the three-dimensional β sheet structure (shown in (b)) is unrolled into two dimensions, for each domain (N-terminal and C-terminal in blue and red respectively). On the N-terminal domain are indicated the sequence positions K ∈ {0, 18, 37, 46} characterizing the possible circularly permuted “central domains”, with K = 0 corresponding to the native fold and K > 0 indicating the approximate starting residue for the “central domain” misfold. (c), (e), (g): two dimensional representations of the observed misfolded topologies of Src SH3. In each case, the residue K characterizing the misfold is indicated by the bullet point and the central domain is enclosed by a broken rectangle. (d), (f), (h): three-dimensional representations of the misfolds shown in (c),(e),(g) respectively.
Table 1.
Summary of misfolding statistics and central domain properties.
K labels the type of fold/misfold (see text; K = 0 is native); RCO is relative contact order [49]. ΔGf and ΔGs are the folding barrier and stabilities of a single folded/misfolded domain. Population is frequency of each state at the end of the 1024 trajectories. Maximum standard error on populations is 1.6% for a sample size of 1024. Numbers in brackets are rank correlations with folded/misfolded populations.
Fig 4.
Comparison of domain-swapped misfolds with experimental structures.
Selected misfolded dimeric tandems obtained from the simulations (right column) are compared with corresponding experimental structures (solved by crystallography or NMR) of domain-swapped dimers involving two separate protein chains (left column). The proteins are, from top to bottom (a),(b): SH3, (c),(d): SH2, (e),(f): TNfn3 and (g),(h): PDZ domains The PDB accession codes are 1I07, 1FYR, 2RBL and 2OSG respectively.
Fig 5.
Free energy profile of WT and its circular permutant domains.
The structures of SH3 and Ubiquitin are shown in (a) and (c), with the “cut” positions K in the WT to form circular permutant labeled with crosses. (b) is the free energy surfaces F(Q) of WT SH3 as well as its circular permutants at 300K. (d) is the F(Q) of WT Ubiquitin and its circular permutants at 290K. The labels K indicate the residue index of the cut position. The free energy curves of the circular permutant cases are shifted vertically for visual clarity, and coloured using the colours corresponding to the crosses in (a) and (c). The free energy plots of the other systems: GB1, SH2, TNfn3 and PDZ are shown in Fig A in S1 Text.
Fig 6.
Illustration of relation between folding funnels for native and domain-swapped domains. (A) Example native contact map, highly coarse-grained for simplicity. (B) Map of all possible native-like contacts for a two-domain protein, showing native contacts in black and domain-swapped contacts in blue. (C) In the context of the two-domain sequence, the folding funnels for a single native domain (green broken line) and domain-swapped domain (red broken line) are interconnected, forming part of a single global funnel (black line). States are considered part of the native funnel if all contacts formed belong to the native state, and to the domain-swap funnel if all contacts formed belong to the domain-swapped structure. Note that only a subset of possible states are shown, for clarity (e.g. other domain-swapped species are possible). Only states with a single native-like stretch of residues are considered, whose length does not exceed that of a single folded domain. Arrows connect states differing by a single coarse grained residue flipping between native and non-native.
Fig 7.
Transition paths for the formation of the first (folded or misfolded) domain in tandem SH3 dimers.
Folding (a) and misfolding (c) kinetics are projected along the reaction coordinate Qin, where two different kinds of Qin are chosen depends on K, which is for the native fold and the circular permutated misfold when K = 0 and 18, respectively. Transition-path segments are defined as being between QK = 0.1 and QK = 0.9. In the right panels, the same trajectories are projected onto the QK = 0 and QK = 18 (panel (b) for trajectories in (a) and (d) for those in (c).
Fig 8.
Distribution of the “folding nucleus” location from the tandem dimer simulations (Table 1).
The of the (a) (SH3)2 (b) (SH2)2 (c) (TNfn3)2 and (d) (PDZ)2 are extracted at two different Q on the folding pathway (see individual figure legends for Q values). Note that the Q ∼ 0.5 corresponds to the structure with the first domain fully formed. The spread of contacts in sequence, within a given conformation, also becomes narrower with increasing Q (Fig B of S1 Text).
Fig 9.
a) Monte Carlo simulation trajectory segment when Ep = 4.0 kcal/mol. The free energy profile of vs Qres changes when the difference of the stability between the WT and the circular permutant become larger and larger, in which cases the Ep are b) 0.0, c) 4.0 and d) 6.0 respectively. All the free energy plots are at the temperature T = 525 K.
Fig 10.
Alchemical transformation from native to circular permutant.
(a) Native structure; (b) cyclized structure; (c) circular permutant after cutting another loop.
Fig 11.
ΔΔGtot from alchemical model vs ΔGs (Table 1).
Fig 12.
Stability of different CP of Ubiquitin (a) and TNfn3 (b) with different linker lengths.
The linker sequence composition is (GS)2-S, (GS)5, (GS)7-S and (GS)10 giving ll of 5, 10, 15 and 20 respectively.