Bacterial Inclusion Bodies Contain Amyloid-Like Structure

Protein aggregation is a process in which identical proteins self-associate into imperfectly ordered macroscopic entities. Such aggregates are generally classified as amorphous, lacking any long-range order, or highly ordered fibrils. Protein fibrils can be composed of native globular molecules, such as the hemoglobin molecules in sickle-cell fibrils, or can be reorganized β-sheet–rich aggregates, termed amyloid-like fibrils. Amyloid fibrils are associated with several pathological conditions in humans, including Alzheimer disease and diabetes type II. We studied the structure of bacterial inclusion bodies, which have been believed to belong to the amorphous class of aggregates. We demonstrate that all three in vivo-derived inclusion bodies studied are amyloid-like and comprised of amino-acid sequence-specific cross-β structure. These findings suggest that inclusion bodies are structured, that amyloid formation is an omnipresent process both in eukaryotes and prokaryotes, and that amino acid sequences evolve to avoid the amyloid conformation.


Introduction
The conversion of peptides and proteins into aggregates is associated with several dozen pathological conditions in humans, including Alzheimer disease, Parkinson disease, and diabetes type II [1][2][3], and it is also of major concern in biotechnology [4]. Numerous disease-associated aggregates are composed of elongated filaments, termed amyloid-like fibrils, often comprising intermolecular and in-register bsheets parallel to the fibril axis [5][6][7][8][9][10][11]. Amyloid fibrils bind thioflavin T (Thio T) [12] and show birefringence upon Congo red (CR) staining, which is indicative of repetitive structure [13]. Amyloid fibrils, as highly ordered aggregates, have traditionally been distinguished from amorphous aggregates, such as protein precipitates and bacterial inclusion bodies [14,15]. The latter present widespread problems in biotechnology [16].
The deposition of protein products into electron-dense, apparently amorphous inclusion bodies occurs often during high-level expression of heterologous and some endogenous genes in Escherichia coli [17,18]. The aggregation is probably caused by the high local concentration of nascent polypeptides emerging from ribosomes [4], which are ineffectively protected from aggregation by the insufficient number of chaperones present during recombinant overexpression, or by the lack of eukaryotic chaperones [19]. Although inclusion bodies have been classified as amorphous aggregates, they are not just clusters of misfolded proteins that stick to each other through nonspecific hydrophobic contacts [14][15][16]. Rather, they are often enriched in b-sheet structure [20][21][22], bind amyloid-tropic dyes [21], have a seeding capacity reminiscent of amyloids [21], form homogeneous aggregates without cross-aggregation [23], and display aggregation propensities strongly affected by mutations [24]. Here, we studied in detail the structure of in vivo-derived bacterial inclusion bodies of three proteins and show that all three inclusion bodies studied are amyloid-like, comprising amino acid sequencespecific cross-b structure.

Results
To elucidate structural details of inclusion bodies of E. coli (Figures 1-4) and compare them with the structure of amyloid fibrils, three inclusion body-forming proteins with distinctive native folds (i.e., b-sheet, a-helix, and mixed ahelix/b-sheet, with and without disulfide bridges) were chosen to cover the fold universe to a certain degree: (1) The ahelical, early secreted antigen 6-kDa protein (ESAT-6, with Protein Data Bank [pdb] accession code 1wa8; Figure 2B) from Mycobacterium tuberculosis folds only in complex with its protein partner CFP-10 (culture filtrate protein, 10 kDa) both in vitro and when overexpressed in E. coli [25]. (2) The mixed a-helical and b-sheet fragment residues 13-74 of the secretory human bone morphogenetic protein-2 (BMP2, with pdb accession code 3bmp) ( Figure 3B) [26,27]. BMP2 ) is unable to fold in E. coli because it lacks the essential structural segment of residues 75-114, as well as the ability to form disulfide links in the cytoplasm of E. coli. (3) The b-sheet comprising the extracellular domain (ECD) of the human membrane protein myelin oligodendrocyte glycoprotein (MOG(ECD), with pdb accession code 1pko), having one disulfide bridge ( Figure 4B) [28,29]. The inclusion body formation of MOG is attributed to the reducing environment of the E. coli cytoplasm.

Inclusion Bodies of ESAT-6 Are Composed of Ordered b-Sheet Structure
All three proteins studied are present in the insoluble fraction of E. coli upon overexpression ( Figures 2E, 3E, and 4E) and form homogeneous inclusion bodies as expected (unpublished data). The purified inclusion bodies of all three proteins bind Thio T to a similar or greater extent than aged amyloid fibrils of a-synuclein ( Figure 1) and produce strong birefringence upon CR staining (Figures 1 and S1). These findings indicate that inclusion bodies of all three proteins studied comprise highly ordered structures reminiscent of amyloid-like fibrils.
We found, furthermore, that inclusion bodies of all three proteins contain sequence-specific positions of regular secondary structure, by measuring quenched hydrogen/ deuterium (H/D) exchange with solution nuclear magnetic resonance (NMR) [9,30] (see also Materials and Methods). This technique allows the identification of solvent-protected backbone amide protons involved in hydrogen bonds. In the case of ESAT-6, the [ 15 N, 1 H]-correlation NMR spectrum of Figure 2A (left) contains one assigned cross peak for approximately 90% of all its backbone amides, enabling a residue-specific determination of their hydrogen exchange rates in inclusion bodies. Upon exchange in D 2 O buffer for 311 h (Figure 2A, right), many cross peaks show a virtually complete loss of intensity, which is indicative of fast exchange. In contrast, a set of cross peaks of residues 8-25 and 36-43 are still present, which is indicative of slow exchange. The hydrogen exchange was followed over time to get insight into conformational heterogeneity ( Figure S2). Individual amides of all residues display a biphasic behavior comprising a very fast and a slow exchanging component ( Figure S2). However, for most of the residues, only either the fast or the slow exchanging component is highly populated (Figure 2C; P . 2/3, where P is the population).
The detailed sequence-resolved analysis of the hydrogen exchange data of the major population ( Figure 2C) shows that, most of the residues (i.e., 2-6, and 24-95) are only weakly or not protected and are therefore considered to be conformationally disordered. This is indicated by exchange rates faster than 10 1 h À1 . In contrast, residues 7-23 display slow exchange rates of 10 À3 to 10 À4 h À1 and are therefore considered to be involved in hydrogen bonds ( Figure 2C). Because the circular dichroism (CD) spectrum is indicative of b-sheet and ''random coil'' conformation ( Figure S3), it is likely that the solvent-protected residues 7-23 contain mainly b-sheet secondary structure. The presence of abundant bstructure is strongly supported by the fiber diffraction data of inclusion bodies of ESAT-6 ( Figure 2D). The sharp reflection observed at 4.7 Å is interpreted as the spacing between strands in a b-sheet. The diffuse reflection at approximately 10 Å is interpreted as the spacing between b-sheets. These two reflections are typically observed for amyloid-like fibrils. The circular profiles for these reflections, rather than orthogonal positions for the two reflections, show that the directions of the amyloid-like entities in inclusion bodies are not strongly aligned. To confirm the potential of residues 7-23 to be in a cross-b-sheet amyloid conformation, a synthetic peptide E20 corresponding to residues 6-25 of ESAT-6 was tested for in vitro aggregation. Larger images together with the corresponding pictures from the brightfield microscope are shown in Figure S1. The histogram shows the relative Thio T binding of BMP2(13-74), ESAT-6, and MOG(ECD) in comparison to aged a-synuclein (a-Syn) fibrils. The intensity of the Thio T fluorescence is shown in arbitrary units (AU

Author Summary
Protein aggregation is a process by which identical proteins selfassociate into imperfectly ordered macroscopic entities. Such aggregates are associated with several pathological conditions in humans, including Alzheimer disease, Parkinson disease, and diabetes type II. Furthermore, protein aggregation is a major concern in the biotechnological production of recombinant proteins and the storage of proteins, and is a central mechanism of protein folding. In general, two classes of protein aggregates are classified: first, highly ordered aggregates can be composed of native globular molecules, such as the hemoglobin molecules in sickle-cell fibrils, or reorganized into b-sheet-rich aggregates, termed amyloid-like fibrils; and second, amorphous aggregates that lack any long-range order. Here, we demonstrate that bacterial inclusion bodies, which have been believed to be made up of amorphous aggregates, are in fact amyloid-like, comprising cross-b structure that is dependent on amino-acid sequence. These findings suggest that inclusion bodies are structured, that amyloid formation is a process present in both eukaryotes and prokaryotes, that amino acid sequences can evolve to avoid the amyloid conformation, and that there might be no amorphous state of a protein aggregate.
Upon incubation for 3 d at physiological conditions, E20 forms amyloid fibrils as evidenced both by electron microscopy (EM) ( Figure S5) and Thio T binding (unpublished data).
To verify that the cross-b-sheet-forming segment of residues 7-23 is the dominant component in the formation of inclusion bodies of ESAT-6, the following mutagenesis experiments were carried out: Within the sequence of ESAT-6, aggregation-prone hydrophobic residues were replaced by the aggregation-interfering Arg [31]. The substitution of L36R, V54R, or I76R did not alter substantially the formation of inclusion bodies. This finding is expected because these residues are located within the disordered region of the inclusion bodies. In contrast, the substitution of F8R, I11R, I18R, or V22R within the b-sheet-forming segment abolished the formation of inclusion bodies ( Figure 2E). Similarly, the deletion of the cross-b-sheet-forming segment of residues 7-23 or its replacement by a non-aggregation-prone segment of residues 25-41 of ESAT-6, which forms a helix in the soluble structure of ESAT-6, resulted both in a loss of inclusion body formation ( Figure 2E). The lack of soluble expressed ESAT-6 variants comprising the latter substitutions and deletions is attributed to their fast degradation [15]. To confirm that the plasmids of the latter point mutants were functionally conserved, they were mutated back to wild type, which restored the formation of inclusion bodies upon overexpression ( Figure S4). Consistent with all our findings, one of the two applied algorithms that predict aggregation-prone segments [32,33] indicates residues approximately 7-23 to be the most-aggregation-prone segment of ESAT-6 ( Figure 2C). In summary, residues 7-23 of ESAT-6 in bacterial inclusion bodies form a cross-b-sheet structure characteristic of amyloid-like fibrils [5,10] with the remainder of the amino acid sequence disordered.  Figure 4C) are not protected or only weakly protected and may therefore be conformationally disordered. In contrast, residues 62-67 of BMP2(13-74) ( Figure 3C) and residues 85-95, 101-108, and 111-118 of MOG(ECD) ( Figure  4C) display slow exchange rates of 10 À2 to 10 À4 h À1 and are therefore considered to be involved in hydrogen bonds. Since diffraction patterns of both inclusion bodies are indicative of cross-b structure ( Figures 3D and 4D), it is likely that the solvent-protected segments contain mainly b-sheet secondary structure of cross-b-sheet nature. These findings are further confirmed by the amyloid-like fibril formation of synthetic peptides B13 corresponding to residues 59-71 of BMP2 , and M11 and M18 corresponding to residues 85-95 and 101-118 of MOG(ECD) ( Figure S5). Consistent with these data, the combination of the applied algorithms that predict aggregation-prone segments [32,33] indicate residues approximately 62-67 and residues approximately 100-118 to be the most aggregation-prone segments of BMP2(13-74) ( Figure  3C) and MOG(ECD) ( Figure 4C).

Two Other Inclusion Bodies Also Contain Ordered b-Sheet Structure
A similar mutagenesis experiment as described above was carried out to verify that the cross-b-forming segment residues 62-67 of BMP2(13-74) and the segments 85-95, 101-108, and 111-118 of MOG(ECD) are the dominant components in the formation of inclusion bodies ( Figures  3E and 4E). As expected, the substitutions I32R and L51R of BMP2(13-74) did not alter the formation of inclusion bodies, because the substitutions are located within the disordered segment of the inclusion bodies. In contrast, the substitutions I62R, V63G, V63E, V63R, and L66R of BMP2(13-74) within the b-sheet-forming segment abolished or diminished the formation of inclusion bodies ( Figure 3E). Moreover, these variants are present in the soluble fraction of the E. coli lysate upon overexpression ( Figure 3E). In particular, the replace-chemical-shift assignment of protein backbone amide cross peaks are indicated by a single-letter amino acid code and the corresponding residue number. (B) Ribbon representation of the 3-D structure of soluble ESAT-6 in complex with CFP-10 [25]. The green-colored segment corresponds to residues 7-23, which experiences slow exchange in inclusion bodies as shown in (C). (C) Plots of the observed exchange rates k ex /h, the relative population P(F) of the two exchange regimes observed with P ¼ 1 for 100% occupancy and P ¼ 0 for 0% occupancy, and the predictions of aggregation-prone segments against the amino acid sequence of ESAT-6. The exchange rates of the major population are colored green. If the minor population is present more than 1/3, the corresponding exchange rates are shown in grey. Although some of the residues 36-43 show slow exchange in the HMQC spectrum in (A), their slow-exchanging population is present less than 1/3 and hence not shown in (C). In the third plot of (C) labeled with an ''A,'' predicted aggregation-prone segments of ESAT-6 are shown using two distinct algorithms: 3DPROFILE [33] in gray, and TANGO [32] (the latter is not shown, since no aggregation-prone segment was predicted). For 3DPROFILE, predictions are shown for segments having energies À23 kcal/mol. For both algorithms, outstanding relative values ( À23 in 3DPROFILE, and .0 in TANGO) within a segment of several amino acid residues are indicative of an aggregation-prone segment. The secondary structures of the soluble conformation shown in (B) are highlighted in red for helix and blue for b-sheet, respectively. The secondary structural elements predicted by the software Jpred [53] are highlighted by cyan arrows for b-sheet conformation and a yellow helix for helical structure, respectively. An amino-acid sequence-resolved hydrophobicity score plot calculated by the software ProtScale [54,55] is shown at the lowest panel labeled with ''H,'' with positive values indicative of hydrophobicity. (D) X-ray diffraction of inclusion bodies of ESAT-6. The two reflections at 4.7 Å and approximately 10 Å consistent with cross-b-sheet structure are labeled.
(E) Mutagenesis of ESAT-6 and the influence of amino acid substitutions in the formation of inclusion bodies. Coomassie-stained SDS-polyacrylamide gels were obtained from soluble (s) and insoluble (i) fractions of lysates of E. coli cells expressing wild-type ESAT-6 (WT) or ESAT-6 variants as indicated (i.e., point mutations are indicated by a one-letter code, D7-23 stands for the deletion variant, which lacks the slow-exchanging residues 7-23, and ''replaced'' stands for the mutant for which residues 7-23 were replaced with the helical segment of residues 25-41 of ESAT-6. The 10-kDa molecular weight standard is labeled. The variants that are present in the insoluble fraction are colored blue in the amino acid sequence of (C), and the single amino acid residue variants that are absent in the insoluble fractions are colored red in the amino acid sequence of (C), respectively. doi:10.1371/journal.pbio.0060195.g002 (C) Plots of the observed exchange rates k ex /h, the relative population P(F) of the two exchange regimes observed ( Figure S2), and the predictions of aggregation-prone segments against the amino acid sequence of BMP2 . The exchange rates of the major population are colored green. If the minor population is present more than 1/3, the corresponding exchange rates are shown in grey. In the third plot of (C) labeled with ''A,'' predicted aggregation-prone segments of BMP2(13-74) are shown using two algorithms: 3DPROFILE [33] in gray and TANGO [32] in blue. Predictions of aggregation are shown for segments having energies À23 kcal/mol from 3DPROFILE, and values .0 from TANGO. The secondary structures of the soluble conformation shown in (B) are highlighted in red for helix and blue for b-sheet, respectively. The secondary structural elements predicted by the software Jpred [53] are highlighted by cyan arrows for b-sheet conformation and a yellow helix for helical structure, respectively. An amino acid sequence-resolved hydrophobicity score plot calculated by the software ProtScale [54,55] is shown at the bottom, labeled with ''H,'' with positive values indicative of hydrophobicity. ment series at position 63 indicate that all kinds of aggregation-interfering residues [31] (i.e., with or without charge, small or large side chains, positive or negative charged) are able to perturb inclusion body formation, which results in the presence of soluble protein ( Figure 3E).
To further strengthen the finding that the hydrogenprotected residues 62-67 of BMP2(13-74) are responsible for the formation of inclusion bodies, a plasmid was constructed that codes for residues 61-70 of BMP2(13-74) C-terminally to the protein domain PDZ1 of SAP90, which alone is expressed highly soluble in E. coli ( Figure S6). Since the BMP2-tag, denoted as B10, is highly aggregation prone, upon overexpression in E. coli, the fusion protein PDZ1-B10 is present mainly as inclusion bodies ( Figure S6). Hence, the hydrogenprotected residues 62-67 appear to be the dominant components in the formation of inclusion bodies.
In the mutagenesis experiment of MOG(ECD), substitutions within the disordered segments did not alter the formation of inclusion bodies, as expected. However, single point substitutions within one b-sheet-forming segment are not sufficient to abolish inclusion body formation, indicating that the presence of one aggregation-prone segment of bsheet secondary structure is sufficient for inclusion body formation ( Figure 4E). Diminished inclusion body formation is observed only with the deletion variants MOG(1-73) and MOG(1-83) which lack all the hydrogen exchange-protected segments ( Figure 4E).

Inclusion Bodies May Contain Amyloid-Like Protofibrils
Although the inclusion bodies studied here are amyloidlike, they do not display fibrils under EM ( Figure 5A). The absence of visible fibrils may be attributed to the high protein concentration in inclusion bodies and to the abundant, stained amorphous segments of the protein (Figures 2, 3, and 4) obscuring the fibrils. An alternative explanation could be that inclusion bodies form too rapidly for fibrils to form from the more immature and flexible protofibrils that have less rigidly linear profiles (see also below). We favor this latter hypothesis because of the following reasons: Varying the amount or time of staining as well as grinding inclusion bodies or treating inclusion bodies with chemical denaturant did not reveal reproducible and significant fibril formation under EM (unpublished data). In contrast, upon incubation of purified inclusion bodies of BMP2(13-74) at 37 8C for 12 h, fibrils appear and grow more prominent with time ( Figure  5B).The qualitative analysis of EM shows thereby two scenarios: (1) the fibrils appear to grow from the inclusion bodies ( Figure 5A), and (2) the inclusion bodies comprise highly dense bundles of amyloid fibrils ( Figure 5B). The apparent growth of fibrils from inclusion bodies can be accelerated by higher temperature (tested up to 50 8C; unpublished data), but is not present in a sample stored for 10 mo at 4 8C (unpublished data). Furthermore, protein overexpression of BMP2(13-74) in E. coli over a period of 12 h under semi-controlled physiological conditions yields inclusion bodies with fibril-like structural entities ( Figure 5C). Similarly, upon incubation of purified inclusion bodies of ESAT-6 at room temperature (;21 8C) for 14 d, amyloid-like fibrils appear under EM ( Figure S7). We therefore suggest that inclusion bodies are reminiscent of amyloidogenic protofibrils.

Common Structural Properties of Inclusion Bodies
Our structural studies of three bacterial inclusion bodies reveal the presence of distinct sequence-specific cross-b-sheet structures surrounded by amorphous disordered segments. The number of cross-b segments is variable (i.e., one to three b-segments in the three proteins studied here). Their lengths are typically around seven to ten residues long, if it assumed that the 16-residue-long solvent-protected segment of ESAT-6 inclusion bodies may be interrupted in the middle. The bstrands in inclusion bodies do not necessary have a b-sheet conformation in the corresponding soluble fold of the protein (Figures 2B, 3B, and 4B). In particular, the segments comprising a b-sheet structure in the inclusion bodies of ESAT-6 and BMP2(13-74) are helical in the corresponding soluble folded conformations (Figures 2B and 3B). However, although the amino acid sequence composition of the various b-strands involved in aggregation are quite heterogeneous, these segments share common properties when compared with nonaggregating sequences. First, they display a low energy when complementing themselves in a pair of b-sheets, as required by the 3DPROFILE algorithm [33], and second, they have a high propensity to form a b-aggregate, as required by the TANGO algorithm [32]. Neither algorithm makes predictions in perfect agreement with the NMR results, but the agreement is good considering that the energy models of the two algorithms are entirely different (Figures 2C, 3C, and  4C).
Hence, in striking contrast to the traditional view of bacterial inclusion bodies as amorphous aggregates, they may contain ordered structural segments of cross-b structure, of which the number, length, and form of its b-sheets are determined by the particular amino acid sequence of the protein. The structures of these regions are thereby probably restricted to one of the possible topologies of cross-b structures [10]. Adjacent to the cross-b-sheet core of the inclusion body aggregates, there may be amorphously disordered segments (Figures 2, 3, and 4), or also folded domains [34,35].

Evolutionary Pressure against Amyloid Aggregates
Under the assumption that the observed amyloid-like nature of inclusion bodies (Figures 1-4 and [21]) holds for most of the hundreds of documented bacterial inclusion bodies [36], amyloid aggregation appears to be a common property of protein segments and consequently is observed in both eukaryotes and prokaryotes [2]. Within the framework of this hypothesis, it has been suggested that there must be evolved strategies against amyloid formation, which include both quality control mechanisms through molecular chaperones as well as sequence-based prevention of amyloid aggregation [31,32,37,38]. The suggested evolutionary pressure on the protein's amino acid sequence composition against amyloid-like aggregation is evident from our present inclusion body study. The amino acid sequence of over 80% of ESAT-6, 90% of BMP2 , and approximately 75% of MOG(ECD) are disordered in inclusion bodies ( Figures 2C,  3C, and 4C) and hence not aggregation prone. This finding indicates that the majority of residues in the amino acid sequences of all three proteins studied have evolved to disfavor aggregation.

Bacterial Inclusion Bodies and Human Amyloid Diseases
The formation of amyloid-like inclusion bodies of E. coli may also provide insights into fundamental mechanisms of amyloid aggregation-related pathologies such as Alzheimer disease and prion diseases [17]. In these so-called amyloid diseases, a soluble protein or peptide alters its state via conformational intermediates into amyloid fibrils [1]. Some of the intermediates, rather than the amyloid fibrils themselves, have been proposed to be the most toxic amyloid entities [2,39]. These intermediates might include protofibrils. The data presented here and by others suggest that inclusion bodies exhibit structural and some biological properties of protofibrillar amyloid aggregates [40,41] in that inclusion bodies are amyloid-like in structure, are able to mature into amyloid-like fibrils with time ( Figure 5), and show amyloid-like ''cytotoxicity'' when administered to an eukaryotic cell line [42]. Furthermore, we show in Figure S8, that the amyloidogenic mouse prion protein and fragments thereof form amyloid-like inclusion bodies when expressed in E. coli. Despite the similar biological and structural properties of inclusion bodies and protofibrils or fibrils associated with pathological conditions, inclusion bodies are not toxic to its host E. coli and are believed rather to be a detoxification mechanism [43]. The lack of toxicity of inclusion bodies is in agreement with the lack of toxicity of amyloid b-protein (Ab) amyloids and other amyloid fibrils when administered to a dermal cell line [44]. These findings may support the hypothesis that the toxic species in amyloid diseases is a small oligomer rather than the amyloid fibrils [45][46][47].
Our studies suggest a tight structural link between amyloid aggregates associated with human amyloid diseases, inclusion bodies of the aggresome [48], and inclusion body formation in E. coli. In addition, these structural studies of bacterial inclusion bodies extend the possible structural landscape of every protein: each protein may exist, not only in an unfolded or folded state, but, by containing at least one amino acid segment that is capable of participating in a sequencespecific, ordered, cross-b-sheet aggregated state, may also exist in an amyloid-like aggregate. The process of protein aggregation can thus be viewed as a primitive folding mechanism, resulting in a defined, aggregated conformation with each aggregated protein having its own distinctive properties.

Materials and Methods
Cloning and expression of proteins and variants thereof. The plasmids of ESAT-6 (M. tuberculosis) is a kind gift of Dr. Jeffery S. Cox (University of California, San Francisco), the plasmid of BMP2 (human) is a kind gift of Dr. Senyon Choe (Salk Institute), and the plasmid of MOG (rat) is a kind gift of Dr. Nancy Ruddle (Yale University) and Dr. Christopher Linington (University of Aberdeen). The DNAs encoding the ESAT-6 and BMP2(13-74) were cloned into the pET21a(þ) vector. The DNA encoding the extracellular domain of MOG (MOG(ECD)) was cloned into the pQE-12 vector with and without a C-terminal His-tag [28]. All the experiments throughout this work was done with His-tagged MOG(ECD), since both Histagged and His-tag-free MOG(ECD) form inclusion bodies with a very similar H/D exchange pattern (unpublished data). All site-directed mutants were generated with the appropriate nucleotide modification by PCR and sequenced for accuracy. All proteins were grown in BL21(DE3) cells at 37 8C, and were induced at optical density at 600 nm (OD 600 ) ¼ 1.  [28]. The green-colored segments correspond to residues 85-95, 101-108, and 111-118, which comprise slow exchange in inclusion bodies as shown in (C). (C) Plots of the observed exchange rates k ex /h, the relative population P(F) of the two exchange regimes observed, and the predictions of aggregationprone segments against the amino acid sequence of MOG(ECD). The exchange rates of the major population are colored green. If the minor population is present more than 1/3, the corresponding exchange rates are shown in grey. Because of the size of the protein, considerable overlap is observed in the DMSO spectrum (see [A]), making the analysis of the exchange rates of some residues difficult. However, most of these overlap problems could be resolved by the assumption that sequential neighboring residues show a similar extent of exchange. The exchange rates that have been extracted following this procedure are colored in light green. In the third plot of (C), predicted aggregation-prone segments of MOG(ECD) are shown using two algorithms: 3DPROFILE [33] in gray and TANGO [32] in blue. Predictions of aggregation are shown for segments having energies À19.5 kcal/mol from 3DPROFILE, and values .0 from TANGO. The secondary structures of the soluble conformation shown in (B) are highlighted in red for helix and blue for b-sheet, respectively. The secondary structural elements predicted by the software Jpred [53] are highlighted by cyan arrows for b-sheet conformation and a yellow helix for helical structure, respectively. An amino acid sequence-resolved hydrophobicity score plot calculated by the software ProtScale [54,55]  washed three times with the same buffer. See Figure S9 highlighting the individual steps of the purification protocol.
Thioflavin T binding. Inclusion body sample was diluted in 500 ll of buffer A to a final concentration of 20 lM. The solution was mixed with 5 ll of 2 mM Thio T prepared in the same buffer. Fluorescence was measured immediately after addition of Thio T. The experiment was measured on a spectrofluorimeter (Photon Technology International) with excitation at 450 nm and emission at 485 nm. A rectangular 10-mm quartz microcuvette was used. As reference, asynuclein fibrils were measured as follows: a 14-ll aliquot of aged asynuclein sample (10 mg/ml protein, PBS [pH 7.4]) was diluted in 500 ll of buffer A to a final concentration of 20 lM. The solution was mixed with 5 ll of 2 mM Thio T prepared in the same buffer.
Congo Red birefringence. The CR staining was performed using the diagnostic amyloidal stain kit HT60 from Sigma. Briefly, purified inclusion bodies in buffer A were pelleted by centrifuge at 16,000g and washed once by distilled water. The inclusion bodies were then placed in 500 ll of alkaline sodium chloride solution for 20 min, with continuous vortexing. The continuous vortexing enhances the uniform mixing of all inclusion bodies particles in solution. The mixture was then centrifuged briefly with 16,000g, and pellets were taken to stain with alkaline CR solution for 20 min, with continuous vortexing. The mixtures were centrifuged briefly at 16.000g, and pellets were washed twice with 500 ll of 20% ethanol. The pellets were resuspended in PBS and then spread evenly onto glass slides and air dried at room temperature. The slides were analyzed using a microscope equipped with two polarizers equipped with a chargecoupled device (CCD) camera.
Sequential NMR assignment. Nearly complete sequence-specific assignment of the backbone H N /N cross peaks in the [ 15 N, 1 H]-HMQC spectra were obtained for all three proteins using the tripleresonance experiments HNCACB [49] and (H)N(COCA)NH [50] applied to 13  Residues that display high intrinsic exchange rates in DMSO were determined by the addition of H 2 O followed by the measurement of a series of 2-D spectra. Using this control measurement, some residues were excluded from the H/ D-exchange data analysis. To confirm that during the H/D-exchange measurement the inclusion bodies were preserved in their structural properties, EM and Thio T binding studies were performed at day 0 and day 15 ( Figures 5, S7, and S10). In addition, an H/D-exchange experiment with an exchange time of 1 h of 15-d-old inclusion bodies of ESAT-6 was measured, which showed no qualitative difference when compared with freshly prepared inclusion bodies thereof (unpublished data). Since EM, Thio T, and H/D-exchange measurements of fresh (0 day) and 2-wk-old inclusion bodies show close resemblance, their integrity over the entire H/D-exchange experiment is assumed. The spectra were analyzed using the programs PROSA [51] and CARA [52]. At least two independent experiments were carried out for each protein system. X-ray diffraction. Pelleted inclusion bodies were resuspended in a minimal amount of water. Then 5 ll of the suspension were pipetted between two fire-polished glass rods and left to dry for 2-3 d. The dried material was placed in an X-ray beam at room temperature for a 5-min exposure. A rotating anode generator (Rigaku FR-E) and imaging plate detector (RaxisIVþþ) were used for the data collection. Two independent experiments were carried out for each protein system.
SDS-PAGE. After cell lysis and centrifugation, soluble and insoluble fractions were separated. The insoluble fraction was resuspended in buffer A in the same amount as the soluble fraction. To compare the protein amount in each fraction, 5 ll of solution was taken from either the soluble or the resuspended insoluble fraction, and was mixed with 10 ll of 10 M UREA and 5 ll of NuPAGE LDS sample buffer (Invitrogen). Each sample was boiled for 3 min before loaded to NuPAGE 12% Bis-Tris gel (Invitrogen). Proteins were visualized by Coomassie Blue staining (0.1% Coomassie Blue, 10% acetic acid, 40% methanol). At least two independent experiments were carried out for each protein system.
To verify the identity of the bands on the SDS gel, that were present only upon overexpression of wild-type or mutant BMP2  fragments, the bands of interest were excised and digested with trypsin. After digestion, peptides were extracted and analyzed by matrix-assisted laser desorption/ionization (MALDI)-mass spectrometry (MS). Mass spectrometry was measured on an Applied Biosystems Voyager DE-STR instrument in linear mode (expected mass accuracy 0.5% or better). Alpha-cyano-hydroxybenzoic acid was used as the matrix. An average of 100 spectra were summed for each measurement. For both set of bands from inclusion body samples and the soluble samples, masses corresponding to the expected size of the wild-type and mutant BMP2(13-74) were observed. Observation of the masses not only verified the identity of the bands but also proved the presence of the mutations.
Circular dichroism. The inclusion body preparation of ESAT-6 was diluted into buffer A to a final concentration of 20 lM. The experiment was measured on a CD spectrophotometer (BioLogic) from 200 nm to 260 nm. A 1-mm-thick quartz microcuvette was used.
Electron microscopy. The inclusion body or peptide sample was diluted into buffer A to a final concentration of 50 lM, spotted on a glow-discharged, carbon-coated Formvar grid (Electron Microscopy Sciences), incubated for 5 min, washed with distilled water, and then stained with 1% (w/v) aqueous uranyl formate solution. Uranyl formate solutions were filtered through 0.2-lm sterile syringe filters (Corning) before use. EM analysis was performed using a JEOL JEM-100CXII electron microscope at 80 kV with nominal magnification of 48,0003. Images were recorded digitally by using the SIS Megaview III imaging system. At least two independent experiments were carried out for each sample.
Peptide fibril formation. Peptides were synthesized either by the Salk Institute (E20) or Tufts University peptide facility (B13, M11, and M18) using the solid-phase peptide synthesis approach. Solid peptides were dissolved in buffer A to a final concentration of 500 lM, and the pH was adjusted to 7.5. The peptide solution was centrifuged at 13 000g for 10 min, and the supernatant was incubated at 37 8C with stirring for 3 d before observation. The fibril formation was monitored by EM and Thio T binding.  Smooth solid lines represent the monoexponential fits of the raw data after an initial drop of the intensities within the first measurement points. Data for the residues S77 (yellow), K57 (green), A9 (red), and T23 (cyan) of ESAT-6 are shown. In addition, the exchange data for residues F41 (yellow), V21 (green), V67 (red), and I62 (cyan) of BMP2(13-74) are shown. Furthermore, A24 (yellow), G42 (green), A86 (red), and F104 (cyan) of MOG(ECD) are shown. For all the exchange curves, a biphasic behavior is observed. After an initial drop of the intensities within the first measurement points, a slow exponential decay is manifested. The relative population P(F) of the two exchange species is residue-dependent and listed in Figures 2C,  3C, and 4C. The initial drop is attributed to fast H/D exchange reminiscent of an amorphous structure, whereas the slow exponential decay is attributed to slow H/D exchange reminiscent of a homogenous conformational species with hydrogen bond formation. Found at doi:10.1371/journal.pbio.0060195.sg002 (1.14 MB TIF).  Coomassie-stained SDS-polyacrylamide gels were obtained from soluble (s) and insoluble (i) fractions of lysates of E. coli cells expressing wild-type ESAT-6 constructs that have been back-mutated from the variants F8R, I11R, I18R, and V22R to wild-type. Since all back-mutated plasmid constructs show wild-type-like inclusion body formation, the plasmids of F8R, I11R, I18R, and V22R used in Figure  2E are assumed to be functional. Hence, these controls show that the amino acid substitution F8R, I11R, I18R, and V22R abolish inclusion body formation. Found at doi:10.1371/journal.pbio.0060195.sg004 (125 KB TIF).      Funding. RR is a Pew scholar and DE is an Investigator of the Howard Hughes Medical Institute. This research was supported in part by grants from the National Institutes of Health and the Swiss National Science Foundation. Competing interests. The authors have declared that no competing interests exist.