Exploring the Mechanism Responsible for Cellulase Thermostability by Structure-Guided Recombination

Cellulases from Bacillus and Geobacillus bacteria are potentially useful in the biofuel and animal feed industries. One of the unique characteristics of these enzymes is that they are usually quite thermostable. We previously identified a cellulase, GsCelA, from thermophilic Geobacillus sp. 70PC53, which is much more thermostable than its Bacillus homolog, BsCel5A. Thus, these two cellulases provide a pair of structures ideal for investigating the mechanism regarding how these cellulases can retain activity at high temperature. In the present study, we applied the SCHEMA non-contiguous recombination algorithm as a novel tool, which assigns protein sequences into blocks for domain swapping in a way that lessens structural disruption, to generate a set of chimeric proteins derived from the recombination of GsCelA and BsCel5A. Analyzing the activity and thermostability of this designed library set, which requires only a limited number of chimeras by SCHEMA calculations, revealed that one of the blocks may contribute to the higher thermostability of GsCelA. When tested against swollen Avicel, the highly thermostable chimeric cellulase C10 containing this block showed significantly higher activity (22%-43%) and higher thermostability compared to the parental enzymes. With further structural determinations and mutagenesis analyses, a 310 helix was identified as being responsible for the improved thermostability of this block. Furthermore, in the presence of ionic calcium and crown ether (CR), the chimeric C10 was found to retain 40% residual activity even after heat treatment at 90°C. Combining crystal structure determinations and structure-guided SCHEMA recombination, we have determined the mechanism responsible for the high thermostability of GsCelA, and generated a novel recombinant enzyme with significantly higher activity.

Cellulases are usually more stable than general enzymes in functioning at relatively high temperatures. Thermophilic bacteria belonging to the strains Bacillus, Geobacillus, Caldibacillus, Acidothermus, Caldocellum and Clostridium are known to produce thermostable cellulases [8,9]. Bacillus and Geobacillus strains are industrial thermophilic bacterial strains widely used in the production of value-added vitamins, enzymes and proteins [8,9]. The GsCelA enzyme considered in this study belongs to a particular group of Geobacillus. The recombinant GsCelA expressed in E. coli exhibited ten-fold greater specific activity than the commercially available endo-glucanase from Trichoderma reesei and uniquely retained its activity after long-term heating and low pH treatments [7]. The amino acid sequence of GsCelA indicates it is a member of the glycoside hydrolase GH5 family of cellulases, but shares only 53.1% similarity with other members in this group [7]. In contrast to its full-length sequence, the catalytic core of GsCelA has 60% homology with that of BsCel5A from Bacillus subtilis 168.
BsCel5A, another cellulase belonging to the GH5 enzymes, is the major endoglucanase in Bacillus. BsCel5A from different Bacillus subtilis strains have been cloned and characterized for their application in biofuel production [10,11]. BsCel5A is also a thermostable enzyme, though it is not as tolerant at high temperatures as GsCelA, retaining 70% of its optimal activity after incubation at 75°C for 30 minutes or less. A TIM-barrel (α/β) 8 catalytic domain and a β-sheet cellulose binding module (CBM3) were shown to be present in the cellulase BsCel5A [11].
In contrast to BsCel5A, the structure and mechanism of GsCelA has not been investigated. Therefore, comparisons of GsCelA and BsCel5A provide a unique opportunity for us to explore the mechanism contributing to the better stability of GsCelA at higher temperatures. Here, we have applied SCHEMA structure-guided protein recombination technology to address the mechanism by which GsCelA structure contributes to better thermostability.
The SCHEMA algorithm uses structural information to select boundary locations that minimizes disruption of favorable residue-residue contacts in the resulting chimeras [12]. Non-conserved sequence elements, or sequence 'blocks', are then shuffled among homologous proteins (parental proteins) to generate functional chimeras. Because these blocks contribute to chimera stability with a high degree of additivity, stabilized chimeras can be predicted using models built by sampling a small set of chimeras [13]. In addition, the residues that contribute to stabilizing protein structure may also be identified in the process [14][15][16].
In this study, BsCel5A, a much less thermostable endoglucanase from Bacillus subtilis 168 [17,18], which shares 60% sequence homology with GsCelA, was selected as the second parental protein for SCHEMA recombination. As a result, a highly thermostable chimeric cellulase C10 with increased activity was developed. Through crystal structure determination of GsCelA and C10, we uncovered a 3 10 helix structure that is responsible for the higher thermostability of the enzyme. Moreover, we discovered that addition of crown ether (CR), a crystallization additive and a modulator of protein surface behavior [19], could further improve the thermostability of this cellulase. These newly-discovered structural features have contributed to our knowledge of the mechanisms for endowing better thermostability and activity amongst bacterial GH5 cellulases. These mechanisms may also be applied to improving the thermostability and/or activity of other proteins or enzymes.

Materials and Methods
Protein expression and purification of chimeras and mutants Thirteen chimeric GH5 genes were optimized for expression in E. coli, and the gene sequences were synthesized by GenScript, U.S.A. The PCR fragments and synthesized gene sequences were cloned into CloneJET™ PCR cloning vector (Thermo Scientific) and subsequently subcloned into the pET21b expression vector (Novagen) using the restriction sites NdeI and XhoI. Site-directed mutagenesis and short-fragment substitutions were performed using the Quik-Change Lightning Site-directed Mutagenesis Kit (Agilent Technologies) and In-Fusion™ HD Cloning system (Clonetech). The proteins were expressed in BL21(DE3) pLyss at 37°C for 4-6 hrs under the induction of 0.5 mM IPTG (isopropyl β-D-thiogalactopyranoside). Harvested cells were resuspended in binding buffer (50 mM sodium phosphate, pH 8, 300 mM NaCl) and disrupted by sonication. Lysates were centrifuged at 10000 x g for 60 min and the supernatants were loaded onto nickel-charged 1 ml HiTrap FF crude columns (GE Healthcare) with a flow rate of 0.5 ml/min. The proteins were eluted by 150-300 mM imidazole. All chromatographic steps were performed using an AKTA FPLC system (GE Healthcare). The purified protein samples were confirmed by SDS-PAGE under denaturing conditions and their concentrations were determined by Bradford assay.

Noncontiguous SCHEMA recombination
SCHEMA is a structure-guided computational approach to creating chimeric proteins that retain proper folding and functionality, but explore other properties linked to sequence, such as stability. SCHEMA algorithms identify sites for recombining homologous proteins that minimize structural disruption by maximizing the retention of parental residue-residue contacts in their folded structures [20]. Noncontiguous recombination identifies blocks of sequence that are contiguous in the 3-D structure, but are not necessarily contiguous in the primary sequence. Contacts (residues that are < 4.5 Å apart) are identified from one or more of the crystal structures, and the SCHEMA energy E for a given chimera is calculated by counting the number of residue-residue contacts that are disrupted by recombination. Partition sites of the aligned homologous proteins are chosen to minimize the average of SCHEMA energy <E> of all possible chimeras made by recombining those sequence fragments.
In this study, noncontiguous SCHEMA recombination was designed as previously described [12]. The SCHEMA algorithms uses sequence alignment and structure data to create a SCHEMA contact map for proper chimera design. In the generated structures, the algorithm was set to consider any two amino acids in contact if any atoms, excluding hydrogen, are within a distance of 4.5Å from the residues. A SCHEMA contact map was first generated for each parent. During recombination, the contacts that are not conserved among the parental proteins were considered broken, and so a final 'average' contact map could be built by weighting the retention of each parental contact (0.5 for a single parent, 1 for both parents). The SCHEMA contact map can be abstracted as a graph in which every node represents a non-conserved residue, and is linked by the edges representing the average weighted SCHEMA contacts between two residues. The problem of finding crossover locations that minimize the SCHEMA contact numbers to yield the low-disruption chimeras can therefore be reformulated as the problem of minimizing the edges during graph partitioning, which was solved with the hME-TIS graph partitioning suite [21,22].
Here, the amino acid sequence alignment of the parental enzymes BsCel5A [11] and GsCelA [7] was created using PROMALS3D.24. Crystal structures 3PZT [11] and 4XZB were used to create the BsCel5A and GsCelA SCHEMA contact maps, respectively. As the catalytic cores of BsCel5A and GsCelA share 58% sequence identity, an eight-block chimera design was selected with an average <E> as 16.25 and average <m> as 47 compared to the closest parent.

Measurement of thermostability
In the T 50 thermostability assay, 1 μg of purified cellulase was mixed with 50 mM sodium acetate, pH 5.0, in a final reaction volume of 300 μl. The tubes were incubated in a gradient thermocycler for 10 min with or without 0.05 mM additional calcium ion (Ca 2+ ) and/or 6.25x10 -3 mM crown ether (18-crown-6, CR [19]) at a range of temperatures, and then cooled to 4°C. To each tube, 3 μg of phosphoric acid swollen cellulose (PASC) [23] was added, and the heattreated cellulases were allowed to react at their optimal temperatures for 6 hrs in a thermal cycler. The amount of reducing sugars was measured and quantified by the DNS method [24]. The parameter T 50 is the temperature at which an enzyme loses 50% of its optimal activity after a 10-min heat treatment [13].
The T A50 thermostability assay was performed in an 8-well PCR strip, into which 1 μg purified cellulase was mixed with 50 mM sodium acetate, pH 5.0, and 3 μg PASC in a final reaction volume of 300 μL. The tubes were incubated at a range of temperatures for 6 hrs in a gradient thermocycler and then cooled to 4°C. The reducing sugars were measured and quantified by the DNS method [24]. The parameter T A50 is the temperature at which a cellulase exhibits 50% of its optimal activity in a 6-hr hydrolysis assay [13].

pH tolerance measurement
To test for pH tolerance, the purified enzymes were incubated in 50 mM buffer at different pH values between 2.0 and 10.0 for 12 hrs. Residual activities were then measured with 1% PASC in 50 mM sodium acetate buffer, pH 5.0, at optimal temperatures using the DNS method [24].

Long-term cellulase activity assay
All cellulase activity measurements were conducted in 50 mM sodium acetate buffer, pH 5.0. For PASC (1% w/v) hydrolysis assays, 300 μl of reaction volume containing 1 μg of purified parental or chimeric GH5 cellulases and 0.5 μg Novo-188 (Novozyme) were constituted, and then the reaction was allowed to proceed at 50°C for 6-60 hrs. After hydrolysis, the reaction supernatants were collected, and the reducing sugar concentration was measured by the DNS method [24].

Circular dichroism measurement
Far-UV CD spectra (190-260 nm) were recorded on an AVIV model 202 CD spectrometer using a 1-mm quartz cuvette. Proteins were used at a concentration of 10 μM in 50 mM sodium phosphate buffer, pH 8. Data collection parameters were set to a scan rate of 50 nm/min, response time of 4 s, sensitivity of 100 mdeg, accumulation of 10, heating rate of 1°C/min and 60-s delay time for spectrum collection. Results were expressed as mean residue ellipticity (deg Á cm2 Ádmol −1 Á residue −1 ). All thermal unfolding experiments were monitored at 222 nm.

Crystallization and data collection
The crystals of GsCelA P1 and C10 were grown by mixing 1 μl protein with 1 μL reservoir solution using the sitting-drop vapor diffusion method at 18°C. P1 crystals were obtained in a reservoir solution of 16% (w/v) PEG 4000, 0.2 M imidazole malate, pH 6.0. C10 crystals were grown in a reservoir solution of 25 mM 18-crown-6, 30% (w/v) PEG 4000, 0.1 M HEPES sodium salt at pH 7.4, and 0.2 M calcium chloride. Both crystals were flash-cooled with 22% glycerol (v/v) as a cryo-protectant. Diffraction data for P1 crystals were collected at cryogenic temperatures at a wavelength of 1.5418 Å using a Rigaku FR-E+ SuperBright generator equipped with an R-AXIS HTC image-plate detector. Data for the C10 crystals were collected at a wavelength of 1.00 Å on beam line BL12B2 of the Spring-8 synchrotron in Japan using a Quantum-210 CCD detector. All diffraction data were processed and scaled using the HKL2000 program [25].

Structural determination and refinement
Both crystal structures (P1 and C10) were determined by molecular replacement using the MOL-REP program of the CCP4 program suite, using the crystal structure of endo-1, 4-beta-glucanase (PDB: 3PZT) from Bacillus subtilis [11] as a search model. P1 and C10 crystals belong to space groups P2 1 2 1 2 1 and C222 1 , respectively. Throughout the refinement, 5% of randomly-selected data were set aside for cross-validation with R free values. Manual modifications of the models were performed using the program Coot [26]. Difference Fourier (Fo-Fc) maps were calculated to locate the solvent molecules. Both crystal structures were refined using Refmac5 [27]. Data collection and final model statistics are shown in S1 Table. The molecular figures were produced using UCSF Chimera [28]. The atomic coordinates and structural factors of P1 and C10 have been deposited in the Protein Data Bank with accession codes 4XZB and 4XZW, respectively. The molecular figures were produced using PyMOL (http://www.pymol.org) and UCSF Chimera [28].

Results
Library design of noncontiguous recombination between endo-β-1,4-glucanases from the thermophiles Geobacillus and Bacillus Two bacterial GH5-family enzymes were chosen as parents for noncontiguous SCHEMA recombination: BsCel5A [11] from the thermophile Bacillus subtilis 168, and GsCelA from the highly thermophilic Geobacillus sp.53 [7] (Fig 1A). The crystal structure of the Geobacillus sp.53 GsCelA catalytic domain (GsCelA P1) was first determined at 1.62 Å resolution. GsCelA P1 crystallized in the P2 1 2 1 2 1 space group, with one protein molecule per asymmetric unit (S1 Table). As shown in Fig 1B, GsCelA P1 adopts a TIM-barrel fold, similar to that of the BsCel5A catalytic domain. The 60% sequence similarity and 58% identity in the catalytic domain with closed structure superposition suggested that these two enzymes could be recombined to yield functional catalytic domain chimeras.
We designed a noncontiguous SCHEMA recombination library of GH5-family catalytic domains from GsCelA (P1) and BsCel5A (P2) that yielded an <E> of 16.25 and an average of 47 mutations (<m>) from the closest parent. The individual structural elements, or blocks, for this design are shown in Fig 1A. The designed chimeras were assembled from 16 gene fragments of two parents, with each containing approximately 16 non-conserved residues, representing the eight blocks from each of the parents.
Expression and functional assessment of chimeric GH5 cellulases From a library of 2 8 = 256 chimeric sequences, we selected a subset of eight chimeras for construction and characterization (Table 1, chimeras C1 to C8). These chimeras were chosen to  1 T A50 is the temperature at which an enzyme has 50% of its optimal activity. 2 T 50 is the temperature at which an enzyme loses 50% of its activity after 10 min pre-incubation. 3 A re is the cellulase's specific activity at its optimal temperature measured in a 10 min assay with 1 μg purified enzyme and 1% PASC. The values are normalized relative to the specific activity of BsCel5A catalytic core. * All of these enzymes are assayed at their optimal temperature at 60°C except that the optimal temperatures of C2 and C3 are 55°C; and P2, C5, C6, C7, and C12 are 50°C. maximize mutual interaction among the sequences (m values) and to minimize SCHEMA disruptions (E values), as has previously been described for library designs of chimeric argininases and cellobiohydrolases [29,30]. Therefore, chimeras with higher m and lower E values were chosen. The parents (P1 and P2) and all eight chosen chimeras (C1-C8) were expressed and purified from E. coli. We then evaluated the activity and thermostability of the eight sample chimeras and their parents based on two measurement parameters: T 50 and T A50 . In short, T 50 describes the tolerance of an enzyme to thermal stress, and T A50 describes its ability to function at elevated temperature. The T 50 of six chimeras (C1, C2, C3, C4, C6 and C7 in Table 1) suggested they were less stable than the parent species. Two chimeras (C5 and C8) were more stable than one parent, but only the T 50 of C8 was higher than both of the parent, including the most stable parent, P1.

Model prediction of thermostability and thermoactivity
Several reports have shown that the SCHEMA recombined contiguous blocks of sequences contribute additively to chimera stabilities and that these stabilities are predictable with simple additive block models trained on a sample set of a library [31]. Thus, our small but informative 8-chimera GH5 endoglucanase library could be used to construct a linear model assuming that individual blocks of the structure contribute additively to the stabilities of recombined enzymes. We constructed predictive models of T 50 and T A50 based on the sequences of our eight functional chimeras and two parental cellulases by linear regression. The T 50 model predicts the stabilities of the library sample (r 2 = 0.83) and provides the predicted contributions of each structural block to T 50 (Fig 2A). Similarly, we plotted a model that fits the T A50 stability data (r 2 = 0.88). The predicted block contributions to T A50 are shown in Fig 2B. According to the two thermostability models and their block predictions, we synthesized an additional five chimeras (C9 to C13, Table 1) to better identify key elements contributing to thermostability. The C9 and C10 chimeras were predicted to be more stable than the parents according to the T 50 model. Of these, C10 was significantly more stable: its T 50 was enhanced by 4°C compared with P1 and 9°C compared with P2. The C9, C10, and C11 chimeras were used to assess the contributions of blocks F and E (Table 1). Following swapping between chimeras, we found that block E of P1 and block F of P2 enhanced the thermostability of C10.

Characterization of the most active and stable chimera, C10
The C10 chimera was predicted and synthesized based on the experimental thermostability results of the eight sampled active chimeras. Our results showed that C10 was more catalytically active and more stable than the parent GsCelA (P1), which was already more thermostable than P2. In addition to C10, we also chose two other chimeras (C5 and C8), both of which are less stable than P1 but more stable than P2, to characterize their optimal temperature, thermostability, pH tolerance, and long-term activity (Fig 3). C10 exhibited both broader and higher thermoactivity than P1 in the range of 50°C to 70°C (Fig 3A), with C10 retaining 70% residual activity after 10 min of 80°C heat-treatment while P1 only retained 40% residual activity ( Fig 3B). Interestingly, similar to P1, stable chimeras C8 and C10 had higher acid-stability than C5 and P2 (Fig 3C). In long-term hydrolysis of amorphous cellulose, the most stable chimera, C10, maintained activity longer than the GH5 parents (Fig 3D).

A new 3 10 helix contributes to thermostability of bacterial GH5 cellulases
Since block E of GsCelA had a negative value in the T 50 model but a positive value in the T A50 model (Fig 2), we studied the contribution of block E to overall thermostability in the protein GsCelA. Block E of GsCelA exhibits a 3 10 helical structure different from the corresponding sequence in BsCel5A (Fig 1B), and it is located in the catalytic face of the TIM-barrel. As shown in Fig 4A, adding block E of P1 to P2 (see chimera C13) resulted in an increase by 7°C of T A50 but did not significantly change T 50 , compared to P2. In order to further analyze the  Residual activities (thermostability) of GH5 parents and chimeras after heat-treatment. One microgram of purified cellulases was pre-incubated in 50 mM sodium acetate buffer (pH 5.0) at different temperatures for 10 min. The residual enzyme activities were then determined on 1% PASC at pH 5.0 at their optimal temperature for 6 hrs. Relative activities were compared with the optimal activity of each enzyme without a pre-incubating treatment at 60°C. Some of the enzymes, e.g. C5, do not have the initial point (60°C) of 100% because their optimal activities are not at this reaction temperature and therefore have a slightly lower staring points in the analyses with 60°C pre-incubation temperature. (C) Acid stability of the parents and chosen chimeras. One microgram of purified cellulases was pre-incubated in 50 mM buffer with variable pH for 12 hrs. The residual enzyme activity was determined. Relative activities were compared with the optimal activity of each enzyme without a pre-incubating treatment at pH 5. (D) Long-term activity of the most stable chimera C10 compared with its parents in PASC hydrolysis. One microgram of purified cellulases plus 0.5 μg Novo-188 (β-glucosidase, Novozyme) was added into 1% PASC at 50°C, pH 5.0 for 6, 12, 24, 36, 48, and 60 hrs.
3 10 helical structure of block E, the single residue mutants P1ΔP69 or P1ΔD68, in which P69 or D68 of GsCelA was deleted, respectively, exhibited a significant drop in both T A50 and T 50 compared with wild-type GsCelA (P1, Fig 4A).
As shown in Fig 5A and 5D, the determined C10 structure exhibited side-chain bonding of D68, forming a 3 10 helix with amino acids P69, N70 and A71. The 3 10 helix is inserted between strand β4 and helix α3, and is packed into the concave cavity between two loops, β4α3 loop and β6α4 loop. These two loops are stabilized by the hydrogen bond formed between the side chains of D68 and N70. Furthermore, the hydrophobic side chain of P69 undergoes a nonpolar interaction with residue I103, further contributing to the stabilization of the β4α3 loop. We speculate that the additional 3 10 helix may contribute either increased protein rigidity or stabilized interactions of enzyme-substrate in the transition state. To confirm that this was the cause of the enhanced thermoactivity, we measured the thermal denaturation of these parental and mutated proteins in the presence or absence of substrate (Fig 4B and 4C). Interestingly, when P2 and C13 were incubated with the soluble substrate carboxymethylcellulose (CMC), the C13 chimera exhibited a higher melting temperature than P2 (Fig 4C). The addition of CMC may mimic actual interaction between enzyme and substrate in the thermal unfolding process, and C13 is further stabilized by subtract binding.

Addition of crown ether enhances thermostability
As reported previously, crown ether (CR) is a powerful crystallization additive and can be used to modulate protein surface behaviors [19]. It has also been found that CR interacts with lysines and protein hydrophobic patches. Therefore, in an attempt to further enhance the thermostability of the C10 chimera, we used CR as an additive for crystallization and in the thermostability assay.
The complex crystals of chimera C10/CR belong to the C222 1 space group with one protein molecule per asymmetric unit, plus HEPES, a calcium ion (Ca 2+ ), and two CR-bound molecules. As shown in Fig 5, the structure of C10 adopts the same classical TIM barrel fold as the parental structures of GsCelA core (P1) and BsCel5A (P2) (PDB: 3PZT). Chimera C10 superimposes well on P1 and P2 (Fig 5C), yielding rmsd values of 0.486 Å for 300 atoms and 0.581 Å for 282 atoms, respectively. Interestingly, the chimera C10/CR complex structure revealed a dimeric packing mode with two CR molecules. The first CR adopts a KC-crown binding mode [19], interacting with K257 and V15 from two different chimera C10 monomers, while the second CR is in hydrophobic contact with the first CR molecule (Fig 5B). With CR molecules mediating interactions between two C10 chimera molecules, the CR-binding motif was created by the combination of V15 from block H of P1 and K257 from block D of P2. In thermostability assays with CR, P2 and C10 retained 20% residual activity after heat treatment at 90°C (Fig 6C), but the parental GsCelA exhibited no activity (Fig 6A). These data indicate that K257 on parental BsCel5A plays an important role in the interaction with CR for thermostability improvements.
In addition, a Ca 2+ ion-binding site was identified (Fig 5B). The Ca 2+ ion is bound to the backbone oxygen of G130 and to the side chains of D168, D170 and N171 with an average distance of 2.3 Å. Two water molecules were also coordinated with distances of 2.3 Å. The Ca 2+ ion-binding site is similar to the manganese ion (Mn 2+ )-binding site of the BsCel5A catalytic core (P2) (Fig 5E) [11], and both Ca 2+ and Mn 2+ ions coordinated to the protein surface with the same octahedral configuration. The presence of the Ca 2+ ion enhanced the thermostabilities of P1, P2, and C10 (Fig 6A and 6B). When a Ca 2+ ion and CR were added, C10 retained 40% The overall structure of C10 with HEPES, two CR molecules, and a calcium ion (ball-and-stick models). (B) Two C10 monomers (light green and light blue) from different unit cells are in contact with 2-fold symmetry. Two CR molecules interacting across different unit cells, CR a and CR b+ , are shown as purple and green carbon atoms, respectively. Fo-Fc omit maps (orange) were calculated for the Ca 2+ and CR molecules at 2 σ level. The calcium ion is bound to two water molecules, the backbone oxygen of G130 and the three carboxylate side chains of D168, D170 and N171, with an octahedral configuration. A CR molecule (CR a ) is bound to K257 and V15 from two different C10 monomers. The second CR molecule (CR b ) has a hydrophobic interaction with CR a . (C) Comparison of GsCelA P1, BsCe5lA P2 (PDB: 3PZT) and chimera C10 structures. The catalytic domains are superimposed and colored in green, blue and orange, respectively. The N and C termini are also indicated. (D) β4α3 loop region comparison. GsCelA P1, BsCe5lA P2 and chimera C10 are colored as before. The D68 side chain atoms are hydrogen bonded to N70 and N107, while P69 has a hydrophobic interaction with I103. The chimera C10 protein residues are shown as stick models and carbon atoms are colored orange. (E) The Ca 2+ binding sites in GsCelA P1, BsCe5lA P2 and chimera C10 are superimposed. The carbon atoms of GsCelA P1, BsCe5lA P2 and chimera C10 are colored in light green, light blue and orange, respectively. The Ca 2+ atom from GsCelA P1 is depicted as a light magenta sphere.

Discussion
To determine the mechanism for thermostability of GsCelA cellulase from Geobacillus and to enhance stability improvements for bacterial GH5-family endoglucanases, we first determined the crystal structure of thermostable GsCelA and then used SCHEMA structure-guided protein recombination technology to expand the stability profiles of the GH5 enzymes using its less thermostable homolog, BsCel5A. This information enabled us to efficiently generate a more stable and active chimera, C10. Chimera C10 is more thermostable and acid-stable than its parental GH5 cellulases (Fig 3) and can maintain its higher activity for a longer time, i.e. over 60 hrs at 50°C (Fig 3D). The C10 chimera has the potential to act as a supplement or as part of a thermostable cellulase cocktail [32,33], or to be used in biomaterial conversion by Bacillus transformants [34], as it is similar to both the Bacillus and Geobacillus parental enzymes.
The determined crystal structures of GsCelA have enabled us to explore the structural mechanism of the improved thermostability. The SCHEMA block E displays a 3 10 helix conformation that stabilizes the substrate binding loop and by which it can increase the range of cellulase activity at higher reaction temperatures (Fig 3A) rather than retaining residual activity after heat-treatment (Fig 3B). The deletion mutants (P69 or D68) on block E (Fig 4A) also caused GsCelA to become unstable as the hydrogen bonds were broken.
Other than utilizing the SCHEMA protein engineering method, our crystallographic investigation of C10/CR prompted us to adopt the novel approach of supplementing the C10 Purified cellulases (1μg) were pre-incubated in 50 mM sodium acetate buffer (pH 5.0) at different temperatures for 10 min with or without additional Ca 2+ 0.05 mM or/and CR 6.25×10 −3 mM. The residual enzyme activity was then determined on 1% PASC at pH 5.0 at their optimal temperature for 6 hrs. The 100% value of relative activity refers to the optimal activity of each enzyme without thermal treatment.
doi:10.1371/journal.pone.0147485.g006 chimera with "crown ether", thereby further enhancing its stability (Fig 5). A previous report indicated that crown ethers can modulate the protein surface by interacting with surface lysine residues [19]. The tolerance of the C10 chimera to extreme heat can be enhanced (Fig 6A and  6C) by CR because lysine (K257, Fig 5) resides on the surface of P2 (SCHEMA block D, Fig  1A) instead of arginine as found in GsCelA (P1). Furthermore, we added Ca 2+ ions when crowning C10, and there is a degree of additivity between the metal ion and the crown ether (Fig 6A and 6D) located on different sites of the C10 surface ( Fig 5B).
Based on the thermostability models established in this study, we generated a highly stable and active chimera, C10. Superposition of C10 and P1 shows additional interactions occur in block F. R22 in the β2β3 loop of C10 interacts with the backbone oxygen molecules of helix α1 (residues 6-8), and the side chain of K8 forms a hydrogen bond with the backbone oxygen of K3 (Fig 7). These interactions likely stabilize the N-terminal region. Interestingly, the eight residues of the N-terminal region are flexible in P2. After SCHEMA recombination, helix α1 of C10 becomes rigid and new interactions further stabilize this region. This observation is consistent with increased thermostability of C10 compared to C11. In summary, we have identified a novel, stabilizing loop of GsCelA and present a new strategy to maintain high enzymatic activity as well as enhance protein thermostability by additives through SCHEMA engineering.
Supporting Information S1 Table. Data collection and refinement statistics.