Engineering Proteins for Thermostability with iRDP Web Server

Engineering protein molecules with desired structure and biological functions has been an elusive goal. Development of industrially viable proteins with improved properties such as stability, catalytic activity and altered specificity by modifying the structure of an existing protein has widely been targeted through rational protein engineering. Although a range of factors contributing to thermal stability have been identified and widely researched, the in silico implementation of these as strategies directed towards enhancement of protein stability has not yet been explored extensively. A wide range of structural analysis tools is currently available for in silico protein engineering. However these tools concentrate on only a limited number of factors or individual protein structures, resulting in cumbersome and time-consuming analysis. The iRDP web server presented here provides a unified platform comprising of iCAPS, iStability and iMutants modules. Each module addresses different facets of effective rational engineering of proteins aiming towards enhanced stability. While iCAPS aids in selection of target protein based on factors contributing to structural stability, iStability uniquely offers in silico implementation of known thermostabilization strategies in proteins for identification and stability prediction of potential stabilizing mutation sites. iMutants aims to assess mutants based on changes in local interaction network and degree of residue conservation at the mutation sites. Each module was validated using an extensively diverse dataset. The server is freely accessible at http://irdp.ncl.res.in and has no login requirements.


Introduction
Thermophiles and hyperthermophiles are organisms that grow at extreme temperatures.Enzymes from these organisms are inherently stable and active at high temperatures, offering a major industrial advantage over their mesophilic homologues with respect to their storage, resistance against chemical denaturants and risk of microbial contaminations.Thermal stability is an important parameter that determines economic feasibility of applying an enzyme in any industrial process.Understanding the molecular determinants of thermostability can not only provide useful insights into evolution of such enzymes but the application of these through rational protein engineering to existing mesophilic proteins can also lead to development of more efficient and thermally stable biocatalysts for various industrial applications [1].

Molecular determinants of protein thermostability
Studies have revealed several trends of residue preference towards thermostabilization of thermophilic proteins, such as lower content of uncharged polar residues, preference of arginine over lysine residues and higher charged residue contents [2].Shortening of loop regions is a known mechanism of protein thermostabilization in hyperthermophilic proteins [3][4][5].Various non-covalent interactions are known to play a vital role in thermostabilization of proteins [6][7][8].Disulfide bridges are covalent interactions, which are known to provide stability to protein by entropic effect [9][10][11].The presence of thermolabile residues and bonds involving asparagine and glutamine are known to introduce instability to the protein backbone by undergoing deamidation at elevated temperatures [12].Residues in left-handed helical conformation upon mutation to glycine are known to contribute favorably to protein thermal stability [13,14].Marshall et al., 2002 have studied the interactions of α-helix dipole with side chains of sequentially charged residues and found it to contribute favorably to stability [15,16].Proline residues being conformationally most rigid are thought to provide stability to proteins by entropic effects [17,18].The hydrophobic effect is understood to be one of the primary driving forces of protein folding [19].Decrease in hydrophobic surface area, as a stabilization mechanism has been studied in superoxide dismutase from S. acidocaldarius [20].Bound metal is also vital for stability and functioning of many proteins [21][22][23].

Target identification through comparative structural analysis
The rapid addition of protein structures to the PDB [24], has made manual analysis of combination of such a large number of factors extremely time-consuming and sometimes errorprone.Although a variety of computational tools such as WHAT IF [25], PIC [26] and Capture [27] are available for structural analysis, most of these are limited by their ability to analyse only a single structure at a time (S1 Table ).These tools primarily focus on analysis of non-covalent interactions as stabilizing mechanisms ignoring most molecular determinants listed above.However, most protein engineering studies necessitate simultaneous analysis of several structural mechanisms amongst a vast set of protein structures for improved selection of target protein and potential mutation sites (Fig 1).In view of this, the iCAPS (in silico Comparative Analysis of Protein Structures) module was developed to simplify the comparison process for a large number of protein structures in terms of the features listed above known to affect protein stability (Fig 2).iCAPS aims to help the user compare a series of proteins in order to select a target protein most suitable for initiation of protein engineering studies.

Identification of potential sites for structural stabilization
Once a target protein is selected for engineering, the next task is to identify potential target mutation sites.The Suzuki group has conclusively proved entropic stabilization by proline insertion as a protein thermostabilization mechanism through their work on oligo 1,6 glucosidase from Bacillus cereus.Through this work, they were able to identify that proline insertion at second position of β-turns and N-cap of helix enhanced thermostability of the protein.This mechanism has been defined as the "The Proline Rule" [28,29].Loop stabilization by proline has been successfully implemented for thermostabilization of proteins such as cold shock protein, ubiquitin, ribonuclease Sa2 and guanyl specific ribonuclease Sa3, bacteriophage T4 lysozyme and human lysozyme [30][31][32].Proline insertion at helix N-cap has also been used to enhance the stability of proteins like alcohol dehydrogenase, α-parvalbumin and triosephosphate isomerase [18,33,34].Studies on ribonuclease HI show that release of conformational strain caused due to left-handed helical residue Lys95 upon mutation to Gly, results in considerable increase in thermostability of the protein [13].This strategy has been used to improve the stability of proteins such as Drosophila adapter protein Drk, barnase, lysozyme and Pin1 WW [35][36][37][38][39].In their classic work on bacteriophage T4 lysozyme, Matsumura et al., 1989, have not only elucidated the role of disulfide bridges in protein stability but have also shown that the effect of introduction of a combination of disulfide bridges on protein stability is additive in nature [9].This mechanism has been used successfully for improving the stability of proteins like T4 lysozyme [40], subtilisin BPN [41], xylanase [42], lipase [43], lipase B [44] and glucose 1-dehydrogenase [45].Currently computational tools such as CUPSAT [46], SDM [47], PopMusic [48], FoldX [49] and Rosetta Design [50] are available for predicting the effect of mutation on protein stability.However these tools require the user to input the mutations and do not suggest potential stabilizing mutations using specific strategies.Hence the iStability (in silico Analysis of Stability Change in Protein Structures) module was developed which not only aids in identification of stabilizing mutation sites through the application of protein design strategies described above for improvement of thermal stability but also assesses the stability of any mutant (Fig 2).

Evaluating potential thermostabilization sites using molecular interactions
Once a stabilizing mutation site is selected, it is important to evaluate the mutation in terms of its effects on neighboring residues, which is vital to any protein engineering experiment.Serrano et al., 1992, in their work with barnase enzyme have revealed that loss of buried salt bridges and hydrogen bonds due to mutations affects protein stability significantly [37].Studies carried out on the arc repressor protein of bacteriophage P22 have shown deleterious effects of mutations on protein stability due to disruptions in hydrogen bonds and salt bridges [51].Computational tools such as CUPSAT, SDM, PopMusic and Rosetta Design (S2 Table ) are mainly of predictive nature.Although, users are usually informed of the effects of mutations on protein stability by these tools in form of stability scores, underlying details of interaction rearrangements at the mutation site are currently not provided.Along with the stability scores, the information regarding the change in interactions could provide a better evaluation of the mutations being considered.Understanding this, we have developed a mutation evaluation tool called iMutants (in silico Comparative Analysis of Interactions in Protein Mutants), which assesses the change in local interactions at mutation sites through comparison of wild-type and mutant proteins (Fig 2 ).Evolutionary conservation analysis is crucial to successful protein design since highly conserved positions typically play important structural or functional roles, i.e. mutation could adversely affect protein function [52].Therefore iMutants also evaluates the conserved nature of the mutation sites.
The three modules, iCAPS, iStability and iMutants, addressing these aspects of protein engineering problems, are integrated on a single platform in the form of iRDP (in silico Rational Designing of Proteins) web server presented in this work.

Material and Methods
The iRDP server is built on a Linux platform using R, Perl, HTML and PHP.The Bio3d [53] and iGraph [54] packages form the core of all iRDP modules.The vast analysis carried out by modules of iRDP server uses, both in-house developed scripts and established tools (S3 Table ).Described below is the detailed workflow of each module in iRDP web server (S1 Fig).

In silico Comparative Analysis of Protein Structures (iCAPS)
Multiple structures serve as input to iCAPS.Input can be a list of PDB entries separated by commas, or files in valid PDB format can be uploaded.The user can select the structural features to be analyzed, modify cutoff values for calculation of non-covalent interactions and relative accessible surface area (ASA) before submitting the job.The relative ASA threshold is used to decide whether a residue is buried or exposed to solvent.The results page contains a unique job identification number for each job being submitted.Users can bookmark this page and return to view and retrieve the results later.
Analysis begins with primary structural features like amino acid composition, secondary structure content, information such as helix/strand/turn/coil composition and then proceeds to calculation of non-covalent interactions (Fig 2).The program DSSP [55] is used to detect secondary structures in the input proteins while NACCESS is used for estimation of residue solvent accessibility and accessible surface areas (ASA) [56].The non-covalent interactions and disulfide bonds are identified using the standard criteria reported in literature (S4 Table ).Users are provided with options to change the criteria of interaction calculations.In-house scripts are written for estimation of parameters such as proline residue distribution profile, thermolabile bond profile and helix dipole stabilization profile.Conformationally strained residues are identified by using Procheck [57] while β-turn and N-cap proline residues are identified using Promotif [58] and DSSP respectively.The program FindGeo has been employed for analysis of metal binding sites and geometry [59].Gibbs free energy of unfolding is calculated using FoldX [49].

In silico Analysis of Stability Change in Protein Structures (iStability)
The input to this module consists of providing a PDB code or uploading a valid PDB formatted file.The user must select any of the pre-defined protein design strategies (Fig 2 ) or provide their own mutations in the specified format.Once the design strategy is selected, the user has the choice of trying out four different stability prediction tools (FoldX [49], Auto-Mute [61], I-Mutant [62] and MUpro [63]), which are based on empirical potential energy functions or machine learning methods with provision to modify input.In case of FoldX, mutations are carried out and the total energy difference between the mutant and the corresponding wild-type is estimated.Positive values of energy difference reflects less stability.Auto-mute uses machine learning approach to carry out computational mutagenesis.A 3D-1D profile is generated for both the mutant and wild-type, which is further used in the generation of a vector difference profile.Here the environmental change (EC) scores quantify the differences generated for the mutant in terms of confidence in stability classification.I-Mutant implements support vector machines based on potential energy function to determine the stability.The ΔΔG value calculated uses the difference of Gibbs free energy (ΔG) of the mutant and wild-type.If the ΔΔG value is < 0 kcal/mol, the mutant is considered to be less stable than wild-type, while if the ΔΔG value is > 0 kcal/mol, the mutant is said to be more stable then wild-type.MUpro predicts how single-site amino acid mutations affect protein stability using the ΔΔG value, which is computed using support vector machines and neural networks.The confidence score generated measures the confidence of the prediction.If the score is < 0, it indicates decreased stability, whereas, score > 0 shows that the mutation increases stability compared to wild-type.The user can choose residue conservation analysis if required.If input is a PDB entry, then the evolutionary residue conservation score is derived from ConSurf-DB [64], on a scale of 1-9 (1 is an indication of least conserved/highly variable and 9 highly conserved/least variable).If input is a structure uploaded by the user, then the extracted sequence is used to search for homologs against the UniRef90 database using PSI-BLAST (2 iterations and e-value cutoff of 1) [65].Weighted observed percentages from generated Position Specific Scoring Matrix (PSSM) are scaled from 1 to 9 as already defined and are presented as conservation scores.
For validation of iStability module, a total of 17, 6, 10 and 15 proteins were selected for the analysis of beta-turn proline insertion, N-cap proline insertion, conformational strain release and disulfide bond insertion strategies respectively (Table 2 and S7 Table ).Protein structures were analyzed using iStability module by selecting each strategy and predictions obtained were compared with experimental observations.

in silico Comparative Analysis of Interactions in Protein Mutants (iMutants)
iMutants takes a single structure as input similar to iStability.The user must also provide mutations in the specified format.A large number of mutations can be analyzed simultaneously in iMutants.Each mutation must be provided in a separate line.For double or multiple mutants, mutations should be provided in a comma-separated format on a single line (S2 Fig) .Users can modify interaction cutoffs and relative ASA value before submitting a job.Similar to iStability, an option is provided for residue conservation analysis at the mutation site.
For validation of iMutants module, a total of 51 mutations were analyzed (Table 3 and S8 Table) in arc repressor protein of bacteriophage P22 (PDB ID: 1ARR).Mutants were generated using MODELLER [66], energy minimized using steepest descent and finally the structure thus generated was compared with wild-type structure to analyze the change in interactions at the mutation site.

Results and Discussion
The generalized workflow that can be implemented for most rational protein engineering problems has been described in Fig 1 .The features and in silico implementation of this workflow in iRDP web server has been depicted in Fig 2.

Description and Validation of the Modules
Analysis of structure stabilization mechanisms using iCAPS.iCAPS has been designed to carry out a comparative analysis of protein structures in terms of structural features and interactions that are known to contribute to thermodynamic stability.iCAPS supports investigation of 20 different stabilization mechanisms, as described below, estimating *The labels in local interaction profile correspond to the number of IP: Ion-pair, IPNet: Ion-pair networks, AP: Aromatic-aromatic interaction, APNet: Aromatic-aromatic interaction network, AS: Aromatic-sulphur interactions, ASNet: Aromatic-sulphur interaction network, HB: Hydrogen bonds, Disul: Disulfide bonds, Catpi: Cation-pi interactions, CatpiNet: Cation-pi interaction networks and Hphob: Hydrophobic interactions, formed by wild-type (wild) and mutant (mut) residues.The-(hyphen) corresponds to no interaction or interaction networks detected.Please refer S8 more than 250 parameters (S4 Table ) analyzed simultaneously for a maximum of 100 structures.
1. Amino acid composition: iCAPS generates a comparative summary for a set of proteins in terms of their amino acid composition and its property-wise classification into different categories like positively charged, negatively charged, uncharged polar and aromatic residues.
2. Secondary structure information: Comparative summary generated for overall secondary structure (SS) content as well as the residue composition of each type of SS (Helix/Strand/ Turn/Coil) of proteins, provides better understanding of the contribution of SS to protein thermostabilization.
4. Disulfide bridges: iCAPS identifies disulfide bridges in input protein structures and provides useful insights by classifying them in terms of their expected entropic effect (i.e. based on the number of residues between bridged Cys residues) while providing other details such as solvent accessibility and SS preference of Cys residues, revealing the contribution of these bonds to structural stabilization.
5. Thermolabile bond profile: iCAPS studies spatial distributions of thermolabile bonds as potential target sites for stability enhancement in input structures involving asparagine and glutamine along with additional details such as SS preference and solvent accessible nature of the residues forming these bonds.
6. Conformationally strained residue profile: Conformationally strained residue detection feature in iCAPS identifies conformationally strained residues in input structures, which could be considered for mutation to glycine for improving thermal stability of proteins.The module also provides additional information such as conformational geometry, SS preference and strain distance (distance between the C β and main chain oxygen atom) of the strained residues.While mutation of such residues to Gly remain the most established strategy, the void generated due to the lack of side chain in the Gly residue remains a viable concern [67].In such cases it is advisable to explore non-Glycine substitutions using the customized mutation option in iStability.

Helix dipole stabilization profile:
The helix dipole stabilization feature of iCAPS identifies dipole-stabilized helices in input protein structures along with position-wise classification of dipole stabilizing charged residues.
8. Proline residue profile: iCAPS reports the distribution of proline residues in various secondary structures along with specific identification of prolines occurring at the second position of beta-turns and at N-terminus of helices.Since the solvent exposed loops and intrinsically disordered regions of proteins are often found to be proline-rich [68], the secondary structure wise proline distribution provided warrants careful analysis.9. Accessible Surface Area analysis: iCAPS measures the total, main-chain, side-chain, polar and non-polar accessible surface areas of proteins.ASA analysis also classifies all 20 amino acids by their solvent accessible nature as buried or exposed.

Metal binding analysis:
The module identifies residues involved in metal binding sites along-with determination of metal coordination geometries.
11. Estimation of Gibbs free energy of unfolding: Protein stabilization energies for input structures are computed by iCAPS using the FoldX energy function [49].The total energy is considered as an approximation of overall stability of the protein.This comparative report gives a comprehensive overview of energies involving various structure stabilization mechanisms amongst proteins under study.
The above results are presented in a formatted web page.Besides this, for further downstream analysis the user can download a zipped file containing all results in the form of tabdelimited text files.An extensive help file has been prepared and provided in the website which explains the importance of every parameter generated, along with relevant references.
Validation of iCAPS module.For validation of iCAPS, a diverse non-redundant dataset of thermophilic-mesophilic (TS-MS) protein pairs, from organisms that are moderately thermophilic to hyperthermophilic as well as their mesophilic counterparts, were investigated (Table 1).The pairs comprising of structures having resolution 2.5 Å were selected from a diverse set of families (S5 Table ).The selected TS-MS pairs were observed to be highly similar to each other with RMSD of the structures in the range of 0.69-1.68Å while sequence identity in the range of 24-73% (S3 Fig) .The thermophilic and mesophilic protein set among themselves were found to be highly dissimilar with the sequence identity ranging from 1-12% and 2-13% respectively (S3 Fig) , demonstrating the diverse families considered for the analysis.In terms of structural diversity, 2 families were found to belong to all-alpha class, 3 to all beta class, 1 belonging to small proteins while the rest belonged to alpha-beta class according to the SCOP classification [69].In most cases the oligomeric state of the pairs selected was found to be the same.Table 1 shows the percentage change values between TS-MS pairs with respect to various structural parameters estimated by iCAPS.The percentage change values can be correlated to the extent of the contribution of each of the factor towards thermostability of the proteins in the dataset.A positive value indicates higher occurrence of a particular parameter in thermophilic proteins while a negative value corresponds to higher occurrence in their mesophilic counterparts.
Comparative amino acid composition analysis revealed 13 families showing higher preference of charged (CHG) amino acids while 14 families displayed a lesser content of uncharged polar amino acids (UP) in the thermophilic proteins (Table 1).Of the 16 families, Ferredoxin from Bacillus thermoproteolyticus (1IQZ) was observed to show highest preference for charged residue content compared to its mesophilic partner from Clostridium acidurici.Similarly 3-Phosphoglycerate kinase from Geobacillus stearothermophilus (1PHP) showed lowest preference for UP content compared to its mesophilic homolog from Saccharomyces cerevisiae.This preference for charged residues compared to uncharged polar residues, a thermostabilization trend [70], also affected other parameters such as ion-pairs (IP), and the R/K ratio.It was found that 11 families had higher numbers of ion-pairs.The percentage change of ion-pairs was observed to be highest in case of Rubredoxin, a 53-residue protein.Rubredoxin from thermophilic Pyrococcus furiosus (1CAA), has 7 ion-pairs in its structure (5 ion-pairs form a network) while its mesophilic counterpart from Desulfovibrio vulgaris showed only one ion-pair.Similarly, 9 thermophilic proteins showed higher preference for Arginine than Lysine with highest preference observed in case of L-Lactate dehydrogenase enzyme family (1LDN).In terms of hydrogen bonding interactions (HB), 13 families contained a higher number of hydrogen bonds in the thermophilic set as compared to their mesophilic counterparts, thereby revealing hydrogen bonds as a contributing factor towards better protein stability.For this dataset aromatic residue content (Aro) and interactions involving aromatic amino acids (AAI, ASI and CPI) showed lower contribution towards stability.Hydrophobic residue content (ALI) and hydrophobic interactions (HP) were observed to be higher in 13 and 10 families of thermophilic proteins respectively.Among all pairs in the dataset, Methionine aminopeptidase from Pyrococcus furiosus (1XGS) showed highest hydrophobic interactions compared to its mesophilic counterpart from Escherichia coli.Reduction in hydrophobic surface area of a protein is a known thermostabilization mechanism.It was seen that in 11 cases the change in NP/P ratio was found to be negative (highest in case of Adenylate kinase family).While 7 families in the thermophilic set showed higher Pro content, in the current dataset only 6 and 7 thermophilic proteins respectively show beta-turn (Bt2P), NCap proline insertion parameters to be a contributing factor.Contribution by shortening of loop regions (Loop) in proteins towards thermostability was observed in 10 thermophilic proteins.In case of CheY protein family, loop percentage was observed to be lowest in case of thermophilic protein (1TMY) than its mesophilic partner.It was similarly observed that 12 thermophilic proteins contained fewer conformationally strained residues (CS), a factor contributing positively towards thermostability.
Although it was difficult to observe a generalized rule for protein thermostabilization, the analysis highlighted few parameters such as charged residue preference, increased ion-pairs and hydrogen bonding interactions, decreased non-polar accessible surface area, conformationally strained residues, and shortening of loops to contribute positively to thermostability of proteins in this dataset.
Identification of potential stabilizing mutations in a protein using iStability.iStability offers in silico implementation of four experimentally established protein engineering strategies (Fig 2 ) for identification of possible stabilizing mutation sites.iStability also allows for analysis of any other structure stabilization mechanism based on user-defined mutations.Evolutionary conservation scores further help to decide the mutability of mutation sites, highly conserved sites should be considered for mutation with caution.For further downstream analysis, generated mutant structures can be downloaded.The identification of mutation sites along with stability predictions and residue conservation makes iStability a unique in silico proteinengineering tool.
For the insertion of disulfide bonds strategy, iStability invokes SSBOND software [71], which identifies and ranks residue pairs that on mutation to cysteines could form stable disulfide bridges.Based on identified residue pairs, disulfide bonds are inserted and their effect on stability is predicted.In case of entropic stabilization strategy by insertion of proline residues, iStability identifies the beta-turns in the protein containing non-proline residues at second position and helices containing non-proline residues at the N-cap position.Based on the stability prediction tool (Fig 2 ) selected, residues in the identified positions are mutated to proline.The mutant structures are then compared with the wild-type by the tool in order to generate a stability score.This stability score is then used to report if the stability of the mutant has been found to increase (I) or decrease (D) in comparison to the wild type.Similarly for the release of conformational strain strategy, iStability identifies conformationally strained residues, mutates them to glycine to release the strain and predicts their effects on stability.Apart from implementation of these strategies, iStability reads user-defined mutations through the customized mutation feature and predicts the mutant stability.The results of iStability constitute the stability score, stability prediction (I: increasing stability and D: decreasing stability) and conservation score of the residue being mutated.
Validation of iStability module.For the validation of iStability module, a total of 81 mutations exploring the strategies implemented, from a diverse set of 40 structures (S6 Table ) were analyzed for their stability effect, by selecting FoldX as stability prediction tool.
i. Improvement of protein thermostability by entropic reduction due to Proline introduction Using iStability with default parameters, we have studied 28, second position β-turn proline insertions in 17 proteins from 13 organisms for which experimental stability results were available (Table 2).Of 28, iStability could accurately predict the stabilities for 22 β-turn insertions.In 20 cases, upon proline insertion an increase in stability was observed both experimentally and in iStability results.In the case of Protein G from Streptococcus sp.GX7805, the mutation K10P was predicted to decrease the stability by iStability which correlated with experimental results showing a decrease of 8.4°C in the T m value of the mutant.For the G68P mutation in 2IMM, experimentally a significant decrease in stability was observed which iStability also predicted accurately.For three cases (A93P in 2RN2, G13P and A206P in 3MBP) showing near wild-type stability experimentally, iStability predicted an increase in stability.For 3 other cases (A48P in 1PGA, L15P in 1LVE and A21P in 1RTP), the module was unable to predict the stability of the mutants correctly as experimentally they were observed to have decreased stability whereas iStability predicted them to have increased stability.
11 proline insertions at the N-cap position of helices were analysed in 6 proteins from 5 organisms (S7 Table ) for which experimentally determined stability results were available.Of the 11 mutations only 1 mutation, namely L316P carried out for Alcohol dehydrogenase was predicted inaccurately by iStability.The experimental results [18] for this mutant indicate the mutant to have higher stability (ΔT m : +10.8°C) than wild-type while iStability predicts a decreased stability for the mutant.
ii. Increasing protein thermostability through release of conformational strain by mutation to Glycine A total 14 conformational strained residues in 10 proteins from 5 organisms (Table 2) were studied using iStability and the predictions were compared with experimentally validated results.The stability of 11 mutants was predicted accurately by iStability.Only 3 cases (R21G in 1LZ1, N30G in 1PIN and K136G in 1STN) were predicted incorrectly by the iStability module.Experimentally these three mutants were found to be thermostable [36,39,79] while iStability prediction shows a decreased stability.
iii.Reducing entropy for enhancement of thermostability by introduction of disulfide bridges A set of 28 double Cysteine mutations (S7 Table ) was studied for insertion of disulfide bonds for enhancement of protein stability in 15 proteins from 9 organisms.For this strategy, though iStability detected all the 28 residues pairs as potential insertion sites, stability was predicted correctly in 11 cases in comparison with experimentally determined stabilities.In one case (T72C, A471C in 3GLY), experimental evidence showed near wild-type stability while iStability predicted increased stability.
Of the total 81 predictions studied, 47 were true-positives (Both experiment and predictions showed increase of stability), 8 were true-negatives (Both experiment and predictions showed decrease of stability), 9 were false-positives (Experiment showed decrease while prediction showed increase of stability) and 17 were false-negative (Experiment showed increase while prediction showed decrease of stability).Thus the true-positive rate calculated was 0.84 while false-positive rate was 0.53.By using different FoldX energy cut-offs, neutral-states were incorporated in the prediction and ROC curve was generated (S9 Table and S4 Fig) .While the truepositive rate and false-positive rates calculated actually test the accuracy of the underlying stability prediction program selected (FoldX), the values shown above also reflect the importance of using protein design strategies for better prediction of mutation sites.In those cases where iStability prediction differed from that observed experimentally, further analysis was carried out using iMutants.In case of A48P, the beta-turn proline insertion strategy in 1PGA, iMutants analysis revealed the loss of A48 (N)-(OD1) 46D hydrogen bond in mutant protein.This loss of hydrogen bond could result in decrease in stability observed experimentally.Similar changes in interactions near the mutation site were also noted in other cases, suggesting the need for further evaluation of the identified mutants using iMutants.
iStability currently relies on use of stability prediction tools that are based either on empirical potential energy functions or machine learning methods.Since these tools are based on defined training dataset, predicting mutations that are distant to the training dataset are a cause for concern.
Evaluating mutations through interaction framework and evolutionary residue conservation at mutation sites using iMutants.A comparative local interaction profile generated from the detailed molecular interactions in a model is unique to iMutants.It offers a quantitative measure of structural changes in mutants, through loss or gain of interactions at the mutation site.The module provides a comparative interaction analysis of wild-type and mutant residues summarized in the form of a local interaction profile comprising a number of interactions and their networks.Hyperlinks provide details of the calculated interactions.iMutants also supplements the interaction profile with estimated evolutionary conservation scores of the wild-type residues being mutated.In addition, mutant structures generated can also be downloaded for further downstream analysis.
Validation of iMutants module.The equilibrium stabilities of 51 mutants for the arc repressor protein of bacteriophage P22 (PDB ID: 1ARR) have been studied experimentally by Milla et al., 1994, using thermal and urea denaturation [51].These 51 mutations were analysed using the iMutants module and the change in various non-bonded interactions was recorded.The mutations were divided into four groups as established by Milla et al., 1994 for analysis purposes.The first group consisted of 5 mutants (Table 3; V22A, I37A, V41A, F45A, E36A) that were experimentally determined to be highly unstable as they were unable to form dimers, and remained in an unfolded state.Of the five, the first four mutations showed a dramatic decrease of 5, 6, 3 and 4 hydrophobic interactions, respectively.These interactions affect the hydrophobic core of the protein and the loss of interactions coincides with the experimental instability observed.Since the E36A mutation involves alteration of a buried polar residue, iMutants recorded drastic loss of two ionic interactions, one ionic network along with one hydrogen bond, which was established by Milla et al., 1994 as a possible cause of instability of the mutant.
The next set analysed comprised 20 mutants (S8 Table ), which experimentally exhibited reduced stability with t m values ranging from 30-50°C as compared to the wild-type protein (T m : 57.9°C).Since three mutations R31A, R40A and R50A involved polar residues, iMutants recorded a loss of one ionic interaction and one ionic interaction network for R31A, two ionic interactions, two hydrogen bonds and one ionic interaction network for R40A and finally one ionic interaction for R50A mutants that could explain the instability of these mutants (T m : 37.1°C, 31.2°C and 47.9°C, respectively).For mutants W14A, L21A, N29A, V33A and Y38A, changes in hydrogen bonding interactions were observed (S8 Table ).The mutation W14A also showed a loss of two aromatic pair interactions, one aromatic pair network as well as five hydrophobic interactions.This change in interactions could thus explain the decrease of T m to 31.5°C observed for this mutant.Mutants F10A, L12A, P15A, L19A, L21A, Y38A and M42A showed a loss of 6, 4, 2, 4, 1, 3, and 2 hydrophobic interactions respectively which could contribute to the instability observed.
The set of 25 mutants (S8 Table ) analysed next displayed near wild-type stability experimentally with their t m ranging between 55-63°C.Most mutations in this set showed marginal or no change in their hydrogen bonding, ionic and hydrophobic interactions.P8A, the only mutant with increased stability (T m : 74.1°C), showed a change in just one hydrophobic interaction in the iMutants analysis (S8 Table ).The stabilization of this particular mutant could be due to the extension of β-sheets or relief of unfavourable packing interactions as postulated by Milla et al, 1994.In summary, hydrogen bonds and hydrophobic interactions were observed to provide a sizable contribution towards stability of arc repressor protein.Complete interaction profiles and details of interactions for mutants along with their T m values have been provided in S8 Table .Although the experimental evidence by Milla et al., 1994 focus solely on mutation of the residues to Ala, exploring mutations to non-Ala residues could also yield additional useful information.
iATMs (in silico Analysis of Thermally stable Mutants): An information resource.The local interaction analysis approach of iMutants was extended to analyse experimentally validated mutations listed in the ProTherm database and is provided in the form of iATMs (in silico Analysis of Thermally stable Mutants), as a supplementary information resource to ProTherm [84].Although ProTherm contains a vast resource of experimental information, no information is available describing the changes in structure and atomic interactions due to the mutations carried out.iATMs is organized in three sections based on the type of mutation as single, double or multiple.Within these, the sections are further classified into those containing crystal structures for both wild-type and mutant proteins and those where only wild-type crystal structures are available.Wherever wild-type and mutants structures were known, interaction profiles were generated using those structures.In cases where crystal structures for mutants were absent, generation of local interaction profiles was carried out using known wildtype and modelled mutant structure.Information provided in iATMs could provide a better understanding of correlation between experimental observations and interaction rearrangements due to mutations, leading to better application of derived knowledge towards efficient protein engineering.

Conclusion
Despite the availability of a large number of structural analysis tools, to the best of our knowledge there is currently no unified platform addressing the rational protein design problem.The web platform iRDP uniquely offers investigators a multi-faceted approach for carrying out rational protein engineering by integrating protein structure and mutation analysis tools.The modules of iRDP server can either be used separately for various independent analyses or as a systematic directed strategy encompassing the steps involved in rational protein engineering.Applications of modules do not limit themselves to the protein stability problem since the information generated comprises of diverse structural features, which can correlate with a wide range of properties in proteins.Investigations carried out using iRDP act as a guide for analysing varied structural features that relate to problems such as pH stability, protein active site analysis, crystallizability, analysis of frames from molecular dynamics simulations and protein structure-function relationships.
The future direction of the iRDP web server aspires towards implementation of sequencebased inputs complementing the existing structure-based input followed by visualization of interaction networks and mutation sites, thereby providing a better structural perspective.We welcome comments and corrections from users to further improve the iRDP server.

Fig 2 .
Fig 2. The three working modules of the iRDP web server implemented for rational engineering of proteins.doi:10.1371/journal.pone.0139486.g002

0.15 0.10 6 *
The PDB IDs of Thermophilic (TS), Mesophilic (MS) pair, starting from Column 2, belong to family Citrate Synthase, Malate dehydrogenase, Rubredoxin, Cyclodextrin, Glutamate dehydrogenase, L-Lactate dehydrogenase, Thermolysin, 3-Phosphoglycerate kinase, Chey protein, Methionine aminopeptidase, Endo-1,4-Beta-Xylanase, Adenylate kinase, Ferredoxin, Pyrophosphate phosphohydrolase, Manganese superoxide dismutase and Phosphofructokinase.The parameters listed in Column 1 correspond to aromatic (Aro: residues FWY), uncharged polar (UP: residues NQST), proline (Pro), hydrophobic or aliphatic (ALI: residues VILM), charged (CHG: residues DERKH) residue contents, Arg to Lys ratio (R/K), total percentage of ion-pairs (IP), aromaticaromatic (AAI), aromatic-sulphur (ASI), cation-pi (CPI), hydrogen bonding (HB), hydrophobic (HP) interactions, proline residue percentages occurring at 2 nd position of beta turns (Bt2P) and Ncap helix positions (NCap), percentage of dipole stabilized helices (Hdip), thermolabile bonds (TL) and conformationally strained residues (CS), ratio of nonpolar to polar accessible surface areas (NP/P) and percentage of loop region (Loop).**The value shown in # represents the case in which both MS and TS proteins show absence of the corresponding parameters while the values shown in ~represents the case in which only the MS protein shows absence of the corresponding features.*** Total number of positive values calculated for each parameter; indicates number of families in which thermophilic proteins have higher preference for the parameter than their mesophilic counterparts.Numbers of ~values are also considered while counting total number of positive values.Detailed results can be found at http://irdp.ncl.res.in/cgi-bin/result_fetch.php?ID=iCAPScase.doi:10.1371/journal.pone.0139486.t001

Table 2 .
Validation of iStability using beta-turn proline insertion and conformational strain release strategy.

Table 2 .
(Continued) The labels I, D and N correspond to an increase, decrease and no change in stability respectively for the mutations as inferred from experiment.The values with unit kcal/mol represent ddG value (Change in free energy of unfolding, Mutant-Wild-type) while those with unit°C represent dTm value (Change in midpoint temperature of the thermal unfolding, Mutant-Wild-type) as inferred from the experiment.A positive value represents an increase in stability.**Two states of iStability predictions considered are: I (FoldX energy < 0) representing increased stability and D (FoldX energy > 0) representing decreased stability.

Table 3 .
iMutant analysis for the five highly unstable mutations in Arc repressor protein.