Computational Analysis of Functional Single Nucleotide Polymorphisms Associated with the CYP11B2 Gene

Single nucleotide polymorphisms (SNPs) are the most common type of genetic variations in humans and play a major role in the genetics of human phenotype variation and the genetic basis of human complex diseases. Recently, there is considerable interest in understanding the possible role of the CYP11B2 gene with corticosterone methyl oxidase deficiency, primary aldosteronism, and cardio-cerebro-vascular diseases. Hence, the elucidation of the function and molecular dynamic behavior of CYP11B2 mutations is crucial in current genomics. In this study, we investigated the pathogenic effect of 51 nsSNPs and 26 UTR SNPs in the CYP11B2 gene through computational platforms. Using a combination of SIFT, PolyPhen, I-Mutant Suite, and ConSurf server, four nsSNPs (F487V, V129M, T498A, and V403E) were identified to potentially affect the structure, function, and activity of the CYP11B2 protein. Furthermore, molecular dynamics simulation and structure analyses also confirmed the impact of these nsSNPs on the stability and secondary properties of the CYP11B2 protein. Additionally, utilizing the UTRscan, MirSNP, PolymiRTS and miRNASNP, three SNPs in the 3′UTR region were predicted to exhibit a pattern change in the upstream open reading frames (uORF), and eight microRNA binding sites were found to be highly affected due to 3′UTR SNPs. This cataloguing of deleterious SNPs is essential for narrowing down the number of CYP11B2 mutations to be screened in genetic association studies and for a better understanding of the functional and structural aspects of the CYP11B2 protein.


Introduction
Single nucleotide polymorphisms (SNPs) are the most abundant class of genetic variations in the human genome with a frequency of approximately every 100 to 300 base pairs [1]. Given that there are millions of SNPs in the entire human genome, SNPs are important as markers for constructing genetic maps and have potential as direct functional variants associated with common and genetically complex diseases and drug responses. The vast majority of SNPs are neutral allelic variants; thus, one of the main goals of SNP research is the identification of functional SNPs, which is a crucial step for understanding the molecular basis of complex traits and diseases in humans [2]. However, the identification of these functional sets of SNPs may be a daunting task. Although experimental techniques will provide the strongest evidence for the functional role of a genetic variant [3], it is not feasible to perform laboratory experiments for all SNPs in the human genome or even in a single gene. Hence, theoretical and/ or computational methods are becoming indispensable for the identification and prioritization of SNPs with functional significance from an enormous number of non-risk alleles [4]. Computational methods are sufficiently fast and flexible to provide reliable predictions of functionally significant SNPs with a high accuracy of 80-85% [5][6][7][8][9] when combined with sequence, structure, and phylogenetic relationships.
The aldosterone synthase (CYP11B2) gene is situated on chromosome 8q24.3 and encodes aldosterone synthase, which is the key rate-limiting enzyme for the terminal steps of aldosterone biosynthesis [10]. Previously, Strushkevich N and his research group determined the CYP11B2 structure by means of X-ray crystallography [11]. In recent years, there is considerable interest in understanding the possible role of the CYP11B2 gene for assessing the risk associated with corticosterone methyl oxidase deficiency (including CMO I and CMO II), primary aldosteronism, and cardio-cerebro-vascular diseases [12][13][14][15][16][17]. However, most disease association studies have focused on just a few SNPs, particularly T-344C (rs1799998). Other SNPs in the CYP11B2 gene have not been studied, and the in silico investigations of SNPs in the CYP11B2 gene remain scarce. Lately, Hui E et al. described a 33-year old Chinese man who was compatible with type 2 aldosterone synthase deficiency carried a heterozygous mutation c.977C . A (p.Thr326Lys) in exon 3 and computational analysis also confirmed the missense variant nocuity [18]. Hence one can see that bioinformatics has its unique advantages in understanding the relationship between genes and diseases. In this study, we performed computational analyses of non-synonymous SNPs (nsSNPs) and UTR-region SNPs in the CYP11B2 gene to identify all of the possible deleterious mutations and propose a modeled structure for the mutant protein. We are confident that the results of our study will provide a further understanding of the CYP11B2 gene in human diseases, as well as a guide for future experimental work.

Dataset collection
The SNP information [SNP ID, amino acid position, mRNA accession number NM_000498.3, and Protein accession number NP_000489.3] of the human CYP11B2 gene used in our computational analyses was retrieved from the National Center for Biotechnology Information (NCBI) database of SNPs (dbSNP (http://www.ncbi.nlm.nih.gov/snp/) [19]. The workflow, tools, and databases used to identify the potential functional SNPs in the human CYP11B2 gene are shown in Figure 1.

Assessment of nsSNP functionality
The functional context of nsSNPs was predicted using SIFT, PolyPhen and I-Mutant Suite.
SIFT (http://sift.bii.a-star.edu.sg/index.html) is a sequencehomology-based tool to predict whether an amino acid substitution in a protein would be tolerated or damaging [20]. We performed SIFT by submitting the query in the form of SNP IDs or chromosome positions and alleles in nsSNVs tool. Variants at the position with tolerance index score #0.05 are considered to be deleterious. A lower tolerance index indicates that the particular amino acid substitution likely has a more functional impact [21,22].
PolyPhen (http://genetics.bwh.harvard.edu/pph2/) is an automatic tool that predicts the possible impact of an amino acid substitution on a number of features, including the sequence, phylogenetic, and structural information [23]. The query was submitted in the form of protein sequence with mutational position and substitution. The PolyPhen output comprises a score that ranges from 0 to 1, with zero indicating a neutral effect of amino acid substitutions on protein function. Conversely, a high score represents a variant that is more likely to be damaging. I-Mutant Suite is a suite of support vector machine (SVM)based predictors of protein stability changes according to Gibbs free energy change, enthalpy change, heat capacity change, and transition temperature [24]. The analyses were performed based on protein sequence combined with mutational position and correlated new residue. And the output result of the predicted free energy change (DDG) classifies the prediction into one of three classes: largely unstable (DDG , 20.5 kcal/mol), largely stable (DDG.0.5 kcal/mol), or neutral (-0.5# DDG#0.5 kcal/mol). I-Mutant Suite is available at http://gpcr2.biocomp.unibo.it/cgi/ predictors/I-Mutant3.0/I-Mutant3.0.cgi.

Evolutionary conservation analysis of nsSNPs
An amino acid that plays an essential role, e.g., in enzymatic catalysis, is likely to remain unaltered despite random evolutionary drift. Hence, the level of evolutionary conservation is often indicative of the importance of the position for maintaining the protein's structure and/or function. The ConSurf server is a bioinformatics tool for estimating the evolutionary conservation of amino/nucleic acid positions in a protein/DNA/RNA molecule based on the phylogenetic relationships between homologous sequences [25]. After entering the 3D structure of the query protein, the conservation scores are calculated based on the evolutionary relationships among the protein and its homologs Figure 1. Workflow, tools, and databases used to identify potential functional SNPs in CYP11B2. doi:10.1371/journal.pone.0104311.g001 [26,27]. A conservation score between 1 and 4 is considered variable, whereas a score of 5-6 is intermediate, and a score in the range of 7 to 9 indicates conserved. Using the empirical Bayesian method, the accuracy of the conservation score estimation was significantly improved, particularly when a small number of sequences are used for the calculations [26]. ConSurf is available at http://consurftest.tau.ac.il.

Evaluation of the functional context of SNPs in the UTR region
The 59and 39 untranslated regions of eukaryotic mRNAs (UTRs) play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleocytoplasmic mRNA transport, translation efficiency, subcellular localization, and message stability [28][29][30]. The functional impacts of UTR SNPs were analyzed using UTRScan [30], MirSNP [31], PolymiRTS [32] and miRNASNP [33].
The program UTRscan looks for UTR functional elements by searching through user submitted sequence data for the patterns defined in the UTRsite collection. And UTRsite is a collection of regulatory elements located in the 59 and 39UTRs whose function and structure have been experimentally determined and published. If different sequences for each UTR SNP are found to have different functional patterns, that particular UTR SNP is predicted to have functional significance. The pattern change included two directions by the influence of SNPs at the UTR regions, either from ''have pattern'' to ''no pattern'', or ''no pattern'' to ''have pattern''. UTRscan is available at http://itbtools.ba.itb.cnr.it/ utrscan.
MirSNP is a database of SNPs used for the prediction of whether an SNP within the target site would decrease/break or enhance/create a microRNA-mRNA binding site based on information from dbSNP135 and miRBase 18. Its output of single search by entering the gene name includes mirSVR score, the effect of different alleles, the predicted score, conservative information and Start & End & Binding information. Combined with GWAS or eQTL data sets, MirSNP is highly sensitive and covers most experiments confirmed SNPs that affect miRNA function. MirSNP is available at http://cmbi.bjmu.edu.cn/ mirsnp.
PolymiRTS is a database of naturally occurring DNA variations in microRNA seed regions and microRNA target sites. Integrated data from CLASH (cross linking, ligation and sequencing of hybrids) experiments, PolymiRTS database provides more complete and accurate microRNA-mRNA interactions. The polymorphic microRNA target sites are assigned into four classes: 'D' (the derived allele disrupts a conserved microRNA site), 'N' (the derived allele disrupts a nonconserved microRNA site), 'C' (the derived allele creates a new microRNA site) and 'O' (other cases when the ancestral allele cannot be determined unambiguously). The class 'C' may cause abnormal gene repression and class 'D' may cause loss of normal repression control. So these two classes of PolymiRTS are most likely to have functional impacts. Poly-miRTS is available at http://compbio.uthsc.edu/miRSNP/.    miRNASNP is a database which predicts the effect (loss or gain of function) of SNPs within pre-miRNA, mature miRNA, miRNA target sequences and flanking regions. Using the SNP IDs of the query protein as an input, it produced a list of targets with energy change, SNP-miRNA/target duplexes and gain/loss effect by SNP in miRNA seed or gene 39UTR. Focused on the prediction of potential effects on miRNA biogenesis and target binding by SNPs through both prediction and experimental validation, miRNASNP is a useful resource to shed light on further experiments. miRNASNP is available at http://www.bioguo.org/miRNASNP/.

Molecular modeling and molecular dynamics simulation
A structural analysis was performed to evaluate the structural stability of the native and mutant proteins. The crystal structure of the CYP11B2 protein was acquired from PDB [Protein Data Bank; PDB ID = 4DVQ (A chain)] [34]. The Modeller 9.11 package was used to map the mutations on the structure [35]. Furthermore, we used energy minimization and molecular dynamics simulation (MDS) techniques to understand the structural variations in the mutant protein with respect to the native structure using the NAMD 2.6 package [36]. The native and mutant protein structures were solvated in a water sphere using the VMD 1.9.1 package [37]. The cutoff for electrostatic and  Van der Waals interactions was 12.0 Å . The temperature was maintained constant at 310K through the use of Langevin dynamics, which provides a means of controlling the kinetic energy of the system with a damping coefficient (gamma) of 1/ps. The energy minimization and molecular dynamics simulations were performed using the CHARMM force field with 5000 iterations and a 1-ns timescale, respectively. The trajectory files were analyzed to obtain the root-mean square deviation (RMSD), radius of gyration (Rg), and solvent-accessible surface area (SASA).

Statistical analysis
To determine the differences in the RMSD, Rg and SASA value between native and mutant protein structures, statistical analyses were performed with SAS 9.1 software (SAS Institute, Inc., Cary, NC). If quantitative data both fit the normal distribution and homogeneity of variance, Student's t-test was used to compare the differences between native and mutant group. Otherwise nonparametric Wilcoxon two-sample test was used. The parameters were summarized by medians and interquartile ranges (IQRs). All P-values are two-sided and less than 0.05 was considered a statistically significant difference.

CYP11B2 database construction
The database at http://203.81.25.54 contains the results obtained from this work. The natural variants listed in the database come from dbSNP. For each nsSNP, we provide predictions of the function effects using SIFT, PolyPhen-2, and I-Mutant Suite. Meanwhile, we also list the UTR SNPs that were predicted to have functional significance by MirSNP, polymiRTS and miRNASNP. In addition, PDB structure files of native and mutant proteins as well as results of molecular dynamics simulation can be downloaded. This database is freely available and will be regularly updated.

SNP dataset from dbSNP
The human CYP11B2 gene contains a total of 358 SNPs, of which 51 (14.2%) are nsSNPs and 36 (10.0%) are coding synonymous SNPs. The non-coding region includes 166 SNPs (46.4%) in the intronic region, 79 (22.1%) SNPs in the ''near gene'' region, and 26 SNPs (7.3%) in the mRNA UTR region. The distribution of SNPs is shown in Figure 2. We selected the nsSNPs and UTR-region SNPs for our subsequent investigations.

Identification of deleterious and damaging nsSNPs
The identification of the nsSNPs that confer susceptibility or resistance to human diseases should become increasingly feasible with improved in silico tools. In this analysis, we employed three in silico tools to determine the functional significance of nsSNPs in the CYP11B2 gene. Table 1 presents the results obtained through the SIFT, PolyPhen-2, and I-Mutant Suite analyses of the CYP11B2 nsSNPs.
To improve the prediction accuracy of structure-based tools, we then used I-Mutant Suite. We found that 24 nsSNPs (47.1%) exhibit a DDG value of less than 20.5, which indicates that these are largely unstable.
The predictive power of determining the functional impact of a given nsSNP can be significantly increased by combining information from a variety of tools [38]. Accordingly, we combined the SIFT, PolyPhen, and I-Mutant Suite programs to predict the influence of nsSNPs on protein function and structure. Figure 3 shows the distribution of deleterious and benign nsSNPs obtained using SIFT, PolyPhen, and I-Mutant Suite. Of all of the predictions, 37.3%, 45.1%, and 47.1% were specific found by SIFT, PolyPhen, and I-Mutant Suite, respectively. In addition, six nsSNPs (F499C, Y275C, V129M, T498A, F487V, and V403E) were predicted to be functionally significant by all three tools.
With a diverse set of alignments and molecular characteristics of each in silico tool, the results of three tools were slightly different.

Analysis of nsSNPs in the conserved region
A disease-causing mutation often resides in highly conserved positions. Conservation analyses of the six nsSNPs that were predicted to be deleterious by the above-mentioned three tools were performed using the ConSurf server based on protein structure. Of the six nsSNPs, the four nsSNP positions of V129M, T498A, F487V, and V403E were considered to be located in a highly conserved amino acid region through homologous sequence alignment with the SWISS-PROT, UniProt, and UniRef90 protein databases. The main results are shown in Table 2 and

Functional SNPs in the UTR region
UTRs are known to play vital roles in the post-transcriptional regulation of gene expression, and their importance is emphasized by the finding that UTR variations can lead to serious pathology [39]. All of the 26 UTR SNPs were analyzed using UTRscan. After comparing the functional elements for each UTR SNP, we predicted that three SNPs, namely rs61763988, rs35574522, and SNPs were found to highly affect the microRNA binding targets. Then combined the results of these three tools, eight SNPs (rs188784518, rs117910248, rs61763989, rs61757284, rs28390200, rs7463238, rs3802228 and rs3097) indicate a highest likelihood that the polymorphism significantly altered microRNA targeting of the sequence (Table 3).

Molecular dynamics simulation of native and mutant CYP11B2 proteins
To further understand the structural consequences of the prioritized deleterious mutations, molecular dynamics simulations were conducted to analyze the conformational changes in the native and mutant structures (V129M, V403E, F487V, and T498A). The trajectory files were produced after the molecular dynamics simulation, and we then investigated the RMSD, Rg, and SASA variations between the native and the four mutant structures.
We calculated the RMSD for all the atoms from the initial structure that was considered as the central origin to measure the convergence of the protein system concerned ( Figure 5). In all five structures, considerable structural changes were observed during the initial few picoseconds, leading to an RMSD of ,1.2 Å and subsequently notable structural deviations during the rest of the simulations. In the last 200 picoseconds of the simulation, the median of RMSD is 1. 21 Table 4). The statistical analysis showed significant differences between the native structure and the four   particularly F487V). Moreover, small fluctuations in the average RMSD value after the relaxation period led to the conclusion that the simulation generated a stable trajectory and thus provides a credible basis for further analyses. Rg is defined as the mass-weight root mean square distance of a collection of atoms from their common center of mass. Hence, it provides insight into the overall dimension of a protein. The Rg plot for the Ca atoms of the protein as a function of time at 310 K is shown in Figure 6 and results of data analyses are shown in Table 4. The statistic analysis of Rg value of the last 200 picoseconds of the simulation showed that F487V, V129M and T498A had significant differences with native structure [native:   Figure 6, the F487V mutant curve differed significantly and fluctuated at a higher rate during the simulation time period, indicating that the mutant conformation is flexible throughout the simulation time and that its structure acquires an expanded conformation compared to the native structure. On the contrary, no difference was found between the native structure and V403E structure.
The SASA is the surface area of a biomolecule that is accessible to a solvent and can be related to the hydrophobic core. It is typically calculated using the 'rolling ball' algorithm developed by Shrake and Rupley in 1973 [40]. The SASA was calculated for native and mutant trajectories and is depicted in Table 4 and Figure 7. Data analyses showed that there were significant differences between all four mutant structures and native structure   exposed amino acid residues and could affect the tertiary structure of the protein.
To properly visualize the crystal structure differences between the native and mutant proteins, we spatially superimposed the molecules (Figure 8). The results show that F487V and V129M exhibit a high displacement (5 Å ; shown in red) and that T498A and V403E present a low displacement (0 Å ; shown in blue).
Furthermore, we ranked above four SNPs based on results of RMSD, Rg, SASA variations and spatial superimposition ( Table 5). So F487V had the highest likelihood of deleterious effect, then V129M, T498A, and V403E with descending perniciousness.

CYP11B2 database
During the execution of this project, the CYP11B2 database was created to show a more updated and complete set of in silico analyses per mutation. This database allows a user to quickly retrieve and rapidly analyse the predicted effects of protein variants. With its interactive interface, the CYP11B2 database allows dynamic utilization by enabling users to select only the results of the mutations and algorithms that are most important to them. The in silico analysis of CYP11B2 in this database will be helpful in the design of further experimental research. The CYP11B2 database is available at http://203.81.25.54/.

Discussion
Because of the application of high-throughput sequencing technologies, the number of identified genomic variants, particularly SNPs, in the human genome is rapidly growing. The latest release of NCBI dbSNP database (build 141) contains nearly 44 million validated human SNPs [19]. The principal objective of studies in molecular biology and population genetics is to identify and characterize SNPs that are functionally deleterious from neutral SNPs. This is also an inevitable process in genetic association studies of complex genes and diseases [41]. To the best of our knowledge, this study provides the first demonstration of the computational analysis of functional SNPs associated with the CYP11B2 gene. The value and novelty of this study are to prioritize SNPs with functional significance from an enormous number of non-risk alleles and provide new insights for further genetic association studies. Moreover, these identified SNPs could contribute to aldosterone-induced cardiovascular disease, possibly representing novel targets for the therapy. Of 358 SNPs, we selected the nsSNPs and UTR-region SNPs for our investigations, and variants in near-Gene, intronic regions were unexplored.
In this study, we attempted to evaluate the deleterious nsSNPs in three contexts: (1) Identification of deleterious nsSNPs through both sequence-and structure-based methods (SIFT, PolyPhen and I-Mutant Suite), (2) Calculation of the evolutionary conservation of amino acid positions through a conservation score (ConSurf server), and (3) Measurement of alterations in the protein 3D structure due to deleterious nsSNPs through a molecular dynamics approach. Of the 51 nsSNPs associated with the CYP11B2 gene, four nsSNPs, namely F487V, V129M, T498A, and V403E, were finally identified to be highly deleterious based on above comprehensive analyses, particularly F487V.
A number of recent studies mainly focused on the T-344C polymorphism, which impacts the CYP11B2 promoter activity, but the literature on coding substitutions that directly influence the structure of the protein is scarce. However, T498A, one of four above-mentioned nsSNPs that were predicted to be deleterious, was found to be strongly associated with CMO-II deficiency, which shows very low levels of aldosterone synthesis (0.5% or less compared with the wildtype enzyme). The in vitro analysis of the enzyme activities of the T498A mutation showed efficient 11 bhydroxylase activity but a loss of C 18 activity, resulting in poor aldosterone synthesis [41]. Hence, it appears reasonable to speculate that nsSNPs can ruin the secondary structure of the enzyme, thereby leaving the aldosterone synthase activity intact. It is worth noting that some patients, such as CMO-II deficiency patients who reach adulthood, could be asymptomatic and able to synthesize adequate amounts of aldosterone at the expense of elevated levels of aldosterone precursors. This existence of ostensibly asymptomatic individuals with significantly compromised aldosterone synthase function may reflect problems of ascertainment and may at least partly explain why few coding mutations in the CYP11B2 gene have been reported.
Because the translational regulation of gene expression is as important as the transcriptional regulation for normal cell function and that its dysfunction is related to the pathophysiology of various diseases [42][43][44], the UTR SNPs in the CYP11B2 gene were also evaluated by UTRScan, MirSNP, PolymiRTS and miRNASNP. In our study, we found that 7.3% of the SNPs are located in the UTR region. After comparing the functional elements for each UTR SNP using UTRscan, we found that three SNPs in the 39UTR were predicted to exhibit a pattern change in their upstream open reading frames (uORFs). However, the uORF in the 39UTR is hypothesized to have no functional importance.
Due to the importance of the translational regulation of microRNAs, we further studied whether the 39UTR SNPs change the profile of microRNA binding to the CYP11B2 gene using MirSNP, PolymiRTS and miRNASNP. Of the 26 UTR SNPs, eight (rs188784518, rs117910248, rs61763989, rs61757284, rs28390200, rs7463238, rs3802228 and rs3097) were found to highly affect the microRNA binding targets with MirSNP, PolymiRTS and miRNASNP. These SNPs can break, create, enhance, or decrease microRNA binding (i.e., a single SNP can break a microRNA binding site and also potentially create another site), with consequences on regulation of mRNA degradation pathway thereby affecting mRNA turnover and microRNA function. Therefore, these UTR SNPs could result in the disturbance of aldosterone biosynthesis. Recently, mounting evidence suggests that aldosterone plays crucial roles in a variety of cerebro-, cardiovascular and renal complications [45]. Nevertheless, validation and pathomechanism experiments of these predicted deleterious UTR SNPs were still few. Several studies indicated that rs3802228 might be associated with atrial structural remodeling and the presence of coronary artery disease [46,47]. As reflected in Table 3, rs3802228 could disturb the interactions between mRNA and microRNA-331-5p. Consistent with this idea, one recent study comes to demonstrate that the upregulation of rno-miR-331* could be seen as biomarkers of prognosis in clinical therapy of heart failure [48]. Besides, rs3097 (G5937C), one of above eight detrimental SNPs, was also found to be associated with cardiac wall thickness [49]. Collectively, these facts and speculations suggest that a potential role of these identified UTR SNPs in the pathogenesis of aldosterone-induced cardiovascular complications. Then, it is of considerable interest that the pathogeny of some cardiovascular disease but not limited to primary aldosteronism could be the variants in the CYP11B2 gene, and aldosterone may act as a central player in this pathological process. Thereby, aldosterone antagonist treatment seems to be of considerable therapeutic value to control and limit the progression of these diseases. This newly pathway of CYP11B2 SNPs/aldosterone/cardiovascular disease opens new research insights and therapeutic avenues for the cardiovascular diseases. CYP11B2 protein is a steroid hydroxylase cytochrome P450 enzyme involved in the biosynthesis of the mineralocorticoid aldosterone. It is the sole enzyme capable of synthesizing aldosterone in humans and plays an important role in electrolyte balance and blood pressure. Mutations in the CYP11B2 gene can disturb the biosynthesis of aldosterone, then resulting in aldosterone synthase deficiency, also known as corticosterone methyloxidase deficiency. Besides, CYP11B2 gene variations can also change the gene expression, therefore play an important role in many diseases, such as hypertension, primary aldosteronism and heart failure. In addition, Nicod et al. found that CYP11B2 is also strongly associated with the rate of decline in renal allograft function [50]. Our in silico studies identified various deleterious SNPs, and majority of them have not been reported experimentally so far. However, these findings highlight an attractive screening target for disease association studies involved in CYP11B2 protein, and also provide a guide for future experimental work.
Although the prediction of deleterious SNPs seems to be more and more accurate when integrating more valuable informations, there still exist some challenges to deal with. Computational tools can predict a variant is deleterious or not with a strong confidence, but the information about which disease the variant is related to and which disease the variant has a casual relation with is still missing [51]. In addition, facts show that variants in regulatory regions may alter the consensus of transcription factor binding sites or promoter elements; variants in the introns and silent variants in exons may alter splicing efficiency. Nevertheless, prediction of these variants from genomic sequence remains one of the most challenging tasks for bioinformatics. The biggest problem is overprediction: (1) the prediction of promoter was expressed cryptically; (2) the vast majority of transcription factor binding sites lack characteristics either in length or sequence; (3) cis-regulatory elements, such as ESE (exonic splicing enhancers), ESS (exonic splicing silencers), ISE (intronic splicing enhancers) and ISS (intronic splicing silencers) sites are very poorly defined and may be located in almost any position within exons and introns. For these reasons, we currently did not perform the prediction of variants in near-Gene, intronic regions.
In summary, using combinational in silico investigations, the current study identified four nsSNPs, denoted F487V, V129M, T498A, and V403E, as deleterious to the structure and function of the CYP11B2 gene. The molecular dynamics simulation analyses also confirmed that the four nsSNPs that were predicted to be deleterious may induce changes in the stability of the protein by altering the RMSD, Rg, and SASA. In addition, three SNPs in the 39UTR were predicted to influence the translation pattern of the CYP11B2 gene through UTRscan analysis, and eight 39UTR SNPs may affect microRNA binding sites, as determined through MirSNP, PolymiRTS and miRNASNP analyses. Altered CYP11B2 function due to mutations and protein expression may play a critical role in determining susceptibility to complex diseases. This cataloguing of deleterious SNPs is essential for narrowing down the number of CYP11B2 mutations to be screened in genetic association studies and for a better understanding of the functional and structural aspects of the CYP11B2 protein.